mirror of https://gitee.com/openkylin/linux.git
5727 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Yu Kuai | af25dbc846 |
blk-cgroup: synchronize blkg creation against policy deactivation
[ Upstream commit |
|
Andrea Righi | 3e6e11f853 |
blk-wbt: prevent NULL pointer dereference in wb_timer_fn
[ Upstream commit |
|
Jens Axboe | a3d08aae18 |
block: remove inaccurate requeue check
[ Upstream commit
|
|
Jens Axboe | 396c9e834d |
block: bump max plugged deferred size from 16 to 32
[ Upstream commit
|
|
Linus Torvalds | a379fbbcb8 |
block-5.15-2021-10-29
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8DN8QHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgptM8D/44F9YcY8qRhrZmsUFr0QFlvFHHUCVCWtDR JW3JQN3hV0zBEVIvc0P3NSKAih/1+rJ3WZmVZA0lczm5OHv4C+ESZSmcl3Muv4Tk skOWxwDTIoSCvC+DzDw8k5UluOucLU9V7uLHQYqDOsqHngLwUerGDMwGfkMXKkNb zRvVaqQMUJufY0tN5QaEjl+GsaXiZJ0pid0MOtXo8NeU+K0BDyoBUF5Gco3/8ZYa NtD4hwM48kYoCNJDAeJmRNo3vArPpZdiJ77jeVXHrj42Mp20LK/jD7PdguEbUzq5 3uXhn0boZCKFGhWQntkL18WwaZbFRZzTBpBqpFIjQKIvicNRoGArIguwwmwdt53P lbsgGgyMqQ3KvuOIEgrAFieA/mQ8iw8Pf/QWiQRk2aYA5n+miex1XfmVX7dVipdm OcV5HtLrKPR1newr0/eZIvN31C3tgaViYxxQOunfW57fXPthCazal+ON6K5w9ZZ8 y79P+K1czCS/edKLTB+idvmWWijoF4GRguUMoCKsD4uXOZO0tk/f/U/ds8/+LBxm KQv8T9wBd+r5h225cB+boM+zslkB4vqnCT+MyiIp070ZAbi/ohirYcFC2fM5j57B UZ58C+WjGjtC0nhb0xS8EGM43ow6hkS+8LPub3fb01AKKvfaw22grc9vHBaFgXgW 5rqLDaAIcg== =73x5 -----END PGP SIGNATURE----- Merge tag 'block-5.15-2021-10-29' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: - NVMe pull request: - fix nvmet-tcp header digest verification (Amit Engel) - fix a memory leak in nvmet-tcp when releasing a queue (Maurizio Lombardi) - fix nvme-tcp H2CData PDU send accounting again (Sagi Grimberg) - fix digest pointer calculation in nvme-tcp and nvmet-tcp (Varun Prakash) - fix possible nvme-tcp req->offset corruption (Varun Prakash) - Queue drain ordering fix (Ming) - Partition check regression for zoned devices (Shin'ichiro) - Zone queue restart fix (Naohiro) * tag 'block-5.15-2021-10-29' of git://git.kernel.dk/linux-block: block: Fix partition check for host-aware zoned block devices nvmet-tcp: fix header digest verification nvmet-tcp: fix data digest pointer calculation nvme-tcp: fix data digest pointer calculation nvme-tcp: fix possible req->offset corruption block: schedule queue restart after BLK_STS_ZONE_RESOURCE block: drain queue after disk is removed from sysfs nvme-tcp: fix H2CData PDU send accounting (again) nvmet-tcp: fix a memory leak when releasing a queue |
|
Shin'ichiro Kawasaki | e0c60d0102 |
block: Fix partition check for host-aware zoned block devices
Commit |
|
Naohiro Aota | 9586e67b91 |
block: schedule queue restart after BLK_STS_ZONE_RESOURCE
When dispatching a zone append write request to a SCSI zoned block device, if the target zone of the request is already locked, the device driver will return BLK_STS_ZONE_RESOURCE and the request will be pushed back to the hctx dipatch queue. The queue will be marked as RESTART in dd_finish_request() and restarted in __blk_mq_free_request(). However, this restart applies to the hctx of the completed request. If the requeued request is on a different hctx, dispatch will no be retried until another request is submitted or the next periodic queue run triggers, leading to up to 30 seconds latency for the requeued request. Fix this problem by scheduling a queue restart similarly to the BLK_STS_RESOURCE case or when we cannot get the budget. Also, consolidate the checks into the "need_resource" variable to simplify the condition. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Niklas Cassel <Niklas.Cassel@wdc.com> Link: https://lore.kernel.org/r/20211026165127.4151055-1-naohiro.aota@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Ming Lei | d308ae0d29 |
block: drain queue after disk is removed from sysfs
Before removing disk from sysfs, userspace still may change queue via
sysfs, such as switching elevator or setting wbt latency, both may
reinitialize wbt, then the warning in blk_free_queue_stats() will be
triggered since rq_qos_exit() is moved to del_gendisk().
Fixes the issue by moving draining queue & tearing down after disk is
removed from sysfs, at that time no one can come into queue's
store()/show().
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Fixes:
|
|
Linus Torvalds | 9c0c4d24ac |
block-5.15-2021-10-22
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmFzfzwQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpqvoEACZB9dFKiYyFcv6X6ARAhfKDuE4ImaJ/hLI VCt4M52P06nqywPQ7iyZMRWR/EVKW8ADDJiiAeXy3mDBgMqD6O27u894i1JP06sn 5gHcSvfH1b9PNFMUw04aI3iXXFUzQU7pwn5z6o2nXlA4onPVW7vwTp8fcHd41Kep LqvZJbihW/iF9d0Wjs9LqPRBWXchtsVyxiNDBgC+kx5IYFn+oTnZOhlxw8ZiT/KH 0v8FIq9HY+5n6UP7InZF2gtIQUyDTR5L1zKKvJu5LDoHvcNlM6Ke0m3DVPcgP79D 2kKGyHOGfqC9Gr37qqOjgKqRO/Z/9SCvG39dmocAd/hh3AfUgKpDQs3HgLyx7ECT aRAe5n0XbfIVcHX1XaOc8cGrszan9YhJvt/dMCmkjaG/3hASlzl2kV4QF3f5IVjx oMgB1Kj8kyu6SqG8mCCjyGCxPpzNq8lVplJRlpifoz+ID/+hgt03aDoYVfPZkDRL nf4VdQCRSl3ZEXkHy1j6l6Nb2UgNEZP1B3a/9onSyBJ/WYqSfFMXrx29PSirz7m7 x4jGOJvdqtNx09zjWHXc/d+I8BEXp4JDXe0GH0OHMiwCwz5PoMo99HRb+IuffKjR lWl4EimH0bfzOA/3vFr5TigfqbnDJ7HCRrGsodQX8gJhVaxVTWxeZG+7Y9qkLqnD JGlZeMQ37w== =uhGw -----END PGP SIGNATURE----- Merge tag 'block-5.15-2021-10-22' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Fix for the cgroup code not ussing irq safe stats updates, and one fix for an error handling condition in add_partition()" * tag 'block-5.15-2021-10-22' of git://git.kernel.dk/linux-block: block: fix incorrect references to disk objects blk-cgroup: blk_cgroup_bio_start() should use irq-safe operations on blkg->iostat_cpu |
|
Zqiang | 9fbfabfda2 |
block: fix incorrect references to disk objects
When adding partitions to the disk, the reference count of the disk object is increased. then alloc partition device and called device_add(), if the device_add() return error, the reference count of the disk object will be reduced twice, at put_device(pdev) and put_disk(disk). this leads to the end of the object's life cycle prematurely, and trigger following calltrace. __init_work+0x2d/0x50 kernel/workqueue.c:519 synchronize_rcu_expedited+0x3af/0x650 kernel/rcu/tree_exp.h:847 bdi_remove_from_list mm/backing-dev.c:938 [inline] bdi_unregister+0x17f/0x5c0 mm/backing-dev.c:946 release_bdi+0xa1/0xc0 mm/backing-dev.c:968 kref_put include/linux/kref.h:65 [inline] bdi_put+0x72/0xa0 mm/backing-dev.c:976 bdev_free_inode+0x11e/0x220 block/bdev.c:408 i_callback+0x3f/0x70 fs/inode.c:226 rcu_do_batch kernel/rcu/tree.c:2508 [inline] rcu_core+0x76d/0x16c0 kernel/rcu/tree.c:2743 __do_softirq+0x1d7/0x93b kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0xf2/0x130 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x93/0xc0 making disk is NULL when calling put_disk(). Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Zqiang <qiang.zhang1211@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211018103422.2043-1-qiang.zhang1211@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Linus Torvalds | f2b3420b92 |
block-5.15-2021-10-17
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmFsIqAQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgppbBEACDewLUv7bg1VFIGdroRN51OGiOv1oV+8HP ruY7O9CPtV7wcb3lA1Zy9igICuzuC5culHjbRJrNIUeTdWCQHCFk/sfKSD6VGMoT cFTqpKxV7M3vYr9G2m5TFWgY2mfS+I5fxyDZxK2z2esHCFw6TZ7A5W13xScVXKP+ QdNFSlTrGkpggsSIEeHApG+NLsIecnkT4qzm8zPfUodUtQ3A8JMjQjnYUFEAWfWv l9x9zDIzaGjPtXf5soFEvmdh1ALh3WWiYb1kIwK1FeP/PYX0JV/3zCMgqOwpK+4b 69OM3Q0NPHvu2TgSRK+ghekAtz5qgPDMCrzdhSgLYJEL/PGAOboqjrB9E+wWoEjd IKrYLx4Xao2TUZLJF2y34hHfODGdasx7d+wS191UpVFEZHFhDhIaazZ2rDd5xnQK LdzQw1JQF/igJovHauhSkGFIdJWBSDneLQoMimBnitZlsWARUmFSZej34FFRLZsW 8ZXfqipn/x+fh4sQ/HdEfWxnGHtveDpU+0Ka5bMUe/tJ9RPtmn/Ye7nFjYecC6NY 4UzFSNn+4e9DpHaDuP3I/eA1YBmVlcB5Hum3ve7X6ovwpjArYg3dgJOEi8uCZjfb hdMANmkVptcPiEO9njEHhC7S8+Nm3t+8o3qQceN81j6Vcjgzt/Y/n3Z6UkKeSlkn Ila+cZI1oA== =J/e4 -----END PGP SIGNATURE----- Merge tag 'block-5.15-2021-10-17' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Bigger than usual for this point in time, the majority is fixing some issues around BDI lifetimes with the move from the request_queue to the disk in this release. In detail: - Series on draining fs IO for del_gendisk() (Christoph) - NVMe pull request via Christoph: - fix the abort command id (Keith Busch) - nvme: fix per-namespace chardev deletion (Adam Manzanares) - brd locking scope fix (Tetsuo) - BFQ fix (Paolo)" * tag 'block-5.15-2021-10-17' of git://git.kernel.dk/linux-block: block, bfq: reset last_bfqq_created on group change block: warn when putting the final reference on a registered disk brd: reduce the brd_devices_mutex scope kyber: avoid q->disk dereferences in trace points block: keep q_usage_counter in atomic mode after del_gendisk block: drain file system I/O on del_gendisk block: split bio_queue_enter from blk_queue_enter block: factor out a blk_try_enter_queue helper block: call submit_bio_checks under q_usage_counter nvme: fix per-namespace chardev deletion block/rnbd-clt-sysfs: fix a couple uninitialized variable bugs nvme-pci: Fix abort command id |
|
Tejun Heo | 5370b0f490 |
blk-cgroup: blk_cgroup_bio_start() should use irq-safe operations on blkg->iostat_cpu
|
|
Paolo Valente | d29bd41428 |
block, bfq: reset last_bfqq_created on group change
Since commit |
|
Christoph Hellwig | a20417611b |
block: warn when putting the final reference on a registered disk
Warn when the last reference on a live disk is put without calling del_gendisk first. There are some BDI related bug reports that look like a case of this, so make sure we have the proper instrumentation to catch it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211014130231.1468538-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Christoph Hellwig | c41108049d |
kyber: avoid q->disk dereferences in trace points
q->disk becomes invalid after the gendisk is removed. Work around this by caching the dev_t for the tracepoints. The real fix would be to properly tear down the I/O schedulers with the gendisk, but that is a much more invasive change. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012093301.GA27795@lst.de Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Christoph Hellwig | aec89dc5d4 |
block: keep q_usage_counter in atomic mode after del_gendisk
Don't switch back to percpu mode to avoid the double RCU grace period when tearing down SCSI devices. After removing the disk only passthrough commands can be send anyway. Suggested-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20210929071241.934472-6-hch@lst.de Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Christoph Hellwig | 8e141f9eb8 |
block: drain file system I/O on del_gendisk
Instead of delaying draining of file system I/O related items like the
blk-qos queues, the integrity read workqueue and timeouts only when the
request_queue is removed, do that when del_gendisk is called. This is
important for SCSI where the upper level drivers that control the gendisk
are separate entities, and the disk can be freed much earlier than the
request_queue, or can even be unbound without tearing down the queue.
Fixes:
|
|
Christoph Hellwig | a6741536f4 |
block: split bio_queue_enter from blk_queue_enter
To prepare for fixing a gendisk shutdown race, open code the blk_queue_enter logic in bio_queue_enter. This also removes the pointless flags translation. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20210929071241.934472-4-hch@lst.de Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Christoph Hellwig | 1f14a09890 |
block: factor out a blk_try_enter_queue helper
Factor out the code to try to get q_usage_counter without blocking into a separate helper. Both to improve code readability and to prepare for splitting bio_queue_enter from blk_queue_enter. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20210929071241.934472-3-hch@lst.de Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Christoph Hellwig | cc9c884dd7 |
block: call submit_bio_checks under q_usage_counter
Ensure all bios check the current values of the queue under freeze protection, i.e. to make sure the zero capacity set by del_gendisk is actually seen before dispatching to the driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210929071241.934472-2-hch@lst.de Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Linus Torvalds | 50eb0a06e6 |
block-5.15-2021-10-09
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmFh4SQQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgphKJEACwasArWhI3zxFrl4QQfGsILTbqn7Ent7qp XsEunsQ/27wp8Pv2nTrQT/7Z2qYVuIs+EzPR1+DTFTVWycG3ZZCMp6ACIuAVH73b nD2vNQdHGcWqeipxbRYA/JBON+tVcFGbsFPREQUqn++3FIBHhKUvd/NGRPrim62s a63WEKZuHo/HHptvG9zmtqQAJ1N9wdA75VBn9GNgpqpW/lLePufNykEdox7/HHFg jJzvW8rppRBaLzoDj4RC4hp7Xgf0CPWScKqI3R20CHg5WjXkjDSvIWT3VQuFA/kU aRBO2cvoVYUw7w4xRosxTdJObaDXZZnXewA0SOGwGVAb9pIW5dhzby+bpuRx/Wkw 3WrYGxndu1eis3YAoCVxllzo0zFMXGLjM3nfERketPHFPicLVVd0Pye+YAemyvMp Y9RWIRTRg6w/t0NicfYdK8LBTqEqdoys5OZZgaCyIPDV5dwhHCSDuACAwpR4qTAF PodCBG4DAc5TpE0Vy5QPywIe1cRvF8lnTZ+ZYXR7g3Ub1KoIl24gXuLvNMYrm0qb 92l9S1+Bk1lT3nVThxz+rJSHsumlZHROd5TLkQs1S2bb7E9Pxyc6IW+H0hviutgc bN+aBf7O+U+oDoqkOGWFXB4aJMqFjjVW6z3DmhqE7MyVPE5jkpFSXY19ge2HOVwd AVXocaNAdw== =AbHC -----END PGP SIGNATURE----- Merge tag 'block-5.15-2021-10-09' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Two small fixes for this release: - Add missing QUEUE_FLAG_HCTX_ACTIVE in the debugfs handling (Johannes) - Fix double free / UAF issue in __alloc_disk_node (Tetsuo)" * tag 'block-5.15-2021-10-09' of git://git.kernel.dk/linux-block: block: decode QUEUE_FLAG_HCTX_ACTIVE in debugfs output block: genhd: fix double kfree() in __alloc_disk_node() |
|
Johannes Thumshirn | 1dbdd99b51 |
block: decode QUEUE_FLAG_HCTX_ACTIVE in debugfs output
While debugging an issue we've found that $DEBUGFS/block/$disk/state doesn't decode QUEUE_FLAG_HCTX_ACTIVE but only displays its numerical value. Add QUEUE_FLAG(HCTX_ACTIVE) to the blk_queue_flag_name array so it'll get decoded properly. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/4351076388918075bd80ef07756f9d2ce63be12c.1633332053.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Linus Torvalds | ab2a7a35c4 |
block-5.15-2021-10-01
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmFXvGIQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgprAgD/9WKQIVPyzXC3Ui8OkSb9DM+dnU47+4b6sL Naud4khgrvMjTIdZC9YjwAhKBRNsXbcTXRLbBsxVzk7HhGI31jR711qaOZ/jMRbc dPwcpF82/Yr8BlIl5i3SBvxFNW8FAJWLTVvPvJXSjUBPECTgY6MnchojOjYz1CrN a0TmtHs5Feu+nLyzKgQlPMv3P8tvrS1hQNpfvbK1oWeEQuyONGacK82nl853ak30 c0uywVsvO+yiBY2xWhiTqperbPwnLxCYOMOdBqbNvlPlGIkEbkW1YWhm2/5f3vRi 37C3ibmt2SCnJaAyiNUKrYSTCK9BoeWFqtxVwyGB8WxWsUXrdocEOoBdohxDHrcp gHIqxPVUD2hyEroRQseq4NTvERkRvOiF4jAFA3lHOw5zLtPGNEvC0EUFRMm3Ur3/ DyviDBPy321pvrbqGg5uIL3o6wuXDQNnwxdXa2wElWDLuTYp8tB25rqAe2IRJ+iS k32AsgfL+y5fhjsA6ros/hpQ6FQTAKSW3i6S/0L+QB6H0jYb1ue9JPrA+3MoizvI p+PdmpLeFVQxnfthT2J+6pzQADoxcMwg5ALgyLFvhhL1YAqyhHkKgdAIS3gKXxYm GbQLd2hkv7wPZHEsS/XiGCj+zzyawjsr5hSHRqkbZ1QlyfnuTz5uI65WzST68o4d KOAvQTx4aQ== =SMoX -----END PGP SIGNATURE----- Merge tag 'block-5.15-2021-10-01' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "A few block fixes for this release: - Revert a BFQ commit that causes breakage for people. Unfortunately it was auto-selected for stable as well, so now 5.14.7 suffers from it too. Hopefully stable will pick up this revert quickly too, so we can remove the issue on that end as well. - Add a quirk for Apple NVMe controllers, which due to their non-compliance broke due to the introduction of command sequences (Keith) - Use shifts in nbd, fixing a __divdi3 issue (Nick)" * tag 'block-5.15-2021-10-01' of git://git.kernel.dk/linux-block: nbd: use shifts rather than multiplies Revert "block, bfq: honor already-setup queue merges" nvme: add command id quirk for apple controllers |
|
Tetsuo Handa | 06cc978d3f |
block: genhd: fix double kfree() in __alloc_disk_node()
syzbot is reporting use-after-free read at bdev_free_inode() [1], for kfree() from __alloc_disk_node() is called before bdev_free_inode() (which is called after RCU grace period) reads bdev->bd_disk and calls kfree(bdev->bd_disk). Fix use-after-free read followed by double kfree() problem by making sure that bdev->bd_disk is NULL when calling iput(). Link: https://syzkaller.appspot.com/bug?extid=8281086e8a6fbfbd952a [1] Reported-by: syzbot <syzbot+8281086e8a6fbfbd952a@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/e6dd13c5-8db0-4392-6e78-a42ee5d2a1c4@i-love.sakura.ne.jp Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Jens Axboe | ebc69e897e |
Revert "block, bfq: honor already-setup queue merges"
This reverts commit |
|
Linus Torvalds | bb19237bf6 |
SCSI fixes on 20210925
Thirty Three fixes, I'm afraid. Essentially the build up from the last couple of weeks while I've been dealling with Linux Plumbers conference infrastructure issues. It's mostly the usual assortment of spelling fixes and minor corrections. The only core relevant changes are to the sd driver to reduce the spin up message spew and fix a small memory leak on the freeing path. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJsEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYU+VfiYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishVuMAP9Y2+rE AyQ3Y9dubie/AyKQ/2ZEFO/G2wz+jEJvfppXJwD4ulhhDFh/l4iNPVa7GW9M/ti3 gd+6gMzzRC9/B+u2gQ== =+K+l -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Thirty-three fixes, I'm afraid. Essentially the build up from the last couple of weeks while I've been dealling with Linux Plumbers conference infrastructure issues. It's mostly the usual assortment of spelling fixes and minor corrections. The only core relevant changes are to the sd driver to reduce the spin up message spew and fix a small memory leak on the freeing path" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (33 commits) scsi: ses: Retry failed Send/Receive Diagnostic commands scsi: target: Fix spelling mistake "CONFLIFT" -> "CONFLICT" scsi: lpfc: Fix gcc -Wstringop-overread warning, again scsi: lpfc: Use correct scnprintf() limit scsi: lpfc: Fix sprintf() overflow in lpfc_display_fpin_wwpn() scsi: core: Remove 'current_tag' scsi: acornscsi: Remove tagged queuing vestiges scsi: fas216: Kill scmd->tag scsi: qla2xxx: Restore initiator in dual mode scsi: ufs: core: Unbreak the reset handler scsi: sd_zbc: Support disks with more than 2**32 logical blocks scsi: ufs: core: Revert "scsi: ufs: Synchronize SCSI and UFS error handling" scsi: bsg: Fix device unregistration scsi: sd: Make sd_spinup_disk() less noisy scsi: ufs: ufs-pci: Fix Intel LKF link stability scsi: mpt3sas: Clean up some inconsistent indenting scsi: megaraid: Clean up some inconsistent indenting scsi: sr: Fix spelling mistake "does'nt" -> "doesn't" scsi: Remove SCSI CDROM MAINTAINERS entry scsi: megaraid: Fix Coccinelle warning ... |
|
Ming Lei | f278eb3d81 |
block: hold ->invalidate_lock in blkdev_fallocate
When running ->fallocate(), blkdev_fallocate() should hold mapping->invalidate_lock to prevent page cache from being accessed, otherwise stale data may be read in page cache. Without this patch, blktests block/009 fails sometimes. With this patch, block/009 can pass always. Also as Jan pointed out, no pages can be created in the discarded area while you are holding the invalidate_lock, so remove the 2nd truncate_bdev_range(). Cc: Jan Kara <jack@suse.cz> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210923023751.1441091-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Ming Lei | a647a524a4 |
block: don't call rq_qos_ops->done_bio if the bio isn't tracked
rq_qos framework is only applied on request based driver, so: 1) rq_qos_done_bio() needn't to be called for bio based driver 2) rq_qos_done_bio() needn't to be called for bio which isn't tracked, such as bios ended from error handling code. Especially in bio_endio(): 1) request queue is referred via bio->bi_bdev->bd_disk->queue, which may be gone since request queue refcount may not be held in above two cases 2) q->rq_qos may be freed in blk_cleanup_queue() when calling into __rq_qos_done_bio() Fix the potential kernel panic by not calling rq_qos_ops->done_bio if the bio isn't tracked. This way is safe because both ioc_rqos_done_bio() and blkcg_iolatency_done_bio() are nop if the bio isn't tracked. Reported-by: Yu Kuai <yukuai3@huawei.com> Cc: tj@kernel.org Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20210924110704.1541818-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Li Jinlin | 858560b276 |
blk-cgroup: fix UAF by grabbing blkcg lock before destroying blkg pd
KASAN reports a use-after-free report when doing fuzz test: [693354.104835] ================================================================== [693354.105094] BUG: KASAN: use-after-free in bfq_io_set_weight_legacy+0xd3/0x160 [693354.105336] Read of size 4 at addr ffff888be0a35664 by task sh/1453338 [693354.105607] CPU: 41 PID: 1453338 Comm: sh Kdump: loaded Not tainted 4.18.0-147 [693354.105610] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 0.81 07/02/2018 [693354.105612] Call Trace: [693354.105621] dump_stack+0xf1/0x19b [693354.105626] ? show_regs_print_info+0x5/0x5 [693354.105634] ? printk+0x9c/0xc3 [693354.105638] ? cpumask_weight+0x1f/0x1f [693354.105648] print_address_description+0x70/0x360 [693354.105654] kasan_report+0x1b2/0x330 [693354.105659] ? bfq_io_set_weight_legacy+0xd3/0x160 [693354.105665] ? bfq_io_set_weight_legacy+0xd3/0x160 [693354.105670] bfq_io_set_weight_legacy+0xd3/0x160 [693354.105675] ? bfq_cpd_init+0x20/0x20 [693354.105683] cgroup_file_write+0x3aa/0x510 [693354.105693] ? ___slab_alloc+0x507/0x540 [693354.105698] ? cgroup_file_poll+0x60/0x60 [693354.105702] ? 0xffffffff89600000 [693354.105708] ? usercopy_abort+0x90/0x90 [693354.105716] ? mutex_lock+0xef/0x180 [693354.105726] kernfs_fop_write+0x1ab/0x280 [693354.105732] ? cgroup_file_poll+0x60/0x60 [693354.105738] vfs_write+0xe7/0x230 [693354.105744] ksys_write+0xb0/0x140 [693354.105749] ? __ia32_sys_read+0x50/0x50 [693354.105760] do_syscall_64+0x112/0x370 [693354.105766] ? syscall_return_slowpath+0x260/0x260 [693354.105772] ? do_page_fault+0x9b/0x270 [693354.105779] ? prepare_exit_to_usermode+0xf9/0x1a0 [693354.105784] ? enter_from_user_mode+0x30/0x30 [693354.105793] entry_SYSCALL_64_after_hwframe+0x65/0xca [693354.105875] Allocated by task 1453337: [693354.106001] kasan_kmalloc+0xa0/0xd0 [693354.106006] kmem_cache_alloc_node_trace+0x108/0x220 [693354.106010] bfq_pd_alloc+0x96/0x120 [693354.106015] blkcg_activate_policy+0x1b7/0x2b0 [693354.106020] bfq_create_group_hierarchy+0x1e/0x80 [693354.106026] bfq_init_queue+0x678/0x8c0 [693354.106031] blk_mq_init_sched+0x1f8/0x460 [693354.106037] elevator_switch_mq+0xe1/0x240 [693354.106041] elevator_switch+0x25/0x40 [693354.106045] elv_iosched_store+0x1a1/0x230 [693354.106049] queue_attr_store+0x78/0xb0 [693354.106053] kernfs_fop_write+0x1ab/0x280 [693354.106056] vfs_write+0xe7/0x230 [693354.106060] ksys_write+0xb0/0x140 [693354.106064] do_syscall_64+0x112/0x370 [693354.106069] entry_SYSCALL_64_after_hwframe+0x65/0xca [693354.106114] Freed by task 1453336: [693354.106225] __kasan_slab_free+0x130/0x180 [693354.106229] kfree+0x90/0x1b0 [693354.106233] blkcg_deactivate_policy+0x12c/0x220 [693354.106238] bfq_exit_queue+0xf5/0x110 [693354.106241] blk_mq_exit_sched+0x104/0x130 [693354.106245] __elevator_exit+0x45/0x60 [693354.106249] elevator_switch_mq+0xd6/0x240 [693354.106253] elevator_switch+0x25/0x40 [693354.106257] elv_iosched_store+0x1a1/0x230 [693354.106261] queue_attr_store+0x78/0xb0 [693354.106264] kernfs_fop_write+0x1ab/0x280 [693354.106268] vfs_write+0xe7/0x230 [693354.106271] ksys_write+0xb0/0x140 [693354.106275] do_syscall_64+0x112/0x370 [693354.106280] entry_SYSCALL_64_after_hwframe+0x65/0xca [693354.106329] The buggy address belongs to the object at ffff888be0a35580 which belongs to the cache kmalloc-1k of size 1024 [693354.106736] The buggy address is located 228 bytes inside of 1024-byte region [ffff888be0a35580, ffff888be0a35980) [693354.107114] The buggy address belongs to the page: [693354.107273] page:ffffea002f828c00 count:1 mapcount:0 mapping:ffff888107c17080 index:0x0 compound_mapcount: 0 [693354.107606] flags: 0x17ffffc0008100(slab|head) [693354.107760] raw: 0017ffffc0008100 ffffea002fcbc808 ffffea0030bd3a08 ffff888107c17080 [693354.108020] raw: 0000000000000000 00000000001c001c 00000001ffffffff 0000000000000000 [693354.108278] page dumped because: kasan: bad access detected [693354.108511] Memory state around the buggy address: [693354.108671] ffff888be0a35500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [693354.116396] ffff888be0a35580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [693354.124473] >ffff888be0a35600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [693354.132421] ^ [693354.140284] ffff888be0a35680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [693354.147912] ffff888be0a35700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [693354.155281] ================================================================== blkgs are protected by both queue and blkcg locks and holding either should stabilize them. However, the path of destroying blkg policy data is only protected by queue lock in blkcg_activate_policy()/blkcg_deactivate_policy(). Other tasks can get the blkg policy data before the blkg policy data is destroyed, and use it after destroyed, which will result in a use-after-free. CPU0 CPU1 blkcg_deactivate_policy spin_lock_irq(&q->queue_lock) bfq_io_set_weight_legacy spin_lock_irq(&blkcg->lock) blkg_to_bfqg(blkg) pd_to_bfqg(blkg->pd[pol->plid]) ^^^^^^blkg->pd[pol->plid] != NULL bfqg != NULL pol->pd_free_fn(blkg->pd[pol->plid]) pd_to_bfqg(blkg->pd[pol->plid]) bfqg_put(bfqg) kfree(bfqg) blkg->pd[pol->plid] = NULL spin_unlock_irq(q->queue_lock); bfq_group_set_weight(bfqg, val, 0) bfqg->entity.new_weight ^^^^^^trigger uaf here spin_unlock_irq(&blkcg->lock); Fix by grabbing the matching blkcg lock before trying to destroy blkg policy data. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Li Jinlin <lijinlin3@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20210914042605.3260596-1-lijinlin3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Yanfei Xu | 6f5ddde410 |
blkcg: fix memory leak in blk_iolatency_init
BUG: memory leak
unreferenced object 0xffff888129acdb80 (size 96):
comm "syz-executor.1", pid 12661, jiffies 4294962682 (age 15.220s)
hex dump (first 32 bytes):
20 47 c9 85 ff ff ff ff 20 d4 8e 29 81 88 ff ff G...... ..)....
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff82264ec8>] kmalloc include/linux/slab.h:591 [inline]
[<ffffffff82264ec8>] kzalloc include/linux/slab.h:721 [inline]
[<ffffffff82264ec8>] blk_iolatency_init+0x28/0x190 block/blk-iolatency.c:724
[<ffffffff8225b8c4>] blkcg_init_queue+0xb4/0x1c0 block/blk-cgroup.c:1185
[<ffffffff822253da>] blk_alloc_queue+0x22a/0x2e0 block/blk-core.c:566
[<ffffffff8223b175>] blk_mq_init_queue_data block/blk-mq.c:3100 [inline]
[<ffffffff8223b175>] __blk_mq_alloc_disk+0x25/0xd0 block/blk-mq.c:3124
[<ffffffff826a9303>] loop_add+0x1c3/0x360 drivers/block/loop.c:2344
[<ffffffff826a966e>] loop_control_get_free drivers/block/loop.c:2501 [inline]
[<ffffffff826a966e>] loop_control_ioctl+0x17e/0x2e0 drivers/block/loop.c:2516
[<ffffffff81597eec>] vfs_ioctl fs/ioctl.c:51 [inline]
[<ffffffff81597eec>] __do_sys_ioctl fs/ioctl.c:874 [inline]
[<ffffffff81597eec>] __se_sys_ioctl fs/ioctl.c:860 [inline]
[<ffffffff81597eec>] __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:860
[<ffffffff843fa745>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<ffffffff843fa745>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
[<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
Once blk_throtl_init() queue init failed, blkcg_iolatency_exit() will
not be invoked for cleanup. That leads a memory leak. Swap the
blk_throtl_init() and blk_iolatency_init() calls can solve this.
Reported-by: syzbot+01321b15cc98e6bf96d6@syzkaller.appspotmail.com
Fixes:
|
|
Lihong Kou | 3df49967f6 |
block: flush the integrity workqueue in blk_integrity_unregister
When the integrity profile is unregistered there can still be integrity reads queued up which could see a NULL verify_fn as shown by the race window below: CPU0 CPU1 process_one_work nvme_validate_ns bio_integrity_verify_fn nvme_update_ns_info nvme_update_disk_info blk_integrity_unregister ---set queue->integrity as 0 bio_integrity_process --access bi->profile->verify_fn(bi is a pointer of queue->integity) Before calling blk_integrity_unregister in nvme_update_disk_info, we must make sure that there is no work item in the kintegrityd_wq. Just call blk_flush_integrity to flush the work queue so the bug can be resolved. Signed-off-by: Lihong Kou <koulihong@huawei.com> [hch: split up and shortened the changelog] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20210914070657.87677-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Christoph Hellwig | 783a40a1b3 |
block: check if a profile is actually registered in blk_integrity_unregister
While clearing the profile itself is harmless, we really should not clear the stable writes flag if it wasn't set due to a registered integrity profile. Reported-by: Lihong Kou <koulihong@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20210914070657.87677-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Zenghui Yu | 1a0db7744e |
scsi: bsg: Fix device unregistration
device_initialize() is used to take a refcount on the device. However,
put_device() is not called during device teardown. This leads to a
leak of private data of the driver core, dev_name(), etc. This is
reported by kmemleak at boot time if we compile kernel with
DEBUG_TEST_DRIVER_REMOVE.
Fix memory leaks during unregistration and implement a release
function.
Link: https://lore.kernel.org/r/20210911105306.1511-1-yuzenghui@huawei.com
Fixes:
|
|
Ming Lei | 67f3b2f822 |
blk-mq: avoid to iterate over stale request
blk-mq can't run allocating driver tag and updating ->rqs[tag] atomically, meantime blk-mq doesn't clear ->rqs[tag] after the driver tag is released. So there is chance to iterating over one stale request just after the tag is allocated and before updating ->rqs[tag]. scsi_host_busy_iter() calls scsi_host_check_in_flight() to count scsi in-flight requests after scsi host is blocked, so no new scsi command can be marked as SCMD_STATE_INFLIGHT. However, driver tag allocation still can be run by blk-mq core. One request is marked as SCMD_STATE_INFLIGHT, but this request may have been kept in another slot of ->rqs[], meantime the slot can be allocated out but ->rqs[] isn't updated yet. Then this in-flight request is counted twice as SCMD_STATE_INFLIGHT. This way causes trouble in handling scsi error. Fixes the issue by not iterating over stale request. Cc: linux-scsi@vger.kernel.org Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Reported-by: luojiaxing <luojiaxing@huawei.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210906065003.439019-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Linus Torvalds | c0f7e49fc4 |
block-5.15-2021-09-11
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmE8ueIQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpkSYD/9eaQ1Hxc+X+4eVb3A9Cpy36Qy/uY/hArnT kSUDtQitrRigqhStaD0MGpknWFnZE4cSojbYN0OoEWL7GC8idSZXx7KrVJpSHGbM XGVEflohvjDLNPkV99gmlzF2o6zPlWESApU1/HO2x+Ws1oKaYDAfFVf0CPGPe2C6 MRerU5v3HSmTC0eFZxU246bwwX/phNuNDokndR27rrsjK0mLF5UoMKySeqy3INp5 6mj3R+HNIW5j8eQk/HJPW7dgiKpWYneWV2Z90DuOLbcJ+wnx7s07wT1yRnOFUTsb p2ojVWmXtCJ1kRex6bK/eeIJC5TYvT3bNwsnIRmJHd9btHqhm2uKy77m3S1AuE7w K8bN581aXlr/3pUbFyYZDZQbYshUn25YP9OlyS9r4pklCh9C5KneL1b4xswWTDTB whvPZlkot3rGD8LHDpV5xVVzeaAcbSXanIRROjxHqQSRRTA9BjG3E4A2cDh8nmYD mRGEimfZcoojF2EQJYswPOQ24cZwpnihPpJO9NkOodRqfasn6XakAGg6SONFYyQ0 Ewa6QzIOCebBgOVGbzMtpoDpnySE12ONmrDCbSEiYFJLXBMMiqgNON/Xaq0tmXHT lsDpyz3ytWAB9OZ3M0/9arZzlFf/E+FRqt4ExelmwxiutKRb1dIKQq8xip/YxdA+ Y86kwUoAXQ== =1ajD -----END PGP SIGNATURE----- Merge tag 'block-5.15-2021-09-11' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: - NVMe pull request from Christoph: - fix nvmet command set reporting for passthrough controllers (Adam Manzanares) - update a MAINTAINERS email address (Chaitanya Kulkarni) - set QUEUE_FLAG_NOWAIT for nvme-multipth (me) - handle errors from add_disk() (Luis Chamberlain) - update the keep alive interval when kato is modified (Tatsuya Sasaki) - fix a buffer overrun in nvmet_subsys_attr_serial (Hannes Reinecke) - do not reset transport on data digest errors in nvme-tcp (Daniel Wagner) - only call synchronize_srcu when clearing current path (Daniel Wagner) - revalidate paths during rescan (Hannes Reinecke) - Split out the fs/block_dev into block/fops.c and block/bdev.c, which has been long overdue. Do this now before -rc1, to avoid annoying conflicts due to this (Christoph) - blk-throtl use-after-free fix (Li) - Improve plug depth for multi-device plugs, greatly increasing md resync performance (Song) - blkdev_show() locking fix (Tetsuo) - n64cart error check fix (Yang) * tag 'block-5.15-2021-09-11' of git://git.kernel.dk/linux-block: n64cart: fix return value check in n64cart_probe() blk-mq: allow 4x BLK_MAX_REQUEST_COUNT at blk_plug for multiple_queues block: move fs/block_dev.c to block/bdev.c block: split out operations on block special files blk-throttle: fix UAF by deleteing timer in blk_throtl_exit() block: genhd: don't call blkdev_show() with major_names_lock held nvme: update MAINTAINERS email address nvme: add error handling support for add_disk() nvme: only call synchronize_srcu when clearing current path nvme: update keep alive interval when kato is modified nvme-tcp: Do not reset transport on data digest errors nvmet: fixup buffer overrun in nvmet_subsys_attr_serial() nvmet: return bool from nvmet_passthru_ctrl and nvmet_is_passthru_req nvmet: looks at the passthrough controller when initializing CAP nvme: move nvme_multi_css into nvme.h nvme-multipath: revalidate paths during rescan nvme-multipath: set QUEUE_FLAG_NOWAIT |
|
Song Liu | 7f2a6a69f7 |
blk-mq: allow 4x BLK_MAX_REQUEST_COUNT at blk_plug for multiple_queues
Limiting number of request to BLK_MAX_REQUEST_COUNT at blk_plug hurts performance for large md arrays. [1] shows resync speed of md array drops for md array with more than 16 HDDs. Fix this by allowing more request at plug queue. The multiple_queue flag is used to only apply higher limit to multiple queue cases. [1] https://lore.kernel.org/linux-raid/CAFDAVznS71BXW8Jxv6k9dXc2iR3ysX3iZRBww_rzA8WifBFxGg@mail.gmail.com/ Tested-by: Marcin Wanat <marcin.wanat@gmail.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Christoph Hellwig | 0dca4462ed |
block: move fs/block_dev.c to block/bdev.c
Move it together with the rest of the block layer. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210907141303.1371844-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Christoph Hellwig | cd82cca7eb |
block: split out operations on block special files
Add a new block/fops.c for all the file and address_space operations that provide the block special file support. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210907141303.1371844-2-hch@lst.de [axboe: correct trailing whitespace while at it] Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Li Jinlin | 884f0e84f1 |
blk-throttle: fix UAF by deleteing timer in blk_throtl_exit()
The pending timer has been set up in blk_throtl_init(). However, the timer is not deleted in blk_throtl_exit(). This means that the timer handler may still be running after freeing the timer, which would result in a use-after-free. Fix by calling del_timer_sync() to delete the timer in blk_throtl_exit(). Signed-off-by: Li Jinlin <lijinlin3@huawei.com> Link: https://lore.kernel.org/r/20210907121242.2885564-1-lijinlin3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Tetsuo Handa | dfbb3409b2 |
block: genhd: don't call blkdev_show() with major_names_lock held
If CONFIG_BLK_DEV_LOOP && CONFIG_MTD (at least; there might be other
combinations), lockdep complains circular locking dependency at
__loop_clr_fd(), for major_names_lock serves as a locking dependency
aggregating hub across multiple block modules.
======================================================
WARNING: possible circular locking dependency detected
5.14.0+ #757 Tainted: G E
------------------------------------------------------
systemd-udevd/7568 is trying to acquire lock:
ffff88800f334d48 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x70/0x560
but task is already holding lock:
ffff888014a7d4a0 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x4d/0x400 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #6 (&lo->lo_mutex){+.+.}-{3:3}:
lock_acquire+0xbe/0x1f0
__mutex_lock_common+0xb6/0xe10
mutex_lock_killable_nested+0x17/0x20
lo_open+0x23/0x50 [loop]
blkdev_get_by_dev+0x199/0x540
blkdev_open+0x58/0x90
do_dentry_open+0x144/0x3a0
path_openat+0xa57/0xda0
do_filp_open+0x9f/0x140
do_sys_openat2+0x71/0x150
__x64_sys_openat+0x78/0xa0
do_syscall_64+0x3d/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #5 (&disk->open_mutex){+.+.}-{3:3}:
lock_acquire+0xbe/0x1f0
__mutex_lock_common+0xb6/0xe10
mutex_lock_nested+0x17/0x20
bd_register_pending_holders+0x20/0x100
device_add_disk+0x1ae/0x390
loop_add+0x29c/0x2d0 [loop]
blk_request_module+0x5a/0xb0
blkdev_get_no_open+0x27/0xa0
blkdev_get_by_dev+0x5f/0x540
blkdev_open+0x58/0x90
do_dentry_open+0x144/0x3a0
path_openat+0xa57/0xda0
do_filp_open+0x9f/0x140
do_sys_openat2+0x71/0x150
__x64_sys_openat+0x78/0xa0
do_syscall_64+0x3d/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #4 (major_names_lock){+.+.}-{3:3}:
lock_acquire+0xbe/0x1f0
__mutex_lock_common+0xb6/0xe10
mutex_lock_nested+0x17/0x20
blkdev_show+0x19/0x80
devinfo_show+0x52/0x60
seq_read_iter+0x2d5/0x3e0
proc_reg_read_iter+0x41/0x80
vfs_read+0x2ac/0x330
ksys_read+0x6b/0xd0
do_syscall_64+0x3d/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&p->lock){+.+.}-{3:3}:
lock_acquire+0xbe/0x1f0
__mutex_lock_common+0xb6/0xe10
mutex_lock_nested+0x17/0x20
seq_read_iter+0x37/0x3e0
generic_file_splice_read+0xf3/0x170
splice_direct_to_actor+0x14e/0x350
do_splice_direct+0x84/0xd0
do_sendfile+0x263/0x430
__se_sys_sendfile64+0x96/0xc0
do_syscall_64+0x3d/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#3){.+.+}-{0:0}:
lock_acquire+0xbe/0x1f0
lo_write_bvec+0x96/0x280 [loop]
loop_process_work+0xa68/0xc10 [loop]
process_one_work+0x293/0x480
worker_thread+0x23d/0x4b0
kthread+0x163/0x180
ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
lock_acquire+0xbe/0x1f0
process_one_work+0x280/0x480
worker_thread+0x23d/0x4b0
kthread+0x163/0x180
ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}:
validate_chain+0x1f0d/0x33e0
__lock_acquire+0x92d/0x1030
lock_acquire+0xbe/0x1f0
flush_workqueue+0x8c/0x560
drain_workqueue+0x80/0x140
destroy_workqueue+0x47/0x4f0
__loop_clr_fd+0xb4/0x400 [loop]
blkdev_put+0x14a/0x1d0
blkdev_close+0x1c/0x20
__fput+0xfd/0x220
task_work_run+0x69/0xc0
exit_to_user_mode_prepare+0x1ce/0x1f0
syscall_exit_to_user_mode+0x26/0x60
do_syscall_64+0x4c/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
(wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&lo->lo_mutex);
lock(&disk->open_mutex);
lock(&lo->lo_mutex);
lock((wq_completion)loop0);
*** DEADLOCK ***
2 locks held by systemd-udevd/7568:
#0: ffff888012554128 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_put+0x4c/0x1d0
#1: ffff888014a7d4a0 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x4d/0x400 [loop]
stack backtrace:
CPU: 0 PID: 7568 Comm: systemd-udevd Tainted: G E 5.14.0+ #757
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
Call Trace:
dump_stack_lvl+0x79/0xbf
print_circular_bug+0x5d6/0x5e0
? stack_trace_save+0x42/0x60
? save_trace+0x3d/0x2d0
check_noncircular+0x10b/0x120
validate_chain+0x1f0d/0x33e0
? __lock_acquire+0x953/0x1030
? __lock_acquire+0x953/0x1030
__lock_acquire+0x92d/0x1030
? flush_workqueue+0x70/0x560
lock_acquire+0xbe/0x1f0
? flush_workqueue+0x70/0x560
flush_workqueue+0x8c/0x560
? flush_workqueue+0x70/0x560
? sched_clock_cpu+0xe/0x1a0
? drain_workqueue+0x41/0x140
drain_workqueue+0x80/0x140
destroy_workqueue+0x47/0x4f0
? blk_mq_freeze_queue_wait+0xac/0xd0
__loop_clr_fd+0xb4/0x400 [loop]
? __mutex_unlock_slowpath+0x35/0x230
blkdev_put+0x14a/0x1d0
blkdev_close+0x1c/0x20
__fput+0xfd/0x220
task_work_run+0x69/0xc0
exit_to_user_mode_prepare+0x1ce/0x1f0
syscall_exit_to_user_mode+0x26/0x60
do_syscall_64+0x4c/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f0fd4c661f7
Code: 00 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 13 fc ff ff
RSP: 002b:00007ffd1c9e9fd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 00007f0fd46be6c8 RCX: 00007f0fd4c661f7
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000006
RBP: 0000000000000006 R08: 000055fff1eaf400 R09: 0000000000000000
R10: 00007f0fd46be6c8 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000002f08 R15: 00007ffd1c9ea050
Commit
|
|
Linus Torvalds | 1dbe7e386f |
block-5.15-2021-09-05
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmE1hQcQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgprjfEACdG+medwQOPpKNSoAvQmYyQnZRMbPjiruv A4nW2L6MKaExO59qQLVbBYHaH2+ng2UR/p5jNi2AKm+hrQEYllxlNvuCkRBIn97J r45R48mzBbHjR4kE3Fdu1mOFpBWOuU9JrtzHI+JF/Sl/qPIxKYNHf5E66T6l90Fz 0hJkorAoVB7+hQYixdmkM9quZy11D5SY3aM+bG8r2uNjZTBEHMfmOen8o1giR0vC EOHzObuC6WLjLGQInNW+Cq2//vVVybQa79mhOUMp93z5nhDMtwUu7MH4B4kmGpix GLjDa1DukUZe7nGcnsRKmjjXQ+BpG6YF52Z2RfVZpWZn83t5c4YQsq++TPZ8KfpK 4NAFFuSbGM/+QWwEiiyWu00syvpzrEJ4ZIJyZX3FYEeKyKWVRGHqlMDcS9LstYOk 4OfgQUcJ7f/fXeedwi0OGJS1BLr6fi8RnazIafCNIIJLe1XIwTsNufPCNxWYqDAi 0XhH+uYGD38VoUiR5JymZku6frwY4kxssA1khPPE5jWbzCZXiHprwwzaP4hBNNeZ c5cn9/1ZQSoTE3ebrX9pzTn5wRZwAL+iDhZ2SpLlN2Ji1BJ4EM9H8qFGj3U/CSM4 OWKY2c0VwJYQUhjO4QDBx0MblJgNy8HsvmqGETuxUlk56j3Q1Mx3ViPV43amP9eM OM4mGige3Q== =4SCA -----END PGP SIGNATURE----- Merge tag 'block-5.15-2021-09-05' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Was going to send this one in later this week, but given that -Werror is now enabled (or at least available), the mq-deadline fix really should go in for the folks hitting that. - Ensure dd_queued() is only there if needed (Geert) - Fix a kerneldoc warning for bio_alloc_kiocb() - BFQ fix for queue merging - loop locking fix (Tetsuo)" * tag 'block-5.15-2021-09-05' of git://git.kernel.dk/linux-block: loop: reduce the loop_ctl_mutex scope bio: fix kerneldoc documentation for bio_alloc_kiocb() block, bfq: honor already-setup queue merges block/mq-deadline: Move dd_queued() to fix defined but not used warning |
|
Linus Torvalds | 14726903c8 |
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton: "173 patches. Subsystems affected by this series: ia64, ocfs2, block, and mm (debug, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock, oom-kill, migration, ksm, percpu, vmstat, and madvise)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits) mm/madvise: add MADV_WILLNEED to process_madvise() mm/vmstat: remove unneeded return value mm/vmstat: simplify the array size calculation mm/vmstat: correct some wrong comments mm/percpu,c: remove obsolete comments of pcpu_chunk_populated() selftests: vm: add COW time test for KSM pages selftests: vm: add KSM merging time test mm: KSM: fix data type selftests: vm: add KSM merging across nodes test selftests: vm: add KSM zero page merging test selftests: vm: add KSM unmerge test selftests: vm: add KSM merge test mm/migrate: correct kernel-doc notation mm: wire up syscall process_mrelease mm: introduce process_mrelease system call memblock: make memblock_find_in_range method private mm/mempolicy.c: use in_task() in mempolicy_slab_node() mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies mm/mempolicy: advertise new MPOL_PREFERRED_MANY mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY ... |
|
Christoph Hellwig | f358afc52c |
mm: remove flush_kernel_dcache_page
flush_kernel_dcache_page is a rather confusing interface that implements a subset of flush_dcache_page by not being able to properly handle page cache mapped pages. The only callers left are in the exec code as all other previous callers were incorrect as they could have dealt with page cache pages. Replace the calls to flush_kernel_dcache_page with calls to flush_dcache_page, which for all architectures does either exactly the same thing, can contains one or more of the following: 1) an optimization to defer the cache flush for page cache pages not mapped into userspace 2) additional flushing for mapped page cache pages if cache aliases are possible Link: https://lkml.kernel.org/r/20210712060928.4161649-7-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Cc: Alex Shi <alexs@kernel.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Jens Axboe | 0ef47db1cb |
bio: fix kerneldoc documentation for bio_alloc_kiocb()
Apparently the last fixup got butter fingered a bit, the correct variable name is 'nr_vecs', not 'nr_iovecs'. Link: https://lore.kernel.org/lkml/20210903164939.02f6e8c5@canb.auug.org.au/ Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Linus Torvalds | a9c9a6f741 |
SCSI misc on 20210902
This series consists of the usual driver updates (ufs, qla2xxx, target, smartpqi, lpfc, mpt3sas). The core change causing the most churn was replacing the command request field request with a macro, allowing us to offset map to it and remove the redundant field; the same was also done for the tag field. The most impactful change is the final removal of scsi_ioctl, which has been deprecated for over a decade. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYTD/TiYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdUkAQCjb3Ux 4K9438mMelHlzM4er1S1IJ0WNnvObaVMNO9LBwD+JUz+rHsrKvuEX9j3g3C3u6JH hC3BUEW8f2LLnujWanQ= =lC5o -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This series consists of the usual driver updates (ufs, qla2xxx, target, smartpqi, lpfc, mpt3sas). The core change causing the most churn was replacing the command request field request with a macro, allowing us to offset map to it and remove the redundant field; the same was also done for the tag field. The most impactful change is the final removal of scsi_ioctl, which has been deprecated for over a decade" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (293 commits) scsi: ufs: Fix ufshcd_request_sense_async() for Samsung KLUFG8RHDA-B2D1 scsi: ufs: ufs-exynos: Fix static checker warning scsi: mpt3sas: Use the proper SCSI midlayer interfaces for PI scsi: lpfc: Use the proper SCSI midlayer interfaces for PI scsi: lpfc: Copyright updates for 14.0.0.1 patches scsi: lpfc: Update lpfc version to 14.0.0.1 scsi: lpfc: Add bsg support for retrieving adapter cmf data scsi: lpfc: Add cmf_info sysfs entry scsi: lpfc: Add debugfs support for cm framework buffers scsi: lpfc: Add support for maintaining the cm statistics buffer scsi: lpfc: Add rx monitoring statistics scsi: lpfc: Add support for the CM framework scsi: lpfc: Add cmfsync WQE support scsi: lpfc: Add support for cm enablement buffer scsi: lpfc: Add cm statistics buffer support scsi: lpfc: Add EDC ELS support scsi: lpfc: Expand FPIN and RDF receive logging scsi: lpfc: Add MIB feature enablement support scsi: lpfc: Add SET_HOST_DATA mbox cmd to pass date/time info to firmware scsi: fc: Add EDC ELS definition ... |
|
Paolo Valente | 2d52c58b9c |
block, bfq: honor already-setup queue merges
The function bfq_setup_merge prepares the merging between two bfq_queues, say bfqq and new_bfqq. To this goal, it assigns bfqq->new_bfqq = new_bfqq. Then, each time some I/O for bfqq arrives, the process that generated that I/O is disassociated from bfqq and associated with new_bfqq (merging is actually a redirection). In this respect, bfq_setup_merge increases new_bfqq->ref in advance, adding the number of processes that are expected to be associated with new_bfqq. Unfortunately, the stable-merging mechanism interferes with this setup. After bfqq->new_bfqq has been set by bfq_setup_merge, and before all the expected processes have been associated with bfqq->new_bfqq, bfqq may happen to be stably merged with a different queue than the current bfqq->new_bfqq. In this case, bfqq->new_bfqq gets changed. So, some of the processes that have been already accounted for in the ref counter of the previous new_bfqq will not be associated with that queue. This creates an unbalance, because those references will never be decremented. This commit fixes this issue by reestablishing the previous, natural behaviour: once bfqq->new_bfqq has been set, it will not be changed until all expected redirections have occurred. Signed-off-by: Davide Zini <davidezini2@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20210802141352.74353-2-paolo.valente@linaro.org Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Geert Uytterhoeven | 55a51ea140 |
block/mq-deadline: Move dd_queued() to fix defined but not used warning
If CONFIG_BLK_DEBUG_FS=n: block/mq-deadline.c:274:12: warning: ‘dd_queued’ defined but not used [-Wunused-function] 274 | static u32 dd_queued(struct deadline_data *dd, enum dd_prio prio) | ^~~~~~~~~ Fix this by moving dd_queued() just before the sole function that calls it. Fixes: |
|
Linus Torvalds | 87045e6546 |
for-5.15-tag
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmEs2NIACgkQxWXV+ddt WDsJMQ/+PJ/yXfI85mAeAzTJLWQ0zD6YO3iBhf3wOeyychWC4on435pj+zW8zR/U /bix25ygoWF4MvGF6p0uyv4Z5mnvkZXE5lapUcJu6wXG7se1QRPH0broTh05IBXK SnT93Eb9RexaiNFk7DVma9XkviqZ/ZISPtkJ9wYrfIba7j/U/wa+PtEFS7wk58hP rFQXgV64xm/pcP28YYHfOkCjdyUMdJrnBUvfKOlX6d94lmYbP5lyiTL+XJEXExzN wPakD0UsnXPr4TRvf+YRTPeFHPPUgyORII7otVUOKmGywWtcJrELX8rXFoW+6GwB dzZIcSYXHUxU5UrtMbZgiztVBJ+bQY5juYMIrj13eYOMYkijxAqPP84iDO15+TSV zNqyAVjUglHCGUGjhSpAxnAmtp+IJTZfVAWcvIKq3VqvJtb8tssQsk9bqFjH1xlH qNJLE57CYe3tjw05K9y0keMh2iJWRWkXZYkgI/zjwo5nreemobpN+3fO4yneVLh7 ecdBmSl/JVSzAB1NamLOCZNGZLUqiiuTvZlJtI6ZsekrN1+4A6QzVcU/MGjSYL1v C7W0hK0LF+e3xIBkxTKVq8noolsgbmlWacxJq8fZq9HwZy5IVJOVm9STDlCuLaIo gPr0V0itkclcsMU0CHTyCjMsfuHYUwJZXwg93wKfJf5UCzS4OWU= =ALO9 -----END PGP SIGNATURE----- Merge tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "The highlights of this round are integrations with fs-verity and idmapped mounts, the rest is usual mix of minor improvements, speedups and cleanups. There are some patches outside of btrfs, namely updating some VFS interfaces, all straightforward and acked. Features: - fs-verity support, using standard ioctls, backward compatible with read-only limitation on inodes with previously enabled fs-verity - idmapped mount support - make mount with rescue=ibadroots more tolerant to partially damaged trees - allow raid0 on a single device and raid10 on two devices, degenerate cases but might be useful as an intermediate step during conversion to other profiles - zoned mode block group auto reclaim can be disabled via sysfs knob Performance improvements: - continue readahead of node siblings even if target node is in memory, could speed up full send (on sample test +11%) - batching of delayed items can speed up creating many files - fsync/tree-log speedups - avoid unnecessary work (gains +2% throughput, -2% run time on sample load) - reduced lock contention on renames (on dbench +4% throughput, up to -30% latency) Fixes: - various zoned mode fixes - preemptive flushing threshold tuning, avoid excessive work on almost full filesystems Core: - continued subpage support, preparation for implementing remaining features like compression and defragmentation; with some limitations, write is now enabled on 64K page systems with 4K sectors, still considered experimental - no readahead on compressed reads - inline extents disabled - disabled raid56 profile conversion and mount - improved flushing logic, fixing early ENOSPC on some workloads - inode flags have been internally split to read-only and read-write incompat bit parts, used by fs-verity - new tree items for fs-verity - descriptor item - Merkle tree item - inode operations extended to be namespace-aware - cleanups and refactoring Generic code changes: - fs: new export filemap_fdatawrite_wbc - fs: removed sync_inode - block: bio_trim argument type fixups - vfs: add namespace-aware lookup" * tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (114 commits) btrfs: reset replace target device to allocation state on close btrfs: zoned: fix ordered extent boundary calculation btrfs: do not do preemptive flushing if the majority is global rsv btrfs: reduce the preemptive flushing threshold to 90% btrfs: tree-log: check btrfs_lookup_data_extent return value btrfs: avoid unnecessarily logging directories that had no changes btrfs: allow idmapped mount btrfs: handle ACLs on idmapped mounts btrfs: allow idmapped INO_LOOKUP_USER ioctl btrfs: allow idmapped SUBVOL_SETFLAGS ioctl btrfs: allow idmapped SET_RECEIVED_SUBVOL ioctls btrfs: relax restrictions for SNAP_DESTROY_V2 with subvolids btrfs: allow idmapped SNAP_DESTROY ioctls btrfs: allow idmapped SNAP_CREATE/SUBVOL_CREATE ioctls btrfs: check whether fsgid/fsuid are mapped during subvolume creation btrfs: allow idmapped permission inode op btrfs: allow idmapped setattr inode op btrfs: allow idmapped tmpfile inode op btrfs: allow idmapped symlink inode op btrfs: allow idmapped mkdir inode op ... |
|
Linus Torvalds | 3b629f8d6d |
io_uring-bio-cache.5-2021-08-30
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmEs8QQQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpgAgD/wP9gGxrFE5oxtdozDPkEYTXn5e0QKseDyV cNxLmSb3wc4WIEPwjCavdQHpy0fnbjaYwGveHf9ygQwDZPj9WBgEL3ipPYXCCzFA ysoV86kBRxKDI476r2InxI8WaW7hV0IWxPlScUTA1QeeNAzRJDymQvRuwg5KvVRS Jt6R58khzWpEGYO2CqFTpGsA7x01R0kvZ54xmFgKZ+Pxo+Bk03fkO32YUFC49Wm8 Zy+JMsaiIlLgucDTJ4zAKjQUXiwP2GMEw5Vk/lLUFGBvyw0AN2rO9g18L7QW2ZUu vnkaJQwBbMUbgveXlI/y6GG/vuKUG2i4AmzNJH17qFCnimO3JY6vgzUOg5dqOiwx bx7ZzmnBWgQp95/cSAlZ4QwRYf3z0hvVFKPj9U3X9wKGmuxUKHiLResQwp7bzRdd 4L4Jo1WFDDHR/1MOOzzW0uxE3uTm0LKcncsi4hJL20dl+16RXCIbzHWUTAd8yyMV 9QeUAumc4GHOeswa1Ms8jLPAgXyEoAkec7ca7cRIY/NW+DXGLG9tYBgCw1eLe6BN M7LwMsPNlS2v2dMUbiuw8XxkA+uYso728e2vd/edca2jxXj8+SVnm020aYBnxIzh nmjbf69+QddBPEnk/EPvRj8tXOhr3k7FklI4R7qlei/+IGTujGPvM4kn3p6fnHrx d7bsu/jtaQ== =izfH -----END PGP SIGNATURE----- Merge tag 'io_uring-bio-cache.5-2021-08-30' of git://git.kernel.dk/linux-block Pull support for struct bio recycling from Jens Axboe: "This adds bio recycling support for polled IO, allowing quick reuse of a bio for high IOPS scenarios via a percpu bio_set list. It's good for almost a 10% improvement in performance, bumping our per-core IO limit from ~3.2M IOPS to ~3.5M IOPS" * tag 'io_uring-bio-cache.5-2021-08-30' of git://git.kernel.dk/linux-block: bio: improve kerneldoc documentation for bio_alloc_kiocb() block: provide bio_clear_hipri() helper block: use the percpu bio cache in __blkdev_direct_IO io_uring: enable use of bio alloc cache block: clear BIO_PERCPU_CACHE flag if polling isn't supported bio: add allocation cache abstraction fs: add kiocb alloc cache flag bio: optimize initialization of a bio |
|
Linus Torvalds | 679369114e |
for-5.15/block-2021-08-30
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmEs6H0QHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpukbD/9Qk9fQte+WJVmpbdvhV40gcKBVnGOVH0ke k+36x6AB/gWKnFHwtprsSyVqPxmzqwTv9VIq5l/s3Vydt3L61znvTneBeN03Wlkn UTxD0lY8HzyVWnZb82LBBjjy7cs6EzrFG4kBH/ZiTAyTcBsCAvzo5J7mywb4gFjj L/HeBq58EJ3WCUlxlVW1ijctvi7wnGoaH5bZY1TE00GGT6TysN2bEPfzjkuYHrDz RqhoQdWPLDz6h3x9lAncPw2MWlcmlGvJ96ABseAKFPKvXxE2PzgolSoQfVUUJtko bqGyy2ns+pxN11SrcGYjogEKVKhONoms/5UN1RtwRBVsgvecxlHER/SgyZ8luBDo lFhVXulkSjpswbWutRy3USge98GwMu2Z4ppP2CDmO7hkQd0DF8sL0kPKyaREkcHi NmsD/0zF2uUhUVN+PRC/MuzngAmL4Mmxjk70L+MohlK7e+H3pnEo1ec3OMcXe+wB dG6t/BFD9bYmj0UjsHeXEoR/iRuvSba1L8zBz5dhRaHH6DvdycYhpynXWWlU3C8K 3nzEVVpcDINMsiRl1Vqb6g6HsMwHIH84FRl7Mc51UmhW9C4gLfWMCt1guQuzOj72 yEbmCLydE/FR2IUPY7eqX8hRG8GTUlMtSvGdgnvBOcWj+K3buT/c5yVTHgTrN8ox LCOXHSvV6w== =S8fs -----END PGP SIGNATURE----- Merge tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-block Pull block updates from Jens Axboe: "Nothing major in here - lots of good cleanups and tech debt handling, which is also evident in the diffstats. In particular: - Add disk sequence numbers (Matteo) - Discard merge fix (Ming) - Relax disk zoned reporting restrictions (Niklas) - Bio error handling zoned leak fix (Pavel) - Start of proper add_disk() error handling (Luis, Christoph) - blk crypto fix (Eric) - Non-standard GPT location support (Dmitry) - IO priority improvements and cleanups (Damien)o - blk-throtl improvements (Chunguang) - diskstats_show() stack reduction (Abd-Alrhman) - Loop scheduler selection (Bart) - Switch block layer to use kmap_local_page() (Christoph) - Remove obsolete disk_name helper (Christoph) - block_device refcounting improvements (Christoph) - Ensure gendisk always has a request queue reference (Christoph) - Misc fixes/cleanups (Shaokun, Oliver, Guoqing)" * tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-block: (129 commits) sg: pass the device name to blk_trace_setup block, bfq: cleanup the repeated declaration blk-crypto: fix check for too-large dun_bytes blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN blk-zoned: allow zone management send operations without CAP_SYS_ADMIN block: mark blkdev_fsync static block: refine the disk_live check in del_gendisk mmc: sdhci-tegra: Enable MMC_CAP2_ALT_GPT_TEGRA mmc: block: Support alternative_gpt_sector() operation partitions/efi: Support non-standard GPT location block: Add alternative_gpt_sector() operation bio: fix page leak bio_add_hw_page failure block: remove CONFIG_DEBUG_BLOCK_EXT_DEVT block: remove a pointless call to MINOR() in device_add_disk null_blk: add error handling support for add_disk() virtio_blk: add error handling support for add_disk() block: add error handling for device_add_disk / add_disk block: return errors from disk_alloc_events block: return errors from blk_integrity_add block: call blk_register_queue earlier in device_add_disk ... |