blk-mq: don't redistribute hardware queues on a CPU hotplug event

Currently blk-mq will totally remap hardware context when a CPU hotplug
even happened, which causes major havoc for drivers, as they are never
told about this remapping.  E.g. any carefully sorted out CPU affinity
will just be completely messed up.

The rebuild also doesn't really help for the common case of cpu
hotplug, which is soft onlining / offlining of cpus - in this case we
should just leave the queue and irq mapping as is.  If it actually
worked it would have helped in the case of physical cpu hotplug,
although for that we'd need a way to actually notify the driver.
Note that drivers may already be able to accommodate such a topology
change on their own, e.g. using the reset_controller sysfs file in NVMe
will cause the driver to get things right for this case.

With the rebuild removed we will simplify retain the queue mapping for
a soft offlined CPU that will work when it comes back online, and will
map any newly onlined CPU to queue 0 until the driver initiates
a rebuild of the queue map.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

This commit is contained in:

Christoph Hellwig

2016-09-14 16:18:52 +02:00

committed by

Jens Axboe

parent 474b313de7

commit 4e68a01142

1 changed files with 0 additions and 2 deletions

									
										2

block/blk-mq.c

										View File
									
				@ -2157,8 +2157,6 @@ static void blk_mq_queue_reinit(struct request_queue *q,

					blk_mq_sysfs_unregister(q);

					blk_mq_update_queue_map(q->mq_map, q->nr_hw_queues, online_mask);

					/*

					 * redo blk_mq_init_cpu_queues and blk_mq_init_hw_queues. FIXME: maybe

					 * we should change hctx numa_node according to new topology (this

blk-mq: don't redistribute hardware queues on a CPU hotplug event

2 block/blk-mq.c Unescape Escape View File

2

block/blk-mq.c

View File