linux_old1

History

Douglas Anderson 9ea61cac0b dm bufio: avoid sleeping while holding the dm_bufio lock We've seen in-field reports showing _lots_ (18 in one case, 41 in another) of tasks all sitting there blocked on: mutex_lock+0x4c/0x68 dm_bufio_shrink_count+0x38/0x78 shrink_slab.part.54.constprop.65+0x100/0x464 shrink_zone+0xa8/0x198 In the two cases analyzed, we see one task that looks like this: Workqueue: kverityd verity_prefetch_io __switch_to+0x9c/0xa8 __schedule+0x440/0x6d8 schedule+0x94/0xb4 schedule_timeout+0x204/0x27c schedule_timeout_uninterruptible+0x44/0x50 wait_iff_congested+0x9c/0x1f0 shrink_inactive_list+0x3a0/0x4cc shrink_lruvec+0x418/0x5cc shrink_zone+0x88/0x198 try_to_free_pages+0x51c/0x588 __alloc_pages_nodemask+0x648/0xa88 __get_free_pages+0x34/0x7c alloc_buffer+0xa4/0x144 __bufio_new+0x84/0x278 dm_bufio_prefetch+0x9c/0x154 verity_prefetch_io+0xe8/0x10c process_one_work+0x240/0x424 worker_thread+0x2fc/0x424 kthread+0x10c/0x114 ...and that looks to be the one holding the mutex. The problem has been reproduced on fairly easily: 0. Be running Chrome OS w/ verity enabled on the root filesystem 1. Pick test patch: http://crosreview.com/412360 2. Install launchBalloons.sh and balloon.arm from http://crbug.com/468342 ...that's just a memory stress test app. 3. On a 4GB rk3399 machine, run nice ./launchBalloons.sh 4 900 100000 ...that tries to eat 4 * 900 MB of memory and keep accessing. 4. Login to the Chrome web browser and restore many tabs With that, I've seen printouts like: DOUG: long bufio 90758 ms ...and stack trace always show's we're in dm_bufio_prefetch(). The problem is that we try to allocate memory with GFP_NOIO while we're holding the dm_bufio lock. Instead we should be using GFP_NOWAIT. Using GFP_NOIO can cause us to sleep while holding the lock and that causes the above problems. The current behavior explained by David Rientjes: It will still try reclaim initially because __GFP_WAIT (or __GFP_KSWAPD_RECLAIM) is set by GFP_NOIO. This is the cause of contention on dm_bufio_lock() that the thread holds. You want to pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a mutex that can be contended by a concurrent slab shrinker (if count_objects didn't use a trylock, this pattern would trivially deadlock). This change significantly increases responsiveness of the system while in this state. It makes a real difference because it unblocks kswapd. In the bug report analyzed, kswapd was hung: kswapd0 D ffffffc000204fd8 0 72 2 0x00000000 Call trace: [<ffffffc000204fd8>] __switch_to+0x9c/0xa8 [<ffffffc00090b794>] __schedule+0x440/0x6d8 [<ffffffc00090bac0>] schedule+0x94/0xb4 [<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44 [<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac [<ffffffc00090d9d8>] mutex_lock+0x4c/0x68 [<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78 [<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464 [<ffffffc00030dbd8>] shrink_zone+0xa8/0x198 [<ffffffc00030e578>] balance_pgdat+0x328/0x508 [<ffffffc00030eb7c>] kswapd+0x424/0x51c [<ffffffc00023f06c>] kthread+0x10c/0x114 [<ffffffc000203dd0>] ret_from_fork+0x10/0x40 By unblocking kswapd memory pressure should be reduced. Suggested-by: David Rientjes <rientjes@google.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>		2016-12-08 14:13:04 -05:00
..
bcache	block: export bio_free_pages to other modules	2016-09-22 07:48:03 -06:00
persistent-data	dm block manager: make block locking optional	2016-11-14 15:17:47 -05:00
Kconfig	dm block manager: make block locking optional	2016-11-14 15:17:47 -05:00
Makefile	dm: move request-based code out to dm-rq.[hc]	2016-06-10 15:15:44 -04:00
bitmap.c	md/bitmap: fix wrong cleanup	2016-09-21 09:09:44 -07:00
bitmap.h	md-cluster: sync bitmap when node received RESYNCING msg	2016-05-04 12:39:35 -07:00
dm-bio-prison.c	block: add a bi_error field to struct bio	2015-07-29 08:55:15 -06:00
dm-bio-prison.h	dm bio prison: add dm_cell_promote_or_release()	2015-05-29 14:19:06 -04:00
dm-bio-record.h	…
dm-bufio.c	dm bufio: avoid sleeping while holding the dm_bufio lock	2016-12-08 14:13:04 -05:00
dm-bufio.h	…
dm-builtin.c	dm: move request-based code out to dm-rq.[hc]	2016-06-10 15:15:44 -04:00
dm-cache-block-types.h	…
dm-cache-metadata.c	dm cache metadata: remove an extra newline in DMERR and code	2016-11-21 09:52:02 -05:00
dm-cache-metadata.h	dm cache: make sure every metadata function checks fail_io	2016-03-10 17:12:12 -05:00
dm-cache-policy-cleaner.c	dm cache: speed up writing of the hint array	2016-09-22 11:15:02 -04:00
dm-cache-policy-internal.h	dm cache: speed up writing of the hint array	2016-09-22 11:15:02 -04:00
dm-cache-policy-smq.c	dm cache policy smq: distribute entries to random levels when switching to smq	2016-09-22 11:15:03 -04:00
dm-cache-policy.c	…
dm-cache-policy.h	dm cache: speed up writing of the hint array	2016-09-22 11:15:02 -04:00
dm-cache-target.c	dm cache: add missing cache device name to DMERR in set_cache_mode()	2016-11-21 09:52:03 -05:00
dm-core.h	dm: move request-based code out to dm-rq.[hc]	2016-06-10 15:15:44 -04:00
dm-crypt.c	dm crypt: constify crypt_iv_operations structures	2016-11-21 09:52:06 -05:00
dm-delay.c	dm: rename target's per_bio_data_size to per_io_data_size	2016-02-22 22:34:37 -05:00
dm-era-target.c	block: rename bio bi_rw to bi_opf	2016-08-07 14:41:02 -06:00
dm-exception-store.c	- Revert a dm-multipath change that caused a regression for unprivledged	2015-11-04 21:19:53 -08:00
dm-exception-store.h	dm snapshot: fix hung bios when copy error occurs	2016-01-08 20:03:05 -05:00
dm-flakey.c	dm flakey: return -EINVAL on interval bounds error in flakey_ctr()	2016-11-21 09:52:07 -05:00
dm-io.c	dm io: use bvec iterator helpers to implement .get_page and .next_page	2016-11-21 09:51:57 -05:00
dm-ioctl.c	dm: allow bio-based table to be upgraded to bio-based with DAX support	2016-07-20 23:49:52 -04:00
dm-kcopyd.c	dm: move request-based code out to dm-rq.[hc]	2016-06-10 15:15:44 -04:00
dm-linear.c	libnvdimm for 4.8	2016-07-28 17:38:16 -07:00
dm-log-userspace-base.c	dm: drop NULL test before kmem_cache_destroy() and mempool_destroy()	2015-10-31 19:06:00 -04:00
dm-log-userspace-transfer.c	dm log userspace transfer: match wait_for_completion_timeout return type	2015-04-15 12:10:20 -04:00
dm-log-userspace-transfer.h	…
dm-log-writes.c	Merge branch 'for-4.9/block' of git://git.kernel.dk/linux-block	2016-10-07 14:42:05 -07:00
dm-log.c	dm log: fix unitialized bio operation flags	2016-08-24 21:55:05 -04:00
dm-mpath.c	dm mpath: do not modify *__clone if blk_mq_alloc_request() fails	2016-11-21 09:52:10 -05:00
dm-mpath.h	…
dm-path-selector.c	…
dm-path-selector.h	dm path selector: remove 'repeat_count' return from .select_path hook	2016-02-22 22:34:42 -05:00
dm-queue-length.c	dm path selector: remove 'repeat_count' return from .select_path hook	2016-02-22 22:34:42 -05:00
dm-raid.c	dm raid: correct error messages on old metadata validation	2016-11-21 09:52:05 -05:00
dm-raid1.c	dm mirror: use all available legs on multiple failures	2016-10-14 11:55:17 -04:00
dm-region-hash.c	block: rename bio bi_rw to bi_opf	2016-08-07 14:41:02 -06:00
dm-round-robin.c	dm round robin: do not use this_cpu_ptr() without having preemption disabled	2016-08-15 09:23:14 -04:00
dm-rq.c	dm rq: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments	2016-11-21 09:51:57 -05:00
dm-rq.h	dm rq: introduce dm_mq_kick_requeue_list()	2016-09-15 11:16:05 -04:00
dm-service-time.c	dm path selector: remove 'repeat_count' return from .select_path hook	2016-02-22 22:34:42 -05:00
dm-snap-persistent.c	dm: use bio op accessors	2016-06-07 13:41:38 -06:00
dm-snap-transient.c	dm snapshot: fix hung bios when copy error occurs	2016-01-08 20:03:05 -05:00
dm-snap.c	block: rename bio bi_rw to bi_opf	2016-08-07 14:41:02 -06:00
dm-stats.c	dm: move request-based code out to dm-rq.[hc]	2016-06-10 15:15:44 -04:00
dm-stats.h	dm stats: support precise timestamps	2015-06-17 12:40:40 -04:00
dm-stripe.c	block: rename bio bi_rw to bi_opf	2016-08-07 14:41:02 -06:00
dm-switch.c	dm switch: simplify conditional in alloc_region_table()	2015-10-31 19:06:06 -04:00
dm-sysfs.c	dm: move request-based code out to dm-rq.[hc]	2016-06-10 15:15:44 -04:00
dm-table.c	dm table: simplify dm_table_determine_type()	2016-12-08 14:13:03 -05:00
dm-target.c	libnvdimm for 4.8	2016-07-28 17:38:16 -07:00
dm-thin-metadata.c	dm thin: fix a race condition between discarding and provisioning a block	2016-07-20 12:43:35 -04:00
dm-thin-metadata.h	dm thin: fix a race condition between discarding and provisioning a block	2016-07-20 12:43:35 -04:00
dm-thin.c	block: rename bio bi_rw to bi_opf	2016-08-07 14:41:02 -06:00
dm-uevent.c	…
dm-uevent.h	…
dm-verity-fec.c	dm verity fec: fix block calculation	2016-07-01 23:29:08 -04:00
dm-verity-fec.h	dm verity: add support for forward error correction	2015-12-10 10:39:03 -05:00
dm-verity-target.c	dm verity: fix incorrect error message	2016-11-21 09:52:01 -05:00
dm-verity.h	dm verity: add ignore_zero_blocks feature	2015-12-10 10:39:03 -05:00
dm-zero.c	block: rename bio bi_rw to bi_opf	2016-08-07 14:41:02 -06:00
dm.c	- A couple DM raid and DM mirror fixes	2016-10-28 09:27:58 -07:00
dm.h	dm: add infrastructure for DAX support	2016-07-20 23:49:49 -04:00
faulty.c	MD: rename some functions	2016-01-20 13:52:20 -08:00
linear.c	block: rename bio bi_rw to bi_opf	2016-08-07 14:41:02 -06:00
linear.h	…
md-cluster.c	md-cluster: make resync lock also could be interruptted	2016-09-21 09:09:44 -07:00
md-cluster.h	md-cluster: gather resync infos and enable recv_thread after bitmap is ready	2016-05-09 09:24:03 -07:00
md.c	md: be careful not lot leak internal curr_resync value into metadata. -- (all)	2016-10-28 22:04:05 -07:00
md.h	md: changes for MD_STILL_CLOSED flag	2016-09-21 09:09:44 -07:00
multipath.c	block: rename bio bi_rw to bi_opf	2016-08-07 14:41:02 -06:00
multipath.h	…
raid0.c	block: rename bio bi_rw to bi_opf	2016-08-07 14:41:02 -06:00
raid0.h	block: kill merge_bvec_fn() completely	2015-08-13 12:31:57 -06:00
raid1.c	raid1: handle read error also in readonly mode	2016-10-28 22:04:04 -07:00
raid1.h	md-cluster: Use a small window for resync	2015-10-12 01:32:05 -05:00
raid5-cache.c	raid5-cache: correct condition for empty metadata write	2016-10-28 22:04:03 -07:00
raid5.c	Merge tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md	2016-10-07 09:45:43 -07:00
raid5.h	md/raid5: Convert to hotplug state machine	2016-09-06 18:30:23 +02:00
raid10.c	RAID10: ignore discard error	2016-10-24 15:28:17 -07:00
raid10.h	raid10: improve random reads performance	2016-07-19 15:20:28 -07:00