linux/include/trace/events
KOSAKI Motohiro 7d3579e8e6 vmscan: narrow the scenarios in whcih lumpy reclaim uses synchrounous reclaim
shrink_page_list() can decide to give up reclaiming a page under a
number of conditions such as

  1. trylock_page() failure
  2. page is unevictable
  3. zone reclaim and page is mapped
  4. PageWriteback() is true
  5. page is swapbacked and swap is full
  6. add_to_swap() failure
  7. page is dirty and gfpmask don't have GFP_IO, GFP_FS
  8. page is pinned
  9. IO queue is congested
 10. pageout() start IO, but not finished

With lumpy reclaim, failures result in entering synchronous lumpy reclaim
but this can be unnecessary.  In cases (2), (3), (5), (6), (7) and (8),
there is no point retrying.  This patch causes lumpy reclaim to abort when
it is known it will fail.

Case (9) is more interesting. current behavior is,
  1. start shrink_page_list(async)
  2. found queue_congested()
  3. skip pageout write
  4. still start shrink_page_list(sync)
  5. wait on a lot of pages
  6. again, found queue_congested()
  7. give up pageout write again

So, it's useless time wasting.  However, just skipping page reclaim is
also notgood as x86 allocating a huge page needs 512 pages for example.
It can have more dirty pages than queue congestion threshold (~=128).

After this patch, pageout() behaves as follows;

 - If order > PAGE_ALLOC_COSTLY_ORDER
	Ignore queue congestion always.
 - If order <= PAGE_ALLOC_COSTLY_ORDER
	skip write page and disable lumpy reclaim.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:07 -07:00
..
bkl.h events: Harmonize event field names and print output names 2009-10-15 12:42:03 +02:00
block.h block: remove wrappers for request type/flags 2010-08-07 18:17:56 +02:00
ext4.h writeback: remove nonblocking/encountered_congestion references 2010-10-26 16:52:05 -07:00
gfpflags.h vmscan: tracing: add trace events for kswapd wakeup, sleeping and direct reclaim 2010-08-09 20:44:59 -07:00
irq.h irq: Add tracepoint to softirq_raise 2010-09-07 17:49:34 +02:00
jbd2.h ext4: Add new tracepoint for jbd2_cleanup_journal_tail 2009-12-23 07:45:44 -05:00
kmem.h vmscan: tracing: add trace events for kswapd wakeup, sleeping and direct reclaim 2010-08-09 20:44:59 -07:00
kvm.h KVM: cleanup kvm trace 2010-05-17 12:15:22 +03:00
lock.h tracing: Factorize lock events in a lock class 2010-05-09 13:45:35 +02:00
mce.h perf_event, x86, mce: Use TRACE_EVENT() for MCE logging 2009-10-13 09:43:38 +02:00
module.h Merge branch 'linus' into tracing/core 2010-04-08 10:18:47 +02:00
napi.h napi: Convert trace_napi_poll to TRACE_EVENT 2010-09-07 17:51:01 +02:00
net.h netdev: Add tracepoints to netdev layer 2010-09-07 17:51:33 +02:00
power.h tracing, perf: Add more power related events 2010-09-17 09:10:43 +02:00
sched.h tracing/sched: Add sched_pi_setprio tracepoint 2010-09-21 10:56:41 -04:00
scsi.h [SCSI] scsi_trace: Enhance SCSI command tracing 2010-04-30 12:52:08 -05:00
signal.h tracing: Fix null pointer deref with SEND_SIG_FORCED 2010-06-08 23:51:32 +02:00
skb.h skb: Add tracepoints to freeing skb 2010-09-07 17:51:53 +02:00
syscalls.h tracing: Separate raw syscall from syscall tracer 2009-11-25 14:20:06 -05:00
timer.h tracing: Fix timer tracing 2010-08-19 13:00:41 +02:00
vmscan.h vmscan: narrow the scenarios in whcih lumpy reclaim uses synchrounous reclaim 2010-10-26 16:52:07 -07:00
workqueue.h workqueue: add queue_work and activate_work trace points 2010-10-05 10:49:55 +02:00
writeback.h writeback: account for time spent congestion_waited 2010-10-26 16:52:07 -07:00