linux_old1/include/trace/events
Mel Gorman 82b212f400 Revert "mm: remove __GFP_NO_KSWAPD"
With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
based on failures" reverted, Zdenek Kabelac reported the following

  Hmm,  so it's just took longer to hit the problem and observe
  kswapd0 spinning on my CPU again - it's not as endless like before -
  but still it easily eats minutes - it helps to	turn off  Firefox
  or TB  (memory hungry apps) so kswapd0 stops soon - and restart
  those apps again.  (And I still have like >1GB of cached memory)

  kswapd0         R  running task        0    30      2 0x00000000
  Call Trace:
    preempt_schedule+0x42/0x60
    _raw_spin_unlock+0x55/0x60
    put_super+0x31/0x40
    drop_super+0x22/0x30
    prune_super+0x149/0x1b0
    shrink_slab+0xba/0x510

The sysrq+m indicates the system has no swap so it'll never reclaim
anonymous pages as part of reclaim/compaction.  That is one part of the
problem but not the root cause as file-backed pages could also be
reclaimed.

The likely underlying problem is that kswapd is woken up or kept awake
for each THP allocation request in the page allocator slow path.

If compaction fails for the requesting process then compaction will be
deferred for a time and direct reclaim is avoided.  However, if there
are a storm of THP requests that are simply rejected, it will still be
the the case that kswapd is awake for a prolonged period of time as
pgdat->kswapd_max_order is updated each time.  This is noticed by the
main kswapd() loop and it will not call kswapd_try_to_sleep().  Instead
it will loopp, shrinking a small number of pages and calling
shrink_slab() on each iteration.

The temptation is to supply a patch that checks if kswapd was woken for
THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not
backed up by proper testing.  As 3.7 is very close to release and this
is not a bug we should release with, a safer path is to revert "mm:
remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing
out the balance_pgdat() logic in general.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Zdenek Kabelac <zkabelac@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-26 17:41:24 -08:00
..
9p.h net/9p: Convert net/9p protocol dumps to tracepoints 2011-10-24 11:13:12 -05:00
asoc.h ASoC: dapm: Fix x86_64 build warning. 2012-04-23 13:15:35 +01:00
block.h blktrace: add FLUSH/FUA support 2011-08-11 10:36:05 +02:00
btrfs.h Btrfs: update delayed ref's tracepoints to show sequence 2012-10-01 15:19:17 -04:00
compaction.h UAPI: (Scripted) Convert #include "..." to #include <path/...> in kernel system headers 2012-10-02 18:01:25 +01:00
ext3.h userns: Convert ext3 to use kuid/kgid where appropriate 2012-05-15 14:59:27 -07:00
ext4.h ext4: add missing space to trace message 2012-08-17 09:52:17 -04:00
gfpflags.h Revert "mm: remove __GFP_NO_KSWAPD" 2012-11-26 17:41:24 -08:00
gpio.h gpio: add trace events for setting direction and value 2011-05-20 00:40:19 -06:00
irq.h rcu: Use softirq to address performance regression 2011-06-14 15:25:39 -07:00
jbd.h jbd: Write journal superblock with WRITE_FUA after checkpointing 2012-05-15 23:34:37 +02:00
jbd2.h jbd2: issue cache flush after checkpointing even with internal journal 2012-03-13 22:22:54 -04:00
kmem.h UAPI: (Scripted) Convert #include "..." to #include <path/...> in kernel system headers 2012-10-02 18:01:25 +01:00
kvm.h KVM: Introduce __KVM_HAVE_IRQ_LINE 2012-06-18 16:06:35 +03:00
lock.h tracing: Factorize lock events in a lock class 2010-05-09 13:45:35 +02:00
mce.h tracing: Fix event alignment: mce:mce_record 2011-03-10 10:34:28 -05:00
module.h include: replace linux/module.h with "struct module" wherever possible 2011-10-31 19:32:32 -04:00
napi.h napi: Convert trace_napi_poll to TRACE_EVENT 2010-09-07 17:51:01 +02:00
net.h net: tracepoint of net_dev_xmit sees freed skb and causes panic 2011-06-02 14:06:31 -07:00
oom.h tracepoint: add tracepoints for debugging oom_score_adj 2012-01-10 16:30:44 -08:00
power.h PM / Sleep: Add wakeup_source_activate and wakeup_source_deactivate tracepoints 2012-05-01 21:25:25 +02:00
printk.h printk/tracing: Add console output tracing 2012-02-13 13:46:05 -05:00
random.h random: add tracepoints for easier debugging and verification 2012-07-14 20:17:48 -04:00
rcu.h rcu: Add tracing for _rcu_barrier() 2012-07-02 12:33:23 -07:00
regmap.h The following text was taken from the original review request: 2012-03-24 10:41:37 -07:00
regulator.h regulator: Add basic trace facilities 2011-01-12 14:33:00 +00:00
rpm.h device.h: audit and cleanup users in main include dir 2012-03-16 10:38:24 -04:00
sched.h perf/trace: Add ability to set a target task for events 2012-07-31 17:02:05 +02:00
scsi.h [SCSI] Include protection operation in SCSI command trace 2011-03-14 18:36:02 -05:00
signal.h tracing: let trace_signal_generate() report more info, kill overflow_fail/lose_info 2012-01-13 18:48:50 +01:00
skb.h tracing: Fix event alignment: skb:kfree_skb 2011-03-10 10:34:31 -05:00
sock.h core: add tracepoints for queueing skb to rcvbuf 2011-06-21 16:06:10 -07:00
sunrpc.h SUNRPC: Adding status trace points 2012-02-06 10:37:53 -05:00
syscalls.h tracing: Allow raw syscall trace events for non privileged users 2010-11-18 14:37:43 +01:00
task.h tracepoint: add tracepoints for debugging oom_score_adj 2012-01-10 16:30:44 -08:00
timer.h tracing: Fix timer tracing 2010-08-19 13:00:41 +02:00
udp.h udp: add tracepoints for queueing skb to rcvbuf 2011-06-21 16:06:10 -07:00
vmscan.h UAPI: (Scripted) Convert #include "..." to #include <path/...> in kernel system headers 2012-10-02 18:01:25 +01:00
workqueue.h workqueue: factor out worker_pool from global_cwq 2012-07-12 14:46:37 -07:00
writeback.h writeback: Move requeueing when I_SYNC set to writeback_sb_inodes() 2012-05-06 13:43:38 +08:00
xen.h xen/mmu: Use Xen specific TLB flush instead of the generic one. 2012-10-31 12:38:31 -04:00