Commit Graph

319 Commits

Author SHA1 Message Date
Ingo Molnar c726b61c6a Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2010-06-09 18:55:57 +02:00
Li Zefan 039ca4e74a tracing: Remove kmemtrace ftrace plugin
We have been resisting new ftrace plugins and removing existing
ones, and kmemtrace has been superseded by kmem trace events
and perf-kmem, so we remove it.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
[ remove kmemtrace from the makefile, handle slob too ]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-06-09 17:31:22 +02:00
Linus Torvalds 3b03117c5c Merge branch 'slub/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'slub/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  SLUB: Allow full duplication of kmalloc array for 390
  slub: move kmem_cache_node into it's own cacheline
2010-05-30 12:46:17 -07:00
Miao Xie c0ff7453bb cpuset,mm: fix no node to alloc memory when changing cpuset's mems
Before applying this patch, cpuset updates task->mems_allowed and
mempolicy by setting all new bits in the nodemask first, and clearing all
old unallowed bits later.  But in the way, the allocator may find that
there is no node to alloc memory.

The reason is that cpuset rebinds the task's mempolicy, it cleans the
nodes which the allocater can alloc pages on, for example:

(mpol: mempolicy)
	task1			task1's mpol	task2
	alloc page		1
	  alloc on node0? NO	1
				1		change mems from 1 to 0
				1		rebind task1's mpol
				0-1		  set new bits
				0	  	  clear disallowed bits
	  alloc on node1? NO	0
	  ...
	can't alloc page
	  goto oom

This patch fixes this problem by expanding the nodes range first(set newly
allowed bits) and shrink it lazily(clear newly disallowed bits).  So we
use a variable to tell the write-side task that read-side task is reading
nodemask, and the write-side task clears newly disallowed nodes after
read-side task ends the current memory allocation.

[akpm@linux-foundation.org: fix spello]
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Paul Menage <menage@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:06:57 -07:00
Alexander Duyck 73367bd8ee slub: move kmem_cache_node into it's own cacheline
This patch is meant to improve the performance of SLUB by moving the local
kmem_cache_node lock into it's own cacheline separate from kmem_cache.
This is accomplished by simply removing the local_node when NUMA is enabled.

On my system with 2 nodes I saw around a 5% performance increase w/
hackbench times dropping from 6.2 seconds to 5.9 seconds on average.  I
suspect the performance gain would increase as the number of nodes
increases, but I do not have the data to currently back that up.

Bugzilla-Reference: http://bugzilla.kernel.org/show_bug.cgi?id=15713
Cc: <stable@kernel.org>
Reported-by: Alex Shi <alex.shi@intel.com>
Tested-by: Alex Shi <alex.shi@intel.com>
Acked-by: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-05-24 21:11:29 +03:00
Pekka Enberg bb4f6b0cd7 Merge branches 'slab/align', 'slab/cleanups', 'slab/fixes', 'slab/memhotadd' and 'slub/fixes' into slab-for-linus 2010-05-22 10:57:52 +03:00
Minchan Kim 6b65aaf302 slub: Use alloc_pages_exact_node() for page allocation
The alloc_slab_page() in SLUB uses alloc_pages() if node is '-1'.  This means
that node validity check in alloc_pages_node is unnecessary and we can use
alloc_pages_exact_node() to avoid comparison and branch as commit
6484eb3e2a ("page allocator: do not check NUMA node ID when the caller
knows the node is valid") did for the page allocator.

Cc: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-05-22 10:57:31 +03:00
Xiaotian Feng d3e14aa336 slub: __kmalloc_node_track_caller should trace kmalloc_large_node case
commit 94b528d (kmemtrace: SLUB hooks for caller-tracking functions)
missed tracing kmalloc_large_node in __kmalloc_node_track_caller. We
should trace it same as __kmalloc_node.

Acked-by: David Rientjes <rientjes@google.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-05-22 10:57:31 +03:00
Eric Dumazet bbd7d57bfe slub: Potential stack overflow
I discovered that we can overflow stack if CONFIG_SLUB_DEBUG=y and use slabs
with many objects, since list_slab_objects() and process_slab() use
DECLARE_BITMAP(map, page->objects).

With 65535 bits, we use 8192 bytes of stack ...

Switch these allocations to dynamic allocations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-05-22 10:57:30 +03:00
David Woodhouse 4581ced379 mm: Move ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN to <linux/slub_def.h>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-05-19 22:03:13 +03:00
Zhang, Yanmin 111c7d8243 slub: Fix bad boundary check in init_kmem_cache_nodes()
Function init_kmem_cache_nodes is incorrect when checking upper limitation of
kmalloc_caches. The breakage was introduced by commit
91efd773c7 ("dma kmalloc handling fixes").

Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-05-05 21:12:19 +03:00
Pekka Enberg d3e06e2b15 slub: Fix kmem_ptr_validate() for non-kernel pointers
As suggested by Linus, fix up kmem_ptr_validate() to handle non-kernel pointers
more graciously. The patch changes kmem_ptr_validate() to use the newly
introduced kern_ptr_validate() helper to check that a pointer is a valid kernel
pointer before we attempt to convert it into a 'struct page'.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-09 10:09:50 -07:00
Linus Torvalds c32da02342 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (56 commits)
  doc: fix typo in comment explaining rb_tree usage
  Remove fs/ntfs/ChangeLog
  doc: fix console doc typo
  doc: cpuset: Update the cpuset flag file
  Fix of spelling in arch/sparc/kernel/leon_kernel.c no longer needed
  Remove drivers/parport/ChangeLog
  Remove drivers/char/ChangeLog
  doc: typo - Table 1-2 should refer to "status", not "statm"
  tree-wide: fix typos "ass?o[sc]iac?te" -> "associate" in comments
  No need to patch AMD-provided drivers/gpu/drm/radeon/atombios.h
  devres/irq: Fix devm_irq_match comment
  Remove reference to kthread_create_on_cpu
  tree-wide: Assorted spelling fixes
  tree-wide: fix 'lenght' typo in comments and code
  drm/kms: fix spelling in error message
  doc: capitalization and other minor fixes in pnp doc
  devres: typo fix s/dev/devm/
  Remove redundant trailing semicolons from macros
  fix typo "definetly" -> "definitely" in comment
  tree-wide: s/widht/width/g typo in comments
  ...

Fix trivial conflict in Documentation/laptops/00-INDEX
2010-03-12 16:04:50 -08:00
Jiri Kosina 318ae2edc3 Merge branch 'for-next' into for-linus
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
2010-03-08 16:55:37 +01:00
Emese Revfy 52cf25d0ab Driver core: Constify struct sysfs_ops in struct kobj_type
Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

 * prevents modification of data that is shared
   (referenced) by many other structure instances
   at runtime

 * detects/prevents accidental (but not intentional)
   modification attempts on archs that enforce
   read-only kernel data at runtime

 * potentially better optimized code as the compiler
   can assume that the const data cannot be changed

 * the compiler/linker move const data into .rodata
   and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Emese Revfy 9cd43611cc kobject: Constify struct kset_uevent_ops
Constify struct kset_uevent_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

 * prevents modification of data that is shared
   (referenced) by many other structure instances
   at runtime

 * detects/prevents accidental (but not intentional)
   modification attempts on archs that enforce
   read-only kernel data at runtime

 * potentially better optimized code as the compiler
   can assume that the const data cannot be changed

 * the compiler/linker move const data into .rodata
   and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Stephen Rothwell 1154fab73c SLUB: Fix per-cpu merge conflict
The slab tree adds a percpu variable usage case (commit
9dfc6e68bf "SLUB: Use this_cpu operations in
slub"), but the percpu tree removes the prefixing of percpu variables (commit
dd17c8f729 "percpu: remove per_cpu__ prefix"),
thus causing the following compilation error:

    CC      mm/slub.o
  mm/slub.c: In function ‘alloc_kmem_cache_cpus’:
  mm/slub.c:2078: error: implicit declaration of function ‘per_cpu_var’
  mm/slub.c:2078: warning: assignment makes pointer from integer without a cast
  make[1]: *** [mm/slub.o] Error 1

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-03-04 12:09:43 +02:00
Pekka Enberg e2b093f3e9 Merge branches 'slab/cleanups', 'slab/failslab', 'slab/fixes' and 'slub/percpu' into slab-for-linus 2010-03-04 12:07:50 +02:00
Dmitry Monakhov 4c13dd3b48 failslab: add ability to filter slab caches
This patch allow to inject faults only for specific slabs.
In order to preserve default behavior cache filter is off by
default (all caches are faulty).

One may define specific set of slabs like this:
# mark skbuff_head_cache as faulty
echo 1 > /sys/kernel/slab/skbuff_head_cache/failslab
# Turn on cache filter (off by default)
echo 1 > /sys/kernel/debug/failslab/cache-filter
# Turn on fault injection
echo 1 > /sys/kernel/debug/failslab/times
echo 1 > /sys/kernel/debug/failslab/probability

Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-02-26 19:19:39 +02:00
Adam Buchbinder c9404c9c39 Fix misspelling of "should" and "shouldn't" in comments.
Some comments misspell "should" or "shouldn't"; this fixes them. No code changes.

Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-02-05 12:22:30 +01:00
Christoph Lameter 91efd773c7 dma kmalloc handling fixes
1. We need kmalloc_percpu for all of the now extended kmalloc caches
   array not just for each shift value.

2. init_kmem_cache_nodes() must assume node 0 locality for statically
   allocated dma kmem_cache structures even after boot is complete.

Reported-and-tested-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-01-22 18:33:38 +02:00
David Rientjes 7738dd9e8f slub: remove impossible condition
`s' cannot be NULL if kmalloc_caches is not NULL.

This conditional would trigger a NULL pointer on `s', anyway, since it is
immediately derefernced if true.

Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-01-22 18:33:36 +02:00
Christoph Lameter 84e554e686 SLUB: Make slub statistics use this_cpu_inc
this_cpu_inc() translates into a single instruction on x86 and does not
need any register. So use it in stat(). We also want to avoid the
calculation of the per cpu kmem_cache_cpu structure pointer. So pass
a kmem_cache pointer instead of a kmem_cache_cpu pointer.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-12-20 10:39:34 +02:00
Christoph Lameter ff12059ed1 SLUB: this_cpu: Remove slub kmem_cache fields
Remove the fields in struct kmem_cache_cpu that were used to cache data from
struct kmem_cache when they were in different cachelines. The cacheline that
holds the per cpu array pointer now also holds these values. We can cut down
the struct kmem_cache_cpu size to almost half.

The get_freepointer() and set_freepointer() functions that used to be only
intended for the slow path now are also useful for the hot path since access
to the size field does not require accessing an additional cacheline anymore.
This results in consistent use of functions for setting the freepointer of
objects throughout SLUB.

Also we initialize all possible kmem_cache_cpu structures when a slab is
created. No need to initialize them when a processor or node comes online.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-12-20 10:17:59 +02:00
Christoph Lameter 756dee7587 SLUB: Get rid of dynamic DMA kmalloc cache allocation
Dynamic DMA kmalloc cache allocation is troublesome since the
new percpu allocator does not support allocations in atomic contexts.
Reserve some statically allocated kmalloc_cpu structures instead.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-12-20 09:57:00 +02:00
Christoph Lameter 9dfc6e68bf SLUB: Use this_cpu operations in slub
Using per cpu allocations removes the needs for the per cpu arrays in the
kmem_cache struct. These could get quite big if we have to support systems
with thousands of cpus. The use of this_cpu_xx operations results in:

1. The size of kmem_cache for SMP configuration shrinks since we will only
   need 1 pointer instead of NR_CPUS. The same pointer can be used by all
   processors. Reduces cache footprint of the allocator.

2. We can dynamically size kmem_cache according to the actual nodes in the
   system meaning less memory overhead for configurations that may potentially
   support up to 1k NUMA nodes / 4k cpus.

3. We can remove the diddle widdle with allocating and releasing of
   kmem_cache_cpu structures when bringing up and shutting down cpus. The cpu
   alloc logic will do it all for us. Removes some portions of the cpu hotplug
   functionality.

4. Fastpath performance increases since per cpu pointer lookups and
   address calculations are avoided.

V7-V8
- Convert missed get_cpu_slab() under CONFIG_SLUB_STATS

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-12-20 09:29:18 +02:00
Linus Torvalds 2205afa7d1 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf sched: Fix build failure on sparc
  perf bench: Add "all" pseudo subsystem and "all" pseudo suite
  perf tools: Introduce perf_session class
  perf symbols: Ditch dso->find_symbol
  perf symbols: Allow lookups by symbol name too
  perf symbols: Add missing "Variables" entry to map_type__name
  perf symbols: Add support for 'variable' symtabs
  perf symbols: Introduce ELF counterparts to symbol_type__is_a
  perf symbols: Introduce symbol_type__is_a
  perf symbols: Rename kthreads to kmaps, using another abstraction for it
  perf tools: Allow building for ARM
  hw-breakpoints: Handle bad modify_user_hw_breakpoint off-case return value
  perf tools: Allow cross compiling
  tracing, slab: Fix no callsite ifndef CONFIG_KMEMTRACE
  tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACING

Trivial conflict due to different fixes to modify_user_hw_breakpoint()
in include/linux/hw_breakpoint.h
2009-12-14 10:13:22 -08:00
Pekka Enberg 355d79c87a Merge branches 'slab/fixes', 'slab/kmemleak', 'slub/perf' and 'slub/stats' into for-linus 2009-12-12 10:12:19 +02:00
Li Zefan 0f24f1287a tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACING
Define kmem_trace_alloc_{,node}_notrace() if CONFIG_TRACING is
enabled, otherwise perf-kmem will show wrong stats ifndef
CONFIG_KMEM_TRACE, because a kmalloc() memory allocation may
be traced by both trace_kmalloc() and trace_kmem_cache_alloc().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <4B21F89A.7000801@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-11 09:17:02 +01:00
Pekka Enberg 74e2134ff8 SLUB: Fix __GFP_ZERO unlikely() annotation
The unlikely() annotation in slab_alloc() covers too much of the expression.
It's actually very likely that the object is not NULL so use unlikely() only
for the __GFP_ZERO expression like SLAB does.

The patch reduces kernel text by 29 bytes on x86-64:

   text	   data	    bss	    dec	    hex	filename
  24185	   8560	    176	  32921	   8099	mm/slub.o.orig
  24156	   8560	    176	  32892	   807c	mm/slub.o

Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-11-29 09:01:59 +02:00
David Rientjes 78eb00cc57 slub: allow stats to be cleared
When collecting slub stats for particular workloads, it's necessary to
collect each statistic for all caches before the job is even started
because the counters are usually greater than zero just from boot and
initialization.

This allows a statistic to be cleared on each cpu by writing '0' to its
sysfs file.  This creates a baseline for statistics of interest before
the workload is started.

Setting a statistic to a particular value is not supported, so all values
written to these files other than '0' returns -EINVAL.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-10-15 21:34:12 +03:00
Benjamin Herrenschmidt fe1ff49d0d mm: kmem_cache_create(): make it easier to catch NULL cache names
Right now, if you inadvertently pass NULL to kmem_cache_create() at boot
time, it crashes much later after boot somewhere deep inside sysfs which
makes it very non obvious to figure out what's going on.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22 07:17:33 -07:00
Ingo Molnar fdaa45e95d slub: Fix build error in kmem_cache_open() with !CONFIG_SLUB_DEBUG
This build bug:

 mm/slub.c: In function 'kmem_cache_open':
 mm/slub.c:2476: error: 'disable_higher_order_debug' undeclared (first use in this function)
 mm/slub.c:2476: error: (Each undeclared identifier is reported only once
 mm/slub.c:2476: error: for each function it appears in.)

Triggers because there's no !CONFIG_SLUB_DEBUG definition for
disable_higher_order_debug.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-09-15 22:32:10 +03:00
Linus Torvalds ada3fa1505 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits)
  powerpc64: convert to dynamic percpu allocator
  sparc64: use embedding percpu first chunk allocator
  percpu: kill lpage first chunk allocator
  x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA
  percpu: update embedding first chunk allocator to handle sparse units
  percpu: use group information to allocate vmap areas sparsely
  vmalloc: implement pcpu_get_vm_areas()
  vmalloc: separate out insert_vmalloc_vm()
  percpu: add chunk->base_addr
  percpu: add pcpu_unit_offsets[]
  percpu: introduce pcpu_alloc_info and pcpu_group_info
  percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward
  percpu: add @align to pcpu_fc_alloc_fn_t
  percpu: make @dyn_size mandatory for pcpu_setup_first_chunk()
  percpu: drop @static_size from first chunk allocators
  percpu: generalize first chunk allocator selection
  percpu: build first chunk allocators selectively
  percpu: rename 4k first chunk allocator to page
  percpu: improve boot messages
  percpu: fix pcpu_reclaim() locking
  ...

Fix trivial conflict as by Tejun Heo in kernel/sched.c
2009-09-15 09:39:44 -07:00
Pekka Enberg aceda77360 Merge branches 'slab/cleanups' and 'slab/fixes' into for-linus 2009-09-14 20:19:06 +03:00
Eric Dumazet 8a3d271deb slub: fix slab_pad_check()
When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
slab padding, we call restore_bytes() on the whole slab, not only
on the padding.

Acked-by: Christoph Lameer <cl@linux-foundation.org>
Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-09-14 07:57:55 +03:00
Eric Dumazet d76b1590e0 slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU
kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close() and
*before* sysfs_slab_remove() or risk rcu_free_slab() being called after
kmem_cache is deleted (kfreed).

rmmod nf_conntrack can crash the machine because it has to kmem_cache_destroy()
a SLAB_DESTROY_BY_RCU enabled cache.

Cc: <stable@kernel.org>
Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-09-03 22:38:59 +03:00
Xiaotian Feng 5788d8ad6c slub: release kobject if sysfs_create_group failed in sysfs_slab_add
When CONFIG_SLUB_DEBUG is enabled, sysfs_slab_add should unlink and put the
kobject if sysfs_create_group failed. Otherwise, sysfs_slab_add returns error
then free kmem_cache s, thus memory of s->kobj is leaked.

Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-09-03 21:11:41 +03:00
Aaro Koskinen acdfcd04d9 SLUB: fix ARCH_KMALLOC_MINALIGN cases 64 and 256
If the minalign is 64 bytes, then the 96 byte cache should not be created
because it would conflict with the 128 byte cache.

If the minalign is 256 bytes, patching the size_index table should not
result in a buffer overrun.

The calculation "(i - 1) / 8" used to access size_index[] is moved to
a separate function as suggested by Christoph Lameter.

Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-08-30 14:56:48 +03:00
Amerigo Wang 5086c389cb SLUB: Fix some coding style issues
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-08-19 21:44:13 +03:00
WANG Cong cf5d11317e SLUB: Drop write permission to /proc/slabinfo
SLUB does not support writes to /proc/slabinfo so there should not be write
permission to do that either.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-08-18 19:11:40 +03:00
Tejun Heo 384be2b18a Merge branch 'percpu-for-linus' into percpu-for-next
Conflicts:
	arch/sparc/kernel/smp_64.c
	arch/x86/kernel/cpu/perf_counter.c
	arch/x86/kernel/setup_percpu.c
	drivers/cpufreq/cpufreq_ondemand.c
	mm/percpu.c

Conflicts in core and arch percpu codes are mostly from commit
ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
num_possible_cpus() with nr_cpu_ids.  As for-next branch has moved all
the first chunk allocators into mm/percpu.c, the changes are moved
from arch code to mm/percpu.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-08-14 14:45:31 +09:00
Zhang, Yanmin dcb0ce1bdf slub: change kmem_cache->align to record the real alignment
kmem_cache->align records the original align parameter value specified
by users. Function calculate_alignment might change it based on cache
line size. So change kmem_cache->align correspondingly.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-08-01 18:26:40 +03:00
David Rientjes 3de472138a slub: use size and objsize orders to disable debug flags
This patch moves the masking of debugging flags which increase a cache's
min order due to metadata when `slub_debug=O' is used from
kmem_cache_flags() to kmem_cache_open().

Instead of defining the maximum metadata size increase in a preprocessor
macro, this approach uses the cache's ->size and ->objsize members to
determine if the min order increased due to debugging options.  If so,
the flags specified in the more appropriately named DEBUG_METADATA_FLAGS
are masked off.

This approach was suggested by Christoph Lameter
<cl@linux-foundation.org>.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-07-28 10:53:09 +03:00
David Rientjes fa5ec8a1f6 slub: add option to disable higher order debugging slabs
When debugging is enabled, slub requires that additional metadata be
stored in slabs for certain options: SLAB_RED_ZONE, SLAB_POISON, and
SLAB_STORE_USER.

Consequently, it may require that the minimum possible slab order needed
to allocate a single object be greater when using these options.  The
most notable example is for objects that are PAGE_SIZE bytes in size.

Higher minimum slab orders may cause page allocation failures when oom or
under heavy fragmentation.

This patch adds a new slub_debug option, which disables debugging by
default for caches that would have resulted in higher minimum orders:

	slub_debug=O

When this option is used on systems with 4K pages, kmalloc-4096, for
example, will not have debugging enabled by default even if
CONFIG_SLUB_DEBUG_ON is defined because it would have resulted in a
order-1 minimum slab order.

Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-07-10 09:52:55 +03:00
Catalin Marinas e4f7c0b44a kmemleak: Trace the kmalloc_large* functions in slub
The kmalloc_large() and kmalloc_large_node() functions were missed when
adding the kmemleak hooks to the slub allocator. However, they should be
traced to avoid false positives.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-07-08 14:25:14 +01:00
Tejun Heo c43768cbb7 Merge branch 'master' into for-next
Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix
changes.  As alpha in percpu tree uses 'weak' attribute instead of
inline assembly, there's no need for __used attribute.

Conflicts:
	arch/alpha/include/asm/percpu.h
	arch/mn10300/kernel/vmlinux.lds.S
	include/linux/percpu-defs.h
2009-07-04 07:13:18 +09:00
Paul E. McKenney 7ed9f7e5db fix RCU-callback-after-kmem_cache_destroy problem in sl[aou]b
Jesper noted that kmem_cache_destroy() invokes synchronize_rcu() rather than
rcu_barrier() in the SLAB_DESTROY_BY_RCU case, which could result in RCU
callbacks accessing a kmem_cache after it had been destroyed.

Cc: <stable@kernel.org>
Acked-by: Matt Mackall <mpm@selenic.com>
Reported-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-26 12:10:47 +03:00
Pekka Enberg ba52270d18 SLUB: Don't pass __GFP_FAIL for the initial allocation
SLUB uses higher order allocations by default but falls back to small
orders under memory pressure. Make sure the GFP mask used in the initial
allocation doesn't include __GFP_NOFAIL.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-24 12:20:14 -07:00
Tejun Heo 204fba4aa3 percpu: cleanup percpu array definitions
Currently, the following three different ways to define percpu arrays
are in use.

1. DEFINE_PER_CPU(elem_type[array_len], array_name);
2. DEFINE_PER_CPU(elem_type, array_name[array_len]);
3. DEFINE_PER_CPU(elem_type, array_name)[array_len];

Unify to #1 which correctly separates the roles of the two parameters
and thus allows more flexibility in the way percpu variables are
defined.

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: linux-mm@kvack.org
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: David S. Miller <davem@davemloft.net>
2009-06-24 15:13:45 +09:00