Commit Graph

335839 Commits

Author SHA1 Message Date
Andrew Lunn 34c93c8657 dma: mv_xor: Add a device_control function
The dmatest module for DMA engines calls

device_control(dtc->chan, DMA_TERMINATE_ALL, 0);

after completing the tests. The documentation in
include/linux/dmaengine.h suggests this function is optional and
dma_async_device_register() also does not BUG_ON() when not passed a
function. However, dmatest is not the only code in the kernel
unconditionally calling device_control. So add an implementation
indicating all operations are not implemented.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:59:01 +01:00
Thomas Petazzoni cd09fea446 dma: mv_xor: add missing __devinit and __devexit qualifiers on probe and remove
The ->probe() and ->remove() functions were missing the usual
__devinit and __devexit qualifiers.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:59:00 +01:00
Thomas Petazzoni f7d12ef53d dma: mv_xor: add Device Tree binding
This patch finally adds a Device Tree binding to the mv_xor
driver. Thanks to the previous cleanup patches, the Device Tree
binding is relatively simply: one DT node per XOR engine, with
sub-nodes for each XOR channel of the XOR engine. The binding
obviously comes with the necessary documentation.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: devicetree-discuss@lists.ozlabs.org
2012-11-20 15:59:00 +01:00
Thomas Petazzoni 88eb92cb4d dma: mv_xor: add missing free_irq() call
Even though the driver cannot be unloaded at the moment, it is still
good to properly free the IRQ handlers in the channel removal function.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:59:00 +01:00
Thomas Petazzoni b503fa0199 dma: mv_xor: remove the pool_size from platform_data
The pool_size is always PAGE_SIZE, and since it is a software
configuration paramter (and not a hardware description parameter), we
cannot make it part of the Device Tree binding, so we'd better remove
it from the platform_data as well.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:59:00 +01:00
Thomas Petazzoni 9aedbdbab3 dma: mv_xor: remove hw_id field from platform_data
There is no need for the platform_data to give this ID, it is simply
the channel number, so we can compute it inside the driver when
registering the channels.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:59:00 +01:00
Thomas Petazzoni c819ce177e dma: mv_xor: remove useless backpointer from mv_xor_chan to mv_xor_device
The backpointer from mv_xor_chan to mv_xor_device is now useless, get
rid of it.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:59 +01:00
Thomas Petazzoni 297eedbae1 dma: mv_xor: rename mv_xor_private to mv_xor_device
Now that mv_xor_device is no longer used to designate the per-channel
DMA devices, use it know to designate the XOR engine themselves
(currently composed of two XOR channels).

So, now we have the nice organization where:

 - mv_xor_device represents each XOR engine in the system
 - mv_xor_chan   represents each XOR channel of a given XOR engine

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:59 +01:00
Thomas Petazzoni 1ef48a262b dma: mv_xor: merge mv_xor_device and mv_xor_chan
Even though the DMA engine infrastructure has support for multiple
channels per device, the mv_xor driver registers one DMA engine device
for each channel, because the mv_xor channels inside the same XOR
engine have different capabilities, and the DMA engine infrastructure
only allows to express capabilities at the DMA engine device level.

The mv_xor driver has therefore been registering one DMA engine device
and one DMA engine channel for each XOR channel since its introduction
in the kernel. However, it kept two separate internal structures,
mv_xor_device and mv_xor_channel, which didn't make a lot of sense
since there was a 1:1 mapping between those structures.

This patch gets rid of this duplication, and merges everything into
the mv_xor_chan structure.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:59 +01:00
Thomas Petazzoni 275cc0c8bd dma: mv_xor: use mv_xor_chan pointers as arguments to self-test functions
In preparation for the removal of the mv_xor_device structure, we
directly pass mv_xor_chan pointers to the self-test functions included
in the driver. These functions were anyway selecting the first (and
only channel) available in each DMA device, so the behaviour is
unchanged.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:59 +01:00
Thomas Petazzoni 8c75979d7a dma: mv_xor: in mv_xor_device, rename 'common' to 'dmadev'
The mv_xor_device structure embeds a 'struct dma_device', which is
named 'common', a not very meaningful name. Rename it to 'dmadev',
which will help avoid confusions later as we merge the mv_xor_device
and mv_xor_chan structures together.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:59 +01:00
Thomas Petazzoni 98817b9959 dma: mv_xor: in mv_xor_chan, rename 'common' to 'dmachan'
The mv_xor_chan structure embeds a 'struct dma_chan', which is named
'common', a not very meaningful name. Rename it to 'dmachan', which
will help avoid confusions later as we merge the mv_xor_device and
mv_xor_chan structures together.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:58 +01:00
Thomas Petazzoni ecde6cd492 dma: mv_xor: get rid of the pdev pointer in mv_xor_device
It was only used in places where we could get the 'struct device *'
pointer through a different way.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:58 +01:00
Thomas Petazzoni c98c17813e dma: mv_xor: introduce a mv_chan_to_devp() helper
In many place, we need to get the 'struct device *' pointer from a
'struct mv_chan *', so we add a helper that makes this a bit
easier. It will also help reducing the change noise in further
patches.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:58 +01:00
Thomas Petazzoni c35064c4b6 dma: mv_xor: simplify dma_sync_single_for_cpu() calls
In mv_xor_memcpy_self_test() and mv_xor_xor_self_test(), all DMA
functions are called by passing dma_chan->device->dev as the 'device
*', except the calls to dma_sync_single_for_cpu() which uselessly goes
through mv_chan->device->pdev->dev.

Simplify this by using dma_chan->device->dev direclty in
dma_sync_single_for_cpu() calls.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:58 +01:00
Thomas Petazzoni 01a9508de7 dma: mv_xor: remove unused to_mv_xor_device() macro
The to_mv_xor_device() macro is not being used by the driver, so we
can get rid of it.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:58 +01:00
Thomas Petazzoni 8b5c3f6c8d dma: mv_xor: remove unused id field in mv_xor_device structure
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:58 +01:00
Thomas Petazzoni 61971656ce dma: mv_xor: rename many symbols to remove the 'shared' word
The 'shared' word no longer makes sense in a number of places as we
renamed the 'mv_xor_shared' driver to 'mv_xor'.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:57 +01:00
Thomas Petazzoni 0dddee7a7d dma: mv_xor: change the driver name to 'mv_xor'
Since we got rid of the per-XOR channel 'mv_xor' driver, now the
per-XOR engine driver that used to be called 'mv_xor_shared' can
simply be named 'mv_xor'.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:57 +01:00
Thomas Petazzoni 7dde453d62 dma: mv_xor: rename mv_xor_shared_platform_data to mv_xor_platform_data
'struct mv_xor_shared_platform_data' used to be the platform_data
structure for the 'mv_xor_shared', but this driver is going to be
renamed simply 'mv_xor', so also rename its platform_data structure
accordingly.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:57 +01:00
Thomas Petazzoni e39f6ec1f9 dma: mv_xor: rename mv_xor_platform_data to mv_xor_channel_data
mv_xor_platform_data used to be the platform_data structure associated
to the 'mv_xor' driver. This driver no longer exists, and this data
structure really contains the properties of each XOR channel part of a
given XOR engine. Therefore 'struct mv_xor_channel_data' is a more
appropriate name.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:57 +01:00
Thomas Petazzoni 2ccc469cfe dma: mv_xor: remove 'shared' from mv_xor_platform_data
This member of the platform_data structure is no longer used, so get
rid of it.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:56 +01:00
Thomas Petazzoni 18b2a02c7c dma: mv_xor: remove sub-driver 'mv_xor'
Now that XOR channels are directly registered by the main
'mv_xor_shared' device ->probe() function and all users of the
'mv_xor' device have been removed, we can get rid of the latter.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:56 +01:00
Thomas Petazzoni c08f1495c8 arm: plat-orion: remove unused orion_xor_init_channels()
Now that xor0 and xor1 are registered in a single driver manner, the
orion_xor_init_channels() function has become useless.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:56 +01:00
Thomas Petazzoni dd2c57b822 arm: plat-orion: convert the registration of the xor1 engine to the single driver
Instead of registering one 'mv_xor_shared' device for the XOR engine,
and then two 'mv_xor' devices for the XOR channels, pass the channels
properties as platform_data for the main 'mv_xor_shared' device.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:56 +01:00
Thomas Petazzoni af19e148be arm: plat-orion: convert the registration of the xor0 engine to the single driver
Instead of registering one 'mv_xor_shared' device for the XOR engine,
and then two 'mv_xor' devices for the XOR channels, pass the channels
properties as platform_data for the main 'mv_xor_shared' device.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:56 +01:00
Thomas Petazzoni 60d151f387 dma: mv_xor: allow channels to be registered directly from the main device
Extend the XOR engine driver (currently called "mv_xor_shared") so
that XOR channels can be passed in the platform_data structure, and be
registered from there.

This will allow the users of the driver to be converted to the single
platform_driver variant of the mv_xor driver.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:56 +01:00
Thomas Petazzoni a6b4a9d2c1 dma: mv_xor: split initialization/cleanup of XOR channels
Instead of doing the initialization/cleanup of the XOR channels
directly in the ->probe() and ->remove() hooks, we create separate
utility functions mv_xor_channel_add() and mv_xor_channel_remove().

This will allow to easily introduce in a future patch a different way
of registering XOR channels: instead of having one platform_device per
channel, we'll trigger the registration of all XOR channels of a given
XOR engine directly from the XOR engine ->probe() function.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:55 +01:00
Thomas Petazzoni 09f2b7864c dma: mv_xor: do not use pool_size from platform_data within the driver
The driver currently pokes into the platform_data structure during its
normal operation to get the pool_size value. Poking into the
platform_data structure is not nice when moving to the Device Tree, so
this commit adds a new pool_size field in the mv_xor_device structure,
which gets initialized at ->probe() time. The driver then uses this
field instead of the platform_data.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:55 +01:00
Thomas Petazzoni a3fc74bc9b dma: mv_xor: use dev_(err|info|notice) instead of dev_printk
The usage of dev_printk() is deprecated, and the dev_err(), dev_info()
and dev_notice() functions should be used instead.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 15:58:55 +01:00
Andrew Lunn 1611f87251 ARM: Kirkwood: switch to DT clock providers
With true DT clock providers available switch Kirkwood clock setup in
DT- enabled boards. While AUXDATA can be removed completely from bus
probing, some devices still don't know about DT. Therefore, some clkdev
aliases are created until these devices also move to DT.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2012-11-20 14:46:50 +01:00
Sebastian Hesselbarth 5b03df9ace ARM: dove: switch to DT clock providers
With true DT clock providers available switch Dove clock setup in DT-
enabled boards. While AUXDATA can be removed completely from bus probing,
some devices still don't know about DT at all. Therefore, some clock
aliases are created until the devices also move to DT.

Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
2012-11-20 14:46:50 +01:00
Gregory CLEMENT 307c2bf467 clocksource: convert time-armada-370-xp to clk framework
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by Gregory CLEMENT <gregory.clement@free-electrons.com>
2012-11-20 14:46:49 +01:00
Gregory CLEMENT 9d2027830c clk: armada-370-xp: add support for clock framework
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by Gregory CLEMENT <gregory.clement@free-electrons.com>
2012-11-20 14:46:48 +01:00
Gregory CLEMENT c4c34d6084 clk: mvebu: armada 370/XP add clock gating control provider for DT
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
2012-11-20 14:44:00 +01:00
Sebastian Hesselbarth f97d0d7aa8 clk: mvebu: add clock gating control provider for DT
This driver allows to provide DT clocks for clock gates found on
Marvell Dove and Kirkwood SoCs. The clock gates are referenced by
the phandle index of the corresponding bit in the clock gating control
register to ease lookup in the datasheet.

Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
2012-11-20 14:43:24 +01:00
Gregory CLEMENT ab8ba01b3f clk: mvebu: add armada-370-xp CPU specific clocks
Add Armada 370/XP specific CPU clocks

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2012-11-20 14:35:42 +01:00
Sebastian Hesselbarth 97fa4cf442 clk: mvebu: add mvebu core clocks.
This driver allows to provide DT clocks for core clocks found on
Marvell Kirkwood, Dove & 370/XP SoCs. The core clock frequencies and
ratios are determined by decoding the Sample-At-Reset registers.

Although technically correct, using a divider of 0 will lead to
div_by_zero panic. Let's use a ratio of 0/1 instead to fail later
with a zero clock.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by Gregory CLEMENT <gregory.clement@free-electrons.com>
2012-11-20 14:34:08 +01:00
Linus Torvalds f4a75d2eb7 Linux 3.7-rc6 2012-11-16 17:42:40 -08:00
Linus Torvalds 51844b0f04 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Marcelo Tosatti:
 "A correction for oops on module init with older Intel hosts."

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Fix invalid secondary exec controls in vmx_cpuid_update()
2012-11-16 16:49:10 -08:00
Linus Torvalds 0cad3ff404 Merge branch 'akpm' (Fixes from Andrew)
Merge misc fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches)
  revert "mm: fix-up zone present pages"
  tmpfs: change final i_blocks BUG to WARNING
  tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
  mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
  mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
  rapidio: fix kernel-doc warnings
  swapfile: fix name leak in swapoff
  memcg: fix hotplugged memory zone oops
  mips, arc: fix build failure
  memcg: oom: fix totalpages calculation for memory.swappiness==0
  mm: fix build warning for uninitialized value
  mm: add anon_vma_lock to validate_mm()
2012-11-16 15:26:38 -08:00
Andrew Morton 5576646f3c revert "mm: fix-up zone present pages"
Revert commit 7f1290f2f2 ("mm: fix-up zone present pages")

That patch tried to fix a issue when calculating zone->present_pages,
but it caused a regression on 32bit systems with HIGHMEM.  With that
change, reset_zone_present_pages() resets all zone->present_pages to
zero, and fixup_zone_present_pages() is called to recalculate
zone->present_pages when the boot allocator frees core memory pages into
buddy allocator.  Because highmem pages are not freed by bootmem
allocator, all highmem zones' present_pages becomes zero.

Various options for improving the situation are being discussed but for
now, let's return to the 3.6 code.

Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Hugh Dickins 0f3c42f522 tmpfs: change final i_blocks BUG to WARNING
Under a particular load on one machine, I have hit shmem_evict_inode()'s
BUG_ON(inode->i_blocks), enough times to narrow it down to a particular
race between swapout and eviction.

It comes from the "if (freed > 0)" asymmetry in shmem_recalc_inode(),
and the lack of coherent locking between mapping's nrpages and shmem's
swapped count.  There's a window in shmem_writepage(), between lowering
nrpages in shmem_delete_from_page_cache() and then raising swapped
count, when the freed count appears to be +1 when it should be 0, and
then the asymmetry stops it from being corrected with -1 before hitting
the BUG.

One answer is coherent locking: using tree_lock throughout, without
info->lock; reasonable, but the raw_spin_lock in percpu_counter_add() on
used_blocks makes that messier than expected.  Another answer may be a
further effort to eliminate the weird shmem_recalc_inode() altogether,
but previous attempts at that failed.

So far undecided, but for now change the BUG_ON to WARN_ON: in usual
circumstances it remains a useful consistency check.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Hugh Dickins 215c02bc33 tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
Fuzzing with trinity hit the "impossible" VM_BUG_ON(error) (which Fedora
has converted to WARNING) in shmem_getpage_gfp():

  WARNING: at mm/shmem.c:1151 shmem_getpage_gfp+0xa5c/0xa70()
  Pid: 29795, comm: trinity-child4 Not tainted 3.7.0-rc2+ #49
  Call Trace:
    warn_slowpath_common+0x7f/0xc0
    warn_slowpath_null+0x1a/0x20
    shmem_getpage_gfp+0xa5c/0xa70
    shmem_fault+0x4f/0xa0
    __do_fault+0x71/0x5c0
    handle_pte_fault+0x97/0xae0
    handle_mm_fault+0x289/0x350
    __do_page_fault+0x18e/0x530
    do_page_fault+0x2b/0x50
    page_fault+0x28/0x30
    tracesys+0xe1/0xe6

Thanks to Johannes for pointing to truncation: free_swap_and_cache()
only does a trylock on the page, so the page lock we've held since
before confirming swap is not enough to protect against truncation.

What cleanup is needed in this case? Just delete_from_swap_cache(),
which takes care of the memcg uncharge.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Will Deacon 498c228021 mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
kmap_to_page returns the corresponding struct page for a virtual address
of an arbitrary mapping.  This works by checking whether the address
falls in the pkmap region and using the pkmap page tables instead of the
linear mapping if appropriate.

Unfortunately, the bounds checking means that PKMAP_ADDR(LAST_PKMAP) is
incorrectly treated as a highmem address and we can end up walking off
the end of pkmap_page_table and subsequently passing junk to pte_page.

This patch fixes the bound check to stay within the pkmap tables.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Mel Gorman 96710098ee mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
Jiri Slaby reported the following:

	(It's an effective revert of "mm: vmscan: scale number of pages
	reclaimed by reclaim/compaction based on failures".) Given kswapd
	had hours of runtime in ps/top output yesterday in the morning
	and after the revert it's now 2 minutes in sum for the last 24h,
	I would say, it's gone.

The intention of the patch in question was to compensate for the loss of
lumpy reclaim.  Part of the reason lumpy reclaim worked is because it
aggressively reclaimed pages and this patch was meant to be a sane
compromise.

When compaction fails, it gets deferred and both compaction and
reclaim/compaction is deferred avoid excessive reclaim.  However, since
commit c654345924 ("mm: remove __GFP_NO_KSWAPD"), kswapd is woken up
each time and continues reclaiming which was not taken into account when
the patch was developed.

Attempts to address the problem ended up just changing the shape of the
problem instead of fixing it.  The release window gets closer and while
a THP allocation failing is not a major problem, kswapd chewing up a lot
of CPU is.

This patch reverts commit 83fde0f228 ("mm: vmscan: scale number of
pages reclaimed by reclaim/compaction based on failures") and will be
revisited in the future.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Zdenek Kabelac <zkabelac@redhat.com>
Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Randy Dunlap 2ca3cb50ed rapidio: fix kernel-doc warnings
Fix rapidio kernel-doc warnings:

  Warning(drivers/rapidio/rio.c:415): No description found for parameter 'local'
  Warning(drivers/rapidio/rio.c:415): Excess function parameter 'lstart' description in 'rio_map_inb_region'
  Warning(include/linux/rio.h:290): No description found for parameter 'switches'
  Warning(include/linux/rio.h:290): No description found for parameter 'destid_table'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Acked-by: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Xiaotian Feng f58b59c1df swapfile: fix name leak in swapoff
There's a name leak introduced by commit 91a27b2a75 ("vfs: define
struct filename and have getname() return it").  Add the missing
putname.

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Hugh Dickins bea8c150a7 memcg: fix hotplugged memory zone oops
When MEMCG is configured on (even when it's disabled by boot option),
when adding or removing a page to/from its lru list, the zone pointer
used for stats updates is nowadays taken from the struct lruvec.  (On
many configurations, calculating zone from page is slower.)

But we have no code to update all the lruvecs (per zone, per memcg) when
a memory node is hotadded.  Here's an extract from the oops which
results when running numactl to bind a program to a newly onlined node:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000f60
  IP:  __mod_zone_page_state+0x9/0x60
  Pid: 1219, comm: numactl Not tainted 3.6.0-rc5+ #180 Bochs Bochs
  Process numactl (pid: 1219, threadinfo ffff880039abc000, task ffff8800383c4ce0)
  Call Trace:
    __pagevec_lru_add_fn+0xdf/0x140
    pagevec_lru_move_fn+0xb1/0x100
    __pagevec_lru_add+0x1c/0x30
    lru_add_drain_cpu+0xa3/0x130
    lru_add_drain+0x2f/0x40
   ...

The natural solution might be to use a memcg callback whenever memory is
hotadded; but that solution has not been scoped out, and it happens that
we do have an easy location at which to update lruvec->zone.  The lruvec
pointer is discovered either by mem_cgroup_zone_lruvec() or by
mem_cgroup_page_lruvec(), and both of those do know the right zone.

So check and set lruvec->zone in those; and remove the inadequate
attempt to set lruvec->zone from lruvec_init(), which is called before
NODE_DATA(node) has been allocated in such cases.

Ah, there was one exceptionr.  For no particularly good reason,
mem_cgroup_force_empty_list() has its own code for deciding lruvec.
Change it to use the standard mem_cgroup_zone_lruvec() and
mem_cgroup_get_lru_size() too.  In fact it was already safe against such
an oops (the lru lists in danger could only be empty), but we're better
proofed against future changes this way.

I've marked this for stable (3.6) since we introduced the problem in 3.5
(now closed to stable); but I have no idea if this is the only fix
needed to get memory hotadd working with memcg in 3.6, and received no
answer when I enquired twice before.

Reported-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
David Rientjes 18f694271b mips, arc: fix build failure
Using a cross-compiler to fix another issue, the following build error
occurred for mips defconfig:

  arch/mips/fw/arc/misc.c: In function 'ArcHalt':
  arch/mips/fw/arc/misc.c:25:2: error: implicit declaration of function 'local_irq_disable'

Fix it up by including irqflags.h.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00