Commit Graph

344351 Commits

Author SHA1 Message Date
Eric Dumazet 499744209b tuntap: dont use skb after netif_rx_ni(skb)
On Wed, 2012-12-12 at 23:16 -0500, Dave Jones wrote:
> Since todays net merge, I see this when I start openvpn..
>
> general protection fault: 0000 [#1] PREEMPT SMP
> Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables xfs iTCO_wdt iTCO_vendor_support snd_emu10k1 snd_util_mem snd_ac97_codec coretemp ac97_bus microcode snd_hwdep snd_seq pcspkr snd_pcm snd_page_alloc snd_timer lpc_ich i2c_i801 snd_rawmidi mfd_core snd_seq_device snd e1000e soundcore emu10k1_gp gameport i82975x_edac edac_core vhost_net tun macvtap macvlan kvm_intel kvm binfmt_misc nfsd auth_rpcgss nfs_acl lockd sunrpc btrfs libcrc32c zlib_deflate firewire_ohci sata_sil firewire_core crc_itu_t radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core floppy
> CPU 0
> Pid: 1381, comm: openvpn Not tainted 3.7.0+ #14                  /D975XBX
> RIP: 0010:[<ffffffff815b54a4>]  [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
> RSP: 0018:ffff88007d0d9c48  EFLAGS: 00010206
> RAX: 000000000000055d RBX: 6b6b6b6b6b6b6b4b RCX: 1471030a0180040a
> RDX: 0000000000000005 RSI: 00000000ffffffe0 RDI: ffff8800ba83fa80
> RBP: ffff88007d0d9cb8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000101 R12: ffff8800ba83fa80
> R13: 0000000000000008 R14: ffff88007d0d9cc8 R15: ffff8800ba83fa80
> FS:  00007f6637104800(0000) GS:ffff8800bf600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f563f5b01c4 CR3: 000000007d140000 CR4: 00000000000007f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process openvpn (pid: 1381, threadinfo ffff88007d0d8000, task ffff8800a540cd60)
> Stack:
>  ffff8800ba83fa80 0000000000000296 0000000000000000 0000000000000000
>  ffff88007d0d9cc8 ffffffff815bcff4 ffff88007d0d9ce8 ffffffff815b1831
>  ffff88007d0d9ca8 00000000703f6364 ffff8800ba83fa80 0000000000000000
> Call Trace:
>  [<ffffffff815bcff4>] ? netif_rx+0x114/0x4c0
>  [<ffffffff815b1831>] ? skb_copy_datagram_from_iovec+0x61/0x290
>  [<ffffffff815b672a>] __skb_get_rxhash+0x1a/0xd0
>  [<ffffffffa03b9538>] tun_get_user+0x418/0x810 [tun]
>  [<ffffffff8135f468>] ? delay_tsc+0x98/0xf0
>  [<ffffffff8109605c>] ? __rcu_read_unlock+0x5c/0xa0
>  [<ffffffffa03b9a41>] tun_chr_aio_write+0x81/0xb0 [tun]
>  [<ffffffff81145011>] ? __buffer_unlock_commit+0x41/0x50
>  [<ffffffff811db917>] do_sync_write+0xa7/0xe0
>  [<ffffffff811dc01f>] vfs_write+0xaf/0x190
>  [<ffffffff811dc375>] sys_write+0x55/0xa0
>  [<ffffffff81705540>] tracesys+0xdd/0xe2
> Code: 41 8b 44 24 68 41 2b 44 24 6c 01 de 29 f0 83 f8 03 0f 8e a0 00 00 00 48 63 de 49 03 9c 24 e0 00 00 00 48 85 db 0f 84 72 fe ff ff <8b> 03 41 89 46 08 b8 01 00 00 00 e9 43 fd ff ff 0f 1f 40 00 48
> RIP  [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
>  RSP <ffff88007d0d9c48>
> ---[ end trace 6d42c834c72c002e ]---
>
>
> Faulting instruction is
>
>    0:	8b 03                	mov    (%rbx),%eax
>
> rbx is slab poison (-20) so this looks like a use-after-free here...
>
>                         flow->ports = *ports;
>  314:   8b 03                   mov    (%rbx),%eax
>  316:   41 89 46 08             mov    %eax,0x8(%r14)
>
> in the inlined skb_header_pointer in skb_flow_dissect
>
> 	Dave
>

commit 96442e4242 (tuntap: choose the txq based on rxq) added
a use after free.

Cache rxhash in a temp variable before calling netif_rx_ni()

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Wang <jasowang@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-13 12:58:11 -05:00
Dave Jones 026e43def7 nfc: remove noisy message from llcp_sock_sendmsg
This is easily triggerable when fuzz-testing as an unprivileged user.
We could rate-limit it, but given we don't print similar messages
for other protocols, I just removed it.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-13 12:58:10 -05:00
Ralf Baechle bdf20507da MIPS: PMC-Sierra Yosemite: Remove support.
Nobody seems to be interested anymore and upstream also never had an
ethernet driver.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:30 +01:00
Thomas Bogendoerfer fa4dbbc602 VIDEO: Newport Fix console crashes
Because of commit e84de0c619 [MIPS: GIO bus
support for SGI IP22/28] newport con is now taking over console from
dummy con, therefore it's necessary to resize the VC to the correct size
to avoid crashes and garbage on console

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@linux-mips.org
Cc: linux-fbdev@vger.kernel.org
Cc: FlorianSchandinat@gmx.de
Patchwork: https://patchwork.linux-mips.org/patch/4138/
Acked-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:29 +01:00
Ralf Baechle 5613d48239 MIPS: wrppmc: Fix build of PCI code.
CC      arch/mips/wrppmc/pci.o
/home/ralf/src/linux/linux-mips/arch/mips/wrppmc/pci.c: In function ‘gt64120_pci_init’:
/home/ralf/src/linux/linux-mips/arch/mips/wrppmc/pci.c:41:6: error: variable ‘tmp’ set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors

This warning exists in gcc 4.6.0 and newer.  Kernels 2.6.40 and newer use
-Wunused-but-set-variable to suppress it.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:29 +01:00
Ralf Baechle b2f711d485 MIPS: IP22/IP28: Fix build of EISA code.
CC      arch/mips/sgi-ip22/ip22-eisa.o
/home/ralf/src/linux/linux-mips/arch/mips/sgi-ip22/ip22-eisa.c: In function ‘ip22_eisa_intr’:
/home/ralf/src/linux/linux-mips/arch/mips/sgi-ip22/ip22-eisa.c:77:11: error: variable ‘dma2’ set but not used [-Werror=unused-but-set-variable]
/home/ralf/src/linux/linux-mips/arch/mips/sgi-ip22/ip22-eisa.c:77:5: error: variable ‘dma1’ set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors

This warning exists in gcc 4.6.0 and newer.  Kernels 2.6.40 and newer use
-Wunused-but-set-variable to suppress it.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:28 +01:00
Ralf Baechle 66315e15cd MIPS: RB532: Fix build of prom code.
CC      arch/mips/rb532/prom.o
/home/ralf/src/linux/linux-mips/arch/mips/rb532/prom.c: In function ‘prom_setup_cmdline’:
/home/ralf/src/linux/linux-mips/arch/mips/rb532/prom.c:75:22: error: variable ‘prom_envp’ set but not used [-Werror=unused-but-set-variable]

This warning exists in gcc 4.6.0 and newer.  Kernels 2.6.40 and newer use
-Wunused-but-set-variable to suppress it.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:28 +01:00
Ralf Baechle ae1242a546 MIPS: PowerTV: Fix build.
CC      arch/mips/powertv/init.o
/home/ralf/src/linux/linux-mips/arch/mips/powertv/init.c: In function ‘mips_nmi_setup’:
/home/ralf/src/linux/linux-mips/arch/mips/powertv/init.c:80:8: error: variable ‘base’ set but not used [-Werror=unused-but-set-variable]
/home/ralf/src/linux/linux-mips/arch/mips/powertv/init.c: In function ‘mips_ejtag_setup’:
/home/ralf/src/linux/linux-mips/arch/mips/powertv/init.c:94:8: error: variable ‘base’ set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors

As these two functions are, they don't serve any useful purpose so I've
deleted them entirely.

This warning exists in gcc 4.6.0 and newer.  Kernels 2.6.40 and newer use
-Wunused-but-set-variable to suppress it.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:28 +01:00
Dave Jones 686957e71d MIPS: IP27: Correct fucked grammar in ops-bridge.c
I had no idea just how broken IOC3 was until I read this.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:28 +01:00
Ralf Baechle b99fbc10df MIPS: Highmem: Fix build error if CONFIG_DEBUG_HIGHMEM is disabled
CC      arch/mips/mm/highmem.o
/home/ralf/src/linux/linux-mips/arch/mips/mm/highmem.c: In function ‘__kunmap_atomic’:
/home/ralf/src/linux/linux-mips/arch/mips/mm/highmem.c:70:6: error: variable ‘type’ set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors

This warning exists in gcc 4.6.0 and newer.  Kernels 2.6.40 and newer use
-Wunused-but-set-variable to suppress it.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:27 +01:00
Ralf Baechle a16dad7763 MIPS: Fix potencial corruption
Normally r4k_dma_cache_inv should only ever be called with cacheline
aligned addresses.  If however, it isn't there is the theoretical
possibility of data corruption.  There is no correct way of handling this
and anyway, it should only happen if the DMA API is used incorrectly
so drop

There is a different corruption scenario with these CACHE instructions
removed but again there is no way of handling this correctly and it can
be triggered only through incorrect use of the DMA API.

So just get rid of the complexity.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: James Rodriguez <jamesr@juniper.net>
2012-12-13 18:15:27 +01:00
Ralf Baechle 51d943f07d MIPS: Fix for warning from FPU emulation code
The default implementation of 'cpu_has_fpu' macro calls
smp_processor_id() which causes this warning to be printed when
preemption is enabled:

[    4.664000] Algorithmics/MIPS FPU Emulator v1.5
[    4.676000] BUG: using smp_processor_id() in preemptible [00000000] code: ini
[    4.700000] caller is fpu_emulator_cop1Handler+0x434/0x27b8

This problem got introduced in November 2009 by
af1d2af877ef6c36990671bc86a5b9c5bb50b1da (lmo) [MIPS: Fix emulation of
64-bit FPU on 64-bit CPUs.] rsp.  da0bac3341
(kernel.org) [MIPS: Fix emulation of 64-bit FPU on FPU-less 64-bit CPUs.]
in 2.6.32.

Fixed by rewriting cop1_64bit() to return a constant whenever possible
but most importantly avoid the use pf cpu_has_fpu entirely.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Jayachandran C <jchandra@broadcom.com>
Initial-patch-by: Jayachandran C <jchandra@broadcom.com>
Patchwork: https://patchwork.linux-mips.org/patch/4225/
2012-12-13 18:15:27 +01:00
Maciej W. Rozycki 051ff44a8b MIPS: Handle COP3 Unusable exception as COP1X for FP emulation
Our FP emulator is hardcoded for the MIPS IV FP instruction set and does
not match the FP ISA with the general ISA.  However for the few MIPS IV FP
instructions that use the COP1X major opcode it relies on the Coprocessor
Unusable exception to be delivered as a COP1 rather than COP3 exception.
This includes indexed transfer (LDXC1, etc.) and FP multiply-accumulate
(MADD.D, etc.) instructions.

 All the MIPS I, II, III and IV processors and some newer chips that do not
implement the FPU use the COP3 exception however.  Therefore I believe the
kernel should follow and redirect any COP3 Unusable traps to the emulator
unless an actual FPU part or core is present.

 This is a change that implements it.  Any minor opcode encodings that are
not recognised as valid FP instructions are rejected by the emulator and
will result in a SIGILL signal being delivered as they currently do.  We
do not support vendor-specific coprocessor 3 implementations supported
with MIPS I and MIPS II ISA processors; we never set CP0.Status.CU3.

[Ralf: On MIPS IV processors the kernel always enables the XX bit which
replaces the CU3 bit off earlier architecture revisions.]

 If matching between the CPU and the FPU ISA is considered required one
day, this can still be done in the emulator itself.  I think the CpU
exception dispatcher is not the right place to do this anyway, as there
are further differences between MIPS I, MIPS II, MIPS III, MIPS IV and
MIPS32 FP ISAs.

 Corresponding explanation of this implementation is included within the
change itself.

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/project/linux-mips/list/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:27 +01:00
Huacai Chen 8add1ecb81 MIPS: Fix poweroff failure when HOTPLUG_CPU configured.
When poweroff machine, kernel_power_off() call disable_nonboot_cpus().
And if we have HOTPLUG_CPU configured, disable_nonboot_cpus() is not an
empty function but attempt to actually disable the nonboot cpus. Since
system state is SYSTEM_POWER_OFF, play_dead() won't be called and thus
disable_nonboot_cpus() hangs. Therefore, we make this patch to avoid
poweroff failure.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Cc: Yong Zhang <yong.zhang@windriver.com>
Cc: stable@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/4211/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:26 +01:00
Florian Fainelli b88fb18e7e MIPS: MT: Fix build with CONFIG_UIDGID_STRICT_TYPE_CHECKS=y
When CONFIG_UIDGID_STRICT_TYPE_CHECKS is enabled, plain integer checking
between different uids/gids is explicitely turned into a build failure
by making the k{uid,gid}_t types a structure containing a value:

arch/mips/kernel/mips-mt-fpaff.c: In function 'check_same_owner':
arch/mips/kernel/mips-mt-fpaff.c:53:22: error: invalid operands to
binary == (have 'kuid_t' and 'kuid_t')
arch/mips/kernel/mips-mt-fpaff.c:54:15: error: invalid operands to
binary == (have 'kuid_t' and 'kuid_t')

In order to ensure proper comparison between uids, using the helper
function uid_eq() which performs the right thing whenever this config
option is turned on or off.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Patchwork: https://patchwork.linux-mips.org/patch/4717/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:26 +01:00
Paul Bolle a685bc3dab MIPS: Remove unused smvp.h
This header was added in commit 39b8d52542
(kernel.org) / b6e90cd0ae7a556080d9ea2ec1b8f6d9accad9d4 (lmo( ([MIPS] Add
support for MIPS CMP platform.).  None of the functions it declared were
ever included in the tree. Commit cb7f39d2bc
(kernel.org) / b6e90cd0ae7a556080d9ea2ec1b8f6d9accad9d4 (lmo) [MIPS] Remove
unused maltasmp.h.] removeed the sole file that included it because that
file was itself unused.

[ralf@linux-mips.org: The whole mess happened because somebody at MIPS
thought it was a good idea to rename VSMP ("Vitual SMP") to SMVP.  Which
is an IBMeque ETLA in contrast to VSMP, so public kernels as opposed to
MTI's inhouse kernels never followed suit.]

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/3950/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:26 +01:00
David Daney e1ced09797 MIPS/EDAC: Improve OCTEON EDAC support.
Some initialization errors are reported with the existing OCTEON EDAC
support patch.  Also some parts have more than one memory controller.

Fix the errors and add multiple controllers if present.

Signed-off-by: David Daney <david.daney@cavium.com>
2012-12-13 18:15:26 +01:00
David Daney abe105a4d8 MIPS: OCTEON: Add definitions for OCTEON memory contoller registers.
Signed-off-by: David Daney <david.daney@cavium.com>
2012-12-13 18:15:25 +01:00
David Daney 6bbf6a6d48 MIPS: OCTEON: Add OCTEON family definitions to octeon-model.h
Used by follow-on EDAC patches.

Signed-off-by: David Daney <david.daney@cavium.com>
2012-12-13 18:15:25 +01:00
David Daney 1007c4bc0f ata: pata_octeon_cf: Use correct byte order for DMA in when built little-endian.
We need to set the 'endian' bit in this case.

Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David Daney <david.daney@cavium.com>
2012-12-13 18:15:25 +01:00
David Daney 43f01da0f2 MIPS/OCTEON/ata: Convert pata_octeon_cf.c to use device tree.
The patch needs to eliminate the definition of OCTEON_IRQ_BOOTDMA so
that the device tree code can map the interrupt, so in order to not
temporarily break things, we do a single patch to both the interrupt
registration code and the pata_octeon_cf driver.

Also rolled in is a conversion to use hrtimers and corrections to the
timing calculations.

Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David Daney <david.daney@cavium.com>
2012-12-13 18:15:24 +01:00
Ralf Baechle f772cdb2bd MIPS: Remove usage of CEVT_R4K_LIB config option.
Manuel Lauss <manuel.lauss@gmail.com> writes:

I introduced it as a fallback because early revisions of Alchemy hardware
we shipped had a non-functional 32kHz timer and had to rely on the r4k
timer instead.  Previously the r4k timer was initialized regardless, but
it's useless with the "wait" instruction.

So long story short:   I need either the on-chip 32kHz timer OR the r4k
timer if the 32kHz one is unusable, but not both, and r4k timer is useless
when au1k_idle is in use.

The current in-kernel Alchemy boards all work with the 32kHz timer, so I'm
not against removing R4K_LIB symbols.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:24 +01:00
Steven J. Hill d7ea335c05 MIPS: Remove usage of CSRC_R4K_LIB config option.
Manuel Lauss <manuel.lauss@gmail.com> writes:

I introduced it as a fallback because early revisions of Alchemy hardware
we shipped had a non-functional 32kHz timer and had to rely on the r4k
timer instead.  Previously the r4k timer was initialized regardless, but
it's useless with the "wait" instruction.

So long story short:   I need either the on-chip 32kHz timer OR the r4k
timer if the 32kHz one is unusable, but not both, and r4k timer is useless
when au1k_idle is in use.

The current in-kernel Alchemy boards all work with the 32kHz timer, so I'm
not against removing R4K_LIB symbols.

Signed-off-by: Steven J. Hill <sjhill@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:24 +01:00
Florian Fainelli dcb96a4e36 MIPS: AR7: use part_probe_types to specificy the partition parser to use
This patch changes the physmap-flash platform data on AR7 to pass the
correct partition parser: ar7part to used by the "physmap-flash" mapping
driver so we get the partitions probed correctly.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Cc: blogic@openwrt.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4654/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:23 +01:00
Masanari Iida d08be0dbe8 MIPS: Lantiq: Fix typo in "endianness" in dma.c
Correct spelling typo ENDIANESS to ENDIANNESS in arc/mips/lantiq/xway/dma.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Cc: trivial@kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/4613/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 18:15:23 +01:00
Ralf Baechle 0e2794b0b7 MIPS: Kconfig: Rename several firmware related config symbols.
With the upcoming merge of the ARC architecture there is a small likelyhood
of conflicting use for the CONFIG_ARC config symbol.  Rename it to
CONFIG_FW_ARC.  Also rename CONFIG_ARC32 to CONFIG_FW_ARC32, CONFIG_ARC64
to CONFIG_FW_ARC64.

For consistence also rename CONFIG_SNIPROM to CONFIG_FW_SNIPROM and
CONFIG_CFE to CONFIG_FW_CFE.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 17:02:14 +01:00
Ralf Baechle abe77f90dc MIPS: Octeon: Add kexec and kdump support
[ralf@linux-mips.org: Original patch by Maxim Uvarov <muvarov@gmail.com>
with plenty of further shining, polishing, debugging and testing by me.]

Signed-off-by: Maxim Uvarov <muvarov@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: kexec@lists.infradead.org
Cc: horms@verge.net.au
Patchwork: https://patchwork.linux-mips.org/patch/1026/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 17:00:39 +01:00
Ralf Baechle 7aa1c8f47e MIPS: kdump: Add support
[ralf@linux-mips.org: Original patch by Maxim Uvarov <muvarov@gmail.com>
with plenty of further shining, polishing, debugging and testing by me.]

Signed-off-by: Maxim Uvarov <muvarov@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: kexec@lists.infradead.org
Cc: horms@verge.net.au
Patchwork: https://patchwork.linux-mips.org/patch/1025/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-13 16:46:47 +01:00
Linus Torvalds 6be35c700f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking changes from David Miller:

1) Allow to dump, monitor, and change the bridge multicast database
   using netlink.  From Cong Wang.

2) RFC 5961 TCP blind data injection attack mitigation, from Eric
   Dumazet.

3) Networking user namespace support from Eric W. Biederman.

4) tuntap/virtio-net multiqueue support by Jason Wang.

5) Support for checksum offload of encapsulated packets (basically,
   tunneled traffic can still be checksummed by HW).  From Joseph
   Gasparakis.

6) Allow BPF filter access to VLAN tags, from Eric Dumazet and
   Daniel Borkmann.

7) Bridge port parameters over netlink and BPDU blocking support
   from Stephen Hemminger.

8) Improve data access patterns during inet socket demux by rearranging
   socket layout, from Eric Dumazet.

9) TIPC protocol updates and cleanups from Ying Xue, Paul Gortmaker, and
   Jon Maloy.

10) Update TCP socket hash sizing to be more in line with current day
    realities.  The existing heurstics were choosen a decade ago.
    From Eric Dumazet.

11) Fix races, queue bloat, and excessive wakeups in ATM and
    associated drivers, from Krzysztof Mazur and David Woodhouse.

12) Support DOVE (Distributed Overlay Virtual Ethernet) extensions
    in VXLAN driver, from David Stevens.

13) Add "oops_only" mode to netconsole, from Amerigo Wang.

14) Support set and query of VEB/VEPA bridge mode via PF_BRIDGE, also
    allow DCB netlink to work on namespaces other than the initial
    namespace.  From John Fastabend.

15) Support PTP in the Tigon3 driver, from Matt Carlson.

16) tun/vhost zero copy fixes and improvements, plus turn it on
    by default, from Michael S. Tsirkin.

17) Support per-association statistics in SCTP, from Michele
    Baldessari.

And many, many, driver updates, cleanups, and improvements.  Too
numerous to mention individually.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
  net/mlx4_en: Add support for destination MAC in steering rules
  net/mlx4_en: Use generic etherdevice.h functions.
  net: ethtool: Add destination MAC address to flow steering API
  bridge: add support of adding and deleting mdb entries
  bridge: notify mdb changes via netlink
  ndisc: Unexport ndisc_{build,send}_skb().
  uapi: add missing netconf.h to export list
  pkt_sched: avoid requeues if possible
  solos-pci: fix double-free of TX skb in DMA mode
  bnx2: Fix accidental reversions.
  bna: Driver Version Updated to 3.1.2.1
  bna: Firmware update
  bna: Add RX State
  bna: Rx Page Based Allocation
  bna: TX Intr Coalescing Fix
  bna: Tx and Rx Optimizations
  bna: Code Cleanup and Enhancements
  ath9k: check pdata variable before dereferencing it
  ath5k: RX timestamp is reported at end of frame
  ath9k_htc: RX timestamp is reported at end of frame
  ...
2012-12-12 18:07:07 -08:00
Linus Torvalds e37aa63e87 MN10300 changes 2012-12-12
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIVAwUAUMi2WhOxKuMESys7AQK2bxAApJQL2x6/k4swH933rhdVooA2TiMVST3l
 XSy6yil6Qeqz82RDnVMfxQ069N8iP5x93fE918V6UzeIrUmKEL8xD2UJCZzjW6B9
 vmBNrD6VUGdiBhTcGY7er4EtlnRf1XJgUPmfdIEAJoZ8VMkKyYAGkckW2I8hiYbZ
 gyF+ONc+CHxspqS1CzNUmmbP84T6rij2fydqLaSNNnQYnEfICt7dciv73KBQYMtn
 AsCLcmWW4DkZ37VL6Bg8yvgRaxbNlZpS0Rl5oKS65rYX9azt/SvujSta0UEv+uYF
 m/2HqExwgo8HZHKyIEpRgBLqfOfekJATbSLEq3jEgA73MLdzw2DTgpJQOmWCjtjN
 7bROv2O57e8ttxb81x10YyInzOTYOd18XEb2Qa6O4wbB5TS8MxZywfuTfL+sdfsN
 pquqyKNgxD7HqqxIcWSNKGxkPPZ/Xk/JmgcQFVCjpvvdCizsFTwWeiAd81Jz0Dn+
 SLL345nlDJPVukgIiDiwm9UvkyG0Pg03K5k6+7QOWB/5AdPqgRUeOi6gqQE7ZQ9G
 GK8/2xX4xFJ8LLPqfh2X+1PUesa8Dhph4NorsW4comJtPcLuh30XbwIKTjpBF90y
 7OILeZeQ+qFu8S9lLSQOr6zxs3/9uKP8ADoOAnFmUEE2PkzvZqMrnrlezAyqtNht
 LkVa/IR/z50=
 =ex0l
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20121212' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-mn10300

Pull MN10300 changes from David Howells:
 "miscellaneous MN10300 arch patches.  I've based it on top of Al Viro's
  signal tree - so these patches should be pulled after that."

* tag 'for-linus-20121212' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-mn10300:
  MN10300: Use asm-generic/pci_iomap.h
  MN10300: Get rid of unused variable from ASB2305 PCI code
  MN10300: ASB2305 PCI code needs linux/irq.h
  mn10300/mm/fault.c: Port OOM changes to do_page_fault
  MN10300: Handle cacheable PCI regions in pci_iomap()
  MN10300: fix debug polling in ttySM driver
  MN10300: ttySM: clean up unnecessary casting
  MN10300: fix SMP synchronization between txdma and serial driver
  MN10300: fix serial port vdma irq setup for SMP
  MN10300: cleanup IRQ affinity setting
  MN10300: ttySM: Use memory barriers correctly in circular buffer logic
2012-12-12 17:50:34 -08:00
Lin Feng 98870901cc mm/bootmem.c: remove unused wrapper function reserve_bootmem_generic()
reserve_bootmem_generic() has no caller,

Signed-off-by: Lin Feng <linfeng@cn.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Dominik Dingel 66521d5aa6 mm/memory.c: remove unused code from do_wp_page()
page_mkwrite is initalized with zero and only set once, from that point
exists no way to get to the oom or oom_free_new labels.

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Kirill A. Shutemov 816422ad76 asm-generic, mm: pgtable: consolidate zero page helpers
We have two different implementation of is_zero_pfn() and my_zero_pfn()
helpers: for architectures with and without zero page coloring.

Let's consolidate them in <asm-generic/pgtable.h>.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Naoya Horiguchi 56f2fb1476 mm/hugetlb.c: fix warning on freeing hwpoisoned hugepage
Fix the warning from __list_del_entry() which is triggered when a process
tries to do free_huge_page() for a hwpoisoned hugepage.

free_huge_page() can be called for hwpoisoned hugepage from
unpoison_memory().  This function gets refcount once and clears
PageHWPoison, and then puts refcount twice to return the hugepage back to
free pool.  The second put_page() finally reaches free_huge_page().

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Naoya Horiguchi 5f24ae585b hwpoison, hugetlbfs: fix RSS-counter warning
Memory error handling on hugepages can break a RSS counter, which emits a
message like "Bad rss-counter state mm:ffff88040abecac0 idx:1 val:-1".
This is because PageAnon returns true for hugepage (this behavior is
necessary for reverse mapping to work on hugetlbfs).

[akpm@linux-foundation.org: clean up code layout]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Naoya Horiguchi 8c4894c6bc hwpoison, hugetlbfs: fix "bad pmd" warning in unmapping hwpoisoned hugepage
When a process which used a hwpoisoned hugepage tries to exit() or
munmap(), the kernel can print out "bad pmd" message because page table
walker in free_pgtables() encounters 'hwpoisoned entry' on pmd.

This is because currently we fail to clear the hwpoisoned entry in
__unmap_hugepage_range(), so this patch simply does it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Michel Lespinasse 4128997b5f mm: protect against concurrent vma expansion
expand_stack() runs with a shared mmap_sem lock.  Because of this, there
could be multiple concurrent stack expansions in the same mm, which may
cause problems in the vma gap update code.

I propose to solve this by taking the mm->page_table_lock around such vma
expansions, in order to avoid the concurrency issue.  We only have to
worry about concurrent expand_stack() calls here, since we hold a shared
mmap_sem lock and all vma modificaitons other than expand_stack() are done
under an exclusive mmap_sem lock.

I previously tried to achieve the same effect by making sure all growable
vmas in a given mm would share the same anon_vma, which we already lock
here.  However this turned out to be difficult - all of the schemes I
tried for refcounting the growable anon_vma and clearing turned out ugly.
So, I'm now proposing only the minimal fix.

The overhead of taking the page table lock during stack expansion is
expected to be small: glibc doesn't use expandable stacks for the threads
it creates, so having multiple growable stacks is actually uncommon and we
don't expect the page table lock to get bounced between threads.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Michal Hocko c95d26c2ff memcg: do not check for mm in __mem_cgroup_count_vm_event
The mm given to __mem_cgroup_count_vm_event() cannot be NULL because the
function is either called from the page fault path or vma->vm_mm is used.
So the check can be dropped.

The check was introduced by commit 456f998ec8 ("memcg: add the
pagefault count into memcg stats") because the originally proposed patch
used current->mm for shmem but this has been changed to vma->vm_mm later
on without the check being removed (thanks to Hugh for this
recollection).

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Hugh Dickins 220f2ac913 tmpfs: support SEEK_DATA and SEEK_HOLE (reprise)
Revert 3.5's commit f21f806220 ("tmpfs: revert SEEK_DATA and
SEEK_HOLE") to reinstate 4fb5ef089b ("tmpfs: support SEEK_DATA and
SEEK_HOLE"), with the intervening additional arg to
generic_file_llseek_size().

In 3.8, ext4 is expected to join btrfs, ocfs2 and xfs with proper
SEEK_DATA and SEEK_HOLE support; and a good case has now been made for
it on tmpfs, so let's join the party.

It's quite easy for tmpfs to scan the radix_tree to support llseek's new
SEEK_DATA and SEEK_HOLE options: so add them while the minutiae are
still on my mind (in particular, the !PageUptodate-ness of pages
fallocated but still unwritten).

[akpm@linux-foundation.org: fix warning with CONFIG_TMPFS=n]
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jaegeuk Hanse <jaegeuk.hanse@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Zheng Liu <wenqing.lz@taobao.com>
Cc: Jeff liu <jeff.liu@oracle.com>
Cc: Paul Eggert <eggert@cs.ucla.edu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Josef Bacik <josef@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Marco Stornelli <marco.stornelli@gmail.com>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Jiang Liu 01cefaef40 mm: provide more accurate estimation of pages occupied by memmap
If SPARSEMEM is enabled, it won't build page structures for non-existing
pages (holes) within a zone, so provide a more accurate estimation of
pages occupied by memmap if there are bigger holes within the zone.

And pages for highmem zones' memmap will be allocated from lowmem, so
charge nr_kernel_pages for that.

[akpm@linux-foundation.org: mark calc_memmap_size __paging_init]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Chris Clayton <chris2553@googlemail.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Tested-by: Jianguo Wu <wujianguo@huawei.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Yan Hong 02c0ab684f fs/buffer.c: remove redundant initialization in alloc_page_buffers()
buffer_head comes from kmem_cache_zalloc(), no need to zero its fields.

Signed-off-by: Yan Hong <clouds.yan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:35 -08:00
Yan Hong a3f3c29cb2 fs/buffer.c: do not inline exported function
It makes no sense to inline an exported function.

Signed-off-by: Yan Hong <clouds.yan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
Yan Hong 5aaea51dfb writeback: fix a typo in comment
Signed-off-by: Yan Hong <clouds.yan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
Jiang Liu 9feedc9d83 mm: introduce new field "managed_pages" to struct zone
Currently a zone's present_pages is calcuated as below, which is
inaccurate and may cause trouble to memory hotplug.

	spanned_pages - absent_pages - memmap_pages - dma_reserve.

During fixing bugs caused by inaccurate zone->present_pages, we found
zone->present_pages has been abused.  The field zone->present_pages may
have different meanings in different contexts:

1) pages existing in a zone.
2) pages managed by the buddy system.

For more discussions about the issue, please refer to:
  http://lkml.org/lkml/2012/11/5/866
  https://patchwork.kernel.org/patch/1346751/

This patchset tries to introduce a new field named "managed_pages" to
struct zone, which counts "pages managed by the buddy system".  And revert
zone->present_pages to count "physical pages existing in a zone", which
also keep in consistence with pgdat->node_present_pages.

We will set an initial value for zone->managed_pages in function
free_area_init_core() and will adjust it later if the initial value is
inaccurate.

For DMA/normal zones, the initial value is set to:

	(spanned_pages - absent_pages - memmap_pages - dma_reserve)

Later zone->managed_pages will be adjusted to the accurate value when the
bootmem allocator frees all free pages to the buddy system in function
free_all_bootmem_node() and free_all_bootmem().

The bootmem allocator doesn't touch highmem pages, so highmem zones'
managed_pages is set to the accurate value "spanned_pages - absent_pages"
in function free_area_init_core() and won't be updated anymore.

This patch also adds a new field "managed_pages" to /proc/zoneinfo
and sysrq showmem.

[akpm@linux-foundation.org: small comment tweaks]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
David Rientjes c2d23f919b mm, oom: remove statically defined arch functions of same name
out_of_memory() is a globally defined function to call the oom killer.
x86, sh, and powerpc all use a function of the same name within file scope
in their respective fault.c unnecessarily.  Inline the functions into the
pagefault handlers to clean the code up.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
David Rientjes 0fa84a4bfa mm, oom: remove redundant sleep in pagefault oom handler
out_of_memory() will already cause current to schedule if it has not been
killed, so doing it again in pagefault_out_of_memory() is redundant.
Remove it.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
David Rientjes efacd02e4f mm, oom: cleanup pagefault oom handler
To lock the entire system from parallel oom killing, it's possible to pass
in a zonelist with all zones rather than using for_each_populated_zone()
for the iteration.  This obsoletes try_set_system_oom() and
clear_system_oom() so that they can be removed.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
Lai Jiangshan 09285af75d memory_hotplug: allow online/offline memory to result movable node
Now, memory management can handle movable node or nodes which don't have
any normal memory, so we can dynamic configure and add movable node by:

	online a ZONE_MOVABLE memory from a previous offline node
	offline the last normal memory which result a non-normal-memory-node

movable-node is very important for power-saving, hardware partitioning and
high-available-system(hardware fault management).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
Lai Jiangshan 20b2f52b73 numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
We need a node which only contains movable memory.  This feature is very
important for node hotplug.  If a node has normal/highmem, the memory may
be used by the kernel and can't be offlined.  If the node only contains
movable memory, we can offline the memory and the node.

All are prepared, we can actually introduce N_MEMORY.
add CONFIG_MOVABLE_NODE make we can use it for movable-dedicated node

[akpm@linux-foundation.org: fix Kconfig text]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
David Rientjes 68ae564bba mm, memcg: avoid unnecessary function call when memcg is disabled
While profiling numa/core v16 with cgroup_disable=memory on the command
line, I noticed mem_cgroup_count_vm_event() still showed up as high as
0.60% in perftop.

This occurs because the function is called extremely often even when memcg
is disabled.

To fix this, inline the check for mem_cgroup_disabled() so we avoid the
unnecessary function call if memcg is disabled.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00