Commit Graph

634361 Commits

Author SHA1 Message Date
Rob Herring d0f4bce2bc tty: serial_core: fix NULL struct tty pointer access in uart_write_wakeup
Since commit 761ed4a945 ("tty: serial_core: convert uart_close to
use tty_port_close"), the serial console is broken on various systems
and typing "reboot" splats the following on the serial console:

INIT: Sending p[  427.863916] BUG: unable to handle kernel NULL pointer dereference at 00000000000001e0
[  427.885156] IP: [] tty_wakeup+0xc/0x70
[  427.898337] PGD 0 [  427.902051]
[  427.907498] Oops: 0000 [#1] PREEMPT SMP
[  427.917635] Modules linked in: nfsv3 nfs_acl nfs fscache lockd
sunrpc grace edd af_packet cpufreq_conservative cpufreq_userspace
cpufreq_powersave fuse loop md_mod dm_mod joydev hid_generic usbhid
ipmi_ssif ohci_pci ohci_hcd ehci_pci ehci_hcd e1000e ptp firewire_ohci
edac_core pps_core tpm_infineon sp5100_tco firewire_core acpi_cpufreq
serio_raw pcspkr fjes usbcore shpchp edac_mce_amd tpm_tis ipmi_si
tpm_tis_core i2c_piix4 k10temp sg ipmi_msghandler tpm sr_mod button
cdrom kvm_amd kvm irqbypass crc_itu_t ast ttm drm_kms_helper drm
fb_sys_fops sysimgblt sysfillrect syscopyarea i2c_algo_bit scsi_dh_rdac
scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw ata_generic pata_atiixp
[  428.054179] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc1-1.g73e3f23-default #1
[  428.072868] Hardware name: System manufacturer System Product Name/KGP(M)E-D16, BIOS 0902    12/03/2010
[  428.094755] task: ffffffffa2c0d500 task.stack: ffffffffa2c00000
[  428.109717] RIP: 0010:[]  [] tty_wakeup+0xc/0x70
[  428.128407] RSP: 0018:ffff9a1a5fc03df8  EFLAGS: 00010086
[  428.142184] RAX: ffff9a1857258000 RBX: ffffffffa3050ea0 RCX: 0000000000000000
[  428.159649] RDX: 000000000000001b RSI: 0000000000000000 RDI: 0000000000000000
[  428.177109] RBP: ffff9a1a5fc03e08 R08: 0000000000000000 R09: 0000000000000000
[  428.194547] R10: 0000000000021c77 R11: 0000000000000000 R12: ffff9a1857258000
[  428.212002] R13: 0000000000000000 R14: 0000000000000020 R15: 0000000000000020
[  428.229481] FS:  0000000000000000(0000) GS:ffff9a1a5fc00000(0000) knlGS:0000000000000000
[  428.248938] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  428.263726] CR2: 00000000000001e0 CR3: 0000000390c06000 CR4: 00000000000006f0
[  428.281331] Stack:
[  428.288696]  ffffffffa3050ea0 ffff9a1857258000 ffff9a1a5fc03e18 ffffffffa24e0ab1
[  428.307064]  ffff9a1a5fc03e40 ffffffffa24e8865 ffffffffa3050ea0 00000000000000c2
[  428.325456]  0000000000000046 ffff9a1a5fc03e78 ffffffffa24e8a5f ffffffffa3050ea0
[  428.343905] Call Trace:
[  428.352319]   [  428.356216]  [] uart_write_wakeup+0x21/0x30

The problem is for console ports, the serial port is not shutdown and
interrupts may fire after the struct tty is gone. Simply calling the
tty_port helper tty_port_tty_wakeup instead of tty_wakeup directly will
ensure there is a valid struct tty.

Fixes: 761ed4a945 ("tty: serial_core: convert uart_close to use tty_port_close")
Reported-by: Borislav Petkov <bp@alien8.de>
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-serial@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 08:13:07 -04:00
Geert Uytterhoeven 4dda864d73 tty: serial_core: Fix serial console crash on port shutdown
The port->console flag is always false, as uart_console() is called
before the serial console has been registered.

Hence for a serial port used as the console, uart_tty_port_shutdown()
will still be called when userspace closes the port, powering it down.
This may lead to a system lock up when the serial console driver writes
to the serial port's registers.

To fix this, move the setting of port->console after the call to
uart_configure_port(), which registers the serial console.

Fixes: 761ed4a945 ("tty: serial_core: convert uart_close to use tty_port_close")
Reported-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Rob Herring <robh@kernel.org>
Tested-by: Mugunthan V N <mugunthanvnm@ti.com>
Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
[robh: rebased on tty-linus]
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 08:13:07 -04:00
Richard Genoud 9bcffe7575 tty/serial: at91: fix hardware handshake on Atmel platforms
After commit 1cf6e8fc83 ("tty/serial: at91: fix RTS line management
when hardware handshake is enabled"), the hardware handshake wasn't
functional anymore on Atmel platforms (beside SAMA5D2).

To understand why, one has to understand the flag ATMEL_US_USMODE_HWHS
first:
Before commit 1cf6e8fc83 ("tty/serial: at91: fix RTS line management
when hardware handshake is enabled"), this flag was never set.
Thus, the CTS/RTS where only handled by serial_core (and everything
worked just fine).

This commit introduced the use of the ATMEL_US_USMODE_HWHS flag,
enabling it for all boards when the user space enables flow control.

When the ATMEL_US_USMODE_HWHS is set, the Atmel USART controller
handles a part of the flow control job:
- disable the transmitter when the CTS pin gets high.
- drive the RTS pin high when the DMA buffer transfer is completed or
  PDC RX buffer full or RX FIFO is beyond threshold. (depending on the
  controller version).

NB: This feature is *not* mandatory for the flow control to work.
(Nevertheless, it's very useful if low latencies are needed.)

Now, the specifics of the ATMEL_US_USMODE_HWHS flag:

- For platforms with DMAC and no FIFOs (sam9x25, sam9x35, sama5D3,
sama5D4, sam9g15, sam9g25, sam9g35)* this feature simply doesn't work.
( source: https://lkml.org/lkml/2016/9/7/598 )
Tested it on sam9g35, the RTS pins always stays up, even when RXEN=1
or a new DMA transfer descriptor is set.
=> ATMEL_US_USMODE_HWHS must not be used for those platforms

- For platforms with a PDC (sam926{0,1,3}, sam9g10, sam9g20, sam9g45,
sam9g46)*, there's another kind of problem. Once the flag
ATMEL_US_USMODE_HWHS is set, the RTS pin can't be driven anymore via
RTSEN/RTSDIS in USART Control Register. The RTS pin can only be driven
by enabling/disabling the receiver or setting RCR=RNCR=0 in the PDC
(Receive (Next) Counter Register).
=> Doing this is beyond the scope of this patch and could add other
bugs, so the original (and working) behaviour should be set for those
platforms (meaning ATMEL_US_USMODE_HWHS flag should be unset).

- For platforms with a FIFO (sama5d2)*, the RTS pin is driven according
to the RX FIFO thresholds, and can be also driven by RTSEN/RTSDIS in
USART Control Register. No problem here.
(This was the use case of commit 1cf6e8fc83 ("tty/serial: at91: fix
RTS line management when hardware handshake is enabled"))
NB: If the CTS pin declared as a GPIO in the DTS, (for instance
cts-gpios = <&pioA PIN_PB31 GPIO_ACTIVE_LOW>), the transmitter will be
disabled.
=> ATMEL_US_USMODE_HWHS flag can be set for this platform ONLY IF the
CTS pin is not a GPIO.

So, the only case when ATMEL_US_USMODE_HWHS can be enabled is when
(atmel_use_fifo(port) &&
 !mctrl_gpio_to_gpiod(atmel_port->gpios, UART_GPIO_CTS))

Tested on all Atmel USART controller flavours:
AT91SAM9G35-CM (DMAC flavour), AT91SAM9G20-EK (PDC flavour),
SAMA5D2xplained (FIFO flavour).

* the list may not be exhaustive

Cc: <stable@vger.kernel.org> #4.4+ (beware, missing atmel_port variable)
Fixes: 1cf6e8fc83 ("tty/serial: at91: fix RTS line management when hardware handshake is enabled")
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 08:10:48 -04:00
Ido Yariv bd768e1466 KVM: x86: fix wbinvd_dirty_mask use-after-free
vcpu->arch.wbinvd_dirty_mask may still be used after freeing it,
corrupting memory. For example, the following call trace may set a bit
in an already freed cpu mask:
    kvm_arch_vcpu_load
    vcpu_load
    vmx_free_vcpu_nested
    vmx_free_vcpu
    kvm_arch_vcpu_free

Fix this by deferring freeing of wbinvd_dirty_mask.

Cc: stable@vger.kernel.org
Signed-off-by: Ido Yariv <ido@wizery.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-10-28 11:35:21 +02:00
Imre Palik f92b760414 perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors
perf doesn't seem to honour the number of fixed counters specified by CPUID
leaf 0xa. It always assumes that Intel CPUs have at least 3 fixed counters.

So if some of the fixed counters are masked out by the hypervisor, it still
tries to check/set them.

This patch makes perf behave nicer when the kernel is running under a
hypervisor that doesn't expose all the counters.

This patch contains some ideas from Matt Wilson.

Signed-off-by: Imre Palik <imrep@amazon.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Kozyrev <alexander.kozyrev@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Artyom Kuanbekov <artyom.kuanbekov@intel.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Wilson <msw@amazon.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1477037939-15605-1-git-send-email-imrep.amz@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-28 11:06:25 +02:00
Jiri Olsa 5aab90ce1e perf/powerpc: Don't call perf_event_disable() from atomic context
The trinity syscall fuzzer triggered following WARN() on powerpc:

  WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278
  ...
  NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0
  LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0
  Call Trace:
  [c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable)
  [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
  [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
  [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
  [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
  [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48

Followed by a lockdep warning:

  ===============================
  [ INFO: suspicious RCU usage. ]
  4.8.0-rc5+ #7 Tainted: G        W
  -------------------------------
  ./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section!

  other info that might help us debug this:

  rcu_scheduler_active = 1, debug_locks = 0
  2 locks held by ls/2998:
   #0:  (rcu_read_lock){......}, at: [<c0000000000f6a00>] .__atomic_notifier_call_chain+0x0/0x1c0
   #1:  (rcu_read_lock){......}, at: [<c00000000093ac50>] .hw_breakpoint_handler+0x0/0x2b0

  stack backtrace:
  CPU: 9 PID: 2998 Comm: ls Tainted: G        W       4.8.0-rc5+ #7
  Call Trace:
  [c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable)
  [c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180
  [c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0
  [c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0
  [c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380
  [c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60
  [c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0
  [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
  [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
  [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
  [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
  [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48

While it looks like the first WARN() is probably valid, the other one is
triggered by disabling event via perf_event_disable() from atomic context.

The event is disabled here in case we were not able to emulate
the instruction that hit the breakpoint. By disabling the event
we unschedule the event and make sure it's not scheduled back.

But we can't call perf_event_disable() from atomic context, instead
we need to use the event's pending_disable irq_work method to disable it.

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161026094824.GA21397@krava
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-28 11:06:25 +02:00
Jiri Olsa 0933840acf perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix CONFIG_DEBUG_TEST_DRIVER_REMOVE=y kernel panic
CAI Qian reported a crash in the PMU uncore device removal code,
enabled by the CONFIG_DEBUG_TEST_DRIVER_REMOVE=y option:

  https://marc.info/?l=linux-kernel&m=147688837328451

The reason for the crash is that perf_pmu_unregister() tries to remove
a PMU device which is not added at this point. We add PMU devices
only after pmu_bus is registered, which happens in the
perf_event_sysfs_init() call and sets the 'pmu_bus_running' flag.

The fix is to get the 'pmu_bus_running' flag state at the point
the PMU is taken out of the PMU list and remove the device
later only if it's set.

Reported-by: CAI Qian <caiqian@redhat.com>
Tested-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161020111011.GA13361@krava
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-28 11:06:25 +02:00
Borislav Petkov 1c27f646b1 x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y
We needed the physical address of the container in order to compute the
offset within the relocated ramdisk. And we did this by doing __pa() on
the virtual address.

However, __pa() does checks whether the physical address is within
PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail
if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address
which *doesn't* have the randomization offset into a function which uses
PAGE_OFFSET which *does* have that offset.

This makes this check fire:

	VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x));
			^^^^^^

due to the randomization offset.

The fix is as simple as using __pa_nodebug() because we do that
randomization offset accounting later in that function ourselves.

Reported-by: Bob Peterson <rpeterso@redhat.com>
Tested-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm <linux-mm@kvack.org>
Cc: stable@vger.kernel.org # 4.9
Link: http://lkml.kernel.org/r/20161027123623.j2jri5bandimboff@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-28 10:29:59 +02:00
Arnd Bergmann 8ff0513bdc mtd: mtk: avoid warning in mtk_ecc_encode
When building with -Wmaybe-uninitialized, gcc produces a silly false positive
warning for the mtk_ecc_encode function:

drivers/mtd/nand/mtk_ecc.c: In function 'mtk_ecc_encode':
drivers/mtd/nand/mtk_ecc.c:402:15: error: 'val' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The function for some reason contains a double byte swap on big-endian
builds to get the OOB data into the correct order again, and is written
in a slightly confusing way.

Using a simple memcpy32_fromio() to read the data simplifies it a lot
so it becomes more readable and produces no warning. However, the
output might not have 32-bit alignment, so we have to use another
memcpy to avoid taking alignment faults or writing beyond the end
of the array.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: RogerCC Lin <rogercc.lin@mediatek.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2016-10-28 10:21:23 +02:00
Boris Brezillon 73f907fd5f mtd: nand: Fix data interface configuration logic
When changing from one data interface setting to another, one has to
ensure a specific sequence which is described in the ONFI spec.

One of these constraints is that the CE line has go high after a reset
before a command can be sent with the new data interface setting, which
is not guaranteed by the current implementation.

Rework the nand_reset() function and all the call sites to make sure the
CE line is asserted and released when required.

Also make sure to actually apply the new data interface setting on the
first die.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Fixes: d8e725dd83 ("mtd: nand: automate NAND timings selection")
Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
2016-10-28 09:58:36 +02:00
Fabio Estevam ce93bedb5e mtd: nand: gpmi: disable the clocks on errors
We should disable the previously enabled GPMI clocks in the error paths.

Also, when gpmi_enable_clk() fails simply return the error
code immediately rather than jumping to to the 'err_out' label.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Acked-by: Han Xu <han.xu@nxp.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2016-10-28 09:58:05 +02:00
Linus Torvalds 14970f204b Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "20 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  drivers/misc/sgi-gru/grumain.c: remove bogus 0x prefix from printk
  cris/arch-v32: cryptocop: print a hex number after a 0x prefix
  ipack: print a hex number after a 0x prefix
  block: DAC960: print a hex number after a 0x prefix
  fs: exofs: print a hex number after a 0x prefix
  lib/genalloc.c: start search from start of chunk
  mm: memcontrol: do not recurse in direct reclaim
  CREDITS: update credit information for Martin Kepplinger
  proc: fix NULL dereference when reading /proc/<pid>/auxv
  mm: kmemleak: ensure that the task stack is not freed during scanning
  lib/stackdepot.c: bump stackdepot capacity from 16MB to 128MB
  latent_entropy: raise CONFIG_FRAME_WARN by default
  kconfig.h: remove config_enabled() macro
  ipc: account for kmem usage on mqueue and msg
  mm/slab: improve performance of gathering slabinfo stats
  mm: page_alloc: use KERN_CONT where appropriate
  mm/list_lru.c: avoid error-path NULL pointer deref
  h8300: fix syscall restarting
  kcov: properly check if we are in an interrupt
  mm/slab: fix kmemcg cache creation delayed issue
2016-10-27 19:58:39 -07:00
Dave Airlie 1cfa126c52 Merge branch 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Two sets of amdgpu fixes as I missed one set.
* 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux: (23 commits)
  drm/amd/powerplay: fix bug get wrong evv voltage of Polaris.
  drm/amdgpu/si_dpm: workaround for SI kickers
  drm/radeon/si_dpm: workaround for SI kickers
  drm/amdgpu: fix s3 resume back, uvd dpm randomly can't disable.
  drm/radeon: drop register readback in cayman_cp_int_cntl_setup
  drm/amdgpu/vce3: only enable 3 rings on new enough firmware (v2)
  drm/amdgpu: fix fence slab teardown
  drm/amdgpu: update kernel-doc for some functions
  drm/amdgpu: fix a vm_flush fence leak
  drm/amdgpu: fix sched fence slab teardown
  Revert "drm/radeon: fix DP link training issue with second 4K monitor"
  drm/amdgpu/dpm: flush any thermal work on fini
  drm/amdgpu: cancel reset work on fini
  drm/amd/powerplay: don't give up if DPM is already running
  drm/amd/powerplay: fix static checker warning in process_pptables_v1_0.c
  drm/amdgpu: avoid drm error log during S3 on RHEL7.3
  drm/amdgpu: explicitly set pg_flags for ST
  drm/amdgpu/st: move ATC CG golden init from gfx to mc
  drm/amd/amdgpu: expose max engine and memory clock for powerplay enabled case
  drm/amdgpu: move atom scratch register save/restore to common code
  ...
2016-10-28 12:25:01 +10:00
Dave Airlie aa72c26c2b Merge tag 'drm-misc-fixes-2016-10-27' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
Set of drm core fixes.

Hopefully fixes a bug in MST unplugs in fbdev.

* tag 'drm-misc-fixes-2016-10-27' of git://anongit.freedesktop.org/git/drm-misc:
  drm/dp/mst: Check peer device type before attempting EDID read
  drm/dp/mst: Clear port->pdt when tearing down the i2c adapter
  drm/fb-helper: Keep references for the current set of used connectors
  drm: Don't force all planes to be added to the state due to zpos
  drm/fb-helper: Fix connector ref leak on error
  drm/fb-helper: Don't call dirty callback for untouched clips
  drm: Release reference from blob lookup after replacing property
2016-10-28 12:24:14 +10:00
Dimitri Sivanich 8e819101ce drivers/misc/sgi-gru/grumain.c: remove bogus 0x prefix from printk
Would like to have this be a decimal number.

Link: http://lkml.kernel.org/r/20161026134746.GA30169@sgi.com
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Uwe Kleine-König 17a8893956 cris/arch-v32: cryptocop: print a hex number after a 0x prefix
It makes the result hard to interpret correctly if a base 10 number is
prefixed by 0x.  So change to a hex number.

Link: http://lkml.kernel.org/r/20161026125658.25728-6-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Uwe Kleine-König 9105585d13 ipack: print a hex number after a 0x prefix
It makes the result hard to interpret correctly if a base 10 number is
prefixed by 0x.  So change to a hex number.

Link: http://lkml.kernel.org/r/20161026125658.25728-4-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Cc: Jens Taprogge <jens.taprogge@taprogge.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Uwe Kleine-König ee52c44dee block: DAC960: print a hex number after a 0x prefix
It makes the message hard to interpret correctly if a base 10 number is
prefixed by 0x.  So change to a hex number.

Link: http://lkml.kernel.org/r/20161026125658.25728-3-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Uwe Kleine-König 14f947c87a fs: exofs: print a hex number after a 0x prefix
It makes the message hard to interpret correctly if a base 10 number is
prefixed by 0x.  So change to a hex number.

Link: http://lkml.kernel.org/r/20161026125658.25728-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Boaz Harrosh <ooo@electrozaur.com>
Cc: Benny Halevy <bhalevy@primarydata.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Daniel Mentz 62e931fac4 lib/genalloc.c: start search from start of chunk
gen_pool_alloc_algo() iterates over the chunks of a pool trying to find
a contiguous block of memory that satisfies the allocation request.

The shortcut

	if (size > atomic_read(&chunk->avail))
		continue;

makes the loop skip over chunks that do not have enough bytes left to
fulfill the request.  There are two situations, though, where an
allocation might still fail:

(1) The available memory is not contiguous, i.e.  the request cannot
    be fulfilled due to external fragmentation.

(2) A race condition.  Another thread runs the same code concurrently
    and is quicker to grab the available memory.

In those situations, the loop calls pool->algo() to search the entire
chunk, and pool->algo() returns some value that is >= end_bit to
indicate that the search failed.  This return value is then assigned to
start_bit.  The variables start_bit and end_bit describe the range that
should be searched, and this range should be reset for every chunk that
is searched.  Today, the code fails to reset start_bit to 0.  As a
result, prefixes of subsequent chunks are ignored.  Memory allocations
might fail even though there is plenty of room left in these prefixes of
those other chunks.

Fixes: 7f184275aa ("lib, Make gen_pool memory allocator lockless")
Link: http://lkml.kernel.org/r/1477420604-28918-1-git-send-email-danielmentz@google.com
Signed-off-by: Daniel Mentz <danielmentz@google.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Johannes Weiner 89a2848381 mm: memcontrol: do not recurse in direct reclaim
On 4.0, we saw a stack corruption from a page fault entering direct
memory cgroup reclaim, calling into btrfs_releasepage(), which then
tried to allocate an extent and recursed back into a kmem charge ad
nauseam:

  [...]
  btrfs_releasepage+0x2c/0x30
  try_to_release_page+0x32/0x50
  shrink_page_list+0x6da/0x7a0
  shrink_inactive_list+0x1e5/0x510
  shrink_lruvec+0x605/0x7f0
  shrink_zone+0xee/0x320
  do_try_to_free_pages+0x174/0x440
  try_to_free_mem_cgroup_pages+0xa7/0x130
  try_charge+0x17b/0x830
  memcg_charge_kmem+0x40/0x80
  new_slab+0x2d9/0x5a0
  __slab_alloc+0x2fd/0x44f
  kmem_cache_alloc+0x193/0x1e0
  alloc_extent_state+0x21/0xc0
  __clear_extent_bit+0x2b5/0x400
  try_release_extent_mapping+0x1a3/0x220
  __btrfs_releasepage+0x31/0x70
  btrfs_releasepage+0x2c/0x30
  try_to_release_page+0x32/0x50
  shrink_page_list+0x6da/0x7a0
  shrink_inactive_list+0x1e5/0x510
  shrink_lruvec+0x605/0x7f0
  shrink_zone+0xee/0x320
  do_try_to_free_pages+0x174/0x440
  try_to_free_mem_cgroup_pages+0xa7/0x130
  try_charge+0x17b/0x830
  mem_cgroup_try_charge+0x65/0x1c0
  handle_mm_fault+0x117f/0x1510
  __do_page_fault+0x177/0x420
  do_page_fault+0xc/0x10
  page_fault+0x22/0x30

On later kernels, kmem charging is opt-in rather than opt-out, and that
particular kmem allocation in btrfs_releasepage() is no longer being
charged and won't recurse and overrun the stack anymore.

But it's not impossible for an accounted allocation to happen from the
memcg direct reclaim context, and we needed to reproduce this crash many
times before we even got a useful stack trace out of it.

Like other direct reclaimers, mark tasks in memcg reclaim PF_MEMALLOC to
avoid recursing into any other form of direct reclaim.  Then let
recursive charges from PF_MEMALLOC contexts bypass the cgroup limit.

Link: http://lkml.kernel.org/r/20161025141050.GA13019@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Martin Kepplinger 8f72cb4ef9 CREDITS: update credit information for Martin Kepplinger
Content and employer changed.

Link: http://lkml.kernel.org/r/1477304102-28830-1-git-send-email-martin.kepplinger@ginzinger.com
Signed-off-by: Martin Kepplinger <martink@posteo.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Leon Yu 06b2849d10 proc: fix NULL dereference when reading /proc/<pid>/auxv
Reading auxv of any kernel thread results in NULL pointer dereferencing
in auxv_read() where mm can be NULL.  Fix that by checking for NULL mm
and bailing out early.  This is also the original behavior changed by
recent commit c531716785 ("proc: switch auxv to use of __mem_open()").

  # cat /proc/2/auxv
  Unable to handle kernel NULL pointer dereference at virtual address 000000a8
  Internal error: Oops: 17 [#1] PREEMPT SMP ARM
  CPU: 3 PID: 113 Comm: cat Not tainted 4.9.0-rc1-ARCH+ #1
  Hardware name: BCM2709
  task: ea3b0b00 task.stack: e99b2000
  PC is at auxv_read+0x24/0x4c
  LR is at do_readv_writev+0x2fc/0x37c
  Process cat (pid: 113, stack limit = 0xe99b2210)
  Call chain:
    auxv_read
    do_readv_writev
    vfs_readv
    default_file_splice_read
    splice_direct_to_actor
    do_splice_direct
    do_sendfile
    SyS_sendfile64
    ret_fast_syscall

Fixes: c531716785 ("proc: switch auxv to use of __mem_open()")
Link: http://lkml.kernel.org/r/1476966200-14457-1-git-send-email-chianglungyu@gmail.com
Signed-off-by: Leon Yu <chianglungyu@gmail.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Janis Danisevskis <jdanis@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Catalin Marinas 37df49f433 mm: kmemleak: ensure that the task stack is not freed during scanning
Commit 68f24b08ee ("sched/core: Free the stack early if
CONFIG_THREAD_INFO_IN_TASK") may cause the task->stack to be freed
during kmemleak_scan() execution, leading to either a NULL pointer fault
(if task->stack is NULL) or kmemleak accessing already freed memory.

This patch uses the new try_get_task_stack() API to ensure that the task
stack is not freed during kmemleak stack scanning.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=173901.

Fixes: 68f24b08ee ("sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK")
Link: http://lkml.kernel.org/r/1476266223-14325-1-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: CAI Qian <caiqian@redhat.com>
Tested-by: CAI Qian <caiqian@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: CAI Qian <caiqian@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Dmitry Vyukov 02754e0a48 lib/stackdepot.c: bump stackdepot capacity from 16MB to 128MB
KASAN uses stackdepot to memorize stacks for all kmalloc/kfree calls.
Current stackdepot capacity is 16MB (1024 top level entries x 4 pages on
second level).  Size of each stack is (num_frames + 3) * sizeof(long).
Which gives us ~84K stacks.  This capacity was chosen empirically and it
is enough to run kernel normally.

However, when lots of configs are enabled and a fuzzer tries to maximize
code coverage, it easily hits the limit within tens of minutes.  I've
tested for long a time with number of top level entries bumped 4x
(4096).  And I think I've seen overflow only once.  But I don't have all
configs enabled and code coverage has not reached maximum yet.  So bump
it 8x to 8192.

Since we have two-level table, memory cost of this is very moderate --
currently the top-level table is 8KB, with this patch it is 64KB, which
is negligible under KASAN.

Here is some approx math.

128MB allows us to memorize ~670K stacks (assuming stack is ~200b).
I've grepped kernel for kmalloc|kfree|kmem_cache_alloc|kmem_cache_free|
kzalloc|kstrdup|kstrndup|kmemdup and it gives ~60K matches.  Most of
alloc/free call sites are reachable with only one stack.  But some
utility functions can have large fanout.  Assuming average fanout is 5x,
total number of alloc/free stacks is ~300K.

Link: http://lkml.kernel.org/r/1476458416-122131-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Baozeng Ding <sploving1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Kees Cook 0e07f663c9 latent_entropy: raise CONFIG_FRAME_WARN by default
When building with the latent_entropy plugin, set the default
CONFIG_FRAME_WARN to 2048, since some __init functions have many basic
blocks that, when instrumented by the latent_entropy plugin, grow beyond
1024 byte stack size on 32-bit builds.

Link: http://lkml.kernel.org/r/20161018211216.GA39687@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Masahiro Yamada c0a0aba8e4 kconfig.h: remove config_enabled() macro
The use of config_enabled() is ambiguous.  For config options,
IS_ENABLED(), IS_REACHABLE(), etc.  will make intention clearer.
Sometimes config_enabled() has been used for non-config options because
it is useful to check whether the given symbol is defined or not.

I have been tackling on deprecating config_enabled(), and now is the
time to finish this work.

Some new users have appeared for v4.9-rc1, but it is trivial to replace
them:

 - arch/x86/mm/kaslr.c
  replace config_enabled() with IS_ENABLED() because
  CONFIG_X86_ESPFIX64 and CONFIG_EFI are boolean.

 - include/asm-generic/export.h
  replace config_enabled() with __is_defined().

Then, config_enabled() can be removed now.

Going forward, please use IS_ENABLED(), IS_REACHABLE(), etc. for config
options, and __is_defined() for non-config symbols.

Link: http://lkml.kernel.org/r/1476616078-32252-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Aristeu Rozanski 8c8d4d4520 ipc: account for kmem usage on mqueue and msg
When kmem accounting switched from account by default to only account if
flagged by __GFP_ACCOUNT, IPC mqueue and messages was left out.

The production use case at hand is that mqueues should be customizable
via sysctls in Docker containers in a Kubernetes cluster.  This can only
be safely allowed to the users of the cluster (without the risk that
they can cause resource shortage on a node, influencing other users'
containers) if all resources they control are bounded, i.e.  accounted
for.

Link: http://lkml.kernel.org/r/1476806075-1210-1-git-send-email-arozansk@redhat.com
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Reported-by: Stefan Schimanski <sttts@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Stefan Schimanski <sttts@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Aruna Ramakrishna 07a63c41fa mm/slab: improve performance of gathering slabinfo stats
On large systems, when some slab caches grow to millions of objects (and
many gigabytes), running 'cat /proc/slabinfo' can take up to 1-2
seconds.  During this time, interrupts are disabled while walking the
slab lists (slabs_full, slabs_partial, and slabs_free) for each node,
and this sometimes causes timeouts in other drivers (for instance,
Infiniband).

This patch optimizes 'cat /proc/slabinfo' by maintaining a counter for
total number of allocated slabs per node, per cache.  This counter is
updated when a slab is created or destroyed.  This enables us to skip
traversing the slabs_full list while gathering slabinfo statistics, and
since slabs_full tends to be the biggest list when the cache is large,
it results in a dramatic performance improvement.  Getting slabinfo
statistics now only requires walking the slabs_free and slabs_partial
lists, and those lists are usually much smaller than slabs_full.

We tested this after growing the dentry cache to 70GB, and the
performance improved from 2s to 5ms.

Link: http://lkml.kernel.org/r/1472517876-26814-1-git-send-email-aruna.ramakrishna@oracle.com
Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Joe Perches 1f84a18fc0 mm: page_alloc: use KERN_CONT where appropriate
Recent changes to printk require KERN_CONT uses to continue logging
messages.  So add KERN_CONT where necessary.

[akpm@linux-foundation.org: coding-style fixes]
Fixes: 4bcc595ccd ("printk: reinstate KERN_CONT for printing continuation lines")
Link: http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git.joe@perches.com
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:43 -07:00
Alexander Polakov 1bc11d70b5 mm/list_lru.c: avoid error-path NULL pointer deref
As described in https://bugzilla.kernel.org/show_bug.cgi?id=177821:

After some analysis it seems to be that the problem is in alloc_super().
In case list_lru_init_memcg() fails it goes into destroy_super(), which
calls list_lru_destroy().

And in list_lru_init() we see that in case memcg_init_list_lru() fails,
lru->node is freed, but not set NULL, which then leads list_lru_destroy()
to believe it is initialized and call memcg_destroy_list_lru().
memcg_destroy_list_lru() in turn can access lru->node[i].memcg_lrus,
which is NULL.

[akpm@linux-foundation.org: add comment]
Signed-off-by: Alexander Polakov <apolyakov@beget.ru>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:42 -07:00
Mark Rutland 2175358305 h8300: fix syscall restarting
Back in commit f56141e3e2 ("all arches, signal: move restart_block to
struct task_struct"), all architectures and core code were changed to
use task_struct::restart_block.  However, when h8300 support was
subsequently restored in v4.2, it was not updated to account for this,
and maintains thread_info::restart_block, which is not kept in sync.

This patch drops the redundant restart_block from thread_info, and moves
h8300 to the common one in task_struct, ensuring that syscall restarting
always works as expected.

Fixes: f56141e3e2 ("all arches, signal: move restart_block to struct task_struct")
Link: http://lkml.kernel.org/r/1476714934-11635-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: <stable@vger.kernel.org>	[4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:42 -07:00
Andrey Konovalov b274c0bb39 kcov: properly check if we are in an interrupt
in_interrupt() returns a nonzero value when we are either in an
interrupt or have bh disabled via local_bh_disable().  Since we are
interested in only ignoring coverage from actual interrupts, do a proper
check instead of just calling in_interrupt().

As a result of this change, kcov will start to collect coverage from
within local_bh_disable()/local_bh_enable() sections.

Link: http://lkml.kernel.org/r/1476115803-20712-1-git-send-email-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Nicolai Stange <nicstange@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: James Morse <james.morse@arm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:42 -07:00
Joonsoo Kim 86d9f48534 mm/slab: fix kmemcg cache creation delayed issue
There is a bug report that SLAB makes extreme load average due to over
2000 kworker thread.

  https://bugzilla.kernel.org/show_bug.cgi?id=172981

This issue is caused by kmemcg feature that try to create new set of
kmem_caches for each memcg.  Recently, kmem_cache creation is slowed by
synchronize_sched() and futher kmem_cache creation is also delayed since
kmem_cache creation is synchronized by a global slab_mutex lock.  So,
the number of kworker that try to create kmem_cache increases quietly.

synchronize_sched() is for lockless access to node's shared array but
it's not needed when a new kmem_cache is created.  So, this patch rules
out that case.

Fixes: 801faf0db8 ("mm/slab: lockless decision to grow cache")
Link: http://lkml.kernel.org/r/1475734855-4837-1-git-send-email-iamjoonsoo.kim@lge.com
Reported-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 18:43:42 -07:00
Dan Williams 52e73eb287 device-dax: fix percpu_ref_exit ordering
We need to wait until the percpu_ref is released before exit. Otherwise,
we sometimes lose the race and trigger this new warning that was added
in v4.9 (commit a67823c1ed "percpu-refcount: init ->confirm_switch
member properly"):

 WARNING: CPU: 0 PID: 3629 at lib/percpu-refcount.c:107 percpu_ref_exit+0x51/0x60
 [..]
 Call Trace:
  [<ffffffff814bf093>] dump_stack+0x85/0xc2
  [<ffffffff810b15db>] __warn+0xcb/0xf0
  [<ffffffff810b170d>] warn_slowpath_null+0x1d/0x20
  [<ffffffff814d70c1>] percpu_ref_exit+0x51/0x60
  [<ffffffffa005706a>] dax_pmem_percpu_exit+0x1a/0x50 [dax_pmem]
  [<ffffffff81615f1f>] devm_action_release+0xf/0x20

Cc: <stable@vger.kernel.org>
Fixes: ab68f26221 ("/dev/dax, pmem: direct access to persistent memory")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-10-27 17:04:05 -07:00
Linus Torvalds 67463e54be Allow KASAN and HOTPLUG_MEMORY to co-exist when doing build testing
No, KASAN may not be able to co-exist with HOTPLUG_MEMORY at runtime,
but for build testing there is no reason not to allow them together.

This hopefully means better build coverage and fewer embarrasing silly
problems like the one fixed by commit 9db4f36e82 ("mm: remove unused
variable in memory hotplug") in the future.

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 16:23:01 -07:00
Arnd Bergmann 867dfe3421 nvdimm: make CONFIG_NVDIMM_DAX 'bool'
A bugfix just tried to address a randconfig build problem and introduced
a variant of the same problem: with CONFIG_LIBNVDIMM=y and
CONFIG_NVDIMM_DAX=m, the nvdimm module now fails to link:

drivers/nvdimm/built-in.o: In function `to_nd_device_type':
bus.c:(.text+0x1b5d): undefined reference to `is_nd_dax'
drivers/nvdimm/built-in.o: In function `nd_region_notify_driver_action.constprop.2':
region_devs.c:(.text+0x6b6c): undefined reference to `is_nd_dax'
region_devs.c:(.text+0x6b8c): undefined reference to `to_nd_dax'
drivers/nvdimm/built-in.o: In function `nd_region_probe':
region.c:(.text+0x70f3): undefined reference to `nd_dax_create'
drivers/nvdimm/built-in.o: In function `mode_show':
namespace_devs.c:(.text+0xa196): undefined reference to `is_nd_dax'
drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe':
(.text+0xa55f): undefined reference to `is_nd_dax'
drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe':
(.text+0xa56e): undefined reference to `to_nd_dax'

This reverts the earlier fix, making NVDIMM_DAX a 'bool' option again
as it should be (it gets linked into the libnvdimm module). To fix
the original problem, I'm adding a dependency on LIBNVDIMM to
DEV_DAX_PMEM, which ensures we can't have that one built-in if the
rest is a module.

Fixes: 4e65e9381c ("/dev/dax: fix Kconfig dependency build breakage")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-10-27 16:16:21 -07:00
Linus Torvalds 9db4f36e82 mm: remove unused variable in memory hotplug
When I removed the per-zone bitlock hashed waitqueues in commit
9dcb8b685f ("mm: remove per-zone hashtable of bitlock waitqueues"), I
removed all the magic hotplug memory initialization of said waitqueues
too.

But when I actually _tested_ the resulting build, I stupidly assumed
that "allmodconfig" would enable memory hotplug.  And it doesn't,
because it enables KASAN instead, which then disables hotplug memory
support.

As a result, my build test of the per-zone waitqueues was totally
broken, and I didn't notice that the compiler warns about the now unused
iterator variable 'i'.

I guess I should be happy that that seems to be the worst breakage from
my clearly horribly failed test coverage.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 15:49:12 -07:00
Linus Torvalds 4e68af0b06 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "I2C has some driver bugfixes, module autoload fixes, and driver
  enablement on some architectures"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: imx: defer probe if bus recovery GPIOs are not ready
  i2c: designware: Avoid aborted transfers with fast reacting I2C slaves
  i2c: i801: Fix I2C Block Read on 8-Series/C220 and later
  i2c: xgene: Avoid dma_buffer overrun
  i2c: digicolor: Fix module autoload
  i2c: xlr: Fix module autoload for OF registration
  i2c: xlp9xx: Fix module autoload
  i2c: jz4780: Fix module autoload
  i2c: allow configuration of imx driver for ColdFire architecture
  i2c: mark device nodes only in case of successful instantiation
  i2c: rk3x: Give the tuning value 0 during rk3x_i2c_v0_calc_timings
  i2c: hix5hd2: allow build with ARCH_HISI
2016-10-27 15:06:29 -07:00
Linus Torvalds 7f2145b0d0 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal updates from Zhang Rui:
 "The latest Thermal Management updates for v4.9-rc3:

   - Fix a regression introduced by commit
     b721ca0d19(thermal/powerclamp: remove cpu whitelist), that
     powerclamp driver checks cpu support in a wrong way. From: Eric
     Ernst.

   - Fix a problem that intel_pch_thermal driver misses passive trip
     point when the PCH thermal device has an ACPI companion device
     associated. From: Srinivas Pandruvada.

   - Add missing support for Haswell PCH thermal sensor. From: Srinivas
     Pandruvada"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  thermal/powerclamp: correct cpu support check
  thermal: intel_pch_thermal: Enable Haswell PCH
  thermal: intel_pch_thermal: Add an ACPI passive trip
2016-10-27 14:33:08 -07:00
Linus Torvalds 55bea71ed5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "A few more s390 patches for 4.9:
   - a fix for an overflow in the dasd driver reported by UBSAN
   - fix a regression and add hotplug memory to the zone movable again
   - add ignore defines for the pkey system calls
   - fix the ouput of the merged stack tracer
   - replace printk with pr_cont in arch/s390 where appropriate
   - remove the arch specific return_address function again
   - ignore reserved channel paths at boot time
   - add a missing hugetlb_bad_size call to the arch backend"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: fix zone calculation in arch_add_memory()
  s390/dumpstack: use pr_cont within show_stack and die
  s390/dumpstack: get rid of return_address again
  s390/disassambler: use pr_cont where appropriate
  s390/dumpstack: use pr_cont where appropriate
  s390/dumpstack: restore reliable indicator for call traces
  s390/mm: use hugetlb_bad_size()
  s390/cio: don't register chpids in reserved state
  s390: ignore pkey system calls
  s390/dasd: avoid undefined behaviour
2016-10-27 14:16:30 -07:00
Huaibin Wang 599b076d15 i40e: fix call of ndo_dflt_bridge_getlink()
Order of arguments is wrong.
The wrong code has been introduced by commit 7d4f8d871a, but is compiled
only since commit 9df70b6641.

Note that this may break netlink dumps.

Fixes: 9df70b6641 ("i40e: Remove incorrect #ifdef's")
Fixes: 7d4f8d871a ("switchdev; add VLAN support for port's bridge_getlink")
CC: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Huaibin Wang <huaibin.wang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-10-27 14:12:52 -07:00
Jamal Hadi Salim 9ee7837449 net sched filters: fix notification of filter delete with proper handle
Daniel says:

While trying out [1][2], I noticed that tc monitor doesn't show the
correct handle on delete:

$ tc monitor
qdisc clsact ffff: dev eno1 parent ffff:fff1
filter dev eno1 ingress protocol all pref 49152 bpf handle 0x2a [...]
deleted filter dev eno1 ingress protocol all pref 49152 bpf handle 0xf3be0c80

some context to explain the above:
The user identity of any tc filter is represented by a 32-bit
identifier encoded in tcm->tcm_handle. Example 0x2a in the bpf filter
above. A user wishing to delete, get or even modify a specific filter
uses this handle to reference it.
Every classifier is free to provide its own semantics for the 32 bit handle.
Example: classifiers like u32 use schemes like 800:1:801 to describe
the semantics of their filters represented as hash table, bucket and
node ids etc.
Classifiers also have internal per-filter representation which is different
from this externally visible identity. Most classifiers set this
internal representation to be a pointer address (which allows fast retrieval
of said filters in their implementations). This internal representation
is referenced with the "fh" variable in the kernel control code.

When a user successfuly deletes a specific filter, by specifying the correct
tcm->tcm_handle, an event is generated to user space which indicates
which specific filter was deleted.

Before this patch, the "fh" value was sent to user space as the identity.
As an example what is shown in the sample bpf filter delete event above
is 0xf3be0c80. This is infact a 32-bit truncation of 0xffff8807f3be0c80
which happens to be a 64-bit memory address of the internal filter
representation (address of the corresponding filter's struct cls_bpf_prog);

After this patch the appropriate user identifiable handle as encoded
in the originating request tcm->tcm_handle is generated in the event.
One of the cardinal rules of netlink rules is to be able to take an
event (such as a delete in this case) and reflect it back to the
kernel and successfully delete the filter. This patch achieves that.

Note, this issue has existed since the original TC action
infrastructure code patch back in 2004 as found in:
https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/

[1] http://patchwork.ozlabs.org/patch/682828/
[2] http://patchwork.ozlabs.org/patch/682829/

Fixes: 4e54c4816bfe ("[NET]: Add tc extensions infrastructure.")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27 17:12:33 -04:00
Linus Torvalds 7618c6a17f (Quoting from the MAINTAINERS commit:)
Being a Linux kernel maintainer has been my proudest professional
 accomplishment, spanning the last 19 years.  But now we have a surfeit
 of excellent hackers, and I can hand this over without regret.
 
 I'll still be around as co-maintainer for another cycle, but Jessica
 is now the one to convince if you want your patches applied.  She
 rocks, and is far more timely than me too!
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYD+4/AAoJENkgDmzRrbjx4P4P/2sfq3YIeSMrxcsk7wduWNsR
 U+bzrclIkMVnOTGH+t5awFB/vg0VFCywzUs4rqtZ7k4Wb7Ix7MmQD+5Xoe22SIju
 r8uHpJokzkHW1sC+JhOy3rJXivizmdgaVcHZ7U67gMhSD11fZOvIeoa99vs8xzZp
 CC1r0BcllV6o+cNbPoNNm94Ot8GU0//VB3HcwxQoONxGy6bHhbht0ZayPdwV+ARd
 7e8g5PYAEEg9aWDMXNfV/nBEjM1OeiEm9rWuyCJtv5H3eq63Y338KEsTuHQs9GCD
 31+ReFlSSL1HR6+3WCglN9o+rmcwOls0mtEKrAlv7M3tE8M8A28BDMSdwfx7QntD
 PRY9L9h8GWEpf09kM+AffFmrUObhVhpMpIMZuFchuEF/iiZFkzPRrAorVxvmBgaR
 XLTK3oVd2leziIY4uI3JyDEJuQrW3TSOSTcp9AYDteMh1wX1fj9QMO4ayqmeMH+b
 j11mKs2c1guQuWP2nl5vRRZ25vhFH1WdkittAKHzzytxKGdCKOlQdOhPXFQvVc71
 cvx256dTlgjG8HDUhUgNyQdacj6sOMQzk6zxcQkc2pW6edDNBQuC42Tzkb+ElJgv
 rcX7r0ZmoOefGyakx9C0mVkkyNVahWnF5eB0GLebC4ksbMqS5WhrPcHEB7Ukcg9J
 oAxz6kLsGMdI0JOT/3wQ
 =8rM2
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module maintainership updates from Rusty Russell:
 "(Quoting from the MAINTAINERS commit:)

  Being a Linux kernel maintainer has been my proudest professional
  accomplishment, spanning the last 19 years. But now we have a surfeit
  of excellent hackers, and I can hand this over without regret.

  I'll still be around as co-maintainer for another cycle, but Jessica
  is now the one to convince if you want your patches applied. She
  rocks, and is far more timely than me too!"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  MAINTAINERS: Begin module maintainer transition
2016-10-27 14:12:04 -07:00
Guilherme G Piccoli 4c95aa5d8f i40e: disable MSI-X interrupts if we cannot reserve enough vectors
If we fail on allocating enough MSI-X interrupts, we should disable
them since they were previously enabled in this point of code.

Not disabling them can lead to WARN_ON() being triggered and subsequent
failure in enabling MSI as a fallback; the below message was shown without
this patch while we played with interrupt allocation in i40e driver:

[ 21.461346] sysfs: cannot create duplicate filename '/devices/pci0007:00/0007:00:00.0/0007:01:00.3/msi_irqs'
[ 21.461459] ------------[ cut here ]------------
[ 21.461514] WARNING: CPU: 64 PID: 1155 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x88/0xc0

Also, we noticed that without this patch, if we modprobe the module without
enough MSI-X interrupts (triggering the above warning), unload the module
and re-load it again, we got a crash on the system.

Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-10-27 14:10:49 -07:00
David Ertman ea6acb7ef7 i40e: Fix configure TCs after initial DCB disable
in commit a036244c06 a fix
was put into place to avoid a kernel panic when a non-
supported traffic class configuration was put into place
and then lldp was enabled/disabled on the link partner
switch.  This fix caused it to be necessary to
unload/reload the driver to reenable DCB once a supported
TC config was in place.

The root cause of the original panic was that the function
i40e_pf_get_default_tc was allowing for a default TC other
than TC 0, and only TC 0 is supported as a default.

This patch removes the get_default_tc function and replaces
it with a #define since there is only one TC supported as
a default.

Change-Id: I448371974e946386d0a7718d73668b450b7c72ef
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Ronald Bynoe <ronald.j.bynoe@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-10-27 14:08:56 -07:00
Emil Tantilov a3b8cb1f84 ixgbe: fix panic when using macvlan with l2-fwd-offload enabled
Fix NULL pointer dereference in the case where a macvlan interface is
brought up while the PF is still down:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffffa0170fb2>] ixgbe_alloc_rx_buffers+0x42/0x1a0 [ixgbe]

Call Trace:
[<ffffffffa017336b>] ixgbe_configure_rx_ring+0x2eb/0x3d0 [ixgbe]
[<ffffffffa0173811>] ixgbe_fwd_ring_up+0xd1/0x380 [ixgbe]
[<ffffffffa0179709>] ixgbe_fwd_add+0x149/0x230 [ixgbe]
[<ffffffffa0113480>] macvlan_open+0x260/0x2b0 [macvlan]

Reported-by: Matthew Garrett <mjg59@coreos.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-10-27 14:07:21 -07:00
Colin Ian King c121f72a66 net: bgmac: fix spelling mistake: "connecton" -> "connection"
trivial fix to spelling mistake in dev_err message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27 16:37:24 -04:00
Arnd Bergmann bc72f3dd89 flow_dissector: fix vlan tag handling
gcc warns about an uninitialized pointer dereference in the vlan
priority handling:

net/core/flow_dissector.c: In function '__skb_flow_dissect':
net/core/flow_dissector.c:281:61: error: 'vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]

As pointed out by Jiri Pirko, the variable is never actually used
without being initialized first as the only way it end up uninitialized
is with skb_vlan_tag_present(skb)==true, and that means it does not
get accessed.

However, the warning hints at some related issues that I'm addressing
here:

- the second check for the vlan tag is different from the first one
  that tests the skb for being NULL first, causing both the warning
  and a possible NULL pointer dereference that was not entirely fixed.
- The same patch that introduced the NULL pointer check dropped an
  earlier optimization that skipped the repeated check of the
  protocol type
- The local '_vlan' variable is referenced through the 'vlan' pointer
  but the variable has gone out of scope by the time that it is
  accessed, causing undefined behavior

Caching the result of the 'skb && skb_vlan_tag_present(skb)' check
in a local variable allows the compiler to further optimize the
later check. With those changes, the warning also disappears.

Fixes: 3805a938a6 ("flow_dissector: Check skb for VLAN only if skb specified.")
Fixes: d5709f7ab7 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Eric Garver <e@erig.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27 16:36:03 -04:00
David Ahern d5d32e4b76 net: ipv6: Do not consider link state for nexthop validation
Similar to IPv4, do not consider link state when validating next hops.

Currently, if the link is down default routes can fail to insert:
 $ ip -6 ro add vrf blue default via 2100:2::64 dev eth2
 RTNETLINK answers: No route to host

With this patch the command succeeds.

Fixes: 8c14586fc3 ("net: ipv6: Use passed in table for nexthop lookups")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27 16:33:12 -04:00