Commit Graph

5448 Commits

Author SHA1 Message Date
Gautham R. Shenoy bd00a240dc powerpc/powernv: Fix restore of SPRs upon wake up from hypervisor state loss
pnv_wakeup_tb_loss() currently expects cr4 to be "eq" if the CPU is
waking up from a complete hypervisor state loss. Hence, it currently
restores the SPR contents only if cr4 is "eq".

However, after commit bcef83a00d ("powerpc/powernv: Add platform
support for stop instruction"), on ISA v3.0 CPUs, the function
pnv_restore_hyp_resource() sets cr4 to contain the result of the
comparison between the state the CPU has woken up from and the first
deep stop state before calling pnv_wakeup_tb_loss().

Thus if the CPU woke up from a state that is deeper than the first
deep stop state, cr4 will have "gt" set and hence, pnv_wakeup_tb_loss()
will fail to restore the SPRs on waking up from such a state.

Fix the code in pnv_wakeup_tb_loss() to restore the SPR states when cr4
is "eq" or "gt".

Fixes: bcef83a00d ("powerpc/powernv: Add platform support for stop instruction")
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Shreyas B. Prabhu <shreyasbp@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-12 12:45:50 +10:00
Paolo Bonzini 3f25777499 powerpc: move hmi.c to arch/powerpc/kvm/
hmi.c functions are unused unless sibling_subcore_state is nonzero, and
that in turn happens only if KVM is in use.  So move the code to
arch/powerpc/kvm/, putting it under CONFIG_KVM_BOOK3S_HV_POSSIBLE
rather than CONFIG_PPC_BOOK3S_64.  The sibling_subcore_state is also
included in struct paca_struct only if KVM is supported by the kernel.

Cc: Daniel Axtens <dja@axtens.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: kvm-ppc@vger.kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-09 16:18:07 +10:00
Cyril Bur 78a3e8889b powerpc: signals: Discard transaction state from signal frames
Userspace can begin and suspend a transaction within the signal
handler which means they might enter sys_rt_sigreturn() with the
processor in suspended state.

sys_rt_sigreturn() wants to restore process context (which may have
been in a transaction before signal delivery). To do this it must
restore TM SPRS. To achieve this, any transaction initiated within the
signal frame must be discarded in order to be able to restore TM SPRs
as TM SPRs can only be manipulated non-transactionally..
>From the PowerPC ISA:
  TM Bad Thing Exception [Category: Transactional Memory]
   An attempt is made to execute a mtspr targeting a TM register in
   other than Non-transactional state.

Not doing so results in a TM Bad Thing:
[12045.221359] Kernel BUG at c000000000050a40 [verbose debug info unavailable]
[12045.221470] Unexpected TM Bad Thing exception at c000000000050a40 (msr 0x201033)
[12045.221540] Oops: Unrecoverable exception, sig: 6 [#1]
[12045.221586] SMP NR_CPUS=2048 NUMA PowerNV
[12045.221634] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
 nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter
 ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm
 uio_pdrv_genirq ipmi_powernv uio powernv_rng ipmi_msghandler autofs4 ses enclosure
 scsi_transport_sas bnx2x ipr mdio libcrc32c
[12045.222167] CPU: 68 PID: 6178 Comm: sigreturnpanic Not tainted 4.7.0 #34
[12045.222224] task: c0000000fce38600 ti: c0000000fceb4000 task.ti: c0000000fceb4000
[12045.222293] NIP: c000000000050a40 LR: c0000000000163bc CTR: 0000000000000000
[12045.222361] REGS: c0000000fceb7ac0 TRAP: 0700   Not tainted (4.7.0)
[12045.222418] MSR: 9000000300201033 <SF,HV,ME,IR,DR,RI,LE,TM[SE]> CR: 28444280  XER: 20000000
[12045.222625] CFAR: c0000000000163b8 SOFTE: 0 PACATMSCRATCH: 900000014280f033
GPR00: 01100000b8000001 c0000000fceb7d40 c00000000139c100 c0000000fce390d0
GPR04: 900000034280f033 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 b000000000001033 0000000000000001 0000000000000000
GPR12: 0000000000000000 c000000002926400 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 00003ffff98cadd0 00003ffff98cb470 0000000000000000
GPR28: 900000034280f033 c0000000fceb7ea0 0000000000000001 c0000000fce390d0
[12045.223535] NIP [c000000000050a40] tm_restore_sprs+0xc/0x1c
[12045.223584] LR [c0000000000163bc] tm_recheckpoint+0x5c/0xa0
[12045.223630] Call Trace:
[12045.223655] [c0000000fceb7d80] [c000000000026e74] sys_rt_sigreturn+0x494/0x6c0
[12045.223738] [c0000000fceb7e30] [c0000000000092e0] system_call+0x38/0x108
[12045.223806] Instruction dump:
[12045.223841] 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
[12045.223955] 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020
[12045.224074] ---[ end trace cb8002ee240bae76 ]---

It isn't clear exactly if there is really a use case for userspace
returning with a suspended transaction, however, doing so doesn't (on
its own) constitute a bad frame. As such, this patch simply discards
the transactional state of the context calling the sigreturn and
continues.

Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-29 12:48:40 +10:00
Nicholas Piggin cc7786d3ee powerpc/tm: do not use r13 for tabort_syscall
tabort_syscall runs with RI=1, so a nested recoverable machine
check will load the paca into r13 and overwrite what we loaded
it with, because exceptions returning to privileged mode do not
restore r13.

Fixes: b4b56f9eca (powerpc/tm: Abort syscalls in active transactions)
Cc: stable@vger.kernel.org
Signed-off-by: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-29 12:47:56 +10:00
Josh Poimboeuf 9a7c348ba6 ftrace: Add return address pointer to ftrace_ret_stack
Storing this value will help prevent unwinders from getting out of sync
with the function graph tracer ret_stack.  Now instead of needing a
stateful iterator, they can compare the return address pointer to find
the right ret_stack entry.

Note that an array of 50 ftrace_ret_stack structs is allocated for every
task.  So when an arch implements this, it will add either 200 or 400
bytes of memory usage per task (depending on whether it's a 32-bit or
64-bit platform).

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a95cfcc39e8f26b89a430c56926af0bb217bc0a1.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:15:14 +02:00
Paolo Bonzini 7c379526d7 powerpc: move hmi.c to arch/powerpc/kvm/
hmi.c functions are unused unless sibling_subcore_state is nonzero, and
that in turn happens only if KVM is in use.  So move the code to
arch/powerpc/kvm/, putting it under CONFIG_KVM_BOOK3S_HV_POSSIBLE
rather than CONFIG_PPC_BOOK3S_64.  The sibling_subcore_state is also
included in struct paca_struct only if KVM is supported by the kernel.

Cc: Daniel Axtens <dja@axtens.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: kvm-ppc@vger.kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Nicholas Piggin a74599a504 powerpc/pseries: PACA save area fix for MCE vs MCE
MCE must not enable MSR_RI until PACA_EXMC is no longer being used.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Nicholas Piggin 3f3b5dc14c powerpc/pseries: PACA save area fix for general exception vs MCE
MCE must not use PACA_EXGEN. When a general exception enables MSR_RI,
that means SPRN_SRR[01] and SPRN_SPRG are no longer used. However the
PACA save area is still in use.
Acked-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Michael Ellerman 66443efa83 powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
When booting from an OpenFirmware which supports it, we use the
"ibm,client-architecture-support" firmware call to communicate
our capabilities to firmware.

The format of the structure we pass to firmware is specified in
PAPR (Power Architecture Platform Requirements), or the public version
LoPAPR (Linux on Power Architecture Platform Reference).

Referring to table 244 in LoPAPR v1.1, option vector 5 contains a 4 byte
field at bytes 17-20 for the "Platform Facilities Enable". This is
followed by a 1 byte field at byte 21 for "Sub-Processor Represenation
Level".

Comparing to the code, there we have the Platform Facilities
options (OV5_PFO_*) at byte 17, but we fail to pad that field out to its
full width of 4 bytes. This means the OV5_SUB_PROCESSORS option is
incorrectly placed at byte 18.

Fix it by adding zero bytes for bytes 18, 19, 20, and comment the bytes
to hopefully make it clearer in future.

As far as I'm aware nothing actually consumes this value at this time,
so the effect of this bug is nil in practice.

It does mean we've been incorrectly setting bit 15 of the "Platform
Facilities Enable" option for the past ~3 1/2 years, so we should avoid
allocating that bit to anything else in future.

Fixes: df77c79920 ("powerpc/pseries: Update ibm,architecture.vec for PAPR 2.7/POWER8")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Boqun Feng 19ab58d19e powerpc, hotplug: Avoid to touch non-existent cpumasks.
We observed a kernel oops when running a PPC guest with config NR_CPUS=4
and qemu option "-smp cores=1,threads=8":

[   30.634781] Unable to handle kernel paging request for data at
address 0xc00000014192eb17
[   30.636173] Faulting instruction address: 0xc00000000003e5cc
[   30.637069] Oops: Kernel access of bad area, sig: 11 [#1]
[   30.637877] SMP NR_CPUS=4 NUMA pSeries
[   30.638471] Modules linked in:
[   30.638949] CPU: 3 PID: 27 Comm: migration/3 Not tainted
4.7.0-07963-g9714b26 #1
[   30.640059] task: c00000001e29c600 task.stack: c00000001e2a8000
[   30.640956] NIP: c00000000003e5cc LR: c00000000003e550 CTR:
0000000000000000
[   30.642001] REGS: c00000001e2ab8e0 TRAP: 0300   Not tainted
(4.7.0-07963-g9714b26)
[   30.643139] MSR: 8000000102803033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22004084  XER: 00000000
[   30.644583] CFAR: c000000000009e98 DAR: c00000014192eb17 DSISR: 40000000 SOFTE: 0
GPR00: c00000000140a6b8 c00000001e2abb60 c0000000016dd300 0000000000000003
GPR04: 0000000000000000 0000000000000004 c0000000016e5920 0000000000000008
GPR08: 0000000000000004 c00000014192eb17 0000000000000000 0000000000000020
GPR12: c00000000140a6c0 c00000000ffffc00 c0000000000d3ea8 c00000001e005680
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 c00000001e6b3a00 0000000000000000 0000000000000001
GPR24: c00000001ff85138 c00000001ff85130 000000001eb6f000 0000000000000001
GPR28: 0000000000000000 c0000000017014e0 0000000000000000 0000000000000018
[   30.653882] NIP [c00000000003e5cc] __cpu_disable+0xcc/0x190
[   30.654713] LR [c00000000003e550] __cpu_disable+0x50/0x190
[   30.655528] Call Trace:
[   30.655893] [c00000001e2abb60] [c00000000003e550] __cpu_disable+0x50/0x190 (unreliable)
[   30.657280] [c00000001e2abbb0] [c0000000000aca0c] take_cpu_down+0x5c/0x100
[   30.658365] [c00000001e2abc10] [c000000000163918] multi_cpu_stop+0x1a8/0x1e0
[   30.659617] [c00000001e2abc60] [c000000000163cc0] cpu_stopper_thread+0xf0/0x1d0
[   30.660737] [c00000001e2abd20] [c0000000000d8d70] smpboot_thread_fn+0x290/0x2a0
[   30.661879] [c00000001e2abd80] [c0000000000d3fa8] kthread+0x108/0x130
[   30.662876] [c00000001e2abe30] [c000000000009968] ret_from_kernel_thread+0x5c/0x74
[   30.664017] Instruction dump:
[   30.664477] 7bde1f24 38a00000 787f1f24 3b600001 39890008 7d204b78 7d05e214 7d0b07b4
[   30.665642] 796b1f24 7d26582a 7d204a14 7d29f214 <7d4048a8> 7d4a3878 7d4049ad 40c2fff4
[   30.666854] ---[ end trace 32643b7195717741 ]---

The reason of this is that in __cpu_disable(), when we try to set the
cpu_sibling_mask or cpu_core_mask of the sibling CPUs of the disabled
one, we don't check whether the current configuration employs those
sibling CPUs(hw threads). And if a CPU is not employed by a
configuration, the percpu structures cpu_{sibling,core}_mask are not
allocated, therefore accessing those cpumasks will result in problems as
above.

This patch fixes this problem by adding an addition check on whether the
id is no less than nr_cpu_ids in the sibling CPU iteration code.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Paul Gortmaker 8a39b05f08 powerpc: migrate exception table users off module.h and onto extable.h
These files were only including module.h for exception table
related functions.  We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Mauricio Faria de Oliveira 2dd9c11b9d powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)
This patch leverages 'struct pci_host_bridge' from the PCI subsystem
in order to free the pci_controller only after the last reference to
its devices is dropped (avoiding an oops in pcibios_release_device()
if the last reference is dropped after pcibios_free_controller()).

The patch relies on pci_host_bridge.release_fn() (and .release_data),
which is called automatically by the PCI subsystem when the root bus
is released (i.e., the last reference is dropped).  Those fields are
set via pci_set_host_bridge_release() (e.g. in the platform-specific
implementation of pcibios_root_bridge_prepare()).

It introduces the 'pcibios_free_controller_deferred()' .release_fn()
and it expects .release_data to hold a pointer to the pci_controller.

The function implictly calls 'pcibios_free_controller()', so an user
must *NOT* explicitly call it if using the new _deferred() callback.

The functionality is enabled for pseries (although it isn't platform
specific, and may be used by cxl).

Details on not-so-elegant design choices:

 - Use 'pci_host_bridge.release_data' field as pointer to associated
   'struct pci_controller' so *not* to 'pci_bus_to_host(bridge->bus)'
   in pcibios_free_controller_deferred().

   That's because pci_remove_root_bus() sets 'host_bridge->bus = NULL'
   (so, if the last reference is released after pci_remove_root_bus()
   runs, which eventually reaches pcibios_free_controller_deferred(),
   that would hit a null pointer dereference).

   The cxl/vphb.c code calls pci_remove_root_bus(), and the cxl folks
   are interested in this fix.

Test-case #1 (hold references)

  # ls -ld /sys/block/sd* | grep -m1 0021:01:00.0
  <...> /sys/block/sdaa -> ../devices/pci0021:01/0021:01:00.0/<...>

  # ls -ld /sys/block/sd* | grep -m1 0021:01:00.1
  <...> /sys/block/sdab -> ../devices/pci0021:01/0021:01:00.1/<...>

  # cat >/dev/sdaa & pid1=$!
  # cat >/dev/sdab & pid2=$!

  # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
  Validating PHB DLPAR capability...yes.
  [  594.306719] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
  [  594.306738] pci_hp_remove_devices:    Removing 0021:01:00.0...
  ...
  [  598.236381] pci_hp_remove_devices:    Removing 0021:01:00.1...
  ...
  [  611.972077] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  611.972140] rpadlpar_io: slot PHB 33 removed

  # kill -9 $pid1
  # kill -9 $pid2
  [  632.918088] pcibios_free_controller_deferred: domain 33, dynamic 1

Test-case #2 (don't hold references)

  # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
  Validating PHB DLPAR capability...yes.
  [  916.357363] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
  [  916.357386] pci_hp_remove_devices:    Removing 0021:01:00.0...
  ...
  [  920.566527] pci_hp_remove_devices:    Removing 0021:01:00.1...
  ...
  [  933.955873] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  933.955977] pcibios_free_controller_deferred: domain 33, dynamic 1
  [  933.955999] rpadlpar_io: slot PHB 33 removed

Suggested-By: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> # cxl
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Nicholas Piggin b9a4a0d02c powerpc/vdso: Fix build rules to rebuild vdsos correctly
When using if_changed, we need to add FORCE as a dependency (see
Documentation/kbuild/makefiles.txt) otherwise we don't get command line
change checking amongst other things. This has resulted in vdsos not
being rebuilt when switching between big and little endian.

The vdso64/32ld commands have to be changed around to avoid pulling
FORCE into the linker command line (code copied from x86).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-10 23:04:12 +10:00
Benjamin Herrenschmidt 97f6e0cc35 powerpc/32: Fix crash during static key init
We cannot do those initializations from apply_feature_fixups() as
this function runs in a very restricted environment on 32-bit where
the kernel isn't running at its linked address and the PTRRELOC()
macro must be used for any global accesss.

Instead, split them into a separtate steup_feature_keys() function
which is called in a more suitable spot on ppc32.

Fixes: 309b315b6e ("powerpc: Call jump_label_init() in apply_feature_fixups()")
Reported-and-tested-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-10 19:41:58 +10:00
Benjamin Herrenschmidt f9cc1d1f80 powerpc: Update obsolete comment in setup_32.c about early_init()
We don't identify the machine type anymore...

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-10 19:38:02 +10:00
Benjamin Herrenschmidt 7d70c63c71 powerpc: Print the kernel load address at the end of prom_init()
This makes it easier to debug crashes that happen very early before
the kernel takes over Open Firmware by allowing us to relate the OF
reported crashing addresses to offsets within the kernel.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-10 19:37:43 +10:00
Cyril Bur c7a318ba86 powerpc/ptrace: Fix coredump since ptrace TM changes
Commit 8d460f6156 ("powerpc/process: Add the function
flush_tmregs_to_thread") added flush_tmregs_to_thread() and included
the assumption that it would only be called for a task which is not
current.

Although this is correct for ptrace, when generating a core dump, some
of the routines which call flush_tmregs_to_thread() are called. This
leads to a WARNing such as:

  Not expecting ptrace on self: TM regs may be incorrect
  ------------[ cut here ]------------
  WARNING: CPU: 123 PID: 7727 at arch/powerpc/kernel/process.c:1088 flush_tmregs_to_thread+0x78/0x80
  CPU: 123 PID: 7727 Comm: libvirtd Not tainted 4.8.0-rc1-gcc6x-g61e8a0d #1
  task: c000000fe631b600 task.stack: c000000fe63b0000
  NIP: c00000000001a1a8 LR: c00000000001a1a4 CTR: c000000000717780
  REGS: c000000fe63b3420 TRAP: 0700   Not tainted  (4.8.0-rc1-gcc6x-g61e8a0d)
  MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 28004222  XER: 20000000
  ...
  NIP [c00000000001a1a8] flush_tmregs_to_thread+0x78/0x80
  LR [c00000000001a1a4] flush_tmregs_to_thread+0x74/0x80
  Call Trace:
   flush_tmregs_to_thread+0x74/0x80 (unreliable)
   vsr_get+0x64/0x1a0
   elf_core_dump+0x604/0x1430
   do_coredump+0x5fc/0x1200
   get_signal+0x398/0x740
   do_signal+0x54/0x2b0
   do_notify_resume+0x98/0xb0
   ret_from_except_lite+0x70/0x74

So fix flush_tmregs_to_thread() to detect the case where it is called on
current, and a transaction is active, and in that case flush the TM regs
to the thread_struct.

This patch also moves flush_tmregs_to_thread() into ptrace.c as it is
only called from that file.

Fixes: 8d460f6156 ("powerpc/process: Add the function flush_tmregs_to_thread")
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Flesh out change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-10 16:34:20 +10:00
Mahesh Salgaonkar c74dd88e77 powerpc/book3s: Fix MCE console messages for unrecoverable MCE.
When machine check occurs with MSR(RI=0), it means MC interrupt is
unrecoverable and kernel goes down to panic path. But the console
message still shows it as recovered. This patch fixes the MCE console
messages.

Fixes: 36df96f8ac ("powerpc/book3s: Decode and save machine check event.")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 19:46:54 +10:00
Michael Ellerman 61e8a0d5a0 powerpc/pci: Fix endian bug in fixed PHB numbering
The recent commit 63a72284b1 ("powerpc/pci: Assign fixed PHB number
based on device-tree properties"), added code to read a 64-bit property
from the device tree, and if not found read a 32-bit property (reg).

There was a bug in the 32-bit case, on big endian machines, due to the
use of the 64-bit value to read the 32-bit property. The cast of &prop
means we end up writing to the high 32-bit of prop, leaving the low
32-bits containing whatever junk was on the stack.

If that junk value was non-zero, and < MAX_PHBS, we would end up using
it as the PHB id. This results in users seeing what appear to be random
PHB ids.

Fix it by reading into a u32 property and then assigning that to the
u64 value, letting the CPU do the correct conversions for us.

Fixes: 63a72284b1 ("powerpc/pci: Assign fixed PHB number based on device-tree properties")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 16:52:03 +10:00
Guilherme G. Piccoli 10560b9afc powerpc/eeh: Switch to conventional PCI address output in EEH log
This is a very minor/trivial fix for the output of PCI address on EEH
logs. The PCI address on "OF node" field currently is using ":" as a
separator for the function, but the usual separator is ".". This patch
changes the separator to dot, so the PCI address is printed as usual.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 16:52:03 +10:00
Guenter Roeck 546c4402c5 powerpc/vdso: Add missing include file
Some powerpc builds fail with the following buld error.

  In file included from ./arch/powerpc/include/asm/mmu_context.h:11:0,
                   from arch/powerpc/kernel/vdso.c:28:
  arch/powerpc/include/asm/cputhreads.h: In function 'get_tensr':
  arch/powerpc/include/asm/cputhreads.h:101:2: error:
  	implicit declaration of function 'cpu_has_feature'

Fixes: b92a226e52 ("powerpc: Move cpu_has_feature() to a separate file")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 16:52:00 +10:00
Mahesh Salgaonkar bc14c49195 powerpc/powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.
The current implementation of MCE early handling modifies CR0/1 registers
without saving its old values. Fix this by moving early check for
powersaving mode to machine_check_handle_early().

The power architecture 2.06 or later allows the possibility of getting
machine check while in nap/sleep/winkle. The last bit of HSPRG0 is set
to 1, if thread is woken up from winkle. Hence, clear the last bit of
HSPRG0 (r13) before MCE handler starts using it as paca pointer.

Also, the current code always puts the thread into nap state irrespective
of whatever idle state it woke up from. Fix that by looking at
paca->thread_idle_state and put the thread back into same state where it
came from.

Fixes: 1c51089f77 ("powerpc/book3s: Return from interrupt if coming from evil context.")
Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 16:51:35 +10:00
Mahesh Salgaonkar 98d8821a47 powerpc/powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h
Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h so that MCE handler changes
in subsequent patch can use it.

No functionality change.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 14:50:20 +10:00
Mahesh Salgaonkar e325d76f8b powerpc/powernv: Load correct TOC pointer while waking up from winkle.
The function pnv_restore_hyp_resource() loads the TOC into r2 from
the invalid PACA pointer before fixing r13 value. This do not affect
POWER ISA 3.0 but it does have an impact on POWER ISA 2.07 or less
leading CPU to get stuck forever.

	login: [  471.830433] Processor 120 is stuck.

This can be easily reproducible using following steps:
- Turn off SMT
	$ ppc64_cpu --smt=off
- offline/online any online cpu (Thread 0 of any core which is online)
	$ echo 0 > /sys/devices/system/cpu/cpu<num>/online
	$ echo 1 > /sys/devices/system/cpu/cpu<num>/online

For POWER ISA 2.07 or less, the last bit of HSPRG0 is set indicating
that thread is waking up from winkle. Hence, the last bit of HSPRG0(r13)
needs to be clear before accessing it as PACA to avoid loading invalid
values from invalid PACA pointer.

Fix this by loading TOC after r13 register is corrected.

Fixes: bcef83a00d ("powerpc/powernv: Add platform support for stop instruction")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 14:50:19 +10:00
Al Viro 9445aa1a30 ppc: move exports to definitions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-08-07 23:50:09 -04:00
Linus Torvalds 6c84239d59 RTC for 4.8
Cleanups:
  - huge cleanup of rtc-generic and char/genrtc this allowed to cleanup rtc-cmos,
   rtc-sh, rtc-m68k, rtc-powerpc and rtc-parisc
  - move mn10300 to rtc-cmos
 
 Subsystem:
  - fix wakealarms after hibernate
  - multiples fixes for rctest
  - simplify implementations of .read_alarm
 
 New drivers:
  - Maxim MAX6916
 
 Drivers:
  - ds1307: fix weekday
  - m41t80: add wakeup support
  - pcf85063: add support for PCF85063A variant
  - rv8803: extend i2c fix and other fixes
  - s35390a: fix alarm reading, this fixes instant reboot after shutdown for QNAP
    TS-41x
  - s3c: clock fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJXokhIAAoJENiigzvaE+LCZqQP+wWzintN/N1u3dKiVB7iSdwq
 +S/jAXD9wW8OK9PI60/YUGRYeUXmZW9t4XYg1VKCxU9KpVC17LgOtDyXD8BufP1V
 uREJEzZw9O7zCCjeHp/ICFjBkc62Net6ZDOO+ZyXPNfddpS1Xq1uUgXLZc/202UR
 ID/kewu0pJRDnoxyqznWn9+8D33w/ygXs2slY2Ive0ONtjdgxGcsj2rNbb2RYn2z
 OP7br3lLg7qkFh4TtXb61eh/9GYIk6wzP/CrX5l/jH4SjQnrIk5g/X/Cd1qQ/qso
 JZzFoonOKvIp5Gw/+fZ9NP3YFcnkoRMv4NjZV8PAmsYLds+ibRiBcoB8u6FmiJV7
 WW5uopgPkfCGN5BV3+QHwJDVe+WlgnlzaT5zPUCcP5KWusDts4fWIgzP7vrtAzf4
 3OJLrgSGdBeOqWnJD21nxKUD27JOseX7D+BFtwxR4lMsXHqlHJfETpZ8gts1ZGH3
 2U353j/jkZvGWmc6dMcuxOXT2K4VqpYeIIqs0IcLu6hM9crtR89zPR2Iu1AilfDW
 h2NroF+Q//SgMMzWoTEG6Tn7RAc7MthgA/tRCFZF9CBMzNs988w0CTHnKsIHmjpU
 UKkMeJGAC9YrPYIcqrg0oYsmLUWXc8JuZbGJBnei3BzbaMTlcwIN9qj36zfq6xWc
 TMLpbWEoIsgFIZMP/hAP
 =rpGB
 -----END PGP SIGNATURE-----

Merge tag 'rtc-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "RTC for 4.8

  Cleanups:
   - huge cleanup of rtc-generic and char/genrtc this allowed to cleanup
     rtc-cmos, rtc-sh, rtc-m68k, rtc-powerpc and rtc-parisc
   - move mn10300 to rtc-cmos

  Subsystem:
   - fix wakealarms after hibernate
   - multiples fixes for rctest
   - simplify implementations of .read_alarm

  New drivers:
   - Maxim MAX6916

  Drivers:
   - ds1307: fix weekday
   - m41t80: add wakeup support
   - pcf85063: add support for PCF85063A variant
   - rv8803: extend i2c fix and other fixes
   - s35390a: fix alarm reading, this fixes instant reboot after
     shutdown for QNAP TS-41x
   - s3c: clock fixes"

* tag 'rtc-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (65 commits)
  rtc: rv8803: Clear V1F when setting the time
  rtc: rv8803: Stop the clock while setting the time
  rtc: rv8803: Always apply the I²C workaround
  rtc: rv8803: Fix read day of week
  rtc: rv8803: Remove the check for valid time
  rtc: rv8803: Kconfig: Indicate rx8900 support
  rtc: asm9260: remove .owner field for driver
  rtc: at91sam9: Fix missing spin_lock_init()
  rtc: m41t80: add suspend handlers for alarm IRQ
  rtc: m41t80: make it a real error message
  rtc: pcf85063: Add support for the PCF85063A device
  rtc: pcf85063: fix year range
  rtc: hym8563: in .read_alarm set .tm_sec to 0 to signal minute accuracy
  rtc: explicitly set tm_sec = 0 for drivers with minute accurancy
  rtc: s3c: Add s3c_rtc_{enable/disable}_clk in s3c_rtc_setfreq()
  rtc: s3c: Remove unnecessary call to disable already disabled clock
  rtc: abx80x: use devm_add_action_or_reset()
  rtc: m41t80: use devm_add_action_or_reset()
  rtc: fix a typo and reduce three empty lines to one
  rtc: s35390a: improve two comments in .set_alarm
  ...
2016-08-05 09:48:22 -04:00
Linus Torvalds 2cfd716d27 powerpc updates for 4.8 #2
Fixes:
  - Fix early access to cpu_spec relocation from Benjamin Herrenschmidt
  - Fix incorrect event codes in power9-event-list from Madhavan Srinivasan
  - Move register_process_table() out of ppc_md from Michael Ellerman
 
 Use jump_label for [cpu|mmu]_has_feature() from Aneesh Kumar K.V, Kevin Hao and Michael Ellerman:
  - Add mmu_early_init_devtree() from Michael Ellerman
  - Move disable_radix handling into mmu_early_init_devtree() from Michael Ellerman
  - Do hash device tree scanning earlier from Michael Ellerman
  - Do radix device tree scanning earlier from Michael Ellerman
  - Do feature patching before MMU init from Michael Ellerman
  - Check features don't change after patching from Michael Ellerman
  - Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V
  - Convert mmu_has_feature() to returning bool from Michael Ellerman
  - Convert cpu_has_feature() to returning bool from Michael Ellerman
  - Define radix_enabled() in one place & use static inline from Michael Ellerman
  - Add early_[cpu|mmu]_has_feature() from Michael Ellerman
  - Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar K.V
  - jump_label: Make it possible for arches to invoke jump_label_init() earlier from Kevin Hao
  - Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V
  - Remove mfvtb() from Kevin Hao
  - Move cpu_has_feature() to a separate file from Kevin Hao
  - Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael Ellerman
  - Add option to use jump label for cpu_has_feature() from Kevin Hao
  - Add option to use jump label for mmu_has_feature() from Kevin Hao
  - Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh Kumar K.V
  - Annotate jump label assembly from Michael Ellerman
 
 TLB flush enhancements from Aneesh Kumar K.V:
  - radix: Implement tlb mmu gather flush efficiently
  - Add helper for finding SLBE LLP encoding
  - Use hugetlb flush functions
  - Drop multiple definition of mm_is_core_local
  - radix: Add tlb flush of THP ptes
  - radix: Rename function and drop unused arg
  - radix/hugetlb: Add helper for finding page size
  - hugetlb: Add flush_hugetlb_tlb_range
  - remove flush_tlb_page_nohash
 
 Add new ptrace regsets from Anshuman Khandual and Simon Guo:
  - elf: Add powerpc specific core note sections
  - Add the function flush_tmregs_to_thread
  - Enable in transaction NT_PRFPREG ptrace requests
  - Enable in transaction NT_PPC_VMX ptrace requests
  - Enable in transaction NT_PPC_VSX ptrace requests
  - Adapt gpr32_get, gpr32_set functions for transaction
  - Enable support for NT_PPC_CGPR
  - Enable support for NT_PPC_CFPR
  - Enable support for NT_PPC_CVMX
  - Enable support for NT_PPC_CVSX
  - Enable support for TM SPR state
  - Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
  - Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
  - Enable support for EBB registers
  - Enable support for Performance Monitor registers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXpGaLAAoJEFHr6jzI4aWA9aYP/1AqmRPJ9D0XVUJWT+FVABUK
 LESESoVFF4Hug1j1F8Synhg5o4SzD2t45iGKbclYaFthOIyovMg7Wr1KSu4hQ0go
 rPuQfpXDNQ8jKdDX8hbPXKUxrNRBNfqJGFo5E7mO6wN9AJ9d1LVwQ+jKAva29Tqs
 LaAlMbQNbeObPNzOl73B73iew3aozr+mXjBqv82lqvgYknBD2CLf24xGG3eNIbq5
 ZZk4LPC8pdkaxnajnzRFzqwiyPWzao0yfpVRKh52TKHBQF/prR/KACb6zUuja/61
 krOfegUKob14OYrehjs6X8XNRLnILRI0u1H5bmj7eVEiY/usyNzE93SMHZM3Wdau
 sQF/Au4OLNXj0ZQdNBtzRsZRyp1d560Gsj+lQGBoPd4hfIWkFYHvxzxsUSdqv4uA
 MWDMwN0Vvfk0cpprsabsWNevkaotYYBU00px5hF/e5ZUc9/x/xYUVMgPEDr0QZLr
 cHJo9/Pjk4u/0g4lj+2y1LLl/0tNEZZg69O6bvffPAPVSS4/P4y/bKKYd4I0zL99
 Ykp91mSmkl70F3edgOSFqyda2gN2l2Ekb/i081YGXheFy1rbD29Vxv82BOVog4KY
 ibvOqp38WDzCVk5OXuCRvBl0VudLKGJYdppU1nXg4KgrTZXHeCAC0E+NzUsgOF4k
 OMvQ+5drVxrno+Hw8FVJ
 =0Q8E
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc updates from Michael Ellerman:
 "These were delayed for various reasons, so I let them sit in next a
  bit longer, rather than including them in my first pull request.

  Fixes:
   - Fix early access to cpu_spec relocation from Benjamin Herrenschmidt
   - Fix incorrect event codes in power9-event-list from Madhavan Srinivasan
   - Move register_process_table() out of ppc_md from Michael Ellerman

  Use jump_label use for [cpu|mmu]_has_feature():
   - Add mmu_early_init_devtree() from Michael Ellerman
   - Move disable_radix handling into mmu_early_init_devtree() from Michael Ellerman
   - Do hash device tree scanning earlier from Michael Ellerman
   - Do radix device tree scanning earlier from Michael Ellerman
   - Do feature patching before MMU init from Michael Ellerman
   - Check features don't change after patching from Michael Ellerman
   - Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V
   - Convert mmu_has_feature() to returning bool from Michael Ellerman
   - Convert cpu_has_feature() to returning bool from Michael Ellerman
   - Define radix_enabled() in one place & use static inline from Michael Ellerman
   - Add early_[cpu|mmu]_has_feature() from Michael Ellerman
   - Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar K.V
   - jump_label: Make it possible for arches to invoke jump_label_init() earlier from Kevin Hao
   - Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V
   - Remove mfvtb() from Kevin Hao
   - Move cpu_has_feature() to a separate file from Kevin Hao
   - Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael Ellerman
   - Add option to use jump label for cpu_has_feature() from Kevin Hao
   - Add option to use jump label for mmu_has_feature() from Kevin Hao
   - Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh Kumar K.V
   - Annotate jump label assembly from Michael Ellerman

  TLB flush enhancements from Aneesh Kumar K.V:
   - radix: Implement tlb mmu gather flush efficiently
   - Add helper for finding SLBE LLP encoding
   - Use hugetlb flush functions
   - Drop multiple definition of mm_is_core_local
   - radix: Add tlb flush of THP ptes
   - radix: Rename function and drop unused arg
   - radix/hugetlb: Add helper for finding page size
   - hugetlb: Add flush_hugetlb_tlb_range
   - remove flush_tlb_page_nohash

  Add new ptrace regsets from Anshuman Khandual and Simon Guo:
   - elf: Add powerpc specific core note sections
   - Add the function flush_tmregs_to_thread
   - Enable in transaction NT_PRFPREG ptrace requests
   - Enable in transaction NT_PPC_VMX ptrace requests
   - Enable in transaction NT_PPC_VSX ptrace requests
   - Adapt gpr32_get, gpr32_set functions for transaction
   - Enable support for NT_PPC_CGPR
   - Enable support for NT_PPC_CFPR
   - Enable support for NT_PPC_CVMX
   - Enable support for NT_PPC_CVSX
   - Enable support for TM SPR state
   - Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
   - Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
   - Enable support for EBB registers
   - Enable support for Performance Monitor registers"

* tag 'powerpc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
  powerpc/mm: Move register_process_table() out of ppc_md
  powerpc/perf: Fix incorrect event codes in power9-event-list
  powerpc/32: Fix early access to cpu_spec relocation
  powerpc/ptrace: Enable support for Performance Monitor registers
  powerpc/ptrace: Enable support for EBB registers
  powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
  powerpc/ptrace: Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
  powerpc/ptrace: Enable support for TM SPR state
  powerpc/ptrace: Enable support for NT_PPC_CVSX
  powerpc/ptrace: Enable support for NT_PPC_CVMX
  powerpc/ptrace: Enable support for NT_PPC_CFPR
  powerpc/ptrace: Enable support for NT_PPC_CGPR
  powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction
  powerpc/ptrace: Enable in transaction NT_PPC_VSX ptrace requests
  powerpc/ptrace: Enable in transaction NT_PPC_VMX ptrace requests
  powerpc/ptrace: Enable in transaction NT_PRFPREG ptrace requests
  powerpc/process: Add the function flush_tmregs_to_thread
  elf: Add powerpc specific core note sections
  powerpc/mm: remove flush_tlb_page_nohash
  powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range
  ...
2016-08-05 09:00:54 -04:00
Krzysztof Kozlowski 00085f1efa dma-mapping: use unsigned long for dma_attrs
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-04 08:50:07 -04:00
Linus Torvalds c8d0267efd PCI changes for the v4.8 merge window:
Enumeration
     Move ecam.h to linux/include/pci-ecam.h (Jayachandran C)
     Add parent device field to ECAM struct pci_config_window (Jayachandran C)
     Add generic MCFG table handling (Tomasz Nowicki)
     Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki)
     Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki)
 
   Resource management
     Add devm_request_pci_bus_resources() (Bjorn Helgaas)
     Unify pci_resource_to_user() declarations (Bjorn Helgaas)
     Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas)
     Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas)
     Make PCI I/O space optional on ARM32 (Bjorn Helgaas)
     Ignore write combining when mapping I/O port space (Bjorn Helgaas)
     Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas)
     Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas)
     Support I/O resources when parsing host bridge resources (Jayachandran C)
     Add helpers to request/release memory and I/O regions (Johannes Thumshirn)
     Use pci_(request|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn)
     Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5))
     Add generic pci_bus_claim_resources() (Lorenzo Pieralisi)
     Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
     Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi)
     Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya)
     Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu)
 
   PCI device hotplug
     Allow additional bus numbers for hotplug bridges (Keith Busch)
     Ignore interrupts during D3cold (Lukas Wunner)
 
   Power management
     Enforce type casting for pci_power_t (Andy Shevchenko)
     Don't clear d3cold_allowed for PCIe ports (Mika Westerberg)
     Put PCIe ports into D3 during suspend (Mika Westerberg)
     Power on bridges before scanning new devices (Mika Westerberg)
     Runtime resume bridge before rescan (Mika Westerberg)
     Add runtime PM support for PCIe ports (Mika Westerberg)
     Remove redundant check of pcie_set_clkpm (Shawn Lin)
 
   Virtualization
     Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra)
     Add DMA alias quirk for Adaptec 3805 (Alex Williamson)
     Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake)
     Add ACS quirk for Solarflare SFC9220 (Edward Cree)
 
   MSI
     Fix PCI_MSI dependencies (Arnd Bergmann)
     Add pci_msix_desc_addr() helper (Christoph Hellwig)
     Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig)
     Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig)
     Provide sensible IRQ vector alloc/free routines (Christoph Hellwig)
     Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig)
 
   Error Handling
     Bind DPC to Root Ports as well as Downstream Ports (Keith Busch)
     Remove DPC tristate module option (Keith Busch)
     Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg)
 
   Generic host bridge driver
     Select IRQ_DOMAIN (Arnd Bergmann)
     Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
 
   ACPI host bridge driver
     Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki)
     Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki)
     Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki)
     Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki)
 
   Altera host bridge driver
     Check link status before retrain link (Ley Foon Tan)
     Poll for link up status after retraining the link (Ley Foon Tan)
 
   Axis ARTPEC-6 host bridge driver
     Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann)
     Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel)
     Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel)
 
   Intel VMD host bridge driver
     Use lock save/restore in interrupt enable path (Jon Derrick)
     Select device dma ops to override (Keith Busch)
     Initialize list item in IRQ disable (Keith Busch)
     Use x86_vector_domain as parent domain (Keith Busch)
     Separate MSI and MSI-X vector sharing (Keith Busch)
 
   Marvell Aardvark host bridge driver
     Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni)
     Add Aardvark PCI host controller driver (Thomas Petazzoni)
     Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni)
 
   Microsoft Hyper-V host bridge driver
     Fix interrupt cleanup path (Cathy Avery)
     Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
     Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
 
   NVIDIA Tegra host bridge driver
     Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren)
     Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren)
     Use lower-case hex consistently for register definitions (Thierry Reding)
     Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding)
     Stop setting pcibios_min_mem (Thierry Reding)
 
   Renesas R-Car host bridge driver
     Drop gen2 dummy I/O port region (Bjorn Helgaas)
 
   TI DRA7xx host bridge driver
     Fix return value in case of error (Christophe JAILLET)
 
   Xilinx AXI host bridge driver
     Fix return value in case of error (Christophe JAILLET)
 
   Miscellaneous
     Make bus_attr_resource_alignment static (Ben Dooks)
     Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks)
     MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven)
     Make host bridge drivers explicitly non-modular (Paul Gortmaker)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXoNRtAAoJEFmIoMA60/r8LMkP/3kiNh21QFS6RZGOaDft5/Py
 n14Zo0w51avspxoI3iyDlBd5q/SssMqi+2c6Ko/fh2D2xMxJgmQOjdMDrIGARxGA
 qEHk/5IoXquY2/GcptmCk3ap66cJ6kTovS4OPrb73m3fPuknFwFwdzExq22XHbnI
 crPya6xwQxPLc54VpY/TsgW8E+EKZd/3FW9wuzzNHXrXmTILyhBQzQAA0K470GMx
 wEXU6kc3M/XhRuF1zjV9/O+H/xguwfnbTpZLvd2NAF6uXKZoRytEHHtNnVqu1hoe
 UPpDS2xq32pMNbGxGqBetCdIbkY/hWOufmckHI7Yu2OfXBYyHBYMG2je1+nMPkOV
 WiFhhrchGt5KnEMUwXPS4ROqnSZVpZBl1Fd4s10GhUYkoE2HNKJXta398H9FR1jj
 4NEVSi4mSX/+CkaoIN3lXYiaf9P0wv4Wppve4Scr30+VnLjJhm7Vw5La7v12oo6x
 otrJ/g98AkmnbuUdLeWBUS/+TOcdPjZYbw52rqBsbOOjFm51Zcj6D7kf5WcTypQy
 HzbvygSVabcioWehUG1uudC8pdJmQlUGx1aES/iu+mZEae4cuUFALu6hDBD9IYnZ
 5JdwjVzI0UItEwT3rQt3t4xiAqHADQ0NAVNJVCeREdoy/YQpSoTWGXIpyqCZ1yCm
 aBykjRsxbKQXlhVeIxuc
 =NVxu
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Highlights:

   - ARM64 support for ACPI host bridges

   - new drivers for Axis ARTPEC-6 and Marvell Aardvark

   - new pci_alloc_irq_vectors() interface for MSI-X, MSI, legacy INTx

   - pci_resource_to_user() cleanup (more to come)

  Detailed summary:

  Enumeration:
   - Move ecam.h to linux/include/pci-ecam.h (Jayachandran C)
   - Add parent device field to ECAM struct pci_config_window (Jayachandran C)
   - Add generic MCFG table handling (Tomasz Nowicki)
   - Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki)
   - Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki)

  Resource management:
   - Add devm_request_pci_bus_resources() (Bjorn Helgaas)
   - Unify pci_resource_to_user() declarations (Bjorn Helgaas)
   - Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas)
   - Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas)
   - Make PCI I/O space optional on ARM32 (Bjorn Helgaas)
   - Ignore write combining when mapping I/O port space (Bjorn Helgaas)
   - Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas)
   - Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas)
   - Support I/O resources when parsing host bridge resources (Jayachandran C)
   - Add helpers to request/release memory and I/O regions (Johannes Thumshirn)
   - Use pci_(request|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn)
   - Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5))
   - Add generic pci_bus_claim_resources() (Lorenzo Pieralisi)
   - Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
   - Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi)
   - Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya)
   - Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu)

  PCI device hotplug:
   - Allow additional bus numbers for hotplug bridges (Keith Busch)
   - Ignore interrupts during D3cold (Lukas Wunner)

  Power management:
   - Enforce type casting for pci_power_t (Andy Shevchenko)
   - Don't clear d3cold_allowed for PCIe ports (Mika Westerberg)
   - Put PCIe ports into D3 during suspend (Mika Westerberg)
   - Power on bridges before scanning new devices (Mika Westerberg)
   - Runtime resume bridge before rescan (Mika Westerberg)
   - Add runtime PM support for PCIe ports (Mika Westerberg)
   - Remove redundant check of pcie_set_clkpm (Shawn Lin)

  Virtualization:
   - Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra)
   - Add DMA alias quirk for Adaptec 3805 (Alex Williamson)
   - Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake)
   - Add ACS quirk for Solarflare SFC9220 (Edward Cree)

  MSI:
   - Fix PCI_MSI dependencies (Arnd Bergmann)
   - Add pci_msix_desc_addr() helper (Christoph Hellwig)
   - Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig)
   - Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig)
   - Provide sensible IRQ vector alloc/free routines (Christoph Hellwig)
   - Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig)

  Error Handling:
   - Bind DPC to Root Ports as well as Downstream Ports (Keith Busch)
   - Remove DPC tristate module option (Keith Busch)
   - Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg)

  Generic host bridge driver:
   - Select IRQ_DOMAIN (Arnd Bergmann)
   - Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)

  ACPI host bridge driver:
   - Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki)
   - Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki)
   - Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki)
   - Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki)

  Altera host bridge driver:
   - Check link status before retrain link (Ley Foon Tan)
   - Poll for link up status after retraining the link (Ley Foon Tan)

  Axis ARTPEC-6 host bridge driver:
   - Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann)
   - Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel)
   - Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel)

  Intel VMD host bridge driver:
   - Use lock save/restore in interrupt enable path (Jon Derrick)
   - Select device dma ops to override (Keith Busch)
   - Initialize list item in IRQ disable (Keith Busch)
   - Use x86_vector_domain as parent domain (Keith Busch)
   - Separate MSI and MSI-X vector sharing (Keith Busch)

  Marvell Aardvark host bridge driver:
   - Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni)
   - Add Aardvark PCI host controller driver (Thomas Petazzoni)
   - Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni)

  Microsoft Hyper-V host bridge driver:
   - Fix interrupt cleanup path (Cathy Avery)
   - Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
   - Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov)

  NVIDIA Tegra host bridge driver:
   - Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren)
   - Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren)
   - Use lower-case hex consistently for register definitions (Thierry Reding)
   - Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding)
   - Stop setting pcibios_min_mem (Thierry Reding)

  Renesas R-Car host bridge driver:
   - Drop gen2 dummy I/O port region (Bjorn Helgaas)

  TI DRA7xx host bridge driver:
   - Fix return value in case of error (Christophe JAILLET)

  Xilinx AXI host bridge driver:
   - Fix return value in case of error (Christophe JAILLET)

  Miscellaneous:
   - Make bus_attr_resource_alignment static (Ben Dooks)
   - Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks)
   - MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven)
   - Make host bridge drivers explicitly non-modular (Paul Gortmaker)"

* tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (125 commits)
  PCI: xgene: Make explicitly non-modular
  PCI: thunder-pem: Make explicitly non-modular
  PCI: thunder-ecam: Make explicitly non-modular
  PCI: tegra: Make explicitly non-modular
  PCI: rcar-gen2: Make explicitly non-modular
  PCI: rcar: Make explicitly non-modular
  PCI: mvebu: Make explicitly non-modular
  PCI: layerscape: Make explicitly non-modular
  PCI: keystone: Make explicitly non-modular
  PCI: hisi: Make explicitly non-modular
  PCI: generic: Make explicitly non-modular
  PCI: designware-plat: Make it explicitly non-modular
  PCI: artpec6: Make explicitly non-modular
  PCI: armada8k: Make explicitly non-modular
  PCI: artpec: Add PCI_MSI_IRQ_DOMAIN dependency
  PCI: Add ACS quirk for Solarflare SFC9220
  arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700
  PCI: aardvark: Add Aardvark PCI host controller driver
  dt-bindings: add DT binding for the Aardvark PCIe controller
  PCI: tegra: Program PADS_REFCLK_CFG* registers with per-SoC values
  ...
2016-08-02 17:12:29 -04:00
Linus Torvalds 221bb8a46e - ARM: GICv3 ITS emulation and various fixes. Removal of the old
VGIC implementation.
 
 - s390: support for trapping software breakpoints, nested virtualization
 (vSIE), the STHYI opcode, initial extensions for CPU model support.
 
 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
 preliminary to this and the upcoming support for hardware virtualization
 extensions.
 
 - x86: support for execute-only mappings in nested EPT; reduced vmexit
 latency for TSC deadline timer (by about 30%) on Intel hosts; support for
 more than 255 vCPUs.
 
 - PPC: bugfixes.
 
 The ugly bit is the conflicts.  A couple of them are simple conflicts due
 to 4.7 fixes, but most of them are with other trees. There was definitely
 too much reliance on Acked-by here.  Some conflicts are for KVM patches
 where _I_ gave my Acked-by, but the worst are for this pull request's
 patches that touch files outside arch/*/kvm.  KVM submaintainers should
 probably learn to synchronize better with arch maintainers, with the
 latter providing topic branches whenever possible instead of Acked-by.
 This is what we do with arch/x86.  And I should learn to refuse pull
 requests when linux-next sends scary signals, even if that means that
 submaintainers have to rebase their branches.
 
 Anyhow, here's the list:
 
 - arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
 by the nvdimm tree.  This tree adds handle_preemption_timer and
 EXIT_REASON_PREEMPTION_TIMER at the same place.  In general all mentions
 of pcommit have to go.
 
 There is also a conflict between a stable fix and this patch, where the
 stable fix removed the vmx_create_pml_buffer function and its call.
 
 - virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
 This tree adds kvm_io_bus_get_dev at the same place.
 
 - virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
 file was completely removed for 4.8.
 
 - include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
 this is a change that should have gone in through the irqchip tree and
 pulled by kvm-arm.  I think I would have rejected this kvm-arm pull
 request.  The KVM version is the right one, except that it lacks
 GITS_BASER_PAGES_SHIFT.
 
 - arch/powerpc: what a mess.  For the idle_book3s.S conflict, the KVM
 tree is the right one; everything else is trivial.  In this case I am
 not quite sure what went wrong.  The commit that is causing the mess
 (fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
 path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
 and arch/powerpc/kvm/.  It's large, but at 396 insertions/5 deletions
 I guessed that it wasn't really possible to split it and that the 5
 deletions wouldn't conflict.  That wasn't the case.
 
 - arch/s390: also messy.  First is hypfs_diag.c where the KVM tree
 moved some code and the s390 tree patched it.  You have to reapply the
 relevant part of commits 6c22c98637, plus all of e030c1125e, to
 arch/s390/kernel/diag.c.  Or pick the linux-next conflict
 resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
 Second, there is a conflict in gmap.c between a stable fix and 4.8.
 The KVM version here is the correct one.
 
 I have pushed my resolution at refs/heads/merge-20160802 (commit
 3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
 SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
 U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
 x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
 qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
 69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
 =42ti
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:

 - ARM: GICv3 ITS emulation and various fixes.  Removal of the
   old VGIC implementation.

 - s390: support for trapping software breakpoints, nested
   virtualization (vSIE), the STHYI opcode, initial extensions
   for CPU model support.

 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
   of cleanups, preliminary to this and the upcoming support for
   hardware virtualization extensions.

 - x86: support for execute-only mappings in nested EPT; reduced
   vmexit latency for TSC deadline timer (by about 30%) on Intel
   hosts; support for more than 255 vCPUs.

 - PPC: bugfixes.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
  KVM: PPC: Introduce KVM_CAP_PPC_HTM
  MIPS: Select HAVE_KVM for MIPS64_R{2,6}
  MIPS: KVM: Reset CP0_PageMask during host TLB flush
  MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
  MIPS: KVM: Sign extend MFC0/RDHWR results
  MIPS: KVM: Fix 64-bit big endian dynamic translation
  MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
  MIPS: KVM: Use 64-bit CP0_EBase when appropriate
  MIPS: KVM: Set CP0_Status.KX on MIPS64
  MIPS: KVM: Make entry code MIPS64 friendly
  MIPS: KVM: Use kmap instead of CKSEG0ADDR()
  MIPS: KVM: Use virt_to_phys() to get commpage PFN
  MIPS: Fix definition of KSEGX() for 64-bit
  KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
  kvm: x86: nVMX: maintain internal copy of current VMCS
  KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
  KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
  KVM: arm64: vgic-its: Simplify MAPI error handling
  KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
  KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
  ...
2016-08-02 16:11:27 -04:00
Bjorn Helgaas 9454c23852 Merge branch 'pci/msi-affinity' into next
Conflicts:
	drivers/nvme/host/pci.c
2016-08-01 12:34:01 -05:00
Anshuman Khandual a67ae75802 powerpc/ptrace: Enable support for Performance Monitor registers
This patch enables support for Performance monitor registers related
ELF core note NT_PPC_PMU based ptrace requests through
PTRACE_GETREGSET, PTRACE_SETREGSET calls. This is achieved
through adding one new register sets REGSET_PMU in powerpc
corresponding to the ELF core note sections added in this
regard. It also implements the get, set and active functions
for this new register sets added.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:24 +10:00
Anshuman Khandual cf89d4e1b1 powerpc/ptrace: Enable support for EBB registers
This patch enables support for EBB state registers related
ELF core note NT_PPC_EBB based ptrace requests through
PTRACE_GETREGSET, PTRACE_SETREGSET calls. This is achieved
through adding one new register sets REGSET_EBB in powerpc
corresponding to the ELF core note sections added in this
regard. It also implements the get, set and active functions
for this new register sets added.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:23 +10:00
Anshuman Khandual fa439810cc powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
This patch enables support for running TAR, PPR, DSCR registers
related ELF core notes NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR based
ptrace requests through PTRACE_GETREGSET, PTRACE_SETREGSET calls.
This is achieved through adding three new register sets REGSET_TAR,
REGSET_PPR, REGSET_DSCR in powerpc corresponding to the ELF core
note sections added in this regad. It implements the get, set and
active functions for all these new register sets added.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:22 +10:00
Anshuman Khandual c45dc9003a powerpc/ptrace: Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
This patch enables support for all three TM checkpointed SPR
states related ELF core note  NT_PPC_TM_CTAR, NT_PPC_TM_CPPR,
NT_PPC_TM_CDSCR based ptrace requests through PTRACE_GETREGSET,
PTRACE_SETREGSET calls. This is achieved through adding three
new register sets REGSET_TM_CTAR, REGSET_TM_CPPR and
REGSET_TM_CDSCR in powerpc corresponding to the ELF core note
sections added. It implements the get, set and active functions
for all these new register sets added.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:22 +10:00
Anshuman Khandual 08e1c01d6a powerpc/ptrace: Enable support for TM SPR state
This patch enables support for TM SPR state related ELF core
note NT_PPC_TM_SPR based ptrace requests through PTRACE_GETREGSET,
PTRACE_SETREGSET calls. This is achieved through adding a register
set REGSET_TM_SPR in powerpc corresponding to the ELF core note
section added. It implements the get, set and active functions for
this new register set added.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:21 +10:00
Anshuman Khandual 9d3918f7c0 powerpc/ptrace: Enable support for NT_PPC_CVSX
This patch enables support for TM checkpointed VSX register
set ELF core note NT_PPC_CVSX based ptrace requests through
PTRACE_GETREGSET, PTRACE_SETREGSET calls. This is achieved
through adding a register set REGSET_CVSX in powerpc
corresponding to the ELF core note section added. It
implements the get, set and active functions for this new
register set added.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:20 +10:00
Anshuman Khandual 8c13f59999 powerpc/ptrace: Enable support for NT_PPC_CVMX
This patch enables support for TM checkpointed VMX register
set ELF core note NT_PPC_CVMX based ptrace requests through
PTRACE_GETREGSET, PTRACE_SETREGSET calls. This is achieved
through adding a register set REGSET_CVMX in powerpc
corresponding to the ELF core note section added. It
implements the get, set and active functions for this new
register set added.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:19 +10:00
Anshuman Khandual 19cbcbf75a powerpc/ptrace: Enable support for NT_PPC_CFPR
This patch enables support for TM checkpointed FPR register
set ELF core note NT_PPC_CFPR based ptrace requests through
PTRACE_GETREGSET, PTRACE_SETREGSET calls. This is achieved
through adding a register set REGSET_CFPR in powerpc
corresponding to the ELF core note section added. It
implements the get, set and active functions for this new
register set added.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:19 +10:00
Anshuman Khandual 25847fb195 powerpc/ptrace: Enable support for NT_PPC_CGPR
This patch enables support for TM checkpointed GPR register
set ELF core note NT_PPC_CGPR based ptrace requests through
PTRACE_GETREGSET, PTRACE_SETREGSET calls. This is achieved
through adding a register set REGSET_CGPR in powerpc
corresponding to the ELF core note section added. It
implements the get, set and active functions for this new
register set added.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:18 +10:00
Anshuman Khandual 04fcadce0e powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction
This patch splits gpr32_get, gpr32_set functions to accommodate
in transaction ptrace requests implemented in patches later in
the series.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:17 +10:00
Anshuman Khandual 94b7d3610e powerpc/ptrace: Enable in transaction NT_PPC_VSX ptrace requests
This patch enables in transaction NT_PPC_VSX ptrace requests. The
function vsr_get which gets the running value of all VSX registers
and the function vsr_set which sets the running value of of all VSX
registers work on the running set of VMX registers whose location
will be different if transaction is active. This patch makes these
functions adapt to situations when the transaction is active.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:17 +10:00
Anshuman Khandual d844e27915 powerpc/ptrace: Enable in transaction NT_PPC_VMX ptrace requests
This patch enables in transaction NT_PPC_VMX ptrace requests. The
function vr_get which gets the running value of all VMX registers
and the function vr_set which sets the running value of of all VMX
registers work on the running set of VMX registers whose location
will be different if transaction is active. This patch makes these
functions adapt to situations when the transaction is active.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:16 +10:00
Anshuman Khandual 1ec8549d44 powerpc/ptrace: Enable in transaction NT_PRFPREG ptrace requests
This patch enables in transaction NT_PRFPREG ptrace requests.
The function fpr_get which gets the running value of all FPR
registers and the function fpr_set which sets the running
value of of all FPR registers work on the running set of FPR
registers whose location will be different if transaction is
active. This patch makes these functions adapt to situations
when the transaction is active.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:15 +10:00
Anshuman Khandual 8d460f6156 powerpc/process: Add the function flush_tmregs_to_thread
This patch creates a function flush_tmregs_to_thread which
will then be used by subsequent patches in this series. The
function checks for self tracing ptrace interface attempts
while in the TM context and logs appropriate warning message.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:15 +10:00
Kevin Hao c12e6f24d4 powerpc: Add option to use jump label for mmu_has_feature()
As we just did for CPU features.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:06 +10:00
Kevin Hao 4db7327194 powerpc: Add option to use jump label for cpu_has_feature()
We do binary patching of asm code using CPU features, which is a
one-time operation, done during early boot. However checks of CPU
features in C code are currently done at run time, even though the set
of CPU features can never change after boot.

We can optimise this by using jump labels to implement cpu_has_feature(),
meaning checks in C code are binary patched into a single nop or branch.

For a C sequence along the lines of:

    if (cpu_has_feature(FOO))
         return 2;

The generated code before is roughly:

    ld      r9,-27640(r2)
    ld      r9,0(r9)
    lwz     r9,32(r9)
    cmpwi   cr7,r9,0
    bge     cr7, 1f
    li      r3,2
    blr
1:  ...

After (true):
    nop
    li      r3,2
    blr

After (false):
    b	1f
    li      r3,2
    blr
1:  ...

mpe: Rename MAX_CPU_FEATURES as we already have a #define with that
name, and define it simply as a constant, rather than doing tricks with
sizeof and NULL pointers. Rename the array to cpu_feature_keys. Use the
kconfig we added to guard it. Add BUILD_BUG_ON() if the feature is not a
compile time constant. Rewrite the change log.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:05 +10:00
Kevin Hao b92a226e52 powerpc: Move cpu_has_feature() to a separate file
We plan to use jump label for cpu_has_feature(). In order to implement
this we need to include the linux/jump_label.h in asm/cputable.h.

Unfortunately if we do that it leads to an include loop. The root of the
problem seems to be that reg.h needs cputable.h (for CPU_FTRs), and then
cputable.h via jump_label.h eventually pulls in hw_irq.h which needs
reg.h (for MSR_EE).

So move cpu_has_feature() to a separate file on its own.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Rename to cpu_has_feature.h and flesh out change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:03 +10:00
Aneesh Kumar K.V b8f1b4f860 powerpc/mm: Convert early cpu/mmu feature check to use the new helpers
This switches early feature checks to use the non static key variant of
the function. In later patches we will be switching cpu_has_feature()
and mmu_has_feature() to use static keys and we can use them only after
static key/jump label is initialized. Any check for feature before jump
label init should be done using this new helper.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:01 +10:00
Aneesh Kumar K.V 5a25b6f527 powerpc/mm: Make MMU_FTR_RADIX a MMU family feature
MMU feature bits are defined such that we use the lower half to
present MMU family features. Remove the strict split of half and
also move Radix to a mmu family feature. Radix introduce a new MMU
model and strictly speaking it is a new MMU family. This also free
up bits which can be used for individual features later.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:57 +10:00
Michael Ellerman 9e8066f398 powerpc/64: Do feature patching before MMU init
Up until now we needed to do the MMU init before feature patching,
because part of the MMU init was scanning the device tree and setting
and/or clearing some MMU feature bits.

Now that we have split that MMU feature modification out into routines
called from early_init_devtree() (called earlier) we can now do feature
patching before calling MMU init.

The advantage of this is it means the remainder of the MMU init runs
with the final set of features which will apply for the rest of the life
of the system. This means we don't have to special case anything called
from MMU init to deal with a changing set of feature bits.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:56 +10:00
Michael Ellerman c610ec60ed powerpc/mm: Move disable_radix handling into mmu_early_init_devtree()
Move the handling of the disable_radix command line argument into the
newly created mmu_early_init_devtree().

It's an MMU option so it's preferable to have it in an mm related file,
and it also means platforms that don't support radix don't have to carry
the code.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:53 +10:00
Michael Ellerman 1a01dc87e0 powerpc/mm: Add mmu_early_init_devtree()
Empty for now, but we'll add to it in the next patch.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:53 +10:00
Linus Torvalds bad60e6f25 powerpc updates for 4.8 # 1
Highlights:
  - PowerNV PCI hotplug support.
  - Lots more Power9 support.
  - eBPF JIT support on ppc64le.
  - Lots of cxl updates.
  - Boot code consolidation.
 
 Bug fixes:
  - Fix spin_unlock_wait() from Boqun Feng
  - Fix stack pointer corruption in __tm_recheckpoint() from Michael Neuling
  - Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
  - mm: Ensure "special" zones are empty from Oliver O'Halloran
  - ftrace: Separate the heuristics for checking call sites from Michael Ellerman
  - modules: Never restore r2 for a mprofile-kernel style mcount() call from Michael Ellerman
  - Fix endianness when reading TCEs from Alexey Kardashevskiy
  - start rtasd before PCI probing from Greg Kurz
  - PCI: rpaphp: Fix slot registration for multiple slots under a PHB from Tyrel Datwyler
  - powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev Bhattiprolu
 
 Cleanups & fixes:
  - Drop support for MPIC in pseries from Rashmica Gupta
  - Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
  - Remove unused symbols in asm-offsets.c from Rashmica Gupta
  - Fix SRIOV not building without EEH enabled from Russell Currey
  - Remove kretprobe_trampoline_holder. from Thiago Jung Bauermann
  - Reduce log level of PCI I/O space warning from Benjamin Herrenschmidt
  - Add array bounds checking to crash_shutdown_handlers from Suraj Jitindar Singh
  - Avoid -maltivec when using clang integrated assembler from Anton Blanchard
  - Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
  - Fix error return value in cmm_mem_going_offline() from Rasmus Villemoes
  - export cpu_to_core_id() from Mauricio Faria de Oliveira
  - Remove old symbols from defconfigs from Andrew Donnellan
  - Update obsolete comments in setup_32.c about entry conditions from Benjamin Herrenschmidt
  - Add comment explaining the purpose of setup_kdump_trampoline() from Benjamin Herrenschmidt
  - Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin Hao
  - Remove RELOCATABLE_PPC32 from Kevin Hao
  - Fix .long's in tlb-radix.c to more meaningful from Balbir Singh
 
 Minor cleanups & fixes:
  - Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
    Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King, Geliang
    Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman, Michael Ellerman,
    Stephen Rothwell, Stewart Smith.
 
 Freescale updates from Scott:
  - "Highlights include more 8xx optimizations, device tree updates,
    and MVME7100 support."
 
 PowerNV PCI hotplug from Gavin Shan:
  - PCI: Add pcibios_setup_bridge()
  - Override pcibios_setup_bridge()
  - Remove PCI_RESET_DELAY_US
  - Move pnv_pci_ioda_setup_opal_tce_kill() around
  - Increase PE# capacity
  - Allocate PE# in reverse order
  - Create PEs in pcibios_setup_bridge()
  - Setup PE for root bus
  - Extend PCI bridge resources
  - Make pnv_ioda_deconfigure_pe() visible
  - Dynamically release PE
  - Update bridge windows on PCI plug
  - Delay populating pdn
  - Support PCI slot ID
  - Use PCI slot reset infrastructure
  - Introduce pnv_pci_get_slot_id()
  - Functions to get/set PCI slot state
  - PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  - Print correct PHB type names
 
 Power9 idle support from Shreyas B. Prabhu:
  - set power_save func after the idle states are initialized
  - Use PNV_THREAD_WINKLE macro while requesting for winkle
  - make hypervisor state restore a function
  - Rename idle_power7.S to idle_book3s.S
  - Rename reusable idle functions to hardware agnostic names
  - Make pnv_powersave_common more generic
  - abstraction for saving SPRs before entering deep idle states
  - Add platform support for stop instruction
  - cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
  - cpuidle/powernv: cleanup cpuidle-powernv.c
  - cpuidle/powernv: Add support for POWER ISA v3 idle states
  - Use deepest stop state when cpu is offlined
 
 Power9 PMU from Madhavan Srinivasan:
  - factor out power8 pmu macros and defines
  - factor out power8 pmu functions
  - factor out power8 __init_pmu code
  - Add power9 event list macros for generic and cache events
  - Power9 PMU support
  - Export Power9 generic and cache events to sysfs
 
 Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
  - Add XICS emulation APIs
  - Move a few exception common handlers to make room
  - Add support for HV virtualization interrupts
  - Add mechanism to force a replay of interrupts
  - Add ICP OPAL backend
  - Discover IODA3 PHBs
  - pci: Remove obsolete SW invalidate
  - opal: Add real mode call wrappers
  - Rename TCE invalidation calls
  - Remove SWINV constants and obsolete TCE code
  - Rework accessing the TCE invalidate register
  - Fallback to OPAL for TCE invalidations
  - Use the device-tree to get available range of M64's
  - Check status of a PHB before using it
  - pci: Don't try to allocate resources that will be reassigned
 
 Other Power9:
  - Send SIGBUS on unaligned copy and paste from Chris Smart
  - Large Decrementer support from Oliver O'Halloran
  - Load Monitor Register Support from Jack Miller
 
 Performance improvements from Anton Blanchard:
  - Avoid load hit store in __giveup_fpu() and __giveup_altivec()
  - Avoid load hit store in setup_sigcontext()
  - Remove assembly versions of strcpy, strcat, strlen and strcmp
  - Align hot loops of some string functions
 
 eBPF JIT from Naveen N. Rao:
  - Fix/enhance 32-bit Load Immediate implementation
  - Optimize 64-bit Immediate loads
  - Introduce rotate immediate instructions
  - A few cleanups
  - Isolate classic BPF JIT specifics into a separate header
  - Implement JIT compiler for extended BPF
 
 Operator Panel driver from Suraj Jitindar Singh:
  - devicetree/bindings: Add binding for operator panel on FSP machines
  - Add inline function to get rc from an ASYNC_COMP opal_msg
  - Add driver for operator panel on FSP machines
 
 Sparse fixes from Daniel Axtens:
  - make some things static
  - Introduce asm-prototypes.h
  - Include headers containing prototypes
  - Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
  - kvm: Clarify __user annotations
  - Pass endianness to sparse
  - Make ppc_md.{halt, restart} __noreturn
 
 MM fixes & cleanups from Aneesh Kumar K.V:
  - radix: Update LPCR HR bit as per ISA
  - use _raw variant of page table accessors
  - Compile out radix related functions if RADIX_MMU is disabled
  - Clear top 16 bits of va only on older cpus
  - Print formation regarding the the MMU mode
  - hash: Update SDR1 size encoding as documented in ISA 3.0
  - radix: Update PID switch sequence
  - radix: Update machine call back to support new HCALL.
  - radix: Add LPID based tlb flush helpers
  - radix: Add a kernel command line to disable radix
  - Cleanup LPCR defines
 
 Boot code consolidation from Benjamin Herrenschmidt:
  - Move epapr_paravirt_early_init() to early_init_devtree()
  - cell: Don't use flat device-tree after boot
  - ge_imp3a: Don't use the flat device-tree after boot
  - mpc85xx_ds: Don't use the flat device-tree after boot
  - mpc85xx_rdb: Don't use the flat device-tree after boot
  - Don't test for machine type in rtas_initialize()
  - Don't test for machine type in smp_setup_cpu_maps()
  - dt: Add of_device_compatible_match()
  - Factor do_feature_fixup calls
  - Move 64-bit feature fixup earlier
  - Move 64-bit memory reserves to setup_arch()
  - Use a cachable DART
  - Move FW feature probing out of pseries probe()
  - Put exception configuration in a common place
  - Remove early allocation of the SMU command buffer
  - Move MMU backend selection out of platform code
  - pasemi: Remove IOBMAP allocation from platform probe()
  - mm/hash: Don't use machine_is() early during boot
  - Don't test for machine type to detect HEA special case
  - pmac: Remove spurrious machine type test
  - Move hash table ops to a separate structure
  - Ensure that ppc_md is empty before probing for machine type
  - Move 64-bit probe_machine() to later in the boot process
  - Move 32-bit probe() machine to later in the boot process
  - Get rid of ppc_md.init_early()
  - Move the boot time info banner to a separate function
  - Move setting of {i,d}cache_bsize to initialize_cache_info()
  - Move the content of setup_system() to setup_arch()
  - Move cache info inits to a separate function
  - Re-order the call to smp_setup_cpu_maps()
  - Re-order setup_panic()
  - Make a few boot functions __init
  - Merge 32-bit and 64-bit setup_arch()
 
 Other new features:
  - tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
  - tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
  - powerpc: Add module autoloading based on CPU features from Alastair D'Silva
  - crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
  - Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
  - xmon: Dump ISA 2.06 SPRs from Michael Ellerman
  - xmon: Dump ISA 2.07 SPRs from Michael Ellerman
  - Add a parameter to disable 1TB segs from Oliver O'Halloran
  - powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
  - Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
  - pseries: Add pseries hotplug workqueue from John Allen
  - pseries: Add support for hotplug interrupt source from John Allen
  - pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
  - pseries: Move property cloning into its own routine from Nathan Fontenot
  - pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
  - pseries: Auto-online hotplugged memory from Nathan Fontenot
  - pseries: Remove call to memblock_add() from Nathan Fontenot
 
 cxl:
  - Add set and get private data to context struct from Michael Neuling
  - make base more explicitly non-modular from Paul Gortmaker
  - Use for_each_compatible_node() macro from Wei Yongjun
  - Frederic Barrat
    - Abstract the differences between the PSL and XSL
    - Make vPHB device node match adapter's
  - Philippe Bergheaud
    - Add mechanism for delivering AFU driver specific events
    - Ignore CAPI adapters misplaced in switched slots
    - Refine slice error debug messages
  - Andrew Donnellan
    - static-ify variables to fix sparse warnings
    - PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
    - PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
    - Add cxl_check_and_switch_mode() API to switch bi-modal cards
    - remove dead Kconfig options
    - fix potential NULL dereference in free_adapter()
  - Ian Munsie
    - Update process element after allocating interrupts
    - Add support for CAPP DMA mode
    - Fix allowing bogus AFU descriptors with 0 maximum processes
    - Fix allocating a minimum of 2 pages for the SPA
    - Fix bug where AFU disable operation had no effect
    - Workaround XSL bug that does not clear the RA bit after a reset
    - Fix NULL pointer dereference on kernel contexts with no AFU interrupts
    - powerpc/powernv: Split cxl code out into a separate file
    - Add cxl_slot_is_supported API
    - Enable bus mastering for devices using CAPP DMA mode
    - Move cxl_afu_get / cxl_afu_put to base
    - Allow a default context to be associated with an external pci_dev
    - Do not create vPHB if there are no AFU configuration records
    - powerpc/powernv: Add support for the cxl kernel api on the real phb
    - Add support for using the kernel API with a real PHB
    - Add kernel APIs to get & set the max irqs per context
    - Add preliminary workaround for CX4 interrupt limitation
    - Add support for interrupts on the Mellanox CX4
    - Workaround PE=0 hardware limitation in Mellanox CX4
    - powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n
 
 selftests:
  - Test unaligned copy and paste from Chris Smart
  - Load Monitor Register Tests from Jack Miller
  - Cyril Bur
    - exec() with suspended transaction
    - Use signed long to read perf_event_paranoid
    - Fix usage message in context_switch
    - Fix generation of vector instructions/types in context_switch
  - Michael Ellerman
    - Use "Delta" rather than "Error" in normal output
    - Import Anton's mmap & futex micro benchmarks
    - Add a test for PROT_SAO
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnWchAAoJEFHr6jzI4aWAe64P/36Vd9yJLptjkoyZp8/IQtu1
 Cv8buQwGdKuSMzdkcUAOXcC3fe2u70ZWXMKKLfY3koIV1IAiqdWk5/XWRKMP2XmE
 dG0LhSf0uu7uh+mE0WvQnRu46ImeKtQ+mPp4Hbs/s9SxMSeYjruv3vdWWmgUq0cl
 Gac2qJSRtAMmgLuHWMjf7N5mxOTOnKejU4o2i9cJ+YHmWKOdCigv2Ge1UadOQFlC
 E7tRPiUR3asfDfj+e+LVTTdToH6p8pk+mOUzIoZ8jIkQ+IXzi62UDl5+Rw9mqiuX
 1CtqEMUXxo2qwX+d4TcV/QUOp0YKPuIcUZ9NMMS+S3lOyJ4NFt+j2Izk7QJp5kNP
 gKVqB68TjDQsBuDr3P9ynlHbduxTIhZAqopbTrLe0FIg48nUe4n1yHJBVzqaVajX
 rFBJSsSUffBLAARNPSXJJhIgc2C1/qOC8dgMeDMcR2kPirDHaQZ/lY1yEpq1yiqR
 q6e3v5hvIAm4IjbYk0mF7TUxBrPGVE/ExyBINyASRoYxAJ1PyeD/iljZ9vI3asRA
 s+hhxT8H3f7lnqTrmJqMjHgAdGkmag07EdmvFNX4xK4aADSy7Y6g4dw25ffRopo9
 p9Jf9HX+dZv65Y3UjbV/6HuXcaSEBJJLSVWvii65PebqSN0LuHEFvNeIJ6Iblx0B
 AWh/hd0Iin2gdkcG39Mr
 =Z5kM
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - PowerNV PCI hotplug support.
   - Lots more Power9 support.
   - eBPF JIT support on ppc64le.
   - Lots of cxl updates.
   - Boot code consolidation.

  Bug fixes:
   - Fix spin_unlock_wait() from Boqun Feng
   - Fix stack pointer corruption in __tm_recheckpoint() from Michael
     Neuling
   - Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
   - mm: Ensure "special" zones are empty from Oliver O'Halloran
   - ftrace: Separate the heuristics for checking call sites from
     Michael Ellerman
   - modules: Never restore r2 for a mprofile-kernel style mcount() call
     from Michael Ellerman
   - Fix endianness when reading TCEs from Alexey Kardashevskiy
   - start rtasd before PCI probing from Greg Kurz
   - PCI: rpaphp: Fix slot registration for multiple slots under a PHB
     from Tyrel Datwyler
   - powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev
     Bhattiprolu

  Cleanups & fixes:
   - Drop support for MPIC in pseries from Rashmica Gupta
   - Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
   - Remove unused symbols in asm-offsets.c from Rashmica Gupta
   - Fix SRIOV not building without EEH enabled from Russell Currey
   - Remove kretprobe_trampoline_holder from Thiago Jung Bauermann
   - Reduce log level of PCI I/O space warning from Benjamin
     Herrenschmidt
   - Add array bounds checking to crash_shutdown_handlers from Suraj
     Jitindar Singh
   - Avoid -maltivec when using clang integrated assembler from Anton
     Blanchard
   - Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
   - Fix error return value in cmm_mem_going_offline() from Rasmus
     Villemoes
   - export cpu_to_core_id() from Mauricio Faria de Oliveira
   - Remove old symbols from defconfigs from Andrew Donnellan
   - Update obsolete comments in setup_32.c about entry conditions from
     Benjamin Herrenschmidt
   - Add comment explaining the purpose of setup_kdump_trampoline() from
     Benjamin Herrenschmidt
   - Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin
     Hao
   - Remove RELOCATABLE_PPC32 from Kevin Hao
   - Fix .long's in tlb-radix.c to more meaningful from Balbir Singh

  Minor cleanups & fixes:
   - Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
     Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King,
     Geliang Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman,
     Michael Ellerman, Stephen Rothwell, Stewart Smith.

  Freescale updates from Scott:
   - "Highlights include more 8xx optimizations, device tree updates,
     and MVME7100 support."

  PowerNV PCI hotplug from Gavin Shan:
   - PCI: Add pcibios_setup_bridge()
   - Override pcibios_setup_bridge()
   - Remove PCI_RESET_DELAY_US
   - Move pnv_pci_ioda_setup_opal_tce_kill() around
   - Increase PE# capacity
   - Allocate PE# in reverse order
   - Create PEs in pcibios_setup_bridge()
   - Setup PE for root bus
   - Extend PCI bridge resources
   - Make pnv_ioda_deconfigure_pe() visible
   - Dynamically release PE
   - Update bridge windows on PCI plug
   - Delay populating pdn
   - Support PCI slot ID
   - Use PCI slot reset infrastructure
   - Introduce pnv_pci_get_slot_id()
   - Functions to get/set PCI slot state
   - PCI/hotplug: PowerPC PowerNV PCI hotplug driver
   - Print correct PHB type names

  Power9 idle support from Shreyas B. Prabhu:
   - set power_save func after the idle states are initialized
   - Use PNV_THREAD_WINKLE macro while requesting for winkle
   - make hypervisor state restore a function
   - Rename idle_power7.S to idle_book3s.S
   - Rename reusable idle functions to hardware agnostic names
   - Make pnv_powersave_common more generic
   - abstraction for saving SPRs before entering deep idle states
   - Add platform support for stop instruction
   - cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
   - cpuidle/powernv: cleanup cpuidle-powernv.c
   - cpuidle/powernv: Add support for POWER ISA v3 idle states
   - Use deepest stop state when cpu is offlined

  Power9 PMU from Madhavan Srinivasan:
   - factor out power8 pmu macros and defines
   - factor out power8 pmu functions
   - factor out power8 __init_pmu code
   - Add power9 event list macros for generic and cache events
   - Power9 PMU support
   - Export Power9 generic and cache events to sysfs

  Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
   - Add XICS emulation APIs
   - Move a few exception common handlers to make room
   - Add support for HV virtualization interrupts
   - Add mechanism to force a replay of interrupts
   - Add ICP OPAL backend
   - Discover IODA3 PHBs
   - pci: Remove obsolete SW invalidate
   - opal: Add real mode call wrappers
   - Rename TCE invalidation calls
   - Remove SWINV constants and obsolete TCE code
   - Rework accessing the TCE invalidate register
   - Fallback to OPAL for TCE invalidations
   - Use the device-tree to get available range of M64's
   - Check status of a PHB before using it
   - pci: Don't try to allocate resources that will be reassigned

  Other Power9:
   - Send SIGBUS on unaligned copy and paste from Chris Smart
   - Large Decrementer support from Oliver O'Halloran
   - Load Monitor Register Support from Jack Miller

  Performance improvements from Anton Blanchard:
   - Avoid load hit store in __giveup_fpu() and __giveup_altivec()
   - Avoid load hit store in setup_sigcontext()
   - Remove assembly versions of strcpy, strcat, strlen and strcmp
   - Align hot loops of some string functions

  eBPF JIT from Naveen N. Rao:
   - Fix/enhance 32-bit Load Immediate implementation
   - Optimize 64-bit Immediate loads
   - Introduce rotate immediate instructions
   - A few cleanups
   - Isolate classic BPF JIT specifics into a separate header
   - Implement JIT compiler for extended BPF

  Operator Panel driver from Suraj Jitindar Singh:
   - devicetree/bindings: Add binding for operator panel on FSP machines
   - Add inline function to get rc from an ASYNC_COMP opal_msg
   - Add driver for operator panel on FSP machines

  Sparse fixes from Daniel Axtens:
   - make some things static
   - Introduce asm-prototypes.h
   - Include headers containing prototypes
   - Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
   - kvm: Clarify __user annotations
   - Pass endianness to sparse
   - Make ppc_md.{halt, restart} __noreturn

  MM fixes & cleanups from Aneesh Kumar K.V:
   - radix: Update LPCR HR bit as per ISA
   - use _raw variant of page table accessors
   - Compile out radix related functions if RADIX_MMU is disabled
   - Clear top 16 bits of va only on older cpus
   - Print formation regarding the the MMU mode
   - hash: Update SDR1 size encoding as documented in ISA 3.0
   - radix: Update PID switch sequence
   - radix: Update machine call back to support new HCALL.
   - radix: Add LPID based tlb flush helpers
   - radix: Add a kernel command line to disable radix
   - Cleanup LPCR defines

  Boot code consolidation from Benjamin Herrenschmidt:
   - Move epapr_paravirt_early_init() to early_init_devtree()
   - cell: Don't use flat device-tree after boot
   - ge_imp3a: Don't use the flat device-tree after boot
   - mpc85xx_ds: Don't use the flat device-tree after boot
   - mpc85xx_rdb: Don't use the flat device-tree after boot
   - Don't test for machine type in rtas_initialize()
   - Don't test for machine type in smp_setup_cpu_maps()
   - dt: Add of_device_compatible_match()
   - Factor do_feature_fixup calls
   - Move 64-bit feature fixup earlier
   - Move 64-bit memory reserves to setup_arch()
   - Use a cachable DART
   - Move FW feature probing out of pseries probe()
   - Put exception configuration in a common place
   - Remove early allocation of the SMU command buffer
   - Move MMU backend selection out of platform code
   - pasemi: Remove IOBMAP allocation from platform probe()
   - mm/hash: Don't use machine_is() early during boot
   - Don't test for machine type to detect HEA special case
   - pmac: Remove spurrious machine type test
   - Move hash table ops to a separate structure
   - Ensure that ppc_md is empty before probing for machine type
   - Move 64-bit probe_machine() to later in the boot process
   - Move 32-bit probe() machine to later in the boot process
   - Get rid of ppc_md.init_early()
   - Move the boot time info banner to a separate function
   - Move setting of {i,d}cache_bsize to initialize_cache_info()
   - Move the content of setup_system() to setup_arch()
   - Move cache info inits to a separate function
   - Re-order the call to smp_setup_cpu_maps()
   - Re-order setup_panic()
   - Make a few boot functions __init
   - Merge 32-bit and 64-bit setup_arch()

  Other new features:
   - tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
   - tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
   - powerpc: Add module autoloading based on CPU features from Alastair D'Silva
   - crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
   - Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
   - xmon: Dump ISA 2.06 SPRs from Michael Ellerman
   - xmon: Dump ISA 2.07 SPRs from Michael Ellerman
   - Add a parameter to disable 1TB segs from Oliver O'Halloran
   - powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
   - Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
   - pseries: Add pseries hotplug workqueue from John Allen
   - pseries: Add support for hotplug interrupt source from John Allen
   - pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
   - pseries: Move property cloning into its own routine from Nathan Fontenot
   - pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
   - pseries: Auto-online hotplugged memory from Nathan Fontenot
   - pseries: Remove call to memblock_add() from Nathan Fontenot

  cxl:
   - Add set and get private data to context struct from Michael Neuling
   - make base more explicitly non-modular from Paul Gortmaker
   - Use for_each_compatible_node() macro from Wei Yongjun
   - Frederic Barrat
   - Abstract the differences between the PSL and XSL
   - Make vPHB device node match adapter's
   - Philippe Bergheaud
   - Add mechanism for delivering AFU driver specific events
   - Ignore CAPI adapters misplaced in switched slots
   - Refine slice error debug messages
   - Andrew Donnellan
   - static-ify variables to fix sparse warnings
   - PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
   - PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
   - Add cxl_check_and_switch_mode() API to switch bi-modal cards
   - remove dead Kconfig options
   - fix potential NULL dereference in free_adapter()
   - Ian Munsie
   - Update process element after allocating interrupts
   - Add support for CAPP DMA mode
   - Fix allowing bogus AFU descriptors with 0 maximum processes
   - Fix allocating a minimum of 2 pages for the SPA
   - Fix bug where AFU disable operation had no effect
   - Workaround XSL bug that does not clear the RA bit after a reset
   - Fix NULL pointer dereference on kernel contexts with no AFU interrupts
   - powerpc/powernv: Split cxl code out into a separate file
   - Add cxl_slot_is_supported API
   - Enable bus mastering for devices using CAPP DMA mode
   - Move cxl_afu_get / cxl_afu_put to base
   - Allow a default context to be associated with an external pci_dev
   - Do not create vPHB if there are no AFU configuration records
   - powerpc/powernv: Add support for the cxl kernel api on the real phb
   - Add support for using the kernel API with a real PHB
   - Add kernel APIs to get & set the max irqs per context
   - Add preliminary workaround for CX4 interrupt limitation
   - Add support for interrupts on the Mellanox CX4
   - Workaround PE=0 hardware limitation in Mellanox CX4
   - powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n

  selftests:
   - Test unaligned copy and paste from Chris Smart
   - Load Monitor Register Tests from Jack Miller
   - Cyril Bur
   - exec() with suspended transaction
   - Use signed long to read perf_event_paranoid
   - Fix usage message in context_switch
   - Fix generation of vector instructions/types in context_switch
   - Michael Ellerman
   - Use "Delta" rather than "Error" in normal output
   - Import Anton's mmap & futex micro benchmarks
   - Add a test for PROT_SAO"

* tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (263 commits)
  powerpc/mm: Parenthesise IS_ENABLED() in if condition
  tty/hvc: Use opal irqchip interface if available
  tty/hvc: Use IRQF_SHARED for OPAL hvc consoles
  selftests/powerpc: exec() with suspended transaction
  powerpc: Improve comment explaining why we modify VRSAVE
  powerpc/mm: Drop unused externs for hpte_init_beat[_v3]()
  powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header
  powerpc/mm: Fix build break when PPC_NATIVE=n
  crypto: vmx - Convert to CPU feature based module autoloading
  powerpc: Add module autoloading based on CPU features
  powerpc/powernv/ioda: Fix endianness when reading TCEs
  powerpc/mm: Add memory barrier in __hugepte_alloc()
  powerpc/modules: Never restore r2 for a mprofile-kernel style mcount() call
  powerpc/ftrace: Separate the heuristics for checking call sites
  powerpc: Merge 32-bit and 64-bit setup_arch()
  powerpc/64: Make a few boot functions __init
  powerpc: Re-order setup_panic()
  powerpc: Re-order the call to smp_setup_cpu_maps()
  powerpc/32: Move cache info inits to a separate function
  powerpc/64: Move the content of setup_system() to setup_arch()
  ...
2016-07-30 21:01:36 -07:00
Michael Ellerman 719dbb2df7 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include more 8xx optimizations, device tree updates,
and MVME7100 support."
2016-07-30 13:43:19 +10:00
Linus Torvalds 7a1e8b80fb Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - TPM core and driver updates/fixes
   - IPv6 security labeling (CALIPSO)
   - Lots of Apparmor fixes
   - Seccomp: remove 2-phase API, close hole where ptrace can change
     syscall #"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
  apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
  tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
  tpm: Factor out common startup code
  tpm: use devm_add_action_or_reset
  tpm2_i2c_nuvoton: add irq validity check
  tpm: read burstcount from TPM_STS in one 32-bit transaction
  tpm: fix byte-order for the value read by tpm2_get_tpm_pt
  tpm_tis_core: convert max timeouts from msec to jiffies
  apparmor: fix arg_size computation for when setprocattr is null terminated
  apparmor: fix oops, validate buffer size in apparmor_setprocattr()
  apparmor: do not expose kernel stack
  apparmor: fix module parameters can be changed after policy is locked
  apparmor: fix oops in profile_unpack() when policy_db is not present
  apparmor: don't check for vmalloc_addr if kvzalloc() failed
  apparmor: add missing id bounds check on dfa verification
  apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
  apparmor: use list_next_entry instead of list_entry_next
  apparmor: fix refcount race when finding a child profile
  apparmor: fix ref count leak when profile sha1 hash is read
  apparmor: check that xindex is in trans_table bounds
  ...
2016-07-29 17:38:46 -07:00
Linus Torvalds 1b3fc0bef8 pstore subsystem updates for v4.8
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJXloQYAAoJEIly9N/cbcAmGC0QAIpyoiqEuDiJq/XpRg1ux+PC
 Vyr15Pub9yQYwcrWffMH1Zr0GhlFmXb1iP9rp36zdtMhjBEfq7wegvblLVMlzl6G
 7nYt8hJjDh/h8iw1lElgDL2kwUbTym43HoczJNvY/lOmFuUMK8AoDIRYjFTLAKfQ
 S4KA9MFJe3kDh4OUoQVfQNrC2VReLD4uvXk4EUF0wDYoqjVKyU3WBHOMgEmggKTR
 cb+fwhg3Lj4cuMMtZqy8wCqZ/hqhaH8giHC9YbIZQyre3ylncH9xUZyfiqS6nQGc
 eLc03qxqDNsmZvcY6cJgXldLQ3tXM4o96Moakzn2n4sQcW9vh/3oZzDPd7gC8Ei1
 GfIXmRBXFhj5JaeHNGJxL6oCywK+JaqxG8nqD7cEcXTzJiHzjn5kKKSFlr3GmI7w
 47htXv9t07SMgQW0IlBws5yApfeB62dQXmhZc1kMtbonhGdAZCCUg2Nrv34VxrjX
 Dp+LCmD5bg/fBrnAt8f+IIQEd3pElngay+SmSEB9XFUejf3pKw8SvcoDbmE3LD7M
 zGh5bEkptHll2GMInVAt4b4tiC44e7u+0H1Rsi/ttA1cktXZ9hOtFewhXNvfOS2I
 hAMGSngOdpzR5v9Mof5hJAWrr1CLkoh757UoYMsb8u2V9aQw7oVZ5JRMzjZBU6iC
 qXqHm4h5P1bpAeit3DLF
 =fiRI
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore subsystem updates from Kees Cook:
 "This expands the supported compressors, fixes some bugs, and finally
  adds DT bindings"

* tag 'pstore-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore/ram: add Device Tree bindings
  efi-pstore: implement efivars_pstore_exit()
  pstore: drop file opened reference count
  pstore: add lzo/lz4 compression support
  pstore: Cleanup pstore_dump()
  pstore: Enable compression on normal path (again)
  ramoops: Only unregister when registered
2016-07-26 18:48:23 -07:00
Linus Torvalds bbce2ad2d7 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 4.8:

  API:
   - first part of skcipher low-level conversions
   - add KPP (Key-agreement Protocol Primitives) interface.

  Algorithms:
   - fix IPsec/cryptd reordering issues that affects aesni
   - RSA no longer does explicit leading zero removal
   - add SHA3
   - add DH
   - add ECDH
   - improve DRBG performance by not doing CTR by hand

  Drivers:
   - add x86 AVX2 multibuffer SHA256/512
   - add POWER8 optimised crc32c
   - add xts support to vmx
   - add DH support to qat
   - add RSA support to caam
   - add Layerscape support to caam
   - add SEC1 AEAD support to talitos
   - improve performance by chaining requests in marvell/cesa
   - add support for Araneus Alea I USB RNG
   - add support for Broadcom BCM5301 RNG
   - add support for Amlogic Meson RNG
   - add support Broadcom NSP SoC RNG"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (180 commits)
  crypto: vmx - Fix aes_p8_xts_decrypt build failure
  crypto: vmx - Ignore generated files
  crypto: vmx - Adding support for XTS
  crypto: vmx - Adding asm subroutines for XTS
  crypto: skcipher - add comment for skcipher_alg->base
  crypto: testmgr - Print akcipher algorithm name
  crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op
  crypto: nx - off by one bug in nx_of_update_msc()
  crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
  crypto: scatterwalk - Inline start/map/done
  crypto: scatterwalk - Remove unnecessary BUG in scatterwalk_start
  crypto: scatterwalk - Remove unnecessary advance in scatterwalk_pagedone
  crypto: scatterwalk - Fix test in scatterwalk_done
  crypto: api - Optimise away crypto_yield when hard preemption is on
  crypto: scatterwalk - add no-copy support to copychunks
  crypto: scatterwalk - Remove scatterwalk_bytes_sglen
  crypto: omap - Stop using crypto scatterwalk_bytes_sglen
  crypto: skcipher - Remove top-level givcipher interface
  crypto: user - Remove crypto_lookup_skcipher call
  crypto: cts - Convert to skcipher
  ...
2016-07-26 13:40:17 -07:00
Anton Blanchard dd57023747 powerpc: Improve comment explaining why we modify VRSAVE
The comment explaining why we modify VRSAVE is misleading, glibc
does rely on the behaviour. Update the comment.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-26 14:16:19 +10:00
Kees Cook 74e630a758 Linux 4.7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXlRXSAAoJEHm+PkMAQRiGG/gH/0Z8O4zWOsrwO+X1mRToRDBH
 joFOjAmCVe83T1VpF5LYNB+9+owL/dEDt6+ZIswnhH7AfQPjs4RqwS4PcuMbCDVO
 +mDm0PmfcKaYcQZrB2Z2OwIzRNnfCTVcsDPhIHwuIHk0m4z/xuGZonD8KoAj0+tO
 3yJF6sbE1KubDVjOb+lmZZSP3cXA0pDXrNhkYhE4Tsr8fiihGjeXSNJ8t2zPLjxo
 W3MPqo0rzDvQsOwoF4TWHHagVaFSJlhLBBgqu33fI7uO3jtfQD2G8wG68JCND1j3
 qbMoBfTLFV/yQmSIJUt0Wv1axaCcwnjpweEB35A/GEeZ0mNB1rDdoBeI1eKEQkc=
 =DGFC
 -----END PGP SIGNATURE-----

Merge tag 'v4.7' into for-linus/pstore

Linux 4.7
2016-07-25 13:50:36 -07:00
Michael Ellerman 31278b17a0 powerpc/modules: Never restore r2 for a mprofile-kernel style mcount() call
In the module loader we process relocations, and for long jumps we
generate trampolines (aka stubs). At the call site for one of these
trampolines we usually need to generate a load instruction to restore
the TOC pointer into r2.

There is one exception however, which is calls to mcount() using the
mprofile-kernel ABI, they handle the TOC inside the stub, and so for
them we do not generate a TOC load.

The bug is in how the code in restore_r2() decides if it needs to
generate the TOC load. It does so by looking for a nop following the
branch, and if it sees a nop, it replaces it with the load. In general
the compiler has no reason to generate a nop following the mcount()
call and so that check works OK.

However if we combine a jump label at the start of a function, with an
early return, such that GCC applies the shrink-wrapping optimisation, we
can then end up with an mcount call followed immediately by a nop.
However the nop is not there for a TOC load, it is for the jump label.

That confuses restore_r2() into replacing the jump label nop with a TOC
load, which in turn confuses ftrace into replacing the mcount call with
a b +8 (fixed in the previous commit). The end result is we jump over
the jump label, which if it was supposed to return means we incorrectly
run the body of the function.

We have seen this in practice with some yet-to-be-merged patches that
use jump labels more extensively.

The fix is relatively simple, in restore_r2() we check for an
mprofile-kernel style mcount() call first, before looking for the
presence of a nop.

Fixes: 153086644f ("powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 20:10:42 +10:00
Michael Ellerman 9d63610951 powerpc/ftrace: Separate the heuristics for checking call sites
In __ftrace_make_nop() (the 64-bit version), we have code to deal with
two ftrace ABIs. There is the original ABI, which looks mostly like a
function call, and then the mprofile-kernel ABI which is just a branch.

The code tries to handle both cases, by looking for the presence of a
load to restore the TOC pointer (PPC_INST_LD_TOC). If we detect the TOC
load, we assume the call site is for an mcount() call using the old ABI.
That means we patch the mcount() call with a b +8, to branch over the
TOC load.

However if the kernel was built with mprofile-kernel, then there will
never be a call site using the original ftrace ABI. If for some reason
we do see a TOC load, then it's there for a good reason, and we should
not jump over it.

So split the code, using the existing CC_USING_MPROFILE_KERNEL. Kernels
built with mprofile-kernel will only look for, and expect, the new ABI,
and similarly for the original ABI.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 20:10:37 +10:00
Benjamin Herrenschmidt b1923caa6e powerpc: Merge 32-bit and 64-bit setup_arch()
There is little enough differences now.

mpe: Add a/p/k/setup.h to contain the prototypes and empty versions of
functions we need, rather than using weak functions. Add a few other
empty versions to avoid as many #ifdefs as possible in the code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:17:46 +10:00
Benjamin Herrenschmidt 009776baa1 powerpc/64: Make a few boot functions __init
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:17:25 +10:00
Benjamin Herrenschmidt f7b9ebb79e powerpc: Re-order setup_panic()
Do it right after probe_machine() since it's about testing ppc_md,
and put the test in the common code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:17:23 +10:00
Benjamin Herrenschmidt e39afba3aa powerpc: Re-order the call to smp_setup_cpu_maps()
It makes more sense to do it before intializing xmon() as xmon might
use the info in there. We do want to register the console early
though in case we want some functioning printk's in the cpu map setup.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:14:32 +10:00
Benjamin Herrenschmidt 8f212cb26f powerpc/32: Move cache info inits to a separate function
Matches 64-bit. Also move the call to the same spot as ppc64

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:14:32 +10:00
Benjamin Herrenschmidt fa745a129c powerpc/64: Move the content of setup_system() to setup_arch()
And kill setup_system().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:14:29 +10:00
Benjamin Herrenschmidt 9df549afea powerpc/64: Move setting of {i,d}cache_bsize to initialize_cache_info()
Also remove the completely osbolete comment. We *do* look in the
device-tree.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:08:06 +10:00
Benjamin Herrenschmidt bf1b61fb57 powerpc/64: Move the boot time info banner to a separate function
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:08:05 +10:00
Benjamin Herrenschmidt f2d576948d powerpc: Get rid of ppc_md.init_early()
It is now called right after platform probe, so the probe function
can just do the job.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:07:26 +10:00
Benjamin Herrenschmidt 5657138404 powerpc: Move 32-bit probe() machine to later in the boot process
This converts all the 32-bit platforms to use the expanded device-tree
which is a pretty mechanical change. Unlike 64-bit, the 32-bit kernel
didn't rely on platform initializations to setup the MMU since it
sets it up entirely before probe_machine() so the move has comparatively
less consequences though it's a bigger patch.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:06:42 +10:00
Benjamin Herrenschmidt 406b0b6ae3 powerpc/64: Move 64-bit probe_machine() to later in the boot process
We no long need the machine type that early, so we can move probe_machine()
to after the device-tree has been expanded. This will allow further
consolidation.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:59:22 +10:00
Benjamin Herrenschmidt 84b62c72fa powerpc: Ensure that ppc_md is empty before probing for machine type
Anything in there will be overwritten, so it helps catching nasty
bugs if we check that it's indeed full of NULL's before we do so.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:59:21 +10:00
Benjamin Herrenschmidt 7025776ed1 powerpc/mm: Move hash table ops to a separate structure
Moving probe_machine() to after mmu init will cause the ppc_md
fields relative to the hash table management to be overwritten.

Since we have essentially disconnected the machine type from
the hash backend ops, finish the job by moving them to a different
structure.

The only callback that didn't quite fix is update_partition_table
since this is not specific to hash, so I moved it to a standalone
variable for now. We can revisit later if needed.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Fix ppc64e build failure in kexec]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:59:09 +10:00
Benjamin Herrenschmidt 166dd7d3fb powerpc/64: Move MMU backend selection out of platform code
We move it into early_mmu_init() based on firmware features. For PS3,
we have to move the setting of these into early_init_devtree().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:56:38 +10:00
Benjamin Herrenschmidt d3cbff1b5a powerpc: Put exception configuration in a common place
The various calls to establish exception endianness and AIL are
now done from a single point using already established CPU and FW
feature bits to decide what to do.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:56:31 +10:00
Benjamin Herrenschmidt 3808a88985 powerpc: Move FW feature probing out of pseries probe()
We move the function itself to pseries/firmware.c and call it along
with almost all other flat device-tree parsers from early_init_devtree()

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Move #ifdefs into the header by providing pseries_probe_fw_features()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:56:13 +10:00
Benjamin Herrenschmidt de4cf3de59 powerpc: Move 64-bit memory reserves to setup_arch()
There is really no need to do them that early, early_setup() runs
before MMU is on, we should do the strict minimum there to get the
MMU going.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:54:55 +10:00
Benjamin Herrenschmidt c4bd6cb87c powerpc: Move 64-bit feature fixup earlier
Make it part of early_setup() as we really want the feature fixups
to be applied before we turn on the MMU since they can have an impact
on the various assembly path related to MMU management and interrupts.

This makes 64-bit match what 32-bit does.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:54:55 +10:00
Benjamin Herrenschmidt 9402c68461 powerpc: Factor do_feature_fixup calls
32 and 64-bit do a similar set of calls early on, we move it all to
a single common function to make the boot code more readable.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:51:42 +10:00
Kevin Hao 27d1149667 powerpc/32: Remove RELOCATABLE_PPC32
It is seldom used in the kernel code and can be easily replaced by
either RELOCATABLE or PPC32. So there is no reason to keep a separate
kernel option for this.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-19 20:17:07 +10:00
Aneesh Kumar K.V b275bfb269 powerpc/mm/radix: Add a kernel command line to disable radix
This patch adds the kernel command line disable_radix which disable
the radix MMU mode even if firmware indicates radix support via
ibm,pa-features device tree node.

This helps in testing different MMU mode easily.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17 16:42:55 +10:00
Aneesh Kumar K.V accfad7d0a powerpc/mm: Clear top 16 bits of va only on older cpus
As per ISA, we need to do this only for architecture version 2.02 and
earlier. This continued to work even for 2.07. But let's not do this for
anything after 2.02. ISA 3.0 requires these top bits to be not cleared.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17 16:42:52 +10:00
Benjamin Herrenschmidt 9a1a70ae15 powerpc/pci: Don't try to allocate resources that will be reassigned
When we know we will reassign all resources, trying (and failing)
to allocate them initially is fairly pointless and leads to a lot
of scary messages in the kernel log

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17 16:42:49 +10:00
Benjamin Herrenschmidt 69c592ed40 powerpc/opal: Add real mode call wrappers
Replace the old generic opal_call_realmode() with proper per-call
wrappers similar to the normal ones and convert callers.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17 16:42:46 +10:00
Benjamin Herrenschmidt 1d607bb3bd powerpc/irq: Add mechanism to force a replay of interrupts
Calling this function with interrupts soft-disabled will cause
a replay of the external interrupt vector when they are re-enabled.

This will be used by the OPAL XICS backend (and latter by the native
XIVE code) to handle EOI signaling that there are more interrupts to
fetch from the hardware since the hardware won't issue another HW
interrupt in that case.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17 16:42:44 +10:00
Benjamin Herrenschmidt 9baaef0a22 powerpc/irq: Add support for HV virtualization interrupts
This will be delivering external interrupts from the XIVE to the
Hypervisor. We treat it as a normal external interrupt for the
lazy irq disable code (so it will be replayed as a 0x500) and
route it to do_IRQ.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17 16:42:44 +10:00
Benjamin Herrenschmidt b88d4bce2b powerpc/book64s: Move a few exception common handlers to make room
This moves the CBE RAS and facility unavailable "common" handlers
down to after the FWNMI page.

This frees up some space in the very demanded spaces before the
relocation-on vectors and before the FWNMI page. They are still
within 64K of __start, so CONFIG_RELOCATABLE should still work.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17 16:42:34 +10:00
Shreyas B. Prabhu bcef83a00d powerpc/powernv: Add platform support for stop instruction
POWER ISA v3 defines a new idle processor core mechanism. In summary,
 a) new instruction named stop is added. This instruction replaces
	instructions like nap, sleep, rvwinkle.
 b) new per thread SPR named Processor Stop Status and Control Register
	(PSSCR) is added which controls the behavior of stop instruction.

PSSCR layout:
----------------------------------------------------------
| PLS | /// | SD | ESL | EC | PSLL | /// | TR | MTL | RL |
----------------------------------------------------------
0      4     41   42    43   44     48    54   56    60

PSSCR key fields:
	Bits 0:3  - Power-Saving Level Status. This field indicates the lowest
	power-saving state the thread entered since stop instruction was last
	executed.

	Bit 42 - Enable State Loss
	0 - No state is lost irrespective of other fields
	1 - Allows state loss

	Bits 44:47 - Power-Saving Level Limit
	This limits the power-saving level that can be entered into.

	Bits 60:63 - Requested Level
	Used to specify which power-saving level must be entered on executing
	stop instruction

This patch adds support for stop instruction and PSSCR handling.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:41 +10:00
Shreyas B. Prabhu 0dfffb48ce powerpc/powernv: abstraction for saving SPRs before entering deep idle states
Create a function for saving SPRs before entering deep idle states.
This function can be reused for POWER9 deep idle states.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:40 +10:00
Shreyas B. Prabhu 4eae2c9ae5 powerpc/powernv: Make pnv_powersave_common more generic
pnv_powersave_common does common steps needed before entering idle
state and eventually changes MSR to MSR_IDLE and does rfid to
pnv_enter_arch207_idle_mode.

Move the updation of HSTATE_HWTHREAD_STATE to pnv_powersave_common
from pnv_enter_arch207_idle_mode and make it more generic by passing the
rfid address as a function parameter.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:40 +10:00
Shreyas B. Prabhu 5fa6b6bd7a powerpc/powernv: Rename reusable idle functions to hardware agnostic names
Functions like power7_wakeup_loss, power7_wakeup_noloss,
power7_wakeup_tb_loss are used by POWER7 and POWER8 hardware. They can
also be used by POWER9. Hence rename these functions hardware agnostic
names.

Suggested-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:39 +10:00
Shreyas B. Prabhu 83289f909a powerpc/powernv: Rename idle_power7.S to idle_book3s.S
idle_power7.S handles idle entry/exit for POWER7, POWER8 and in next
patch for POWER9. Rename the file to a non-hardware specific
name.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:39 +10:00
Shreyas B. Prabhu 1706567117 powerpc/kvm: make hypervisor state restore a function
In the current code, when the thread wakes up in reset vector, some
of the state restore code and check for whether a thread needs to
branch to kvm is duplicated. Reorder the code such that this
duplication is avoided.

At a higher level this is what the change looks like-

Before this patch -
power7_wakeup_tb_loss:
	restore hypervisor state
	if (thread needed by kvm)
		goto kvm_start_guest
	restore nvgprs, cr, pc
	rfid to process context

power7_wakeup_loss:
	restore nvgprs, cr, pc
	rfid to process context

reset vector:
	if (waking from deep idle states)
		goto power7_wakeup_tb_loss
	else
		if (thread needed by kvm)
			goto kvm_start_guest
		goto power7_wakeup_loss

After this patch -
power7_wakeup_tb_loss:
	restore hypervisor state
	return

power7_restore_hyp_resource():
	if (waking from deep idle states)
		goto power7_wakeup_tb_loss
	return

power7_wakeup_loss:
	restore nvgprs, cr, pc
	rfid to process context

reset vector:
	power7_restore_hyp_resource()
	if (thread needed by kvm)
                goto kvm_start_guest
	goto power7_wakeup_loss

Reviewed-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:38 +10:00
Shreyas B. Prabhu bfd1b7ae5e powerpc/powernv: Use PNV_THREAD_WINKLE macro while requesting for winkle
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:38 +10:00
Michael Neuling 6bcb80143e powerpc/tm: Fix stack pointer corruption in __tm_recheckpoint()
At the start of __tm_recheckpoint() we save the kernel stack pointer
(r1) in SPRG SCRATCH0 (SPRG2) so that we can restore it after the
trecheckpoint.

Unfortunately, the same SPRG is used in the SLB miss handler.  If an
SLB miss is taken between the save and restore of r1 to the SPRG, the
SPRG is changed and hence r1 is also corrupted.  We can end up with
the following crash when we start using r1 again after the restore
from the SPRG:

  Oops: Bad kernel stack pointer, sig: 6 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  CPU: 658 PID: 143777 Comm: htm_demo Tainted: G            EL   X 4.4.13-0-default #1
  task: c0000b56993a7810 ti: c00000000cfec000 task.ti: c0000b56993bc000
  NIP: c00000000004f188 LR: 00000000100040b8 CTR: 0000000010002570
  REGS: c00000000cfefd40 TRAP: 0300   Tainted: G            EL   X  (4.4.13-0-default)
  MSR: 8000000300001033 <SF,ME,IR,DR,RI,LE>  CR: 02000424  XER: 20000000
  CFAR: c000000000008468 DAR: 00003ffd84e66880 DSISR: 40000000 SOFTE: 0
  PACATMSCRATCH: 00003ffbc865e680
  GPR00: fffffffcfabc4268 00003ffd84e667a0 00000000100d8c38 000000030544bb80
  GPR04: 0000000000000002 00000000100cf200 0000000000000449 00000000100cf100
  GPR08: 000000000000c350 0000000000002569 0000000000002569 00000000100d6c30
  GPR12: 00000000100d6c28 c00000000e6a6b00 00003ffd84660000 0000000000000000
  GPR16: 0000000000000003 0000000000000449 0000000010002570 0000010009684f20
  GPR20: 0000000000800000 00003ffd84e5f110 00003ffd84e5f7a0 00000000100d0f40
  GPR24: 0000000000000000 0000000000000000 0000000000000000 00003ffff0673f50
  GPR28: 00003ffd84e5e960 00000000003d0f00 00003ffd84e667a0 00003ffd84e5e680
  NIP [c00000000004f188] restore_gprs+0x110/0x17c
  LR [00000000100040b8] 0x100040b8
  Call Trace:
  Instruction dump:
  f8a1fff0 e8e700a8 38a00000 7ca10164 e8a1fff8 e821fff0 7c0007dd 7c421378
  7db142a6 7c3242a6 38800002 7c810164 <e9c100e0> e9e100e8 ea0100f0 ea2100f8

We hit this on large memory machines (> 2TB) but it can also be hit on
smaller machines when 1TB segments are disabled.

To hit this, you also need to be virtualised to ensure SLBs are
periodically removed by the hypervisor.

This patches moves the saving of r1 to the SPRG to the region where we
are guaranteed not to take any further SLB misses.

Fixes: 98ae22e15b ("powerpc: Add helper functions for transactional memory context switching")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 15:00:18 +10:00
Michael Ellerman b5f1bf48f2 powerpc fixes for 4.7 #5
- tm: Always reclaim in start_thread() for exec() class syscalls from Cyril Bur
  - tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 from Michael Neuling
  - eeh: Fix wrong argument passed to eeh_rmv_device() from Gavin Shan
  - Initialise pci_io_base as early as possible from Darren Stevens
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXeEmsAAoJEFHr6jzI4aWAwMMQAKs/u9rwB3gpOkNJSHajN1Dd
 kdqDufzLxLDwbWnMfqM1+bcO2EOjPhKbtpbzhG6oeiET8undRRoLsjHS5rZeYK5h
 cviRPEJ/Yz8ZWaIgFGI8+02gXwU0MJhuTY8NPexXmsh4FRdKYwEuCIJShl30lg22
 P7UrJ2SCNM+H/uZyS07B7thiwBeAKSp6VkLTpuW/QDz2j1ra/F22dTh7c0Agdahd
 INAMAnh9nYeuMVYn4XjOOlQ07JnBTuf1/W5Wxlw4i/86rVq+Hy8zh5r1X52oysR5
 lZl63B9q3agKG9cc9lSN2ibTDVerlFMwB2QysX2a6Uy7+y2SB3hS7VS1RTXCh3hg
 /omApGGVW3Hh+E2CuKfFLQySU55NRpLAoTGravGr/KsH4wZP/n/fkrctldCrqm7P
 sTPT52+t+iJQk4fiskRY3yQ7DTTnt3rTC8MJRGqvLuCheolLll4NQaWOF75AJP+7
 WFWtC4QHOTPERMkhqLnZDG2vNuDg1H8chuZ2+PxtIs6G1vuOEun+MTZAYh4u6XWE
 bAIT9rV3xBdE17bzYOQz7lU1y7yNVtP7xkm0HIOAHlU4gUrjQp5u8F3TnPW3/M0m
 8GeaZdrPjhsaNg31YZODAeM8Ddf+N9d2a2VPIr/fzytURhMe0ss3Z/MdMoYRATab
 Lh1o+G3gDo9MVaphoJ3w
 =oEAY
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.7-5' into next

Pull in the fixes we sent during 4.7, we have code we want to merge into
next that depends on some of them.
2016-07-15 14:57:47 +10:00
Daniel Axtens 95ec77c06e powerpc: Make ppc_md.{halt, restart} __noreturn
powernv marks it's halt and restart calls as __noreturn. However,
ppc_md does not have this annotation. Add the annotation to ppc_md,
and then to every halt/restart function that is missing it.

Additionally, I have verified that all of these functions do not
return. Occasionally I have added a spin loop to be sure.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 21:12:06 +10:00
Suraj Jitindar Singh a7d6392866 powerpc/crash: Rearrange loop condition to avoid out of bounds array access
The array crash_shutdown_handles[] has size CRASH_HANDLER_MAX, thus when
we loop over the elements of the list we check crash_shutdown_handles[i]
&& i < CRASH_HANDLER_MAX. However this means that when we increment i to
CRASH_HANDLER_MAX we will perform an out of bound array access checking
the first condition before exiting on the second condition.

To avoid the out of bounds access, simply reorder the loop conditions.

Fixes: 1d1451655b ("powerpc: Add array bounds checking to crash_shutdown_handlers")
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:26:22 +10:00
Benjamin Herrenschmidt 0f2b3442fb powerpc: Don't test for machine type in smp_setup_cpu_maps()
The subsequent test for RTAS along with the LPAR test are sufficient

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-13 18:15:38 +10:00
Benjamin Herrenschmidt 484cc1ed3c powerpc/rtas: Don't test for machine type in rtas_initialize()
The test is unnecessary, the FW_FEATURE_LPAR is sufficient as there
exist no other LPAR type that has RTAS.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-13 18:15:38 +10:00
Benjamin Herrenschmidt da6a97bf12 powerpc: Move epapr_paravirt_early_init() to early_init_devtree()
The function is called by both 32-bit and 64-bit early setup right
after early_init_devtree(). All it does is run yet another early
DT parser which is precisely what early_init_devtree() is about,
so move it in there.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-11 20:09:40 +10:00
Benjamin Herrenschmidt 63c254a501 powerpc: Add comment explaining the purpose of setup_kdump_trampoline()
Anything in early_setup() needs to be justified to be there, in
this case, we need the trampolines before we can take exceptions
and thus before we turn on the MMU.

Also remove a pretty meaningless and misplaced debug message

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Fix comment formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-11 20:09:40 +10:00
Benjamin Herrenschmidt bd7c93cca3 powerpc: Update obsolete comments in setup_32.c about entry conditions
early_init() is called in-place before kernel relocation and using
whatever MMU setup exists at the point the kernel is entered.

machine_init() is called after relocation and after some initial
mapping of PAGE_OFFSET has been established (typically using BATs
on 6xx/7xx/7xxx processors or some form of bolted TLB on others).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-11 20:09:39 +10:00
Scott Wood 9f595fd8b5 powerpc/8xx: Force VIRT_IMMR_BASE to be a positive number
The asm-offsets mechanism generates signed numbers, even if the
input value is explicitly unsigned.  This causes a problem with
older binutils (e.g. 2.23), which sign-extend a negative number
when @h is applied.  Thus, this instruction:

	cmpli   cr0, r11, VIRT_IMMR_BASE@h

resulted in this:

Error: operand out of range (0xfffffff0 is not between 0x00000000 and
0x0000ffff)

By casting to a larger type, we can force the output to be expressed
as a positive number.

Signed-off-by: Scott Wood <oss@buserror.net>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
2016-07-09 03:26:53 -05:00
Christophe Leroy 62f64b49d0 powerpc/8xx: add CONFIG_PIN_TLB_IMMR
CONFIG_PIN_TLB maps IMMR area and the first 24 Mbytes of memory.
In some circunstances it might be more interesting to not map
IMMR but map 32 Mbytes of memory instead.

Therefore we add config option CONFIG_PIN_TLB_IMMR to select if
IMMR shall be pinned or not, hence whether we pin 24 or 32 Mbytes of RAM

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-07-09 02:02:48 -05:00
Christophe Leroy 4ad274502f powerpc/8xx: Rework CONFIG_PIN_TLB handling
On recent kernels, with some debug options like for instance
CONFIG_LOCKDEP, the BSS requires more than 8M memory, allthough
the kernel code fits in the first 8M.
Today, it is necessary to activate CONFIG_PIN_TLB to get more than 8M
at startup, allthough pinning TLB is not necessary for that.

We could have inconditionaly mapped 16 or 24M bytes at startup
but some old hardware only have 8M and mapping non-existing RAM
would be an issue due to speculative accesses.

With the preceding patch however, the TLB entries are populated on
demand. By setting up the TLB miss handler to handle up to 24M until
the handler is patched for the entire memory space, it is possible
to allow access up to more memory without mapping non-existing RAM.

It is therefore not needed anymore to map memory data at all
at startup. It will be handled by the TLB miss handler.

One might still want to PIN the IMMR and the first 24M of RAM.
It is now possible to do it in the C memory initialisation
functions. In addition, we now know how much memory we have
when we do it, so we are able to adapt the pining to the
real amount of memory available. So boards with less than 24M
can now also benefit from PIN_TLB.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-07-09 02:02:48 -05:00
Christophe Leroy bb7f380849 powerpc/8xx: Don't use page table for linear memory space
Instead of using the first level page table to define mappings for
the linear memory space, we can use direct mapping from the TLB
handling routines. This has several advantages:
* No need to read the tables at each TLB miss
* No issue in 16k pages mode where the 1st level table maps 64 Mbytes

The size of the available linear space is known at system startup.
In order to avoid data access at each TLB miss to know the memory
size, the TLB routine is patched at startup with the proper size

This patch provides a 10%-15% improvment of TLB miss handling for
kernel addresses

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-07-09 02:02:48 -05:00
Christophe Leroy 6264dbb98f powerpc/8xx: unpin all TLBs before flushing
Bootloader may have pinned some TLB entries so the kernel must
unpin them before flushing TLBs with tlbia otherwise pinned TLB
entries won't get flushed

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-07-09 02:02:48 -05:00
Christophe Leroy 4badd43ae4 powerpc/8xx: Map IMMR area with 512k page at a fixed address
Once the linear memory space has been mapped with 8Mb pages, as
seen in the related commit, we get 11 millions DTLB missed during
the reference 600s period. 77% of the misses are on user addresses
and 23% are on kernel addresses (1 fourth for linear address space
and 3 fourth for virtual address space)

Traditionaly, each driver manages one computer board which has its
own components with its own memory maps.
But on embedded chips like the MPC8xx, the SOC has all registers
located in the same IO area.

When looking at ioremaps done during startup, we see that
many drivers are re-mapping small parts of the IMMR for their own use
and all those small pieces gets their own 4k page, amplifying the
number of TLB misses: in our system we get 0xff000000 mapped 31 times
and 0xff003000 mapped 9 times.

Even if each part of IMMR was mapped only once with 4k pages, it would
still be several small mappings towards linear area.

This patch maps the IMMR with a single 512k page.

With this patch applied, the number of DTLB misses during the 10 min
period is reduced to 11.8 millions for a duration of 5.8s, which
represents 2% of the non-idle time hence yet another 10% reduction.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-07-09 02:02:48 -05:00
Christophe Leroy f86ef74ed9 powerpc/8xx: Fix vaddr for IMMR early remap
Memory: 124428K/131072K available (3748K kernel code, 188K rwdata,
648K rodata, 508K init, 290K bss, 6644K reserved)
Kernel virtual memory layout:
  * 0xfffdf000..0xfffff000  : fixmap
  * 0xfde00000..0xfe000000  : consistent mem
  * 0xfddf6000..0xfde00000  : early ioremap
  * 0xc9000000..0xfddf6000  : vmalloc & ioremap
SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1

Today, IMMR is mapped 1:1 at startup

Mapping IMMR 1:1 is just wrong because it may overlap with another
area. On most mpc8xx boards it is OK as IMMR is set to 0xff000000
but for instance on EP88xC board, IMMR is at 0xfa200000 which
overlaps with VM ioremap area

This patch fixes the virtual address for remapping IMMR with the fixmap
regardless of the value of IMMR.

The size of IMMR area is 256kbytes (CPM at offset 0, security engine
at offset 128k) so a 512k page is enough

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-07-09 02:02:48 -05:00
Christophe Leroy c223c90386 powerpc32: provide VIRT_CPU_ACCOUNTING
This patch provides VIRT_CPU_ACCOUTING to PPC32 architecture.
PPC32 doesn't have the PACA structure, so we use the task_info
structure to store the accounting data.

In order to reuse on PPC32 the PPC64 functions, all u64 data has
been replaced by 'unsigned long' so that it is u32 on PPC32 and
u64 on PPC64

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-07-09 01:43:50 -05:00
Andrew Donnellan fa2cff3f54 powerpc: Fix typo in comment reference to CONFIG_TRACE_IRQFLAGS
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:10:03 +10:00
Andrew Donnellan 91dc068202 powerpc/eeh: Fix pr_debug()s in eeh_cache.c
eeh_cache.c doesn't build cleanly with -DDEBUG when
CONFIG_PHYS_ADDR_T_64BIT is set, as a couple of pr_debug()s use "%lx" for
resource_size_t parameters.

Use "%pap" instead, as it's the correct format specifier for types deriving
from phys_addr_t.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:09:50 +10:00
Greg Kurz 8c6a0a1f40 powerpc/pseries: start rtasd before PCI probing
A strange behaviour is observed when comparing PCI hotplug in QEMU, between
x86 and pseries. If you consider the following steps:
- start a VM
- add a PCI device via the QEMU monitor before the rtasd has started (for
  example starting the VM in paused state, or hotplug during FW or boot
  loader)
- resume the VM execution

The x86 kernel detects the PCI device, but the pseries one does not.

This happens because the rtasd kernel worker is currently started under
device_initcall, while PCI probing happens earlier under subsys_initcall.

As a consequence, if we have a pending RTAS event at boot time, a message
is printed and the event is dropped.

This patch moves all the initialization of rtasd to arch_initcall, which is
run before subsys_call: this way, logging_enabled is true when the RTAS
event pops up and it is not lost anymore.

The proc fs bits stay at device_initcall because they cannot be run before
fs_initcall.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 19:22:15 +10:00
Guilherme G. Piccoli 63a72284b1 powerpc/pci: Assign fixed PHB number based on device-tree properties
The domain/PHB field of PCI addresses has its value obtained from a
global variable, incremented each time a new domain (represented by
struct pci_controller) is added on the system. The domain addition
process happens during boot or due to PHB hotplug add.

As recent kernels are using predictable naming for network interfaces,
the network stack is more tied to PCI naming. This can be a problem in
hotplug scenarios, because PCI addresses will change if devices are
removed and then re-added. This situation seems unusual, but it can
happen if a user wants to replace a NIC without rebooting the machine,
for example.

This patch changes the way PCI domain values are generated: now, we use
device-tree properties to assign fixed PHB numbers to PCI addresses
when available (meaning pSeries and PowerNV cases). We also use a bitmap
to allow dynamic PHB numbering when device-tree properties are not
used. This bitmap keeps track of used PHB numbers and if a PHB is
released (by hotplug operations for example), it allows the reuse of
this PHB number, avoiding PCI address to change in case of device remove
and re-add soon after. No functional changes were introduced.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
[mpe: Drop unnecessary machine_is(pseries) test]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-07 22:06:55 +10:00
Michael Ellerman d468fcafb7 powerpc/pci: Fix build with PCI_IOV=y and EEH=n
Despite attempting to fix this in commit fb36e90736 ("powerpc/pci: Fix
SRIOV not building without EEH enabled"), the build is still broken when
PCI_IOV=y and EEH=n (eg. g5_defconfig with PCI_IOV=y):

  arch/powerpc/kernel/pci_dn.c: In function ‘remove_dev_pci_data’:
  arch/powerpc/kernel/pci_dn.c:230:18: error: unused variable ‘edev’

Incorporate Ben's idea of using __maybe_unused to avoid so many #ifdefs.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-07 16:33:27 +10:00
Oliver O'Halloran 7990102446 powerpc/timer: Large Decrementer support
Power ISAv3 adds a large decrementer (LD) mode which increases the size
of the decrementer register. The size of the enlarged decrementer
register is between 32 and 64 bits with the exact size being dependent
on the implementation. When in LD mode, reads are sign extended to 64
bits and a decrementer exception is raised when the high bit is set (i.e
the value goes below zero). Writes however are truncated to the physical
register width so some care needs to be taken to ensure that the high
bit is not set when reloading the decrementer. This patch adds support
for using the LD inside the host kernel on processors that support it.

When LD mode is supported firmware will supply the ibm,dec-bits property
for CPU nodes to allow the kernel to determine the maximum decrementer
value. Enabling LD mode is a hypervisor privileged operation so the kernel
can only enable it manually when running in hypervisor mode. Guests that
support LD mode can request it using the "ibm,client-architecture-support"
firmware call (not implemented in this patch) or some other platform
specific method. If this property is not supplied then the traditional
decrementer width of 32 bit is assumed and LD mode will not be enabled.

This patch was based on initial work by Jack Miller.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-05 23:58:53 +10:00
Andrew Donnellan a9862c7440 powerpc/rtas: Fix array overrun in ppc_rtas() syscall
If ppc_rtas() is called with args.nargs == 16 and args.nret == 0,
args.rets is set to point to &args.args[16], which is beyond the end of
the args.args array. This results in a minor read overrun of the array
when we check the first return code (which, per PAPR, is a required
output of all RTAS calls) to see if there's been a hardware error.

Change the nargs/nret check to ensure nargs is <= 15, allowing room for
the status code. Users shouldn't be calling with nret == 0, but there's
no real harm if they do, so we don't stop them.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-05 23:49:52 +10:00
Chris Smart ae26b36f80 powerpc: Send SIGBUS on unaligned copy and paste
Calling ISA 3.0 instructions copy, copy_first, paste and paste_last
generates an alignment fault when copying or pasting unaligned
data (128 byte). We catch this and send SIGBUS to the userspace
process that caused it.

We do not emulate these because paste may contain additional metadata
when pasting to a co-processor and paste_last is the synchronisation
point for preceding copy/paste sequences.

Thanks to Michael Neuling <mikey@neuling.org> for his help.

Signed-off-by: Chris Smart <chris@distroguy.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-05 23:49:51 +10:00
Madhavan Srinivasan 393eb79ad3 powerpc/perf: factor out power8 __init_pmu code
Factor out the power8 pmu init functions to share with
power9. Monitor Mode Control Register S(MMCRS) and
Monitor Mode Control Register H(MMCRH) registers are
dropped in Power9. These registers are added to new
function which are included for power8 init.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-05 23:49:47 +10:00
Michael Ellerman b5b1cfc5d4 powerpc/fadump: Fix build error introduced by recent cleanup
We spent so much time bike-shedding the printk() we missed that the next
line was missing a semi-colon. And it seems none of our defconfigs turn
on CONFIG_FA_DUMP.

Fixes: 4a03749f14 ("powerpc/fadump: Trivial fix of spelling mistake, clean up message")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-05 23:49:46 +10:00
Darren Stevens bfa37087aa powerpc: Initialise pci_io_base as early as possible
Commit d6a9996e84 ("powerpc/mm: vmalloc abstraction in preparation for
radix") turned kernel memory and IO addresses from #defined constants to
variables initialised at runtime.

On PA6T (pasemi) systems the setup_arch() machine call initialises the
onboard PCI-e root-ports, and uses pci_io_base to do this, which is now
before its value has been set, resulting in a panic early in boot before
console IO is initialised.

Move the pci_io_base initialisation to the same place as vmalloc ranges
are set (hash__early_init_mmu()/radix__early_init_mmu()) - this is the
earliest possible place we can initialise it.

Fixes: d6a9996e84 ("powerpc/mm: vmalloc abstraction in preparation for radix")
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Darren Stevens <darren@stevens-zone.net>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Add #ifdef CONFIG_PCI, massage change log slightly]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-30 16:52:29 +10:00
Michael Neuling 190ce8693c powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
Currently we have 2 segments that are bolted for the kernel linear
mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
stacks. Anything accessed outside of these regions may need to be
faulted in. (In practice machines with TM always have 1T segments)

If a machine has < 2TB of memory we never fault on the kernel linear
mapping as these two segments cover all physical memory. If a machine
has > 2TB of memory, there may be structures outside of these two
segments that need to be faulted in. This faulting can occur when
running as a guest as the hypervisor may remove any SLB that's not
bolted.

When we treclaim and trecheckpoint we have a window where we need to
run with the userspace GPRs. This means that we no longer have a valid
stack pointer in r1. For this window we therefore clear MSR RI to
indicate that any exceptions taken at this point won't be able to be
handled. This means that we can't take segment misses in this RI=0
window.

In this RI=0 region, we currently access the thread_struct for the
process being context switched to or from. This thread_struct access
may cause a segment fault since it's not guaranteed to be covered by
the two bolted segment entries described above.

We've seen this with a crash when running as a guest with > 2TB of
memory on PowerVM:

  Unrecoverable exception 4100 at c00000000004f138
  Oops: Unrecoverable exception, sig: 6 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
  task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
  NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
  REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
  MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
  CFAR: c00000000004ed18 SOFTE: 0
  GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
  GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
  GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
  GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
  GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
  GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
  GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
  GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
  NIP [c00000000004f138] restore_gprs+0xd0/0x16c
  LR [0000000010003a24] 0x10003a24
  Call Trace:
  [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
  [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
  [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
  [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
  [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
  [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
  [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
  [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
  Instruction dump:
  7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
  38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
  ---[ end trace 602126d0a1dedd54 ]---

This fixes this by copying the required data from the thread_struct to
the stack before we clear MSR RI. Then once we clear RI, we only access
the stack, guaranteeing there's no segment miss.

We also tighten the region over which we set RI=0 on the treclaim()
path. This may have a slight performance impact since we're adding an
mtmsr instruction.

Fixes: 090b9284d7 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-29 16:19:25 +10:00
Gavin Shan cca0e542e0 powerpc/eeh: Fix wrong argument passed to eeh_rmv_device()
When calling eeh_rmv_device() in eeh_reset_device() for partial hotplug
case, @rmv_data instead of its address is the proper argument.
Otherwise, the stack frame is corrupted when writing to
@rmv_data (actually its address) in eeh_rmv_device(). It results in
kernel crash as observed.

This fixes the issue by passing @rmv_data, not its address to
eeh_rmv_device() in eeh_reset_device().

Fixes: 67086e32b5 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE")
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-28 20:47:49 +10:00
Colin Ian King 4a03749f14 powerpc/fadump: Trivial fix of spelling mistake, clean up message
Fix trivial spelling mistake "rgistration". Also use pr_err()
instead of printk() and unsplit the string to keep it all on one
line.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
[mpe: Keep rc on the same line, splitting it doesn't help]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-28 13:50:47 +10:00
Cyril Bur 8e96a87c54 powerpc/tm: Always reclaim in start_thread() for exec() class syscalls
Userspace can quite legitimately perform an exec() syscall with a
suspended transaction. exec() does not return to the old process, rather
it load a new one and starts that, the expectation therefore is that the
new process starts not in a transaction. Currently exec() is not treated
any differently to any other syscall which creates problems.

Firstly it could allow a new process to start with a suspended
transaction for a binary that no longer exists. This means that the
checkpointed state won't be valid and if the suspended transaction were
ever to be resumed and subsequently aborted (a possibility which is
exceedingly likely as exec()ing will likely doom the transaction) the
new process will jump to invalid state.

Secondly the incorrect attempt to keep the transactional state while
still zeroing state for the new process creates at least two TM Bad
Things. The first triggers on the rfid to return to userspace as
start_thread() has given the new process a 'clean' MSR but the suspend
will still be set in the hardware MSR. The second TM Bad Thing triggers
in __switch_to() as the processor is still transactionally suspended but
__switch_to() wants to zero the TM sprs for the new process.

This is an example of the outcome of calling exec() with a suspended
transaction. Note the first 700 is likely the first TM bad thing
decsribed earlier only the kernel can't report it as we've loaded
userspace registers. c000000000009980 is the rfid in
fast_exception_return()

  Bad kernel stack pointer 3fffcfa1a370 at c000000000009980
  Oops: Bad kernel stack pointer, sig: 6 [#1]
  CPU: 0 PID: 2006 Comm: tm-execed Not tainted
  NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000
  REGS: c00000003ffefd40 TRAP: 0700   Not tainted
  MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]>  CR: 00000000  XER: 00000000
  CFAR: c0000000000098b4 SOFTE: 0
  PACATMSCRATCH: b00000010000d033
  GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000
  GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000
  NIP [c000000000009980] fast_exception_return+0xb0/0xb8
  LR [0000000000000000]           (null)
  Call Trace:
  Instruction dump:
  f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070
  e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b

  Kernel BUG at c000000000043e80 [verbose debug info unavailable]
  Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033)
  Oops: Unrecoverable exception, sig: 6 [#2]
  CPU: 0 PID: 2006 Comm: tm-execed Tainted: G      D
  task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000
  NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000
  REGS: c00000003ffef7e0 TRAP: 0700   Tainted: G      D
  MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]>  CR: 28002828  XER: 00000000
  CFAR: c000000000015a20 SOFTE: 0
  PACATMSCRATCH: b00000010000d033
  GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000
  GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000
  GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004
  GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000
  GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000
  GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80
  NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c
  LR [c000000000015a24] __switch_to+0x1f4/0x420
  Call Trace:
  Instruction dump:
  7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
  4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020

This fixes CVE-2016-5828.

Fixes: bc2a9408fa ("powerpc: Hook in new transactional memory code")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-27 20:35:17 +10:00
Benjamin Herrenschmidt cdb1b3424d powerpc/pci: Reduce log level of PCI I/O space warning
If a PHB has no I/O space, there's no need to make it look like
something bad happened, a pr_debug() is plenty enough since this
is the case of all our modern POWER chips.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-24 15:26:31 +10:00
Michael Ellerman 6e914ee629 powerpc: Fix faults caused by radix patching of SLB miss handler
As part of the Radix MMU support we added some feature sections in the
SLB miss handler. These are intended to catch the case that we
incorrectly take an SLB miss when Radix is enabled, and instead of
crashing weirdly they bail out to a well defined exit path and trigger
an oops.

However the way they were written meant the bailout case was enabled by
default until we did CPU feature patching.

On powermacs the early debug prints in setup_system() can cause an SLB
miss, which happens before code patching, and so the SLB miss handler
would incorrectly bailout and crash during boot.

Fix it by inverting the sense of the feature section, so that the code
which is in place at boot is correct for the hash case. Once we
determine we are using Radix - which will never happen on a powermac -
only then do we patch in the bailout case which unconditionally jumps.

Fixes: caca285e5a ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code")
Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
Tested-by: Denis Kirjanov <kda@linux-powerpc.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-23 09:58:17 +10:00
Gavin Shan 8cc7581cdb powerpc/pci: Delay populating pdn
The pdn (struct pci_dn) instances are allocated from memblock or
bootmem when creating PCI controller (hoses) in setup_arch(). PCI
hotplug, which will be supported by proceeding patches, releases
PCI device nodes and their corresponding pdn on unplugging event.
The memory chunks for pdn instances allocated from memblock or
bootmem are hard to reused after being released.

This delays creating pdn by pci_devs_phb_init() from setup_arch()
to core_initcall() so that they are allocated from slab. The memory
consumed by pdn can be released to system without problem during
PCI unplugging time. It indicates that pci_dn is unavailable in
setup_arch() and the the fixup on pdn (like AGP's) can't be carried
out that time. We have to do that in pcibios_root_bridge_prepare()
on maple/pasemi/powermac platforms where/when the pdn is available.
pcibios_root_bridge_prepare is called from subsys_initcall() which
is executed after core_initcall() so the code flow does not change.

At the mean while, the EEH device is created when pdn is populated,
meaning pdn and EEH device have same life cycle. In turn, we needn't
call eeh_dev_init() to create EEH device explicitly.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-21 15:30:56 +10:00
Gavin Shan 7415c14c56 powerpc/pci: Update bridge windows on PCI plug
On the PCI plugging event, PCI slot's subordinate devices are
scanned and their (IO and MMIO) resources are assigned. Platform
dependent resources (PE#, IO/MMIO/DMA windows) are allocated or
created on updating windows of the slot's upstream bridge.

This updates the windows of the hot plugged slot's upstream bridge
in pcibios_finish_adding_to_bus() so that the platform resources
(PE#, IO/MMIO/DMA segments) are allocated or created accordingly.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-21 15:30:56 +10:00
Gavin Shan c5fcb29a64 powerpc/pci: Override pcibios_setup_bridge()
This overrides pcibios_setup_bridge() that is called to update PCI
bridge windows when PCI resource assignment is completed, to assign
PE and setup various (resource) mapping for the PE in subsequent
patches.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-21 15:30:52 +10:00
Mauricio Faria de Oliveira f8ab481066 powerpc: export cpu_to_core_id()
Export cpu_to_core_id(). This will be used by the lpfc driver.

This enables topology_core_id() from <linux/topology.h> (defined
to cpu_to_core_id() in arch/powerpc/include/asm/topology.h) to be
used by (non-builtin) modules.

That is arch-neutral, already used by eg, drivers/base/topology.c,
but it is builtin (obj-y in Makefile) thus didn't need the export.

Since the module uses topology_core_id() and this is defined to
cpu_to_core_id(), it needs the export, otherwise:

    ERROR: "cpu_to_core_id" [drivers/scsi/lpfc/lpfc.ko] undefined!

Tested on next-20160601.

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-21 15:30:51 +10:00
Jack Miller bd3ea317fd powerpc: Load Monitor Register Support
This enables new registers, LMRR and LMSER, that can trigger an EBB in
userspace code when a monitored load (via the new ldmx instruction)
loads memory from a monitored space. This facility is controlled by a
new FSCR bit, LM.

This patch disables the FSCR LM control bit on task init and enables
that bit when a load monitor facility unavailable exception is taken
for using it. On context switch, this bit is then used to determine
whether the two relevant registers are saved and restored. This is
done lazily for performance reasons.

Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-21 15:30:50 +10:00
Michael Neuling b57bd2de8c powerpc: Improve FSCR init and context switching
This fixes a few issues with FSCR init and switching.

In commit 152d523e63 ("powerpc: Create context switch helpers
save_sprs() and restore_sprs()") we moved the setting of the FSCR
register from inside an CPU_FTR_ARCH_207S section to inside just a
CPU_FTR_ARCH_DSCR section. Hence we are setting FSCR on POWER6/7 where
the FSCR doesn't exist. This is harmless but we shouldn't do it.

Also, we can simplify the FSCR context switch. We don't need to go
through the calculation involving dscr_inherit. We can just restore
what we saved last time.

We also set an initial value in INIT_THREAD, so that pid 1 which is
cloned from that gets a sane value.

Based on patch by Jack Miller.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-21 15:30:50 +10:00
Madhavan Srinivasan 103b7827d9 powerpc: Fix misleading comment in early_setup_secondary()
Current comment in the early_setup_secondary() for paca->soft_enabled
update is misleading. Comment should say to Mark interrupts "disabled"
instead of "enabled". Fix the typo.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-21 15:30:49 +10:00
Thiago Jung Bauermann 61ed9cfb1b powerpc/kprobes: Remove kretprobe_trampoline_holder.
Fixes the following testsuite failure:

  $ sudo ./perf test -v kallsyms
   1: vmlinux symtab matches kallsyms                          :
  --- start ---
  test child forked, pid 12489
  Using /proc/kcore for kernel object code
  Looking at the vmlinux_path (8 entries long)
  Using /boot/vmlinux for symbols
  0xc00000000003d300: diff name v: .kretprobe_trampoline_holder k: kretprobe_trampoline
  Maps only in vmlinux:
   c00000000086ca38-c000000000879b6c 87ca38 [kernel].text.unlikely
   c000000000879b6c-c000000000bf0000 889b6c [kernel].meminit.text
   c000000000bf0000-c000000000c53264 c00000 [kernel].init.text
   c000000000c53264-d000000004250000 c63264 [kernel].exit.text
   d000000004250000-d000000004450000 0 [libcrc32c]
   d000000004450000-d000000004620000 0 [xfs]
   d000000004620000-d000000004680000 0 [autofs4]
   d000000004680000-d0000000046e0000 0 [x_tables]
   d0000000046e0000-d000000004780000 0 [ip_tables]
   d000000004780000-d0000000047e0000 0 [rng_core]
   d0000000047e0000-ffffffffffffffff 0 [pseries_rng]
  Maps in vmlinux with a different name in kallsyms:
  Maps only in kallsyms:
   d000000000000000-f000000000000000 1000000000010000 [kernel.kallsyms]
   f000000000000000-ffffffffffffffff 3000000000010000 [kernel.kallsyms]
  test child finished with -1
  ---- end ----
  vmlinux symtab matches kallsyms: FAILED!

The problem is that the kretprobe_trampoline symbol looks like this:

  $ eu-readelf -s /boot/vmlinux G kretprobe_trampoline
   2431: c000000001302368     24 NOTYPE  LOCAL  DEFAULT       37 kretprobe_trampoline_holder
   2432: c00000000003d300      8 FUNC    LOCAL  DEFAULT        1 .kretprobe_trampoline_holder
  97543: c00000000003d300      0 NOTYPE  GLOBAL DEFAULT        1 kretprobe_trampoline

Its type is NOTYPE, and its size is 0, and this is a problem because
symbol-elf.c:dso__load_sym skips function symbols that are not STT_FUNC
or STT_GNU_IFUNC (this is determined by elf_sym__is_function). Even
if the type is changed to STT_FUNC, when dso__load_sym calls
symbols__fixup_duplicate, the kretprobe_trampoline symbol is dropped in
favour of .kretprobe_trampoline_holder because the latter has non-zero
size (as determined by choose_best_symbol).

With this patch, all vmlinux symbols match /proc/kallsyms and the
testcase passes.

Commit c1c355ce14 ("x86/kprobes: Get rid of
kretprobe_trampoline_holder()") gets rid of kretprobe_trampoline_holder
altogether on x86. This commit does the same on powerpc. This change
introduces no regressions on the perf and ftracetest testsuite results.

Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-21 15:30:49 +10:00
Mahesh Salgaonkar fd7bacbca4 KVM: PPC: Book3S HV: Fix TB corruption in guest exit path on HMI interrupt
When a guest is assigned to a core it converts the host Timebase (TB)
into guest TB by adding guest timebase offset before entering into
guest. During guest exit it restores the guest TB to host TB. This means
under certain conditions (Guest migration) host TB and guest TB can differ.

When we get an HMI for TB related issues the opal HMI handler would
try fixing errors and restore the correct host TB value. With no guest
running, we don't have any issues. But with guest running on the core
we run into TB corruption issues.

If we get an HMI while in the guest, the current HMI handler invokes opal
hmi handler before forcing guest to exit. The guest exit path subtracts
the guest TB offset from the current TB value which may have already
been restored with host value by opal hmi handler. This leads to incorrect
host and guest TB values.

With split-core, things become more complex. With split-core, TB also gets
split and each subcore gets its own TB register. When a hmi handler fixes
a TB error and restores the TB value, it affects all the TB values of
sibling subcores on the same core. On TB errors all the thread in the core
gets HMI. With existing code, the individual threads call opal hmi handle
independently which can easily throw TB out of sync if we have guest
running on subcores. Hence we will need to co-ordinate with all the
threads before making opal hmi handler call followed by TB resync.

This patch introduces a sibling subcore state structure (shared by all
threads in the core) in paca which holds information about whether sibling
subcores are in Guest mode or host mode. An array in_guest[] of size
MAX_SUBCORE_PER_CORE=4 is used to maintain the state of each subcore.
The subcore id is used as index into in_guest[] array. Only primary
thread entering/exiting the guest is responsible to set/unset its
designated array element.

On TB error, we get HMI interrupt on every thread on the core. Upon HMI,
this patch will now force guest to vacate the core/subcore. Primary
thread from each subcore will then turn off its respective bit
from the above bitmap during the guest exit path just after the
guest->host partition switch is complete.

All other threads that have just exited the guest OR were already in host
will wait until all other subcores clears their respective bit.
Once all the subcores turn off their respective bit, all threads will
will make call to opal hmi handler.

It is not necessary that opal hmi handler would resync the TB value for
every HMI interrupts. It would do so only for the HMI caused due to
TB errors. For rest, it would not touch TB value. Hence to make things
simpler, primary thread would call TB resync explicitly once for each
core immediately after opal hmi handler instead of subtracting guest
offset from TB. TB resync call will restore the TB with host value.
Thus we can be sure about the TB state.

One of the primary threads exiting the guest will take up the
responsibility of calling TB resync. It will use one of the top bits
(bit 63) from subcore state flags bitmap to make the decision. The first
primary thread (among the subcores) that is able to set the bit will
have to call the TB resync. Rest all other threads will wait until TB
resync is complete.  Once TB resync is complete all threads will then
proceed.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-06-20 14:11:25 +10:00
Bjorn Helgaas 3830135857 powerpc/pci: Implement pci_resource_to_user() with pcibios_resource_to_bus()
"User" addresses are shown in /sys/devices/pci.../.../resource and
/proc/bus/pci/devices and used as mmap offsets for /proc/bus/pci/BB/DD.F
files.  For I/O port resources on powerpc, these are PCI bus addresses,
i.e., raw BAR values.

Previously pci_resource_to_user() computed the user address by subtracting
"hose->io_base_virt - _IO_BASE" from the resource start:

  pci_resource_to_user()
    if (IO)
      offset = (unsigned long)hose->io_base_virt - _IO_BASE;
    *start = rsrc->start - offset;

We've already told the PCI core about that "hose->io_base_virt - _IO_BASE"
offset:

  pcibios_setup_phb_resources()
    res = &hose->io_resource;
    offset = pcibios_io_space_offset();
    /* i.e., "offset = hose->io_base_virt - _IO_BASE" */
    pci_add_resource_offset(resources, res, offset);

so pcibios_resource_to_bus() knows how to do that translation.

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2016-06-17 14:44:44 -05:00
Yinghai Lu 1e70cdd6e6 powerpc/pci: Remove __pci_mmap_set_pgprot()
The powerpc-specific __pci_mmap_set_pgprot() does two things:

  1) Disables write combining for I/O port space mappings

     This only affects procfs mappings.  The pci_mmap_resource() sysfs path
     only requests write combining for resources with IORESOURCE_PREFETCH
     set, which doesn't include I/O resources.

     The only way to request write combining for I/O port space mappings
     was via the PCIIOC_WRITE_COMBINE ioctl and the proc_bus_pci_mmap()
     path, and we recently changed that path to ignore write combining for
     I/O, so this code in powerpc is no longer needed.

  2) Automatically enables write combining for mappings of prefetchable
     resources, even if not requested by the user

     Both procfs (via PCIIOC_MMAP_IS_MEM and PCIIOC_WRITE_COMBINE ioctls)
     and sysfs (via "resourceN_wc" files, which are created for resources
     with IORESOURCE_PREFETCH) provide ways for the user to map PCI memory
     space with write combining.

     Users that desire write combining should use one of those ways instead
     of relying on powerpc-specific behavior.

Remove the powerpc-specific __pci_mmap_set_pgprot().

The user-visible effect of this change is that powerpc users mapping
prefetchable PCI memory space via procfs without PCIIOC_WRITE_COMBINE or
via sysfs "resourceN" (not "resourceN_wc") will get regular uncacheable
mappings instead of the write combining mappings they used to get.

The new behavior matches the behavior on all other arches that support
write combining mapping.

[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-06-17 14:43:33 -05:00
Gavin Shan a3aa256b72 powerpc/eeh: Fix invalid cached PE primary bus
The PE primary bus cannot be got from its child devices when having
full hotplug in error recovery. The PE primary bus is cached, which
is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary
bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared
before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the
PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually.

This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the
PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset())
can get valid PE primary bus through eeh_pe_bus_get().

Fixes: 67086e32b5 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE")
Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-17 19:51:47 +10:00
Russell Currey fb36e90736 powerpc/pci: Fix SRIOV not building without EEH enabled
On Book3E CPUs (and possibly other configs), it is possible to have SRIOV
(CONFIG_PCI_IOV) set without CONFIG_EEH.  The SRIOV code does not check
for this, and if EEH is disabled, pci_dn.c fails to build.

Fix this by gating the EEH-specific code in the SRIOV implementation
behind CONFIG_EEH.

Fixes: 39218cd0 ("powerpc/eeh: EEH device for VF")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-17 16:52:22 +10:00
Daniel Axtens a9650e9bc5 powerpc/align: Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
Sparse complains that it doesn't know what REG_BYTE is:

  arch/powerpc/kernel/align.c:313:29: error: undefined identifier 'REG_BYTE'

REG_BYTE is defined differently based on whether we're compiling for
LE, BE32 or BE64. Sparse apparently doesn't provide __BIG_ENDIAN__ or
__LITTLE_ENDIAN__, which means we get no definition.

Rather than check for __BIG_ENDIAN__ and then separately for
__LITTLE_ENDIAN__, just switch the #ifdef to check for __BIG_ENDIAN__
and then #else we define the little endian version. Technically that's
dicey because PDP_ENDIAN is also a possibility, but we already do it in
a lot of places so one more hardly matters.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 22:40:19 +10:00
Daniel Axtens 665e87ffe1 powerpc/sparse: Include headers containing prototypes
Sometimes headers that provide prototypes for functions are
accidentally omitted from the files that define the functions.

Fix a couple of times that occurs.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 22:40:19 +10:00
Daniel Axtens 42f5b4cacd powerpc: Introduce asm-prototypes.h
Sparse picked up a number of functions that are implemented in C and
then only referred to in asm code.

This introduces asm-prototypes.h, which provides a place for
prototypes of these functions.

This silences some sparse warnings.

Signed-off-by: Daniel Axtens <dja@axtens.net>
[mpe: Add include guards, clean up copyright & GPL text]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 22:39:54 +10:00
Daniel Axtens 34852ed551 powerpc/sparse: make some things static
This is just a smattering of things picked up by sparse that should
be made static.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 22:23:11 +10:00
Suraj Jitindar Singh 1d1451655b powerpc: Add array bounds checking to crash_shutdown_handlers
The array crash_shutdown_handles is an array of size CRASH_HANDLER_MAX+1
containing up to CRASH_HANDLER_MAX shutdown_handlers. It is assumed to
be NULL terminated, which it is under normal circumstances. Array
accesses in the functions crash_shutdown_unregister() and
default_machine_crash_shutdown() rely on this NULL termination property
when traversing this list and don't protect again out of bounds accesses.
If the NULL terminator were somehow overwritten these functions could
potentially access out of the bounds of the array.

Shrink the array to size CRASH_HANDLER_MAX and implement explicit array
bounds checking when accessing the elements of the
crash_shutdown_handles[] array in crash_shutdown_unregister() and
default_machine_crash_shutdown().

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 16:05:47 +10:00
Rashmica Gupta aac6a91fea powerpc/asm: Remove unused symbols in asm-offsets.c
THREAD_DSCR:
  Added in efcac6589a "powerpc: Per process DSCR + some fixes (try#4)"
  Last usage removed in 152d523e63 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_DSCR_INHERIT:
  Added in 714332858b "powerpc: Restore correct DSCR in context switch"
  Last usage removed in 152d523e63 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_TAR:
  Added in 2468dcf641 "powerpc: Add support for context switching the TAR register"
  Last usage removed in 152d523e63 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_BESCR, THREAD_EBBHR and THREAD_EBBRR:
  Added in 9353374b8e "powerpc: Context switch the new EBB SPRs"
  Last usage removed in 152d523e63 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_SIAR, THREAD_SDAR, THREAD_SIER, THREAD_MMCR0, and THREAD_MMCR2:
  Added in 59affcd3e4 "powerpc: Context switch more PMU related SPRs"
  Last usage removed in b11ae95100 "powerpc: Partial revert of "Context switch more PMU related SPRs""

PACA_LOCK_TOKEN:
  Added in 9e368f2915 "KVM: PPC: book3s_hv: Add support for PPC970-family processors"
  Last usage removed in c17b98cf60 "KVM: PPC: Book3S HV: Remove code for PPC970 processors"

HCALL_STAT_SIZE, HCALL_STAT_CALLS, HCALL_STAT_TB and HCALL_STAT_PURR:
  Added in 57852a853b "[POWERPC] powerpc: Instrument Hypervisor Calls"
  Last usage removed in c8cd093a6e "powerpc: tracing: Add hypervisor call tracepoints"

VCPU_EPLC:
  Added in d30f6e4800 "KVM: PPC: booke: category E.HV (GS-mode) support"
  Never used.

CPU_DOWN_FLUSH:
  Added in e7affb1dba "powerpc/cache: add cache flush operation for various e500"
  Never used.

CFG_STAMP_XSEC:
  Added in 14cf11af6c "powerpc: Merge enough to start building in arch/powerpc."
  Last usage removed in 0e469db8f7 "powerpc: Rework VDSO gettimeofday to prevent time going backwards"

KVM_LPCR:
  Added in aa04b4cc5b "KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests"
  Last usage removed in a0144e2a6b "KVM: PPC: Book3S HV: Store LPCR value for each virtual core"

GPR15, GPR16, GPR17, GPR18, GPR19, GPR20, GPR21, GPR22, GPR23, GPR24,
GPR25, GPR26, GPR27, GPR28, GPR29, GPR30 and GPR31:
  Added in 14cf11af6c "powerpc: Merge enough to start building in arch/powerpc."
  Never used.

VCPU_SHADOW_FSCR:
  Added in 616dff8602 "KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR"
  Never used.

VCPU_SHADOW_SRR1:
  Added in a2d56020d1 "KVM: PPC: Book3S PR: Keep volatile reg values in vcpu rather than shadow_vcpu"
  Never used.

KVM_SPLIT_SIZE:
  Added in b4deba5c41 "KVM: PPC: Book3S HV: Implement dynamicmicro-threading on POWER8"
  Never used.

VCPU_VCPUID:
  Added in de56a948b9 "KVM: PPC: Add support for Book3S processors in hypervisor mode"
  Last usage removed 1b400ba0cd "KVM: PPC: Book3S HV: Improve handling of local vs. global TLB invalidations"

_MQ:
  Added in 14cf11af6c "powerpc: Merge enough to start building in arch/powerpc."
  Never used.

AUDITCONTEXT:
  Added in 14cf11af6c "powerpc: Merge enough to start building in arch/powerpc."
  Last usage removed in 401d1f029b "[PATCH] syscall entry/exit revamp"

CLONE_VM:
  Added in 14cf11af6c "powerpc: Merge enough to start building in arch/powerpc."
  Currently unused.

CLONE_UNTRACED:
  Added in 14cf11af6c "powerpc: Merge enough to start building in arch/powerpc."
  Currently unused.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
[mpe: Munge change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 15:11:25 +10:00
Kees Cook 1addc57e11 powerpc/ptrace: run seccomp after ptrace
Close the hole where ptrace can change a syscall out from under seccomp.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
2016-06-14 10:54:46 -07:00
Andy Lutomirski 2f275de5d1 seccomp: Add a seccomp_data parameter secure_computing()
Currently, if arch code wants to supply seccomp_data directly to
seccomp (which is generally much faster than having seccomp do it
using the syscall_get_xyz() API), it has to use the two-phase
seccomp hooks. Add it to the easy hooks, too.

Cc: linux-arch@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-06-14 10:54:39 -07:00
Greg Kurz 6e45273eac powerpc/pseries: Fix trivial typo in function name
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14 16:05:35 +10:00
Michael Ellerman f55d966536 powerpc: Define and use PPC64_ELF_ABI_v2/v1
We're approaching 20 locations where we need to check for ELF ABI v2.
That's fine, except the logic is a bit awkward, because we have to check
that _CALL_ELF is defined and then what its value is.

So check it once in asm/types.h and define PPC64_ELF_ABI_v2 when ELF ABI
v2 is detected.

We also have a few places where what we're really trying to check is
that we are using the 64-bit v1 ABI, ie. function descriptors. So also
add a #define for that, which simplifies several checks.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14 13:58:27 +10:00
Michael Ellerman 027dfac694 powerpc: Various typo fixes
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14 13:58:26 +10:00
Christophe Leroy e289086f65 powerpc/32: Get rid of sub_reloc_offset()
sub_reloc_offset() has not been used since commit
917f0af9e5 ("powerpc: Remove arch/ppc and include/asm-ppc") which
removed include/asm-ppc/prom.h.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14 13:58:26 +10:00
Anton Blanchard d96f234f47 powerpc: Avoid load hit store in setup_sigcontext()
In setup_sigcontext(), we set current->thread.vrsave then use it
straight after. Since current is hidden from the compiler via inline
assembly, it cannot optimise this and we end up with a load hit store.

Fix this by using a temporary.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14 13:58:25 +10:00
Anton Blanchard 8eb9803723 powerpc: Avoid load hit store in __giveup_fpu() and __giveup_altivec()
In both __giveup_fpu() and __giveup_altivec() we make two modifications
to tsk->thread.regs->msr. gcc decides to do a read/modify/write of
each change, so we end up with a load hit store:

        ld      r9,264(r10)
        rldicl  r9,r9,50,1
        rotldi  r9,r9,14
        std     r9,264(r10)
...
        ld      r9,264(r10)
        rldicl  r9,r9,40,1
        rotldi  r9,r9,24
        std     r9,264(r10)

Fix this by using a temporary.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14 13:58:25 +10:00
Michael Ellerman 2c2a63e301 powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added
The recent commit 7cc851039d ("powerpc/pseries: Add POWER8NVL support
to ibm,client-architecture-support call") added a new PVR mask & value
to the start of the ibm_architecture_vec[] array.

However it missed the fact that further down in the array, we hard code
the offset of one of the fields, and then at boot use that value to
patch the value in the array. This means every update to the array must
also update the #define, ugh.

This means that on pseries machines we will misreport to firmware the
number of cores we support, by a factor of threads_per_core.

Fix it for now by updating the #define.

Fixes: 7cc851039d ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call")
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-08 10:40:05 +10:00
Khem Raj 1e407ee3b2 powerpc/ptrace: Fix out of bounds array access warning
gcc-6 correctly warns about a out of bounds access

arch/powerpc/kernel/ptrace.c:407:24: warning: index 32 denotes an offset greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' [-Warray-bounds]
        offsetof(struct thread_fp_state, fpr[32][0]));
                        ^

check the end of array instead of beginning of next element to fix this

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-06 10:48:07 +10:00
Arnd Bergmann 169047f447 rtc: powerpc: provide rtc_class_ops directly
The rtc-generic driver provides an architecture specific
wrapper on top of the generic rtc_class_ops abstraction,
and powerpc has another abstraction on top, which is a bit
silly.

This changes the powerpc rtc-generic device to provide its
rtc_class_ops directly, to reduce the number of layers
by one.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-06-04 00:23:34 +02:00
Geliang Tang 8cfc8ddc99 pstore: add lzo/lz4 compression support
Like zlib compression in pstore, this patch added lzo and lz4
compression support so that users can have more options and better
compression ratio.

The original code treats the compressed data together with the
uncompressed ECC correction notice by using zlib decompress. The
ECC correction notice is missing in the decompression process. The
treatment also makes lzo and lz4 not working. So I treat them
separately by using pstore_decompress() to treat the compressed
data, and memcpy() to treat the uncompressed ECC correction notice.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-06-02 10:59:31 -07:00
Thomas Huth 7cc851039d powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call
If we do not provide the PVR for POWER8NVL, a guest on this system
currently ends up in PowerISA 2.06 compatibility mode on KVM, since QEMU
does not provide a generic PowerISA 2.07 mode yet. So some new
instructions from POWER8 (like "mtvsrd") get disabled for the guest,
resulting in crashes when using code compiled explicitly for
POWER8 (e.g. with the "-mcpu=power8" option of GCC).

Fixes: ddee09c099 ("powerpc: Add PVR for POWER8NVL processor")
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-01 13:47:34 +10:00
Horia Geantă d54fc90cc9 powerpc: add io{read,write}64 accessors
This will allow device drivers to consistently use io{read,write}XX
also for 64-bit accesses.

Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-05-31 16:41:52 +08:00
Michal Hocko 6904817607 vdso: make arch_setup_additional_pages wait for mmap_sem for write killable
most architectures are relying on mmap_sem for write in their
arch_setup_additional_pages.  If the waiting task gets killed by the oom
killer it would block oom_reaper from asynchronous address space reclaim
and reduce the chances of timely OOM resolving.  Wait for the lock in
the killable mode and return with EINTR if the task got killed while
waiting.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>	[x86 vdso]
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-23 17:04:14 -07:00
Jiri Slaby 5f56a5dfdb exit_thread: remove empty bodies
Define HAVE_EXIT_THREAD for archs which want to do something in
exit_thread. For others, let's define exit_thread as an empty inline.

This is a cleanup before we change the prototype of exit_thread to
accept a task parameter.

[akpm@linux-foundation.org: fix mips]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Linus Torvalds c04a588029 powerpc updates for 4.7
Highlights:
  - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
  - Live patching support for ppc64le (also merged via livepatching.git)
 
 Various cleanups & minor fixes from:
  - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
    Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart
    Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
    Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta,
    Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin
    Rothberg, Vipin K Parashar.
 
 General:
  - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot
  - Fix branching to OOL handlers in relocatable kernel from Hari Bathini
  - Add support for userspace Power9 copy/paste from Chris Smart
  - Always use STRICT_MM_TYPECHECKS from Michael Ellerman
  - Add mask of possible MMU features from Michael Ellerman
 
 PCI:
  - Enable pass through of NVLink to guests from Alexey Kardashevskiy
  - Cleanups in preparation for powernv PCI hotplug from Gavin Shan
  - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
  - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
  - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G. Piccoli
  - Remove the dependency on EEH struct in DDW mechanism from Guilherme G. Piccoli
 
 selftests:
  - Test cp_abort during context switch from Chris Smart
  - Add several tests for transactional memory support from Rashmica Gupta
 
 perf:
  - Add support for sampling interrupt register state from Anju T
  - Add support for unwinding perf-stackdump from Chandan Kumar
 
 cxl:
  - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud
  - Allow initialization on timebase sync failures from Frederic Barrat
  - Increase timeout for detection of AFU mmio hang from Frederic Barrat
  - Handle num_of_processes larger than can fit in the SPA from Ian Munsie
  - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie
  - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie
  - Check periodically the coherent platform function's state from Christophe Lombard
 
 Freescale:
  - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum
    workaround, and a kconfig dependency fix."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXPsGzAAoJEFHr6jzI4aWAVoAP/iKdrDe0eYHlVAE9SqnbsiZs
 lgDxdsC8P3fsmP1G9o/HkKhC82zHl/La8Ztz8dtqa+LkSzbfliWP1ztJsI7GsBFo
 tyCKzWnX9Rwvd3meHu/o/SQ29TNLm/PbPyyRqpj5QPbJ8XCXkAXR7ZZZqjvcMsJW
 /AgIr7Cgf53tl9oZzzl/c7CnNHhMq+NBdA71vhWtUx+T97wfJEGyKW6HhZyHDbEU
 iAki7fu77ZpEqC/Fh9swf0dCGBJ+a132NoMVo0AdV7EQLznUYlQpQEqa+1PyHZOP
 /ArOzf2mDg6m3PfCo1eiB07v8PnVZ3llEUbVAJNg3GUxbE4SHrqq/kwm0iElm3p/
 DvFxerCwdX9vmskJX4wDs+pSZRabXYj9XVMptsgFzA4joWrqqb7mBHqaort88YcY
 YSljEt1bHyXmiJ+dBya40qARsWUkCVN7ZgEzdxckq0KI3w7g2tqpqIbO2lClWT6t
 B3GpqQ4jp34+d1M14FB91fIGK7tMvOhSInE0Mv9+tPvRsepXqiiU/SwdAtRlr3m2
 zs/K+4FYcVjJ3Rmpgc+tI38PbZxHe212I35YN6L1LP+4ZfAtzz0NyKdooTIBtkbO
 19pX4WbBjKq8zK+YutrySncBIrbnI6VjW51vtRhgVKZliPFO/6zKagyU6FbxM+E5
 udQES+t3F/9gvtxgxtDe
 =YvyQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
   - Live patching support for ppc64le (also merged via livepatching.git)

  Various cleanups & minor fixes from:
   - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
     Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie,
     Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring,
     Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras,
     Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung
     Bauermann, Valentin Rothberg, Vipin K Parashar.

  General:
   - Update LMB associativity index during DLPAR add/remove from Nathan
     Fontenot
   - Fix branching to OOL handlers in relocatable kernel from Hari Bathini
   - Add support for userspace Power9 copy/paste from Chris Smart
   - Always use STRICT_MM_TYPECHECKS from Michael Ellerman
   - Add mask of possible MMU features from Michael Ellerman

  PCI:
   - Enable pass through of NVLink to guests from Alexey Kardashevskiy
   - Cleanups in preparation for powernv PCI hotplug from Gavin Shan
   - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
   - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
   - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
     from Guilherme G Piccoli
   - Remove the dependency on EEH struct in DDW mechanism from Guilherme
     G Piccoli

  selftests:
   - Test cp_abort during context switch from Chris Smart
   - Add several tests for transactional memory support from Rashmica
     Gupta

  perf:
   - Add support for sampling interrupt register state from Anju T
   - Add support for unwinding perf-stackdump from Chandan Kumar

  cxl:
   - Configure the PSL for two CAPI ports on POWER8NVL from Philippe
     Bergheaud
   - Allow initialization on timebase sync failures from Frederic Barrat
   - Increase timeout for detection of AFU mmio hang from Frederic
     Barrat
   - Handle num_of_processes larger than can fit in the SPA from Ian
     Munsie
   - Ensure PSL interrupt is configured for contexts with no AFU IRQs
     from Ian Munsie
   - Add kernel API to allow a context to operate with relocate disabled
     from Ian Munsie
   - Check periodically the coherent platform function's state from
     Christophe Lombard

  Freescale:
   - Updates from Scott: "Contains 86xx fixes, minor device tree fixes,
     an erratum workaround, and a kconfig dependency fix."

* tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits)
  powerpc/86xx: Fix PCI interrupt map definition
  powerpc/86xx: Move pci1 definition to the include file
  powerpc/fsl: Fix build of the dtb embedded kernel images
  powerpc/fsl: Fix rcpm compatible string
  powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC
  powerpc/fsl-pci: Add a workaround for PCI 5 errata
  powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb
  powerpc/powernv/npu: Add PE to PHB's list
  powerpc/powernv: Fix insufficient memory allocation
  powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism
  Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
  powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner()
  powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover()
  powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()
  powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()
  Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()"
  powerpc/powernv/npu: Enable NVLink pass through
  powerpc/powernv/npu: Rework TCE Kill handling
  powerpc/powernv/npu: Add set/unset window helpers
  powerpc/powernv/ioda2: Export debug helper pe_level_printk()
  ...
2016-05-20 10:12:41 -07:00
Linus Torvalds 0b86c75db6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching updates from Jiri Kosina:

 - remove of our own implementation of architecture-specific relocation
   code and leveraging existing code in the module loader to perform
   arch-dependent work, from Jessica Yu.

   The relevant patches have been acked by Rusty (for module.c) and
   Heiko (for s390).

 - live patching support for ppc64le, which is a joint work of Michael
   Ellerman and Torsten Duwe.  This is coming from topic branch that is
   share between livepatching.git and ppc tree.

 - addition of livepatching documentation from Petr Mladek

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
  livepatch: make object/func-walking helpers more robust
  livepatch: Add some basic livepatch documentation
  powerpc/livepatch: Add live patching support on ppc64le
  powerpc/livepatch: Add livepatch stack to struct thread_info
  powerpc/livepatch: Add livepatch header
  livepatch: Allow architectures to specify an alternate ftrace location
  ftrace: Make ftrace_location_range() global
  livepatch: robustify klp_register_patch() API error checking
  Documentation: livepatch: outline Elf format and requirements for patch modules
  livepatch: reuse module loader code to write relocations
  module: s390: keep mod_arch_specific for livepatch modules
  module: preserve Elf information for livepatch modules
  Elf: add livepatch-specific Elf constants
2016-05-17 17:11:27 -07:00
Linus Torvalds 16bf834805 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits)
  gitignore: fix wording
  mfd: ab8500-debugfs: fix "between" in printk
  memstick: trivial fix of spelling mistake on management
  cpupowerutils: bench: fix "average"
  treewide: Fix typos in printk
  IB/mlx4: printk fix
  pinctrl: sirf/atlas7: fix printk spelling
  serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/
  w1: comment spelling s/minmum/minimum/
  Blackfin: comment spelling s/divsor/divisor/
  metag: Fix misspellings in comments.
  ia64: Fix misspellings in comments.
  hexagon: Fix misspellings in comments.
  tools/perf: Fix misspellings in comments.
  cris: Fix misspellings in comments.
  c6x: Fix misspellings in comments.
  blackfin: Fix misspelling of 'register' in comment.
  avr32: Fix misspelling of 'definitions' in comment.
  treewide: Fix typos in printk
  Doc: treewide : Fix typos in DocBook/filesystem.xml
  ...
2016-05-17 17:05:30 -07:00
Guilherme G. Piccoli c2078d9ef6 Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
This reverts commit 89a51df5ab.

The function eeh_add_device_early() is used to perform EEH
initialization in devices added later on the system, like in
hotplug/DLPAR scenarios. Since the commit 89a51df5ab ("powerpc/eeh:
Fix crash in eeh_add_device_early() on Cell") a new check was introduced
in this function - Cell has no EEH capabilities which led to kernel oops
if hotplug was performed, so checking for eeh_enabled() was introduced
to avoid the issue.

However, in architectures that EEH is present like pSeries or PowerNV,
we might reach a case in which no PCI devices are present on boot time
and so EEH is not initialized. Then, if a device is added via DLPAR for
example, eeh_add_device_early() fails because eeh_enabled() is false,
and EEH end up not being enabled at all.

This reverts the aforementioned patch since a new verification was
introduced by the commit d91dafc02f ("powerpc/eeh: Delay probing EEH
device during hotplug") and so the original Cell issue does not happen
anymore.

Cc: stable@vger.kernel.org # v4.1+
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12 19:52:21 +10:00
Gavin Shan d6d63d720d powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner()
The label "reset" in eeh_pe_change_owner() is used only for once.
No need to keep it and just drop it. No logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12 19:52:20 +10:00
Gavin Shan 2efc771f24 powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover()
The function eeh_pe_reset_and_recover() is used to recover EEH
error when the passthrough device are transferred to guest and
backwards, meaning the device's driver is vfio-pci or none. In
both cases, the handlers triggered by eeh_report_reset() and
eeh_report_resume() shouldn't be called.

This ignores the error handlers from eeh_report_reset() and
eeh_report_resume().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12 19:52:20 +10:00
Gavin Shan 5a0cdbfd17 powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()
The function eeh_pe_reset_and_recover() is used to recover EEH
error when the passthrou device are transferred to guest and
backwards. The content in the device's config space will be lost
on PE reset issued in the middle of the recovery. The function
saves/restores it before/after the reset. However, config access
to some adapters like Broadcom BCM5719 at this point will causes
fenced PHB. The config space is always blocked and we save 0xFF's
that are restored at late point. The memory BARs are totally
corrupted, causing another EEH error upon access to one of the
memory BARs.

This restores the config space on those adapters like BCM5719
from the content saved to the EEH device when it's populated,
to resolve above issue.

Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
Cc: stable@vger.kernel.org #v3.18+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12 19:52:20 +10:00
Gavin Shan affeb0f2d3 powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()
The function eeh_pe_reset_and_recover() is used to recover EEH
error when the passthrough device are transferred to guest and
backwards, meaning the device's driver is vfio-pci or none.
When the driver is vfio-pci that provides error_detected() error
handler only, the handler simply stops the guest and it's not
expected behaviour. On the other hand, no error handlers will
be called if we don't have a bound driver.

This ignores the error handler in eeh_pe_reset_and_recover()
that reports the error to device driver to avoid the exceptional
behaviour.

Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
Cc: stable@vger.kernel.org #v3.18+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-12 19:52:20 +10:00
Gavin Shan 4a5954ed77 powerpc/pci: Don't scan empty slot
In hotplug case, function pci_add_pci_devices() is called to rescan
the specified PCI bus, which might not have any child devices. Access
to the PCI bus's child device node will cause kernel crash without
exception.

This adds one more check to skip scanning PCI bus that doesn't have
any subordinate devices from device-tree, in order to avoid kernel
crash.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:26 +10:00
Gavin Shan cdddc577d9 powerpc/pci: Export pci_traverse_device_nodes()
This renames traverse_pci_devices() to pci_traverse_device_nodes().
The function traverses all subordinate device nodes of the specified
one. Also, below cleanup applied to the function. No logical changes
introduced.

   * Rename "pre" to "fn".
   * Avoid assignment in if condition reported from checkpatch.pl.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:25 +10:00
Gavin Shan de5a28ac5a powerpc/pci: Introduce pci_remove_device_node_info()
This implements and exports pci_remove_device_node_info(). It's
used to remove the pdn (struct pci_dn) for the indicated device
node. The function is going to be used by PowerNV PCI hotplug
driver.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:25 +10:00
Gavin Shan d8f66f411e powerpc/pci: Export pci_add_device_node_info()
This renames update_dn_pci_info() to pci_add_device_node_info()
with corresponding adjustment on the parameter type and exports it.
The function is used to create pdn (struct pci_dn) for the indicated
device node. Another function add_pdn(), almost wrapper of
pci_add_device_node_info(), to be used in traverse_pci_devices(). No
logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:24 +10:00
Gavin Shan 6384d97780 powerpc/pci: Move pci_find_bus_by_node() around
This moves pci_find_bus_by_node() from arch/powerpc/platforms/
pseries/pci_dlpar.c to arch/powerpc/kernel/pci-hotplug.c so that
the function can be used by pSeries and PowerNV platform at the
same time. Also, below cleanup applied. No functional changes
introduced.

   * Remove variable "busdn" in find_bus_among_children()
   * Use PCI_DN() to convert device node to pci_dn

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:24 +10:00
Gavin Shan bd251b893d powerpc/pci: Rename pcibios_{add, remove}_pci_devices()
This renames pcibios_{add,remove}_pci_devices() to avoid conflicts
with names of the weak functions in PCI subsystem, which have the
prefix "pcibios". No logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:23 +10:00
Mahesh Salgaonkar 2513767d22 powerpc/powernv: Rename machine_check_pSeries_early() to powernv
The routine machine_check_pSeries_early() is only used on powernv, not
pseries. Hence rename machine_check_pSeries_early() to
machine_check_powernv_early().

Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:11 +10:00
Suraj Jitindar Singh 925e2d1ded powerpc: Update of_remove_property() call sites to remove null checking
After obtaining a property from of_find_property() and before calling
of_remove_property() most code checks to ensure that the property
returned from of_find_property() is not null. The previous patch moved
this check to the start of the function of_remove_property() in order to
avoid the case where this check isn't done and a null value is passed.
This ensures the check is always conducted before taking locks and
attempting to remove the property. Thus it is no longer necessary to
perform a check for null values before invoking of_remove_property().

Update of_remove_property() call sites in order to remove redundant
checking for null property value as check is now performed within the
of_remove_property function().

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[mpe: Unbreak some lines which are just >80 chars for readability]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:04 +10:00
Chris Smart e44c1b15cf powerpc: Remove unnecessary CONFIG_SMP #ifdefs
The code in machine_restart/power_off/halt() includes #ifdefs around
calls to smp_send_stop(), however these are not required as
include/linux/smp.h includes an empty version of this function for
CONFIG_SMP=n builds.

Signed-off-by: Chris Smart <chris@distroguy.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:01 +10:00
Rashmica Gupta c415c9cb8a powerpc: Remove unused remnants from A2 cpu
Support for the A2 cpu was removed in commit fb5a515704 ("powerpc:
Remove platforms/wsp and associated pieces"), and the externs:
__setup_cpu_a2 and __restore_cpu_a2 are still around and unused, so
remove them.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:00 +10:00
Valentin Rothberg bb03efe2b7 powerpc/mm/radix: Fix CONFIG_PPC_MMU_STD_64 typo
It's CONFIG_PPC_STD_MMU_64 not ...
     CONFIG_PPC_MMU_STD_64.

Fixes: 11ffc1cfa4c2 ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code")
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:53:59 +10:00
Aneesh Kumar K.V 17a3dd2f5f powerpc/mm/radix: Use firmware feature to enable Radix MMU
We use the existing "ibm,pa-features" device-tree property to enable
Radix MMU mode. This means we default to hash mode unless firmware tells
us it's OK to start using Radix mode.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:53:58 +10:00
Aneesh Kumar K.V 43a5c68427 powerpc/mm/radix: Make sure swapper pgdir is properly aligned
With 4K page size radix config our level 1 page table size is 64K and it
should be naturally aligned.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:53:55 +10:00
Aneesh Kumar K.V d6a9996e84 powerpc/mm: vmalloc abstraction in preparation for radix
The vmalloc range differs between hash and radix config. Hence make
VMALLOC_START and related constants a variable which will be runtime
initialized depending on whether hash or radix mode is active.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:53:53 +10:00
Aneesh Kumar K.V caca285e5a powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code
We also use MMU_FTR_RADIX to branch out from code path specific to
hash.

No functionality change.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:53:45 +10:00
Peter Zijlstra (Intel) e9d867a67f sched: Allow per-cpu kernel threads to run on online && !active
In order to enable symmetric hotplug, we must mirror the online &&
!active state of cpu-down on the cpu-up side.

However, to retain sanity, limit this state to per-cpu kthreads.

Aside from the change to set_cpus_allowed_ptr(), which allow moving
the per-cpu kthreads on, the other critical piece is the cpu selection
for pinned tasks in select_task_rq(). This avoids dropping into
select_fallback_rq().

select_fallback_rq() cannot be allowed to select !active cpus because
its used to migrate user tasks away. And we do not want to move user
tasks onto cpus that are in transition.

Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160301152303.GV6356@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-06 14:58:22 +02:00
Aneesh Kumar K.V 1a472c9dba powerpc/mm/radix: Add tlbflush routines
Core kernel doesn't track the page size of the VA range that we are
invalidating. Hence we end up flushing TLB for the entire mm here. Later
patches will improve this.

We also don't flush page walk cache separetly instead use RIC=2 when
flushing TLB, because we do a MMU gather flush after freeing page table.

MMU_NO_CONTEXT is updated for hash.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:33:09 +10:00
Aneesh Kumar K.V d2adba3fd1 powerpc/mm: Abstraction for switch_mmu_context()
How we switch MMU context differs between hash and radix. For hash we
need to switch the SLB details and for radix we need to switch the PID
SPR.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:33:04 +10:00
Aneesh Kumar K.V dd1842a2a4 powerpc/mm: Make page table size a variable
Radix and hash MMU models support different page table sizes. Make
the #defines a variable so that existing code can work with variable
sizes.

Slice related code is only used by hash, so use hash constants there. We
will replicate some of the boundary conditions with resepct to TASK_SIZE
using radix values too. Right now we do boundary condition check using
hash constants.

Swapper pgdir size is initialized in asm code. We select the max pgd
size to keep it simple. For now we select hash pgdir. When adding radix
we will switch that to radix pgdir which is 64K.

BUILD_BUG_ON check which is removed is already done in hugepage_init()
using MAYBE_BUILD_BUG_ON().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:48 +10:00
Aneesh Kumar K.V 72176dd0ad powerpc/mm: Use a helper for finding pte bits mapping I/O area
Use a helper instead of open coding with constants. A later patch will
drop the WIMG bits and use PowerISA 3.0 defines.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:32 +10:00
Masanari Iida c01e01597c treewide: Fix typos in printk
This patch fix spelling typos in printk from various part
of the codes.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-04-28 10:52:28 +02:00
Thiago Jung Bauermann 7132e2d669 ftrace: Match dot symbols when searching functions on ppc64
In the ppc64 big endian ABI, function symbols point to function
descriptors. The symbols which point to the function entry points
have a dot in front of the function name. Consequently, when the
ftrace filter mechanism searches for the symbol corresponding to
an entry point address, it gets the dot symbol.

As a result, ftrace filter users have to be aware of this ABI detail on
ppc64 and prepend a dot to the function name when setting the filter.

The perf probe command insulates the user from this by ignoring the dot
in front of the symbol name when matching function names to symbols,
but the sysfs interface does not. This patch makes the ftrace filter
mechanism do the same when searching symbols.

Fixes the following failure in ftracetest's kprobe_ftrace.tc:

  .../kprobe_ftrace.tc: line 9: echo: write error: Invalid argument

That failure is on this line of kprobe_ftrace.tc:

  echo _do_fork > set_ftrace_filter

This is because there's no _do_fork entry in the functions list:

  # cat available_filter_functions | grep _do_fork
  ._do_fork

This change introduces no regressions on the perf and ftracetest
testsuite results.

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-27 09:47:29 +10:00
Chris Smart 8a649045e7 powerpc: Add support for userspace P9 copy paste
The copy paste facility introduced in POWER9 provides an optimised
mechanism for a userspace application to copy a cacheline. This is
provided by a pair of instructions, copy and paste, while a third,
cp_abort (copy paste abort), provides a clean up of the state in case of
a failure.

The copy instruction will read a 128 byte cacheline and store it in an
internal buffer. The subsequent paste instruction will store this
internal buffer to memory and set a CR field if the paste succeeds.

Since the state of the copy paste buffer is internal (and not
architecturally visible), in the unlikely event of a context switch, the
state cannot be stored and the paste should therefore fail.

The cp_abort instruction exists to fail and clean up any such
interrupted copy paste sequence and is to be called by the kernel as
part of the context switch. Doing so prevents data from a preceding copy
in one process leaking into the paste of another.

This code enables use of the cp_abort instruction if a supported
processor is detected.

NOTE: this is for userspace only, not in kernel, and does not deal
with KVM guests.

Patch created with much assistance from Michael Neuling
<mikey@neuling.org>

Signed-off-by: Chris Smart <chris@distroguy.com>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-27 09:28:07 +10:00
Andrew Donnellan 2d5217840f powerpc/eeh: fix misleading indentation
Found by smatch.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-27 09:19:37 +10:00
Hari Bathini 057b6d7e62 powerpc/book3s64: Remove __end_handlers marker
The __end_handlers marker was intended to mark down upto code that gets
called from exception prologs. But that hasn't kept pace with code
changes. Case in point, slb_miss_realmode being called from exception
prolog code but isn't below __end_handlers marker. So, __end_handlers
marker is as good as a comment but could be misleading at times if it
isn't in sync with the code, as is the case now. So, let us avoid this
confusion by having a better comment and removing __end_handlers marker
altogether.

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-21 23:32:58 +10:00
Hari Bathini 8ed8ab4004 powerpc/book3s64: Fix branching to OOL handlers in relocatable kernel
Some of the interrupt vectors on 64-bit POWER server processors are only
32 bytes long (8 instructions), which is not enough for the full
first-level interrupt handler. For these we need to branch to an
out-of-line (OOL) handler. But when we are running a relocatable kernel,
interrupt vectors till __end_interrupts marker are copied down to real
address 0x100. So, branching to labels (ie. OOL handlers) outside this
section must be handled differently (see LOAD_HANDLER()), considering
relocatable kernel, which would need at least 4 instructions.

However, branching from interrupt vector means that we corrupt the
CFAR (come-from address register) on POWER7 and later processors as
mentioned in commit 1707dd16. So, EXCEPTION_PROLOG_0 (6 instructions)
that contains the part up to the point where the CFAR is saved in the
PACA should be part of the short interrupt vectors before we branch out
to OOL handlers.

But as mentioned already, there are interrupt vectors on 64-bit POWER
server processors that are only 32 bytes long (like vectors 0x4f00,
0x4f20, etc.), which cannot accomodate the above two cases at the same
time owing to space constraint. Currently, in these interrupt vectors,
we simply branch out to OOL handlers, without using LOAD_HANDLER(),
which leaves us vulnerable when running a relocatable kernel (eg. kdump
case). While this has been the case for sometime now and kdump is used
widely, we were fortunate not to see any problems so far, for three
reasons:

  1. In almost all cases, production kernel (relocatable) is used for
     kdump as well, which would mean that crashed kernel's OOL handler
     would be at the same place where we end up branching to, from short
     interrupt vector of kdump kernel.
  2. Also, OOL handler was unlikely the reason for crash in almost all
     the kdump scenarios, which meant we had a sane OOL handler from
     crashed kernel that we branched to.
  3. On most 64-bit POWER server processors, page size is large enough
     that marking interrupt vector code as executable (see commit
     429d2e83) leads to marking OOL handler code from crashed kernel,
     that sits right below interrupt vector code from kdump kernel, as
     executable as well.

Let us fix this by moving the __end_interrupts marker down past OOL
handlers to make sure that we also copy OOL handlers to real address
0x100 when running a relocatable kernel.

This fix has been tested successfully in kdump scenario, on an LPAR with
4K page size by using different default/production kernel and kdump
kernel.

Also tested by manually corrupting the OOL handlers in the first kernel
and then kdump'ing, and then causing the OOL handlers to fire - mpe.

Fixes: c1fb6816fb ("powerpc: Add relocation on exception vector handlers")
Cc: stable@vger.kernel.org
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-21 23:32:44 +10:00
Michael Ellerman 8404410b29 Merge branch 'topic/livepatch' into next
Merge the support for live patching on ppc64le using mprofile-kernel.
This branch has also been merged into the livepatching tree for v4.7.
2016-04-18 20:45:32 +10:00
Anton Blanchard 4705e02498 powerpc: Update TM user feature bits in scan_features()
We need to update the user TM feature bits (PPC_FEATURE2_HTM and
PPC_FEATURE2_HTM) to mirror what we do with the kernel TM feature
bit.

At the moment, if firmware reports TM is not available we turn off
the kernel TM feature bit but leave the userspace ones on. Userspace
thinks it can execute TM instructions and it dies trying.

This (together with a QEMU patch) fixes PR KVM, which doesn't currently
support TM.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-18 20:10:45 +10:00
Anton Blanchard beff82374b powerpc: Update cpu_user_features2 in scan_features()
scan_features() updates cpu_user_features but not cpu_user_features2.

Amongst other things, cpu_user_features2 contains the user TM feature
bits which we must keep in sync with the kernel TM feature bit.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-18 20:10:45 +10:00
Anton Blanchard 6997e57d69 powerpc: scan_features() updates incorrect bits for REAL_LE
The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU
feature value, meaning all the remaining elements initialise the wrong
values.

This means instead of checking for byte 5, bit 0, we check for byte 0,
bit 0, and then we incorrectly set the CPU feature bit as well as MMU
feature bit 1 and CPU user feature bits 0 and 2 (5).

Checking byte 0 bit 0 (IBM numbering), means we're looking at the
"Memory Management Unit (MMU)" feature - ie. does the CPU have an MMU.
In practice that bit is set on all platforms which have the property.

This means we set CPU_FTR_REAL_LE always. In practice that seems not to
matter because all the modern cpus which have this property also
implement REAL_LE, and we've never needed to disable it.

We're also incorrectly setting MMU feature bit 1, which is:

  #define MMU_FTR_TYPE_8xx		0x00000002

Luckily the only place that looks for MMU_FTR_TYPE_8xx is in Book3E
code, which can't run on the same cpus as scan_features(). So this also
doesn't matter in practice.

Finally in the CPU user feature mask, we're setting bits 0 and 2. Bit 2
is not currently used, and bit 0 is:

  #define PPC_FEATURE_PPC_LE		0x00000001

Which says the CPU supports the old style "PPC Little Endian" mode.
Again this should be harmless in practice as no 64-bit CPUs implement
that mode.

Fix the code by adding the missing initialisation of the MMU feature.

Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It
would be unsafe to start using it as old kernels incorrectly set it.

Fixes: 44ae3ab335 ("powerpc: Free up some CPU feature bits by moving out MMU-related features")
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
[mpe: Flesh out changelog, add comment reserving 0x4]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-18 20:08:38 +10:00
Jiri Kosina 4d4fb97a62 Merge branch 'topic/livepatch' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into for-4.7/livepatching-ppc64le
Pull livepatching support for ppc64 architecture from Michael Ellerman.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-04-15 11:42:51 +02:00
Michael Ellerman 85baa09549 powerpc/livepatch: Add live patching support on ppc64le
Add the kconfig logic & assembly support for handling live patched
functions. This depends on DYNAMIC_FTRACE_WITH_REGS, which in turn
depends on the new -mprofile-kernel ftrace ABI, which is only supported
currently on ppc64le.

Live patching is handled by a special ftrace handler. This means it runs
from ftrace_caller(). The live patch handler modifies the NIP so as to
redirect the return from ftrace_caller() to the new patched function.

However there is one particularly tricky case we need to handle.

If a function A calls another function B, and it is known at link time
that they share the same TOC, then A will not save or restore its TOC,
and will call the local entry point of B.

When we live patch B, we replace it with a new function C, which may
not have the same TOC as A. At live patch time it's too late to modify A
to do the TOC save/restore, so the live patching code must interpose
itself between A and C, and do the TOC save/restore that A omitted.

An additionaly complication is that the livepatch code can not create a
stack frame in order to save the TOC. That is because if C takes > 8
arguments, or is varargs, A will have written the arguments for C in
A's stack frame.

To solve this, we introduce a "livepatch stack" which grows upward from
the base of the regular stack, and is used to store the TOC & LR when
calling a live patched function.

When the patched function returns, we retrieve the real LR & TOC from
the livepatch stack, restore them, and pop the livepatch "stack frame".

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
2016-04-14 15:48:06 +10:00
Michael Ellerman 5d31a96e6c powerpc/livepatch: Add livepatch stack to struct thread_info
In order to support live patching we need to maintain an alternate
stack of TOC & LR values. We use the base of the stack for this, and
store the "live patch stack pointer" in struct thread_info.

Unlike the other fields of thread_info, we can not statically initialise
that value, so it must be done at run time.

This patch just adds the code to support that, it is not enabled until
the next patch which actually adds live patch support.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
2016-04-14 15:47:06 +10:00
Daniel Axtens 7f92bc5694 powerpc: sparse: Include headers for __weak symbols
Sometimes when sparse warns about undefined symbols, it isn't
because they should have 'static' added, it's because they're
overriding __weak symbols defined elsewhere, and the header has
been missed.

Fix a few of them by adding appropriate headers.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-12 21:05:19 +10:00
Daniel Axtens 635218c785 powerpc: sparse: static-ify some things
As sparse suggests, these should be made static.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-12 21:05:18 +10:00
Paul Gortmaker c0c523897d powerpc: make kernel/nvram_64.c explicitly non-modular
The Makefile/Kconfig currently controlling compilation of this code is:

obj-$(CONFIG_PPC64)             += setup_64.o sys_ppc32.o \
                                   signal_64.o ptrace32.o \
                                   paca.o nvram_64.o firmware.o

arch/powerpc/platforms/Kconfig.cputype:config PPC64
arch/powerpc/platforms/Kconfig.cputype: bool "64-bit kernel"

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We don't replace module.h with init.h since the file already has that.

We delete the MODULE_LICENSE tag since that information is already
contained at the top of the file in the comments.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 20:30:43 +10:00
Russell Currey 8ee26530bb powerpc/eeh: rename EEH from "extended" to "enhanced" error handling
IBM online documentation for EEH uses "extended error handling" and
"enhanced error handling" to refer to the same thing, in different
places.  The only place mentioning it as "enhanced error handling" in the
kernel is the MAINTAINERS file, and it's "extended" in some documentation.

IBM originally defined EEH as "enhanced error handling", so standardise
all mentions of EEH to use that term.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 20:30:42 +10:00
Michael Ellerman b05fac783a powerpc: Remove orphaned asm implementation of abs()
This has been unused since ~2004, remove it.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 20:30:41 +10:00
Michael Ellerman 1f4c66e805 powerpc/mm: Remove long disabled SLB code
We have a bunch of SLB related code in the tree which is there to handle
dynamic VSIDs - but currently it's all disabled at compile time. The
comments say "Keep that around for when we re-implement dynamic VSIDs".

But that was over 10 years ago (commit 3c726f8dee ("[PATCH] ppc64:
support 64k pages")). The chance that it would still work unchanged is
minimal, and in the meantime it's confusing to folks browsing/grepping
the code. If we ever want to re-instate it, it's in the git history.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
2016-04-11 20:30:40 +10:00
Oliver O'Halloran 01d7c2a2de powerpc/process: Fix altivec SPR not being saved
In save_sprs() in process.c contains the following test:

	if (cpu_has_feature(cpu_has_feature(CPU_FTR_ALTIVEC)))
		t->vrsave = mfspr(SPRN_VRSAVE);

CPU feature with the mask 0x1 is CPU_FTR_COHERENT_ICACHE so the test
is equivilent to:

	if (cpu_has_feature(CPU_FTR_ALTIVEC) &&
		cpu_has_feature(CPU_FTR_COHERENT_ICACHE))

On CPUs without support for both (i.e G5) this results in vrsave not
being saved between context switches. The vector register save/restore
code doesn't use VRSAVE to determine which registers to save/restore,
but the value of VRSAVE is used to determine if altivec is being used
in several code paths.

Fixes: 152d523e63 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()")
Cc: stable@vger.kernel.org
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-29 12:08:08 +11:00
Alexander Potapenko be7635e728 arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections
KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.

Move the definition of __irq_entry to <linux/interrupt.h> so that the
users don't need to pull in <linux/ftrace.h>.  Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.

Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-25 16:37:42 -07:00
Linus Torvalds d5e2d00898 powerpc updates for 4.6
Highlights:
  - Restructure Linux PTE on Book3S/64 to Radix format from Paul Mackerras
  - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh Kumar K.V
  - Add POWER9 cputable entry from Michael Neuling
  - FPU/Altivec/VSX save/restore optimisations from Cyril Bur
  - Add support for new ftrace ABI on ppc64le from Torsten Duwe
 
 Various cleanups & minor fixes from:
  - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy, Cyril
    Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell Currey,
    Sukadev Bhattiprolu, Suraj Jitindar Singh.
 
 General:
  - atomics: Allow architectures to define their own __atomic_op_* helpers from
    Boqun Feng
  - Implement atomic{, 64}_*_return_* variants and acquire/release/relaxed
    variants for (cmp)xchg from Boqun Feng
  - Add powernv_defconfig from Jeremy Kerr
  - Fix BUG_ON() reporting in real mode from Balbir Singh
  - Add xmon command to dump OPAL msglog from Andrew Donnellan
  - Add xmon command to dump process/task similar to ps(1) from Douglas Miller
  - Clean up memory hotplug failure paths from David Gibson
 
 pci/eeh:
  - Redesign SR-IOV on PowerNV to give absolute isolation between VFs from Wei
    Yang.
  - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan.
  - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang
  - PCI: Add pcibios_bus_add_device() weak function from Wei Yang
  - MAINTAINERS: Update EEH details and maintainership from Russell Currey
 
 cxl:
  - Support added to the CXL driver for running on both bare-metal and
    hypervisor systems, from Christophe Lombard and Frederic Barrat.
  - Ignore probes for virtual afu pci devices from Vaibhav Jain
 
 perf:
  - Export Power8 generic and cache events to sysfs from Sukadev Bhattiprolu
  - hv-24x7: Fix usage with chip events, display change in counter values,
    display domain indices in sysfs, eliminate domain suffix in event names,
    from Sukadev Bhattiprolu
 
 Freescale:
  - Updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum
    optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and
    other dt bits, and minor fixes/cleanup."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW69OrAAoJEFHr6jzI4aWAe5EQAJw/hE6WBQc6a7Tj70AnXOqR
 qk/m5pZjuTwQxfBteIvHR1pE5eXdlvtAjcD254LVkFkAbIn19W/h2k0VX/nlee7P
 n/VRHRifjtGmukqHrPYJJ7ua9mNlY7pxh3leGSixBFASnSWqMxNNNziNQtSTcuCs
 TjHiw6NkZ/kzeunA4bAfE4yHVUZjmL74oiS9JbLyaVHqoW4fqWLlh26AKo2yYMZI
 qPicBBG4HBi3FGvoexnKxlJNdcV4HO7LzDjJmCSfUKYCJi+Pw19T5qmhso0q0qVz
 vHg/A8HNeG4Hn83pNVmLeQSAIQRZ3DvTtcLgbjPo+TVwm/hzrRRBWipTeOVbkLW8
 2bcOXT4t7LWUq15EAJ1LYgYZGzcLrfRfUeOcuQ1TWd3+PcfY9pE7FmizsxAAfaVe
 E9j9mpz4XnIqBtWkFHneTIHkQ5OWptyKuZJEaYH0nut4VsP0k8NarkseafGqBPu7
 5eG83gbiQbCVixfOgblV9eocJ29JcwpjPAY4CZSGJimShg909FV7WRgZgJkKWrbK
 dBRco8Jcp4VglGfo2qymv7Uj4KwQoypBREOhiKUvrAsVlDxPfx+bcskhjGu9xGDC
 xs/+nme0/lKa/wg5K4C3mQ1GAlkMWHI0ojhJjsyODbetup5UbkEu03wjAaTdO9dT
 Y6ptGm0rYAJluPNlziFj
 =qkAt
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "This was delayed a day or two by some build-breakage on old toolchains
  which we've now fixed.

  There's two PCI commits both acked by Bjorn.

  There's one commit to mm/hugepage.c which is (co)authored by Kirill.

  Highlights:
   - Restructure Linux PTE on Book3S/64 to Radix format from Paul
     Mackerras
   - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh
     Kumar K.V
   - Add POWER9 cputable entry from Michael Neuling
   - FPU/Altivec/VSX save/restore optimisations from Cyril Bur
   - Add support for new ftrace ABI on ppc64le from Torsten Duwe

  Various cleanups & minor fixes from:
   - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy,
     Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell
     Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh.

  General:
   - atomics: Allow architectures to define their own __atomic_op_*
     helpers from Boqun Feng
   - Implement atomic{, 64}_*_return_* variants and acquire/release/
     relaxed variants for (cmp)xchg from Boqun Feng
   - Add powernv_defconfig from Jeremy Kerr
   - Fix BUG_ON() reporting in real mode from Balbir Singh
   - Add xmon command to dump OPAL msglog from Andrew Donnellan
   - Add xmon command to dump process/task similar to ps(1) from Douglas
     Miller
   - Clean up memory hotplug failure paths from David Gibson

  pci/eeh:
   - Redesign SR-IOV on PowerNV to give absolute isolation between VFs
     from Wei Yang.
   - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan.
   - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang
   - PCI: Add pcibios_bus_add_device() weak function from Wei Yang
   - MAINTAINERS: Update EEH details and maintainership from Russell
     Currey

  cxl:
   - Support added to the CXL driver for running on both bare-metal and
     hypervisor systems, from Christophe Lombard and Frederic Barrat.
   - Ignore probes for virtual afu pci devices from Vaibhav Jain

  perf:
   - Export Power8 generic and cache events to sysfs from Sukadev
     Bhattiprolu
   - hv-24x7: Fix usage with chip events, display change in counter
     values, display domain indices in sysfs, eliminate domain suffix in
     event names, from Sukadev Bhattiprolu

  Freescale:
   - Updates from Scott: "Highlights include 8xx optimizations, 32-bit
     checksum optimizations, 86xx consolidation, e5500/e6500 cpu
     hotplug, more fman and other dt bits, and minor fixes/cleanup"

* tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits)
  powerpc: Fix unrecoverable SLB miss during restore_math()
  powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers
  powerpc/rcpm: Fix build break when SMP=n
  powerpc/book3e-64: Use hardcoded mttmr opcode
  powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible
  powerpc/T104xRDB: add tdm riser card node to device tree
  powerpc32: PAGE_EXEC required for inittext
  powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree
  powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)
  powerpc/86xx: Introduce and use common dtsi
  powerpc/86xx: Update device tree
  powerpc/86xx: Move dts files to fsl directory
  powerpc/86xx: Switch to kconfig fragments approach
  powerpc/86xx: Update defconfigs
  powerpc/86xx: Consolidate common platform code
  powerpc32: Remove one insn in mulhdu
  powerpc32: small optimisation in flush_icache_range()
  powerpc: Simplify test in __dma_sync()
  powerpc32: move xxxxx_dcache_range() functions inline
  powerpc32: Remove clear_pages() and define clear_page() inline
  ...
2016-03-19 15:38:41 -07:00
Kees Cook 4cc7ecb7f2 param: convert some "on"/"off" users to strtobool
This changes several users of manual "on"/"off" parsing to use
strtobool.

Some side-effects:
- these uses will now parse y/n/1/0 meaningfully too
- the early_param uses will now bubble up parse errors

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Amitkumar Karwar <akarwar@marvell.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Joe Perches <joe@perches.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nishant Sarmukadam <nishants@marvell.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Steve French <sfrench@samba.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Joonsoo Kim e7df0d88c4 powerpc: query dynamic DEBUG_PAGEALLOC setting
We can disable debug_pagealloc processing even if the code is compiled
with CONFIG_DEBUG_PAGEALLOC.  This patch changes the code to query
whether it is enabled or not in runtime.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Linus Torvalds 10dc374766 One of the largest releases for KVM... Hardly any generic improvement,
but lots of architecture-specific changes.
 
 * ARM:
 - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
 - PMU support for guests
 - 32bit world switch rewritten in C
 - various optimizations to the vgic save/restore code.
 
 * PPC:
 - enabled KVM-VFIO integration ("VFIO device")
 - optimizations to speed up IPIs between vcpus
 - in-kernel handling of IOMMU hypercalls
 - support for dynamic DMA windows (DDW).
 
 * s390:
 - provide the floating point registers via sync regs;
 - separated instruction vs. data accesses
 - dirty log improvements for huge guests
 - bugfixes and documentation improvements.
 
 * x86:
 - Hyper-V VMBus hypercall userspace exit
 - alternative implementation of lowest-priority interrupts using vector
 hashing (for better VT-d posted interrupt support)
 - fixed guest debugging with nested virtualizations
 - improved interrupt tracking in the in-kernel IOAPIC
 - generic infrastructure for tracking writes to guest memory---currently
 its only use is to speedup the legacy shadow paging (pre-EPT) case, but
 in the future it will be used for virtual GPUs as well
 - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJW5r3BAAoJEL/70l94x66D2pMH/jTSWWwdTUJMctrDjPVzKzG0
 yOzHW5vSLFoFlwEOY2VpslnXzn5TUVmCAfrdmFNmQcSw6hGb3K/xA/ZX/KLwWhyb
 oZpr123ycahga+3q/ht/dFUBCCyWeIVMdsLSFwpobEBzPL0pMgc9joLgdUC6UpWX
 tmN0LoCAeS7spC4TTiTTpw3gZ/L+aB0B6CXhOMjldb9q/2CsgaGyoVvKA199nk9o
 Ngu7ImDt7l/x1VJX4/6E/17VHuwqAdUrrnbqerB/2oJ5ixsZsHMGzxQ3sHCmvyJx
 WG5L00ubB1oAJAs9fBg58Y/MdiWX99XqFhdEfxq4foZEiQuCyxygVvq3JwZTxII=
 =OUZZ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "One of the largest releases for KVM...  Hardly any generic
  changes, but lots of architecture-specific updates.

  ARM:
   - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
   - PMU support for guests
   - 32bit world switch rewritten in C
   - various optimizations to the vgic save/restore code.

  PPC:
   - enabled KVM-VFIO integration ("VFIO device")
   - optimizations to speed up IPIs between vcpus
   - in-kernel handling of IOMMU hypercalls
   - support for dynamic DMA windows (DDW).

  s390:
   - provide the floating point registers via sync regs;
   - separated instruction vs.  data accesses
   - dirty log improvements for huge guests
   - bugfixes and documentation improvements.

  x86:
   - Hyper-V VMBus hypercall userspace exit
   - alternative implementation of lowest-priority interrupts using
     vector hashing (for better VT-d posted interrupt support)
   - fixed guest debugging with nested virtualizations
   - improved interrupt tracking in the in-kernel IOAPIC
   - generic infrastructure for tracking writes to guest
     memory - currently its only use is to speedup the legacy shadow
     paging (pre-EPT) case, but in the future it will be used for
     virtual GPUs as well
   - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits)
  KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch
  KVM: x86: disable MPX if host did not enable MPX XSAVE features
  arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
  arm64: KVM: vgic-v3: Reset LRs at boot time
  arm64: KVM: vgic-v3: Do not save an LR known to be empty
  arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
  arm64: KVM: vgic-v3: Avoid accessing ICH registers
  KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit
  KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit
  KVM: arm/arm64: vgic-v2: Reset LRs at boot time
  KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty
  KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function
  KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required
  KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers
  KVM: s390: allocate only one DMA page per VM
  KVM: s390: enable STFLE interpretation only if enabled for the guest
  KVM: s390: wake up when the VCPU cpu timer expires
  KVM: s390: step the VCPU timer while in enabled wait
  KVM: s390: protect VCPU cpu timer with a seqcount
  KVM: s390: step VCPU cpu timer during kvm_run ioctl
  ...
2016-03-16 09:55:35 -07:00
Cyril Bur 6e669f085d powerpc: Fix unrecoverable SLB miss during restore_math()
Commit 70fe3d9 "powerpc: Restore FPU/VEC/VSX if previously used" introduces a
call to restore_math() late in the syscall return path, after MSR_RI has been
cleared. The MSR_RI flag is used to indicate whether the kernel can take
another exception or not. A cleared MSR_RI flag indicates that the kernel
cannot.

Unfortunately when a machine is under SLB pressure an SLB miss can occur
in restore_math() which (with MSR_RI cleared) leads to an unrecoverable
exception.

  Unrecoverable exception 4100 at c0000000000088d8
  cpu 0x0: Vector: 4100  at [c0000003fa473b20]
      pc: c0000000000088d8: .load_vr_state+0x70/0x110
      lr: c00000000000f710: .restore_math+0x130/0x188
      sp: c0000003fa473da0
     msr: 9000000002003030
    current = 0xc0000007f876f180
    paca    = 0xc00000000fff0000	 softe: 0	 irq_happened: 0x01
      pid   = 1944, comm = K08umountfs
  [link register   ] c00000000000f710 .restore_math+0x130/0x188
  [c0000003fa473da0] c0000003fa473e30 (unreliable)
  [c0000003fa473e30] c000000000007b6c system_call+0x84/0xfc

The clearing of MSR_RI is actually an optimisation to avoid multiple MSR
writes, what must be disabled are interrupts. See comment in entry_64.S:

  /*
   * For performance reasons we clear RI the same time that we
   * clear EE. We only need to clear RI just before we restore r13
   * below, but batching it with EE saves us one expensive mtmsrd call.
   * We have to be careful to restore RI if we branch anywhere from
   * here (eg syscall_exit_work).
   */

At the point of calling restore_math() r13 has not been restored, as such, the
quick fix of turning MSR_RI back on for the call to restore_math() will
eliminate the occurrence of an unrecoverable exception.

We'd like to do a better fix in future.

Fixes: 70fe3d980f ("powerpc: Restore FPU/VEC/VSX if previously used")
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-16 15:23:02 +11:00
Scott Wood 7a25d91214 powerpc/book3e-64: Use hardcoded mttmr opcode
This preserves the ability to build using older binutils (reportedly <=
2.22).

Fixes: 6becef7ea0 ("powerpc/mpc85xx: Add CPU hotplug support for E6500")
Signed-off-by: Scott Wood <oss@buserror.net>
Cc: chenhui.zhao@freescale.com
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-16 15:22:16 +11:00
Linus Torvalds 710d60cbf1 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cpu hotplug updates from Thomas Gleixner:
 "This is the first part of the ongoing cpu hotplug rework:

   - Initial implementation of the state machine

   - Runs all online and prepare down callbacks on the plugged cpu and
     not on some random processor

   - Replaces busy loop waiting with completions

   - Adds tracepoints so the states can be followed"

More detailed commentary on this work from an earlier email:
 "What's wrong with the current cpu hotplug infrastructure?

   - Asymmetry

     The hotplug notifier mechanism is asymmetric versus the bringup and
     teardown.  This is mostly caused by the notifier mechanism.

   - Largely undocumented dependencies

     While some notifiers use explicitely defined notifier priorities,
     we have quite some notifiers which use numerical priorities to
     express dependencies without any documentation why.

   - Control processor driven

     Most of the bringup/teardown of a cpu is driven by a control
     processor.  While it is understandable, that preperatory steps,
     like idle thread creation, memory allocation for and initialization
     of essential facilities needs to be done before a cpu can boot,
     there is no reason why everything else must run on a control
     processor.  Before this patch series, bringup looks like this:

       Control CPU                     Booting CPU

       do preparatory steps
       kick cpu into life

                                       do low level init

       sync with booting cpu           sync with control cpu

       bring the rest up

   - All or nothing approach

     There is no way to do partial bringups.  That's something which is
     really desired because we waste e.g.  at boot substantial amount of
     time just busy waiting that the cpu comes to life.  That's stupid
     as we could very well do preparatory steps and the initial IPI for
     other cpus and then go back and do the necessary low level
     synchronization with the freshly booted cpu.

   - Minimal debuggability

     Due to the notifier based design, it's impossible to switch between
     two stages of the bringup/teardown back and forth in order to test
     the correctness.  So in many hotplug notifiers the cancel
     mechanisms are either not existant or completely untested.

   - Notifier [un]registering is tedious

     To [un]register notifiers we need to protect against hotplug at
     every callsite.  There is no mechanism that bringup/teardown
     callbacks are issued on the online cpus, so every caller needs to
     do it itself.  That also includes error rollback.

  What's the new design?

     The base of the new design is a symmetric state machine, where both
     the control processor and the booting/dying cpu execute a well
     defined set of states.  Each state is symmetric in the end, except
     for some well defined exceptions, and the bringup/teardown can be
     stopped and reversed at almost all states.

     So the bringup of a cpu will look like this in the future:

       Control CPU                     Booting CPU

       do preparatory steps
       kick cpu into life

                                       do low level init

       sync with booting cpu           sync with control cpu

                                       bring itself up

     The synchronization step does not require the control cpu to wait.
     That mechanism can be done asynchronously via a worker or some
     other mechanism.

     The teardown can be made very similar, so that the dying cpu cleans
     up and brings itself down.  Cleanups which need to be done after
     the cpu is gone, can be scheduled asynchronously as well.

  There is a long way to this, as we need to refactor the notion when a
  cpu is available.  Today we set the cpu online right after it comes
  out of the low level bringup, which is not really correct.

  The proper mechanism is to set it to available, i.e. cpu local
  threads, like softirqd, hotplug thread etc. can be scheduled on that
  cpu, and once it finished all booting steps, it's set to online, so
  general workloads can be scheduled on it.  The reverse happens on
  teardown.  First thing to do is to forbid scheduling of general
  workloads, then teardown all the per cpu resources and finally shut it
  off completely.

  This patch series implements the basic infrastructure for this at the
  core level.  This includes the following:

   - Basic state machine implementation with well defined states, so
     ordering and prioritization can be expressed.

   - Interfaces to [un]register state callbacks

     This invokes the bringup/teardown callback on all online cpus with
     the proper protection in place and [un]installs the callbacks in
     the state machine array.

     For callbacks which have no particular ordering requirement we have
     a dynamic state space, so that drivers don't have to register an
     explicit hotplug state.

     If a callback fails, the code automatically does a rollback to the
     previous state.

   - Sysfs interface to drive the state machine to a particular step.

     This is only partially functional today.  Full functionality and
     therefor testability will be achieved once we converted all
     existing hotplug notifiers over to the new scheme.

   - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying
     processor:

       Control CPU                     Booting CPU

       do preparatory steps
       kick cpu into life

                                       do low level init

       sync with booting cpu           sync with control cpu
       wait for boot
                                       bring itself up

                                       Signal completion to control cpu

     In a previous step of this work we've done a full tree mechanical
     conversion of all hotplug notifiers to the new scheme.  The balance
     is a net removal of about 4000 lines of code.

     This is not included in this series, as we decided to take a
     different approach.  Instead of mechanically converting everything
     over, we will do a proper overhaul of the usage sites one by one so
     they nicely fit into the symmetric callback scheme.

     I decided to do that after I looked at the ugliness of some of the
     converted sites and figured out that their hotplug mechanism is
     completely buggered anyway.  So there is no point to do a
     mechanical conversion first as we need to go through the usage
     sites one by one again in order to achieve a full symmetric and
     testable behaviour"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  cpu/hotplug: Document states better
  cpu/hotplug: Fix smpboot thread ordering
  cpu/hotplug: Remove redundant state check
  cpu/hotplug: Plug death reporting race
  rcu: Make CPU_DYING_IDLE an explicit call
  cpu/hotplug: Make wait for dead cpu completion based
  cpu/hotplug: Let upcoming cpu bring itself fully up
  arch/hotplug: Call into idle with a proper state
  cpu/hotplug: Move online calls to hotplugged cpu
  cpu/hotplug: Create hotplug threads
  cpu/hotplug: Split out the state walk into functions
  cpu/hotplug: Unpark smpboot threads from the state machine
  cpu/hotplug: Move scheduler cpu_online notifier to hotplug core
  cpu/hotplug: Implement setup/removal interface
  cpu/hotplug: Make target state writeable
  cpu/hotplug: Add sysfs state interface
  cpu/hotplug: Hand in target state to _cpu_up/down
  cpu/hotplug: Convert the hotplugged cpu work to a state machine
  cpu/hotplug: Convert to a state machine for the control processor
  cpu/hotplug: Add tracepoints
  ...
2016-03-15 13:50:29 -07:00
Michael Ellerman a1b5344620 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include 8xx optimizations, 32-bit checksum optimizations,
86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt
bits, and minor fixes/cleanup."
2016-03-14 20:05:14 +11:00
Christophe Leroy 737b01fca3 powerpc32: Remove one insn in mulhdu
Remove one instruction in mulhdu

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:12 -06:00
Christophe Leroy 716fa91d19 powerpc32: small optimisation in flush_icache_range()
Inlining of _dcache_range() functions has shown that the compiler
does the same thing a bit better with one insn less

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:12 -06:00
Christophe Leroy affe587bac powerpc32: move xxxxx_dcache_range() functions inline
flush/clean/invalidate _dcache_range() functions are all very
similar and are quite short. They are mainly used in __dma_sync()
perf_event locate them in the top 3 consumming functions during
heavy ethernet activity

They are good candidate for inlining, as __dma_sync() does
almost nothing but calling them

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:12 -06:00
Christophe Leroy 5736f96d12 powerpc32: Remove clear_pages() and define clear_page() inline
clear_pages() is never used expect by clear_page, and PPC32 is the
only architecture (still) having this function. Neither PPC64 nor
any other architecture has it.

This patch removes clear_pages() and moves clear_page() function
inline (same as PPC64) as it only is a few isns

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:11 -06:00
Christophe Leroy 766d45cbee powerpc/8xx: rewrite flush_instruction_cache() in C
On PPC8xx, flushing instruction cache is performed by writing
in register SPRN_IC_CST. This registers suffers CPU6 ERRATA.
The patch rewrites the fonction in C so that CPU6 ERRATA will
be handled transparently

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:11 -06:00
Christophe Leroy a7761fe489 powerpc/8xx: rewrite set_context() in C
There is no real need to have set_context() in assembly.
Now that we have mtspr() handling CPU6 ERRATA directly, we
can rewrite set_context() in C language for easier maintenance.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:20:11 -06:00
Christophe Leroy 63e9e1c28f powerpc/8xx: remove special handling of CPU6 errata in set_dec()
CPU6 ERRATA is now handled directly in mtspr(), so we can use the
standard set_dec() fonction in all cases.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:03 -06:00
Christophe Leroy a372acfac5 powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.

MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.

In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.

In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.

With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:01 -06:00
Christophe Leroy 913a6b3d10 powerpc/8xx: Save r3 all the time in DTLB miss handler
We are spending between 40 and 160 cycles with a mean of 65 cycles in
the DTLB handling routine (measured with mftbl) so make it more
simple althought it adds one instruction.
With this modification, we get three registers available at all time,
which will help with following patch.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11 17:18:01 -06:00
Michael Ellerman d8c0282f4d Merge branch 'topic/mprofile-kernel' into next
Merge the ftrace changes to support -mprofile-kernel on ppc64le. This is
a prerequisite for live patching, the support for which will be merged
via the livepatch tree based on this topic branch.
2016-03-11 11:20:15 +11:00
Ingo Molnar 6cbe9e4a22 Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 10:28:27 +01:00
Christophe Leroy 921fff351c powerpc/8xx: CONFIG_DEBUG_PAGEALLOC requires ITLBmiss for kernel addresses
When CONFIG_DEBUG_PAGEALLOC is activated, the initial TLB mapping gets
flushed to track accesses to wrong areas. Therefore, kernel addresses
will also generate ITLB misses.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09 10:44:17 -06:00
Paolo Bonzini ab92f30875 KVM/ARM updates for 4.6
- VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
 - PMU support for guests
 - 32bit world switch rewritten in C
 - Various optimizations to the vgic save/restore code
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW36xjAAoJECPQ0LrRPXpDGQkQAMDppzcTOixT3e8VPdHAX09a
 Z5PO0gyTMVV7Jyz5Ul3pedPJA2GSK9mxOCwqvIFbdxLAR6ZB00juO5FrTHkSdI91
 1XLPj4bKoMWcVvhL/g5A4Glp/pVMW1k/9Yq8zZAtYlsLRlqG5rLOutSadcqHcYaJ
 cTD/pFf7b2oPtkTPyoFml75KgHBT/8uvAvFDOWA66Id2z6T11+PsBT/6XnGDiwKg
 tpGTNzx3kPIKIzOAOHqVW6UBxFOeabebXLT8wUz3VwNn/UbG6gkumMNApMAyF2q1
 zU0nAh8+7Ek6Dr4OFWE6BfW6sgg/l7i1lA8XoAmqG7ZTrSptCc59fvaZJxPruG+Q
 dMsU6QgR77JJjbZTinf9a1jReZ/liZrx2gZXedVKdILrjmDSq0UnGcxjUOEDZOGy
 2/dbrlJhv+LhpcJtuPpxPCfoqbW5L0ynzmuYuXRdRz3lTHiOWIRx5gugrhO+wH4D
 4gvZhbw3XCiYfpYHYhl8A1EH5kanKgdXDocz9yIm7mZm89gngufF/HkeXS3ZU25T
 yThyBGulGjqN4FCdgf1HolkTfFjnfSx4qJovJ58eHga+HNLXRkTecZZcbFy2OOHv
 8Bx0PIlwj4RgSaRLWQUudAhdhKS2g22DKDDljxFwhkMPNghvqkYMJCRDKLu6GBXQ
 4YsLKM+TaShHFjSpx+ao
 =rpvb
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM updates for 4.6

- VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
- PMU support for guests
- 32bit world switch rewritten in C
- Various optimizations to the vgic save/restore code

Conflicts:
	include/uapi/linux/kvm.h
2016-03-09 11:50:42 +01:00
Andrew Donnellan 949e9b827e powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
In eeh_pci_enable(), after making the request to set the new options, we
call eeh_ops->wait_state() to check that the request finished successfully.

At the moment, if eeh_ops->wait_state() returns 0, we return 0 without
checking that it reflects the expected outcome. This can lead to callers
further up the chain incorrectly assuming the slot has been successfully
unfrozen and continuing to attempt recovery.

On powernv, this will occur if pnv_eeh_get_pe_state() or
pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL
call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or
OPAL_EEH_PHB_ERROR respectively.

On pseries, this will occur if pseries_eeh_get_state() returns 0, which in
turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA
Stopped states.

Obviously, none of these cases represent a successful completion of a
request to thaw MMIO or DMA.

Fix the check so that a wait_state() return value of 0 won't be considered
successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 11:33:30 +11:00
Gavin Shan b6c7347f2f powerpc/eeh: Remove duplicated check in eeh_dump_pe_log()
When eeh_dump_pe_log() is only called by eeh_slot_error_detail(),
we already have the check that the PE isn't in PCI config blocked
state in eeh_slot_error_detail(). So we needn't the duplicated
check in eeh_dump_pe_log().

This removes the duplicated check in eeh_dump_pe_log(). No logical
changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 10:25:35 +11:00
Gavin Shan eca036ee1b powerpc/eeh: Synchronize recovery in host/guest
When passing through SRIOV VFs to guest, we possibly encounter EEH
error on PF. In this case, the VF PEs are put into frozen state.
The error could be reported to guest before it's captured by the
host. That means the guest could attempt to recover errors on VFs
before host gets chance to recover errors on PFs. The VFs won't be
recovered successfully.

This enforces the recovery order for above case: the recovery on
child PE in guest is hold until the recovery on parent PE in host
is completed.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:28 +11:00
Gavin Shan 3fa7bf7229 powerpc/eeh: Don't remove passed VFs
When we have partial hotplug as part of the error recovery on PF,
the VFs that are bound with vfio-pci driver will experience hotplug.
That's not allowed.

This checks if the VF PE is passed or not. If it does, we leave
the VF without removing it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:27 +11:00
Gavin Shan 2311cca555 powerpc/eeh: Don't propagate error to guest
When EEH error happened to the parent PE of those PEs that have
been passed through to guest, the error is propagated to guest
domain and the VFIO driver's error handlers are called. It's not
correct as the error in the host domain shouldn't be propagated
to guests and affect them.

This adds one more limitation when calling EEH error handlers.
If the PE has been passed through to guest, the error handlers
won't be called.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:25 +11:00
Wei Yang 67086e32b5 powerpc/eeh: powerpc/eeh: Support error recovery for VF PE
PFs are enumerated on PCI bus, while VFs are created by PF's driver.

In EEH recovery, it has two cases:
1. Device and driver is EEH aware, error handlers are called.
2. Device and driver is not EEH aware, un-plug the device and plug it again
by enumerating it.

The special thing happens on the second case. For a PF, we could use the
original pci core to enumerate the bus, while for VF we need to record the
VFs which aer un-plugged then plug it again.

Also The patch caches the VF index in pci_dn, which can be used to
calculate VF's bus, device and function number. Those information helps to
locate the VF's PCI device instance when doing hotplug during EEH recovery
if necessary.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:23 +11:00
Wei Yang 9312bc5bab powerpc/powernv: Support EEH reset for VF PE
PEs for VFs don't have primary bus. So they have to have their own reset
backend, which is used during EEH recovery. The patch implements the reset
backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained
in the PE.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:21 +11:00
Wei Yang c29fa27d26 powerpc/eeh: Create PE for VFs
This creates PEs for VFs in the weak function pcibios_bus_add_device().
Those PEs for VFs are identified with newly introduced flag EEH_PE_VF
so that we treat them differently during EEH recovery.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:19 +11:00
Wei Yang 39218cd00e powerpc/eeh: EEH device for VF
VFs and their corresponding pdn are created and released dynamically
when their PF's SRIOV capability is enabled and disabled. This creates
and releases EEH devices for VFs when creating and releasing their pdn
instances, which means EEH devices and pdn instances have same life
cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn).

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:18 +11:00
Wei Yang 51c0e87e9a powerpc/eeh: Cache normal BARs, not windows or IOV BARs
This restricts the EEH address cache to use only the first 7 BARs. This
makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs.
As the result of this change, eeh_addr_cache_get_dev() will return VFs from
VF's resource addresses instead of parent PFs.

This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev()
to 7 BARs and this effectively excludes PCI bridges from being cached.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:17 +11:00
Wei Yang 971427f582 powerpc/pci: Remove VFs prior to PF
As commit ac205b7bb7 ("PCI: make sriov work with hotplug remove")
indicates, VFs which is on the same PCI bus as their PF, should be
removed before the PF. Otherwise, we might run into kernel crash
at PCI unplugging time.

This applies the above pattern to powerpc PCI hotplug path.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:15 +11:00
Gavin Shan 4eb0799ff9 powerpc/eeh: Reworked eeh_pe_bus_get()
The original implementation is ugly: unnecessary if statements and
"out" tag. This reworks the function to avoid above weaknesses. No
functional changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:57:46 +11:00
Torsten Duwe 153086644f powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI
The gcc switch -mprofile-kernel defines a new ABI for calling _mcount()
very early in the function with minimal overhead.

Although mprofile-kernel has been available since GCC 3.4, there were
bugs which were only fixed recently. Currently it is known to work in
GCC 4.9, 5 and 6.

Additionally there are two possible code sequences generated by the
flag, the first uses mflr/std/bl and the second is optimised to omit the
std. Currently only gcc 6 has the optimised sequence. This patch
supports both sequences.

Initial work started by Vojtech Pavlik, used with permission.

Key changes:
 - rework _mcount() to work for both the old and new ABIs.
 - implement new versions of ftrace_caller() and ftrace_graph_caller()
   which deal with the new ABI.
 - updates to __ftrace_make_nop() to recognise the new mcount calling
   sequence.
 - updates to __ftrace_make_call() to recognise the nop'ed sequence.
 - implement ftrace_modify_call().
 - updates to the module loader to surpress the toc save in the module
   stub when calling mcount with the new ABI.

Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:55 +11:00
Torsten Duwe 9a7841ae8d powerpc/ftrace: Use $(CC_FLAGS_FTRACE) when disabling ftrace
Rather than open-coding -pg whereever we want to disable ftrace, use the
existing $(CC_FLAGS_FTRACE) variable.

This has the advantage that it will work in future when we use a
different set of flags to enable ftrace.

Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:55 +11:00
Torsten Duwe c96f83856f powerpc/ftrace: Use generic ftrace_modify_all_code()
Convert powerpc's arch_ftrace_update_code() from its own version to use
the generic default functionality (without stop_machine -- our
instructions are properly aligned and the replacements atomic).

With this we gain error checking and the much-needed function_trace_op
handling.

Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:54 +11:00
Michael Ellerman 336a7b5dd8 powerpc/module: Create a special stub for ftrace_caller()
In order to support the new -mprofile-kernel ABI, we need to be able to
call from the module back to ftrace_caller() (in the kernel) without
using the module's r2. That is because the function in this module which
is calling ftrace_caller() may not have setup r2, if it doesn't
otherwise need it (ie. it accesses no globals).

To make that work we add a new stub which is used for calling
ftrace_caller(), which uses the kernel toc instead of the module toc.

Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:54 +11:00
Michael Ellerman f17c4e01e9 powerpc/module: Mark module stubs with a magic value
When a module is loaded, calls out to the kernel go via a stub which is
generated at runtime. One of these stubs is used to call _mcount(),
which is the default target of tracing calls generated by the compiler
with -pg.

If dynamic ftrace is enabled (which it typically is), another stub is
used to call ftrace_caller(), which is the target of tracing calls when
ftrace is actually active.

ftrace then wants to disable the calls to _mcount() at module startup,
and enable/disable the calls to ftrace_caller() when enabling/disabling
tracing - all of these it does by patching the code.

As part of that code patching, the ftrace code wants to confirm that the
branch it is about to modify, is in fact a call to a module stub which
calls _mcount() or ftrace_caller().

Currently it does that by inspecting the instructions and confirming
they are what it expects. Although that works, the code to do it is
pretty intricate because it requires lots of knowledge about the exact
format of the stub.

We can make that process easier by marking the generated stubs with a
magic value, and then looking for that magic value. Altough this is not
as rigorous as the current method, I believe it is sufficient in
practice.

Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:53 +11:00
Michael Ellerman 136cd3450a powerpc/module: Only try to generate the ftrace_caller() stub once
Currently we generate the module stub for ftrace_caller() at the bottom
of apply_relocate_add(). However apply_relocate_add() is potentially
called more than once per module, which means we will try to generate
the ftrace_caller() stub multiple times.

Although the current code deals with that correctly, ie. it only
generates a stub the first time, it would be clearer to only try to
generate the stub once.

Note also on first reading it may appear that we generate a different
stub for each section that requires relocation, but that is not the
case. The code in stub_for_addr() that searches for an existing stub
uses sechdrs[me->arch.stubs_section], ie. the single stub section for
this module.

A cleaner approach is to only generate the ftrace_caller() stub once,
from module_finalize(). Although the original code didn't check to see
if the stub was actually generated correctly, it seems prudent to add a
check, so do that. And an additional benefit is we can clean the ifdefs
up a little.

Finally we must propagate the const'ness of some of the pointers passed
to module_finalize(), but that is also an improvement.

Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-07 14:53:53 +11:00
Michael Ellerman a5cab83cd3 powerpc: Create a helper for getting the kernel toc value
Move the logic to work out the kernel toc pointer into a header. This is
a good cleanup, and also means we can use it elsewhere in future.

Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
2016-03-07 14:53:52 +11:00
Linus Torvalds b8155fe1b2 powerpc fixes for 4.5 #4
- cxl: Fix PSL timebase synchronization detection from Frederic Barrat
  - Fix oops when destroying hw_breakpoint event from Ravi Bangoria
  - Avoid lbarx on e5500 from Scott Wood
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIbBAABAgAGBQJW21wIAAoJEFHr6jzI4aWAiTQP+PWe24xwWmC/9FWBhT5rmHbs
 CO82Wy2X3gjMsQoXGqe48ohfToJAWEt5yXS+H/PKpoTYJ/yFcih2ufRw55Y6xzHs
 M98c4Rt4mAbGwkmwWTBjSc1zP+ReWeMNwRE69j6N6Xl+fDqGg6tMKrcysh6wdDLB
 ZhLeHZbVEGhIlXuxMjGyPrMxSrvcEZMMVrxp8eTycC6+wBHB4w4WzOpFx87WM0fn
 3g1Q5WWzRjW2iCeWa6tJLwIIzEACrFs8nfJU08mJU1n33YYVbpu+2klG+5ppfGAL
 JiuKt9/ZoWSAMwNU7yEwFdgjtnOzywTnt6kAUWFdMKLV7DktJ+sf0ucZF9oBg87e
 ermsXt+Jy591NXgoif5gFMELWouC+Ti9pnqxdOTawYC8TKFm0OGExJFZQMGmzBw1
 9ZpO/2pl0bsuNZyClspuFyVRrNb1sxddisSyKlxW+0Vnz1P432fRZZSz2gHLvClG
 RjQ+nsKZYj6vH0wfA+eG9+/a5iYjbfBKRNSjrO1g8pbc3ELeG5C2kXDePKnkt8+J
 jpvTLHxaRp44wNPcizPrEthrdlKxE8uvTUosRuSMSJu74SK7t8+9kCvgDnHYne51
 f/2Ko7CzAV2SfdKB+nWI2oJxfYpPlxqORZkYGkQZIOD5uQ8pvJ4yCNzy43XqFHFE
 DyJOYeTxVQmfMT6jkrg=
 =ZcDj
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - cxl: Fix PSL timebase synchronization detection from Frederic Barrat
 - Fix oops when destroying hw_breakpoint event from Ravi Bangoria
 - Avoid lbarx on e5500 from Scott Wood

* tag 'powerpc-4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/fsl-book3e: Avoid lbarx on e5500
  powerpc/hw_breakpoint: Fix oops when destroying hw_breakpoint event
  cxl: Fix PSL timebase synchronization detection
2016-03-06 11:08:06 -08:00
chenhui zhao 6becef7ea0 powerpc/mpc85xx: Add CPU hotplug support for E6500
Support Freescale E6500 core-based platforms, like t4240.
Support disabling/enabling individual CPU thread dynamically.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
2016-03-04 23:58:38 -06:00
chenhui zhao 2f4f1f815b powerpc/mpc85xx: Add hotplug support on E5500 and E500MC cores
Freescale E500MC and E5500 core-based platforms, like P4080, T1040,
support disabling/enabling CPU dynamically.
This patch adds this feature on those platforms.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com>
[scottwood: removed unused pr_fmt]
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:56:31 -06:00
chenhui zhao d17799f9c1 powerpc/rcpm: add RCPM driver
There is a RCPM (Run Control/Power Management) in Freescale QorIQ
series processors. The device performs tasks associated with device
run control and power management.

The driver implements some features: mask/unmask irq, enter/exit low
power states, freeze time base, etc.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@freescale.com>
[scottwood: remove __KERNEL__ ifdef]
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:50:27 -06:00
chenhui zhao e7affb1dba powerpc/cache: add cache flush operation for various e500
Various e500 core have different cache architecture, so they
need different cache flush operations. Therefore, add a callback
function cpu_flush_caches to the struct cpu_spec. The cache flush
operation for the specific kind of e500 is selected at init time.
The callback function will flush all caches inside the current cpu.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04 23:44:51 -06:00
Ravi Bangoria fb822e6076 powerpc/hw_breakpoint: Fix oops when destroying hw_breakpoint event
When destroying a hw_breakpoint event, the kernel oopses as follows:

  Unable to handle kernel paging request for data at address 0x00000c07
  NIP [c0000000000291d0] arch_unregister_hw_breakpoint+0x40/0x60
  LR [c00000000020b6b4] release_bp_slot+0x44/0x80

Call chain:

  hw_breakpoint_event_init()
    bp->destroy = bp_perf_event_destroy;

  do_exit()
    perf_event_exit_task()
      perf_event_exit_task_context()
        WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE);
        perf_event_exit_event()
          free_event()
            _free_event()
              bp_perf_event_destroy() // event->destroy(event);
                release_bp_slot()
                  arch_unregister_hw_breakpoint()

perf_event_exit_task_context() sets child_ctx->task as TASK_TOMBSTONE
which is (void *)-1. arch_unregister_hw_breakpoint() tries to fetch
'thread' attribute of 'task' resulting in oops.

Peterz points out that the code shouldn't be using bp->ctx anyway, but
fixing that will require a decent amount of rework. So for now to fix
the oops, check if bp->ctx->task has been set to (void *)-1, before
dereferencing it. We don't use TASK_TOMBSTONE, because that would
require exporting it and it's supposed to be an internal detail.

Fixes: 63b6da39bb ("perf: Fix perf_event_exit_task() race")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03 22:06:08 +11:00
Aneesh Kumar K.V f64e8084c9 powerpc/mm: Move hash related mmu-*.h headers to book3s/
No code changes.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03 21:19:21 +11:00
Cyril Bur bf6a4d5b75 powerpc: Add the ability to save VSX without giving it up
This patch adds the ability to be able to save the VSX registers to the
thread struct without giving up (disabling the facility) next time the
process returns to userspace.

This patch builds on a previous optimisation for the FPU and VEC registers
in the thread copy path to avoid a possibly pointless reload of VSX state.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 23:34:50 +11:00
Cyril Bur 6f515d842e powerpc: Add the ability to save Altivec without giving it up
This patch adds the ability to be able to save the VEC registers to the
thread struct without giving up (disabling the facility) next time the
process returns to userspace.

This patch builds on a previous optimisation for the FPU registers in the
thread copy path to avoid a possibly pointless reload of VEC state.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 23:34:49 +11:00
Cyril Bur 8792468da5 powerpc: Add the ability to save FPU without giving it up
This patch adds the ability to be able to save the FPU registers to the
thread struct without giving up (disabling the facility) next time the
process returns to userspace.

This patch optimises the thread copy path (as a result of a fork() or
clone()) so that the parent thread can return to userspace with hot
registers avoiding a possibly pointless reload of FPU register state.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 23:34:49 +11:00
Cyril Bur de2a20aa72 powerpc: Prepare for splitting giveup_{fpu, altivec, vsx} in two
This prepares for the decoupling of saving {fpu,altivec,vsx} registers and
marking {fpu,altivec,vsx} as being unused by a thread.

Currently giveup_{fpu,altivec,vsx}() does both however optimisations to
task switching can be made if these two operations are decoupled.
save_all() will permit the saving of registers to thread structs and leave
threads MSR with bits enabled.

This patch introduces no functional change.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 23:34:48 +11:00
Cyril Bur 70fe3d980f powerpc: Restore FPU/VEC/VSX if previously used
Currently the FPU, VEC and VSX facilities are lazily loaded. This is not
a problem unless a process is using these facilities.

Modern versions of GCC are very good at automatically vectorising code,
new and modernised workloads make use of floating point and vector
facilities, even the kernel makes use of vectorised memcpy.

All this combined greatly increases the cost of a syscall since the
kernel uses the facilities sometimes even in syscall fast-path making it
increasingly common for a thread to take an *_unavailable exception soon
after a syscall, not to mention potentially taking all three.

The obvious overcompensation to this problem is to simply always load
all the facilities on every exit to userspace. Loading up all FPU, VEC
and VSX registers every time can be expensive and if a workload does
avoid using them, it should not be forced to incur this penalty.

An 8bit counter is used to detect if the registers have been used in the
past and the registers are always loaded until the value wraps to back
to zero.

Several versions of the assembly in entry_64.S were tested:

  1. Always calling C.
  2. Performing a common case check and then calling C.
  3. A complex check in asm.

After some benchmarking it was determined that avoiding C in the common
case is a performance benefit (option 2). The full check in asm (option
3) greatly complicated that codepath for a negligible performance gain
and the trade-off was deemed not worth it.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Move load_vec in the struct to fill an existing hole, reword change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

fixup
2016-03-02 23:34:48 +11:00
Cyril Bur d272f6670a powerpc: Explicitly disable math features when copying thread
Currently when threads get scheduled off they always giveup the FPU,
Altivec (VMX) and Vector (VSX) units if they were using them. When they are
scheduled back on a fault is then taken to enable each facility and load
registers. As a result explicitly disabling FPU/VMX/VSX has not been
necessary.

Future changes and optimisations remove this mandatory giveup and fault
which could cause calls such as clone() and fork() to copy threads and run
them later with FPU/VMX/VSX enabled but no registers loaded.

This patch starts the process of having MSR_{FP,VEC,VSX} mean that a
threads registers are hot while not having MSR_{FP,VEC,VSX} means that the
registers must be loaded. This allows for a smarter return to userspace.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02 23:34:47 +11:00
Thomas Gleixner fc6d73d674 arch/hotplug: Call into idle with a proper state
Let the non boot cpus call into idle with the corresponding hotplug state, so
the hotplug core can handle the further bringup. That's a first step to
convert the boot side of the hotplugged cpus to do all the synchronization
with the other side through the state machine. For now it'll only start the
hotplug thread and kick the full bringup of the cpu.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel <riel@redhat.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/20160226182341.614102639@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-01 20:36:57 +01:00
Adam Buchbinder 446957ba51 powerpc: Fix misspellings in comments.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01 19:27:20 +11:00
Ingo Molnar 39a1142dbb Linux 4.5-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW0yM6AAoJEHm+PkMAQRiGeUwIAJRTHFPJTFpJcJjeZEV4/EL1
 7Pl0WSHs/CWBkXIevAg2HgkECSQ9NI9FAUFvoGxCldDpFAnL1U2QV8+Ur2qhiXMG
 5v0jILJuiw57qT/NfhEudZolerlRoHILmB3JRTb+DUV4GHZuWpTkJfUSI9j5aTEl
 w83XUgtK4bKeIyFbHdWQk6xqfzfFBSuEITuSXreOMwkFfMmeScE0WXOPLBZWyhPa
 v0rARJLYgM+vmRAnJjnG8unH+SgnqiNcn2oOFpevKwmpVcOjcEmeuxh/HdeZf7HM
 /R8F86OwdmXsO+z8dQxfcucLg+I9YmKfFr8b6hopu1sRztss2+Uk6H1j2J7IFIg=
 =tvkh
 -----END PGP SIGNATURE-----

Merge tag 'v4.5-rc6' into locking/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-29 09:55:22 +01:00
Suresh E. Warrier e17769eb8c KVM: PPC: Book3S HV: Send IPI to host core to wake VCPU
This patch adds support to real-mode KVM to search for a core
running in the host partition and send it an IPI message with
VCPU to be woken. This avoids having to switch to the host
partition to complete an H_IPI hypercall when the VCPU which
is the target of the the H_IPI is not loaded (is not running
in the guest).

The patch also includes the support in the IPI handler running
in the host to do the wakeup by calling kvmppc_xics_ipi_action
for the PPC_MSG_RM_HOST_ACTION message.

When a guest is being destroyed, we need to ensure that there
are no pending IPIs waiting to wake up a VCPU before we free
the VCPUs of the guest. This is accomplished by:
- Forces a PPC_MSG_CALL_FUNCTION IPI to be completed by all CPUs
  before freeing any VCPUs in kvm_arch_destroy_vm().
- Any PPC_MSG_RM_HOST_ACTION messages must be executed first
  before any other PPC_MSG_CALL_FUNCTION messages.

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-02-29 16:25:06 +11:00
Suresh Warrier 31639c77e0 powerpc/smp: Add smp_muxed_ipi_set_message
smp_muxed_ipi_message_pass() invokes smp_ops->cause_ipi, which
uses an ioremapped address to access registers on the XICS
interrupt controller to cause the IPI. Because of this real
mode callers cannot call smp_muxed_ipi_message_pass() for IPI
messaging.

This patch creates a separate function smp_muxed_ipi_set_message
just to set the IPI message without the cause_ipi routine.
After calling this function to set the IPI message, real
mode callers must cause the IPI by writing to the XICS registers
directly.

As part of this, we also change smp_muxed_ipi_message_pass
to call smp_muxed_ipi_set_message to set the message instead
of doing it directly inside the routine.

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-02-29 16:25:06 +11:00
Suresh Warrier bd7f561f76 powerpc/smp: Support more IPI messages
This patch increases the number of demuxed messages for a
controller with a single ipi to 8 for 64-bit systems.

This is required because we want to use the IPI mechanism
to send messages from a CPU running in KVM real mode in a
guest to a CPU in the host to take some action. Currently,
we only support 4 messages and all 4 are already taken.

Define a fifth message PPC_MSG_RM_HOST_ACTION for this
purpose.

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-02-29 16:25:06 +11:00
Daniel Cashman 5ef11c35ce mm: ASLR: use get_random_long()
Replace calls to get_random_int() followed by a cast to (unsigned long)
with calls to get_random_long().  Also address shifting bug which, in
case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits.

Signed-off-by: Daniel Cashman <dcashman@android.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27 10:28:52 -08:00
Linus Torvalds 9aca90a7ca powerpc fixes for 4.5 #3
- eeh: Fix partial hotplug criterion from Gavin Shan
  - mm: Clear the invalid slot information correctly from Aneesh Kumar K.V
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWzXquAAoJEFHr6jzI4aWADHsP/2lbwqz/vS3Ep4zlySHNvStL
 /DrRN2TN35THZ59FPRxgEfeqPxTCXtbpD6zEXwD0gf6m39I2zArhaQMOHXMtVPvV
 p0nAtwR0PX/PxlQTJDpHlg074vVAD7s3iuvad6oNQObLcXhoZ7wYtbStZ9Ithm4R
 YfqZTelzsw+GfMuTYnvAQf5aoRYztUpy7OheaJbbDmSZgMFwF896ZPJnaG9rAOPE
 xcSsRaSfHiUR2NE2ua1K5yya+1ilZqrZhib7QxXgzGuxoVa2AAiPR7Hpx2kX1Wm+
 z0DqPXISzRbVf9zyLgWD3TpJ4OMHI/CYVW+t/Gx/yWCMfNcfavUrh0vPdHRVEPZu
 zxmIUoI6yv7jQ6bcfdzR5s0Mr5pYWlUj5MZg2r8aGqloYcLPk5DiENg+c0QmKI05
 kQPCBoQz2ezzJWAt1BYshkc+mlimv3ODaNWFP34Nc6kcDaSO6a0rhVOecvKuR6dv
 UBNpeh5np1rKq1wX0ri0yAmnm//yXqe+bK0I8Ctipi0++e73sVJGzfFdVvXwEhhW
 h+v1BkdgW8WK/xlH+JCPiXd5dfXrUeFI0D65Kgpb7IbFc9hcXDmp2Dv7+8zx/Wcl
 L2NpuucSDxi+LHkE10QiypgLWSKjn9OSi8PLocqABNXG8uHxIp54jRfyViBNALXF
 XlPveqTgpt7On3aa0qVh
 =bk3U
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - eeh: Fix partial hotplug criterion from Gavin Shan
 - mm: Clear the invalid slot information correctly from Aneesh Kumar K.V

* tag 'powerpc-4.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm/hash: Clear the invalid slot information correctly
  powerpc/eeh: Fix partial hotplug criterion
2016-02-25 19:41:53 -08:00
Michael Ellerman 2527083cb8 powerpc fixes for 4.5 #3
- eeh: Fix partial hotplug criterion from Gavin Shan
  - mm: Clear the invalid slot information correctly from Aneesh Kumar K.V
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWzXquAAoJEFHr6jzI4aWADHsP/2lbwqz/vS3Ep4zlySHNvStL
 /DrRN2TN35THZ59FPRxgEfeqPxTCXtbpD6zEXwD0gf6m39I2zArhaQMOHXMtVPvV
 p0nAtwR0PX/PxlQTJDpHlg074vVAD7s3iuvad6oNQObLcXhoZ7wYtbStZ9Ithm4R
 YfqZTelzsw+GfMuTYnvAQf5aoRYztUpy7OheaJbbDmSZgMFwF896ZPJnaG9rAOPE
 xcSsRaSfHiUR2NE2ua1K5yya+1ilZqrZhib7QxXgzGuxoVa2AAiPR7Hpx2kX1Wm+
 z0DqPXISzRbVf9zyLgWD3TpJ4OMHI/CYVW+t/Gx/yWCMfNcfavUrh0vPdHRVEPZu
 zxmIUoI6yv7jQ6bcfdzR5s0Mr5pYWlUj5MZg2r8aGqloYcLPk5DiENg+c0QmKI05
 kQPCBoQz2ezzJWAt1BYshkc+mlimv3ODaNWFP34Nc6kcDaSO6a0rhVOecvKuR6dv
 UBNpeh5np1rKq1wX0ri0yAmnm//yXqe+bK0I8Ctipi0++e73sVJGzfFdVvXwEhhW
 h+v1BkdgW8WK/xlH+JCPiXd5dfXrUeFI0D65Kgpb7IbFc9hcXDmp2Dv7+8zx/Wcl
 L2NpuucSDxi+LHkE10QiypgLWSKjn9OSi8PLocqABNXG8uHxIp54jRfyViBNALXF
 XlPveqTgpt7On3aa0qVh
 =bk3U
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-4' into next

Pull in our current fixes from 4.5, in particular the "Fix Multi hit
ERAT" bug is causing folks some grief when testing next.
2016-02-25 21:52:58 +11:00
Balbir Singh a4c3f909b4 powerpc: Fix BUG_ON() reporting in real mode
I ran into this issue while debugging an early boot problem. The system
hit a BUG_ON() but report bug failed to print the line number and file
name. The reason being that the system was running in real mode and
report_bug() searches for addresses in the PAGE_OFFSET+ region.

Suggested-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-24 20:09:04 +11:00
Michael Neuling c3ab300ea5 powerpc: Add POWER9 cputable entry
Add a cputable entry for POWER9.  More code is required to actually
boot and run on a POWER9 but this gets the base piece in which we can
start building on.

Copies over from POWER8 except for:
- Adds a new CPU_FTR_ARCH_300 bit to start hanging new architecture
   features from (in subsequent patches).
- Advertises new user features bits PPC_FEATURE2_ARCH_3_00 &
  HAS_IEEE128 when on POWER9.
- Drops CPU_FTR_SUBCORE.
- Drops PMU code and machine check.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22 20:47:48 +11:00
Michael Neuling 15b1624b78 powerpc: Use defines for __init_tlb_power[78]
Use defines for literals __init_tlb_power[78] rather than hand coding
them.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22 20:47:47 +11:00
Gavin Shan f6bf0fa14c powerpc/eeh: Fix partial hotplug criterion
During error recovery, the device could be removed as part of the
partial hotplug. The criterion used to come with partial hotplug
is: if the device driver provides error_detected(), slot_reset()
and resume() callbacks, it's immune from hotplug. Otherwise,
it's going to experience partial hotplug during EEH recovery. But
the criterion isn't correct enough: mlx4_core driver for Mellanox
adapters provides error_detected(), slot_reset() callbacks, but
resume() isn't there. Those Mellanox adapters won't be to involved
in the partial hotplug.

This fixes the criterion to a practical one: adpater with driver
that provides error_detected(), slot_reset() will be immune from
partial hotplug. resume() isn't mandatory.

Fixes: f2da4ccf ("powerpc/eeh: More relaxed hotplug criterion")
Cc: stable@vger.kernel.org #v4.4+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22 19:25:55 +11:00
Linus Torvalds e6a1c1e9dd powerpc fixes for 4.5 #2
- Fix build error on 32-bit with checkpoint restart from Aneesh Kumar
  - Fix dedotify for binutils >= 2.26 from Andreas Schwab
  - Don't trace hcalls on offline CPUs from Denis Kirjanov
  - eeh: Fix stale cached primary bus from Gavin Shan
  - eeh: Fix stale PE primary bus from Gavin Shan
  - mm: Fix Multi hit ERAT cause by recent THP update from Aneesh Kumar K.V
  - ioda: Set "read" permission when "write" is set from Alexey Kardashevskiy
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWx8l5AAoJEFHr6jzI4aWAeY0P/AomeQCRieoBMKJi36WX4+gU
 Cm1iBgM593VEM/KFsYtedm5+4QaCmPE+1tVm4/u0wbLeEQ8TqNLSZLniB9USE0hb
 9655gGQyFE95BZa8WfbqHOI7+BK+TkUOWGY0CfyqPVrknzSN2MCDHjUaNo1wge6l
 zmIYIkKhaQAinFSFovOdjQ63rYdk6CxsfgbP1Gl2aX0cmzWW1n07AvZLqNmLFJ+4
 L3uBXPcrEKY/nfkRi+FutoTb86ggt9J9dqCfJHHfWKn60qwhpKwiva84k3jI/BOu
 yBTFeNzlobXt0ceDSWx1ITXzKmJQokWGC5+Lo+0mDb4veAbhLgHlXdx7NUcZIB6+
 YGYGSOkeKCnbnInIOGLz45LlevJFviI94y0YY4tt++Csq/IjnBhDeTkGx7zcqRLG
 v5hl7AhykHd3Me5iRuyRRVoVyk6+318OZW450Oxxj7EtkzpSeLfHCMKxk5w1Nxuk
 tenWQeApdkTVr91m5VJNFOsrmtFDLJv51C8duiUFWzc195ejSMYDR86K+qBeaebs
 39obXHVYRnCrn9TzODIR9SnEd7dHImekQ4a3G3F54mLJ4qqUN089TDqBGY2GNuT8
 j8QVBttp3SWuZSvtvDJLxoFt2QTKxcuiMQ4FX/OAS4qWRjSR8v2WTCyBZt68l7er
 kpUnIelJSuIDVLdNuFlf
 =7Yzi
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Fix build error on 32-bit with checkpoint restart from Aneesh Kumar
 - Fix dedotify for binutils >= 2.26 from Andreas Schwab
 - Don't trace hcalls on offline CPUs from Denis Kirjanov
 - eeh: Fix stale cached primary bus from Gavin Shan
 - eeh: Fix stale PE primary bus from Gavin Shan
 - mm: Fix Multi hit ERAT cause by recent THP update from Aneesh Kumar K.V
 - ioda: Set "read" permission when "write" is set from Alexey Kardashevskiy

* tag 'powerpc-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/ioda: Set "read" permission when "write" is set
  powerpc/mm: Fix Multi hit ERAT cause by recent THP update
  powerpc/powernv: Fix stale PE primary bus
  powerpc/eeh: Fix stale cached primary bus
  powerpc/pseries: Don't trace hcalls on offline CPUs
  powerpc: Fix dedotify for binutils >= 2.26
  powerpc/book3s_32: Fix build error with checkpoint restart
2016-02-20 09:22:11 -08:00
Balbir Singh 94e3d92359 powerpc: Fix kgdb on little endian ppc64le
I spent some time trying to use kgdb and debugged my inability to
resume from kgdb_handle_breakpoint(). NIP is not incremented
and that leads to a loop in the debugger.

I've tested this lightly on a virtual instance with KDB enabled.
After the patch, I am able to get the "go" command to work as
expected.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-18 00:03:26 +11:00
Gavin Shan 05ba75f848 powerpc/eeh: Fix stale cached primary bus
When PE is created, its primary bus is cached to pe->bus. At later
point, the cached primary bus is returned from eeh_pe_bus_get().
However, we could get stale cached primary bus and run into kernel
crash in one case: full hotplug as part of fenced PHB error recovery
releases all PCI busses under the PHB at unplugging time and recreate
them at plugging time. pe->bus is still dereferencing the PCI bus
that was released.

This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity
of pe->bus. pe->bus is updated when its first child EEH device is
online and the flag is set. Before unplugging in full hotplug for
error recovery, the flag is cleared.

Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE")
Cc: stable@vger.kernel.org #v3.11+
Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-15 21:10:04 +11:00
Andrey Ryabinin 06bea3dbfe locking/lockdep: Eliminate lockdep_init()
Lockdep is initialized at compile time now.  Get rid of lockdep_init().

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Krinkin <krinkin.m.u@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: mm-commits@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09 12:03:25 +01:00
Andrew Donnellan 31f6a4ada1 powerpc/eeh: fix incorrect function name in comment
The comment block above pcibios_set_pcie_reset_state() incorrectly refers
to pcibios_set_pcie_slot_reset(). Fix the comment accordingly.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-08 22:34:59 +11:00
Andreas Schwab f15838e9ca powerpc: Fix dedotify for binutils >= 2.26
Since binutils 2.26 BFD is doing suffix merging on STRTAB sections.  But
dedotify modifies the symbol names in place, which can also modify
unrelated symbols with a name that matches a suffix of a dotted name.  To
remove the leading dot of a symbol name we can just increment the pointer
into the STRTAB section instead.

Backport to all stables to avoid breakage when people update their
binutils - mpe.

Cc: stable@vger.kernel.org
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-08 22:01:46 +11:00
Linus Torvalds ec1cc55d6f powerpc fixes for 4.5
- Wire up copy_file_range() syscall from Chandan Rajendra
  - Simplify module TOC handling from Alan Modra
  - Remove newly added extra definition of pmd_dirty from Stephen Rothwell
  - Allow user space to map rtas_rmo_buf from Vasant Hegde
  - Fix PE location code from Gavin Shan
  - Remove PPMU_HAS_SSLOT flag for Power8 from Madhavan Srinivasan
  - Fixup _HPAGE_CHG_MASK from Aneesh Kumar K.V
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWqy1MAAoJEFHr6jzI4aWAlcsP/1I1WbD3Ek8pL/ljTxD9bfxb
 DxF/HklYphzJDEvupgjDrmJO0RuMHrItAqTqsFbpWfCgn6OtIL/QHgZK3Aebtgjq
 7u6V6SqjfYO7vWnmknvzcG+wDrPb3FrXyrFDE/Stz8IIh9OYrO9HzamqPxfhovh1
 RQzD5eh3FWS9gzKDTiwh5w/lqwgP9Mv0b7BJEUvkQWv9Y9ZG4ZQeQwelUqTD2MKx
 UIVYHjHXiuYYiMP5u59V/VFULq5C7s+DqCENTwfVERfN75p3K/JnO0x/87uiz+U+
 0Y5owkK7sTr/Ozo9rMF5mqd+JNUAutkiD/+xDBivnZlxM/cnGtPpc+D/g7+CT0ar
 oh0GDtCEQeEzyoFHsizSAr1FvXfo7NelhzY9CIoi7KHwCBtZDOIhUndkEfsKnYea
 oZSf86F5KqSw8vTOrrKT5gZLYu5ro513vQHg0vw+tNHIWppsIeW/Pbr9e0o7I6bV
 px3EmKkuUJfSNBNyDscWdUetRWilZsGW+Gg47mlf8Dck091exJ6o1n7HU8Y83KP+
 7QDGwT5AQAZ47Z1N1DyNY5V+9SiYYSrgWi9hQTCtQXKjgd0Cia4zTDaEEMGotfQM
 7DoR6r9tdCphc1oIiUJHhdSgbnR7Yq8804Bc8LSy7gkv9ZjcPvbirgDDLnIJ9zib
 yCt6l6sRkDKZqvlV4wqN
 =KZ85
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Wire up copy_file_range() syscall from Chandan Rajendra
 - Simplify module TOC handling from Alan Modra
 - Remove newly added extra definition of pmd_dirty from Stephen Rothwell
 - Allow user space to map rtas_rmo_buf from Vasant Hegde
 - Fix PE location code from Gavin Shan
 - Remove PPMU_HAS_SSLOT flag for Power8 from Madhavan Srinivasan
 - Fixup _HPAGE_CHG_MASK from Aneesh Kumar K.V

* tag 'powerpc-4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Fixup _HPAGE_CHG_MASK
  powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8
  powerpc/eeh: Fix PE location code
  powerpc/mm: Allow user space to map rtas_rmo_buf
  powerpc: Remove newly added extra definition of pmd_dirty
  powerpc: Simplify module TOC handling
  powerpc: Wire up copy_file_range() syscall
2016-01-29 16:10:16 -08:00
Gavin Shan 7e56f62776 powerpc/eeh: Fix PE location code
In eeh_pe_loc_get(), the PE location code is retrieved from the
"ibm,loc-code" property of the device node for the bridge of the
PE's primary bus. It's not correct because the property indicates
the parent PE's location code.

This reads the correct PE location code from "ibm,io-base-loc-code"
or "ibm,slot-location-code" property of PE parent bus's device node.

Cc: stable@vger.kernel.org # v3.16+
Fixes: 357b2f3dd9 ("powerpc/eeh: Dump PE location code")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-27 11:37:37 +11:00
Linus Torvalds eae21770b4 Merge branch 'akpm' (patches from Andrew)
Merge third patch-bomb from Andrew Morton:
 "I'm pretty much done for -rc1 now:

   - the rest of MM, basically

   - lib/ updates

   - checkpatch, epoll, hfs, fatfs, ptrace, coredump, exit

   - cpu_mask simplifications

   - kexec, rapidio, MAINTAINERS etc, etc.

   - more dma-mapping cleanups/simplifications from hch"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (109 commits)
  MAINTAINERS: add/fix git URLs for various subsystems
  mm: memcontrol: add "sock" to cgroup2 memory.stat
  mm: memcontrol: basic memory statistics in cgroup2 memory controller
  mm: memcontrol: do not uncharge old page in page cache replacement
  Documentation: cgroup: add memory.swap.{current,max} description
  mm: free swap cache aggressively if memcg swap is full
  mm: vmscan: do not scan anon pages if memcg swap limit is hit
  swap.h: move memcg related stuff to the end of the file
  mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online
  mm: vmscan: pass memcg to get_scan_count()
  mm: memcontrol: charge swap to cgroup2
  mm: memcontrol: clean up alloc, online, offline, free functions
  mm: memcontrol: flatten struct cg_proto
  mm: memcontrol: rein in the CONFIG space madness
  net: drop tcp_memcontrol.c
  mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
  mm: memcontrol: allow to disable kmem accounting for cgroup2
  mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
  mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
  mm: memcontrol: separate kmem code from legacy tcp accounting code
  ...
2016-01-21 12:32:08 -08:00
Linus Torvalds d43421565b PCI changes for the v4.5 merge window:
Enumeration
     Simplify config space size computation (Bjorn Helgaas)
     Avoid iterating through ROM outside the resource window (Edward O'Callaghan)
     Support PCIe devices with short cfg_size (Jason S. McMullan)
     Add Netronome vendor and device IDs (Jason S. McMullan)
     Limit config space size for Netronome NFP6000 family (Jason S. McMullan)
     Add Netronome NFP4000 PF device ID (Simon Horman)
     Limit config space size for Netronome NFP4000 (Simon Horman)
     Print warnings for all invalid expansion ROM headers (Vladis Dronov)
 
   Resource management
     Fix minimum allocation address overwrite (Christoph Biedl)
 
   PCI device hotplug
     acpiphp_ibm: Fix null dereferences on null ibm_slot (Colin Ian King)
     pciehp: Always protect pciehp_disable_slot() with hotplug mutex (Guenter Roeck)
     shpchp: Constify hpc_ops structure (Julia Lawall)
     ibmphp: Remove unneeded NULL test (Julia Lawall)
 
   Power management
     Make ASPM sysfs link_state_store() consistent with link_state_show() (Andy Lutomirski)
 
   Virtualization
     Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 (Tim Sander)
 
   MSI
     Remove empty pci_msi_init_pci_dev() (Bjorn Helgaas)
     Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD (Grygorii Strashko)
     Initialize MSI capability for all architectures (Guilherme G. Piccoli)
     Relax msi_domain_alloc() to support parentless MSI irqdomains (Liu Jiang)
 
   ARM Versatile host bridge driver
     Remove unused pci_sys_data structures (Lorenzo Pieralisi)
 
   Broadcom iProc host bridge driver
     Hide CONFIG_PCIE_IPROC (Arnd Bergmann)
     Do not use 0x in front of %pap (Dmitry V. Krivenok)
     Update iProc PCIe device tree binding (Ray Jui)
     Add PAXC interface support (Ray Jui)
     Add iProc PCIe MSI device tree binding (Ray Jui)
     Add iProc PCIe MSI support (Ray Jui)
 
   Freescale i.MX6 host bridge driver
     Use gpio_set_value_cansleep() (Fabio Estevam)
     Add support for active-low reset GPIO (Petr Štetiar)
 
   HiSilicon host bridge driver
     Add support for HiSilicon Hip06 PCIe host controllers (Gabriele Paoloni)
 
   Intel VMD host bridge driver
     Export irq_domain_set_info() for module use (Keith Busch)
     x86/PCI: Allow DMA ops specific to a PCI domain (Keith Busch)
     Use 32 bit PCI domain numbers (Keith Busch)
     Add driver for Intel Volume Management Device (VMD) (Keith Busch)
 
   Qualcomm host bridge driver
     Document PCIe devicetree bindings (Stanimir Varbanov)
     Add Qualcomm PCIe controller driver (Stanimir Varbanov)
     dts: apq8064: add PCIe devicetree node (Stanimir Varbanov)
     dts: ifc6410: enable PCIe DT node for this board (Stanimir Varbanov)
 
   Renesas R-Car host bridge driver
     Add support for R-Car H3 to pcie-rcar (Harunobu Kurokawa)
     Allow DT to override default window settings (Phil Edworthy)
     Convert to DT resource parsing API (Phil Edworthy)
     Revert "PCI: rcar: Build pcie-rcar.c only on ARM" (Phil Edworthy)
     Remove unused pci_sys_data struct from pcie-rcar (Phil Edworthy)
     Add runtime PM support to pcie-rcar (Phil Edworthy)
     Add Gen2 PHY setup to pcie-rcar (Phil Edworthy)
     Add gen2 fallback compatibility string for pci-rcar-gen2 (Simon Horman)
     Add gen2 fallback compatibility string for pcie-rcar (Simon Horman)
 
   Synopsys DesignWare host bridge driver
     Simplify control flow (Bjorn Helgaas)
     Make config accessor override checking symmetric (Bjorn Helgaas)
     Ensure ATU is enabled before IO/conf space accesses (Stanimir Varbanov)
 
   Miscellaneous
     Add of_pci_get_host_bridge_resources() stub (Arnd Bergmann)
     Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask (Bjorn Helgaas)
     Fix all whitespace issues (Bogicevic Sasa)
     x86/PCI: Simplify pci_bios_{read,write} (Geliang Tang)
     Use to_pci_dev() instead of open-coding it (Geliang Tang)
     Use kobj_to_dev() instead of open-coding it (Geliang Tang)
     Use list_for_each_entry() to simplify code (Geliang Tang)
     Fix typos in <linux/msi.h> (Thomas Petazzoni)
     x86/PCI: Clarify AMD Fam10h config access restrictions comment (Tomasz Nowicki)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWoQIuAAoJEFmIoMA60/r8ckYP/0ZrkANeN1SB5cQVi2k7aceq
 kQb1Hk6ifxohJvgpJ/iwmVCHoApyeBfUBfrC+fUpIC2f7ncPsE5HNyjqpAWzFzj2
 sYWwY029yjBQ9g4mPhkvjBXfha+lNtLthWc+Xxcat5pdcyG63Dg4SfJKWm2ZYnbN
 0GJzyRZXIwAMnNf0KIr61Aqru0nXeHvi5wblyJ08UZ7AcNzCtB0wKLmE3S6SeZVF
 f2fry35zcGu+TFvQ1hAYemfl3XyDBJ87nPiKzJAwYSaKcWPFWt+72PBDPO6X9squ
 6prm4nmAgeG2Oo4Zu0fbkDlB2bEsWUc14/xT0i5Wfs35vcwzF+S1zirJAtVqoNir
 NgC7fSbEHbsS7FZOz0rBOBIvIkbb6NdfLFuZqUFv0X1M5bRFywjo8lZRfAYoGJzK
 Mmus0uKbklx5m6RT5adf9+Plev1YJT6XZW9XrDpGnxrwRyPjHmyvuTWsYkumxY7Q
 CE5Wr3p7q2I2+MtrQVv2D9Nzsb+4zQ6BgHrd2vwR/IxTsfdXLU7+B691wkUDX8No
 UKFxBd0FiVCn+srG96u7lWQvdoUqoNCogTZSVzGR5gFBv3zAN9gi8HS7NbV558Mg
 Io3Xw+6dcbG33uvWdU6jHEDLMQsohZcp05Q5esCgRQNV4cGJbPxBDtOZEO/ezvW4
 FAI7lfgYTFiQK3NzE3Ng
 =z9mQ
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for the v4.5 merge window:

  Enumeration:
   - Simplify config space size computation (Bjorn Helgaas)
   - Avoid iterating through ROM outside the resource window (Edward O'Callaghan)
   - Support PCIe devices with short cfg_size (Jason S. McMullan)
   - Add Netronome vendor and device IDs (Jason S. McMullan)
   - Limit config space size for Netronome NFP6000 family (Jason S. McMullan)
   - Add Netronome NFP4000 PF device ID (Simon Horman)
   - Limit config space size for Netronome NFP4000 (Simon Horman)
   - Print warnings for all invalid expansion ROM headers (Vladis Dronov)

  Resource management:
   - Fix minimum allocation address overwrite (Christoph Biedl)

  PCI device hotplug:
   - acpiphp_ibm: Fix null dereferences on null ibm_slot (Colin Ian King)
   - pciehp: Always protect pciehp_disable_slot() with hotplug mutex (Guenter Roeck)
   - shpchp: Constify hpc_ops structure (Julia Lawall)
   - ibmphp: Remove unneeded NULL test (Julia Lawall)

  Power management:
   - Make ASPM sysfs link_state_store() consistent with link_state_show() (Andy Lutomirski)

  Virtualization
   - Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 (Tim Sander)

  MSI:
   - Remove empty pci_msi_init_pci_dev() (Bjorn Helgaas)
   - Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD (Grygorii Strashko)
   - Initialize MSI capability for all architectures (Guilherme G. Piccoli)
   - Relax msi_domain_alloc() to support parentless MSI irqdomains (Liu Jiang)

  ARM Versatile host bridge driver:
   - Remove unused pci_sys_data structures (Lorenzo Pieralisi)

  Broadcom iProc host bridge driver:
   - Hide CONFIG_PCIE_IPROC (Arnd Bergmann)
   - Do not use 0x in front of %pap (Dmitry V. Krivenok)
   - Update iProc PCIe device tree binding (Ray Jui)
   - Add PAXC interface support (Ray Jui)
   - Add iProc PCIe MSI device tree binding (Ray Jui)
   - Add iProc PCIe MSI support (Ray Jui)

  Freescale i.MX6 host bridge driver:
   - Use gpio_set_value_cansleep() (Fabio Estevam)
   - Add support for active-low reset GPIO (Petr Štetiar)

  HiSilicon host bridge driver:
   - Add support for HiSilicon Hip06 PCIe host controllers (Gabriele Paoloni)

  Intel VMD host bridge driver:
   - Export irq_domain_set_info() for module use (Keith Busch)
   - x86/PCI: Allow DMA ops specific to a PCI domain (Keith Busch)
   - Use 32 bit PCI domain numbers (Keith Busch)
   - Add driver for Intel Volume Management Device (VMD) (Keith Busch)

  Qualcomm host bridge driver:
   - Document PCIe devicetree bindings (Stanimir Varbanov)
   - Add Qualcomm PCIe controller driver (Stanimir Varbanov)
   - dts: apq8064: add PCIe devicetree node (Stanimir Varbanov)
   - dts: ifc6410: enable PCIe DT node for this board (Stanimir Varbanov)

  Renesas R-Car host bridge driver:
   - Add support for R-Car H3 to pcie-rcar (Harunobu Kurokawa)
   - Allow DT to override default window settings (Phil Edworthy)
   - Convert to DT resource parsing API (Phil Edworthy)
   - Revert "PCI: rcar: Build pcie-rcar.c only on ARM" (Phil Edworthy)
   - Remove unused pci_sys_data struct from pcie-rcar (Phil Edworthy)
   - Add runtime PM support to pcie-rcar (Phil Edworthy)
   - Add Gen2 PHY setup to pcie-rcar (Phil Edworthy)
   - Add gen2 fallback compatibility string for pci-rcar-gen2 (Simon Horman)
   - Add gen2 fallback compatibility string for pcie-rcar (Simon Horman)

  Synopsys DesignWare host bridge driver:
   - Simplify control flow (Bjorn Helgaas)
   - Make config accessor override checking symmetric (Bjorn Helgaas)
   - Ensure ATU is enabled before IO/conf space accesses (Stanimir Varbanov)

  Miscellaneous:
   - Add of_pci_get_host_bridge_resources() stub (Arnd Bergmann)
   - Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask (Bjorn Helgaas)
   - Fix all whitespace issues (Bogicevic Sasa)
   - x86/PCI: Simplify pci_bios_{read,write} (Geliang Tang)
   - Use to_pci_dev() instead of open-coding it (Geliang Tang)
   - Use kobj_to_dev() instead of open-coding it (Geliang Tang)
   - Use list_for_each_entry() to simplify code (Geliang Tang)
   - Fix typos in <linux/msi.h> (Thomas Petazzoni)
   - x86/PCI: Clarify AMD Fam10h config access restrictions comment (Tomasz Nowicki)"

* tag 'pci-v4.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (58 commits)
  PCI: Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183
  PCI: Limit config space size for Netronome NFP4000
  PCI: Add Netronome NFP4000 PF device ID
  x86/PCI: Add driver for Intel Volume Management Device (VMD)
  PCI/AER: Use 32 bit PCI domain numbers
  x86/PCI: Allow DMA ops specific to a PCI domain
  irqdomain: Export irq_domain_set_info() for module use
  PCI: host: Add of_pci_get_host_bridge_resources() stub
  genirq/MSI: Relax msi_domain_alloc() to support parentless MSI irqdomains
  PCI: rcar: Add Gen2 PHY setup to pcie-rcar
  PCI: rcar: Add runtime PM support to pcie-rcar
  PCI: designware: Make config accessor override checking symmetric
  PCI: ibmphp: Remove unneeded NULL test
  ARM: dts: ifc6410: enable PCIe DT node for this board
  ARM: dts: apq8064: add PCIe devicetree node
  PCI: hotplug: Use list_for_each_entry() to simplify code
  PCI: rcar: Remove unused pci_sys_data struct from pcie-rcar
  PCI: hisi: Add support for HiSilicon Hip06 PCIe host controllers
  PCI: Avoid iterating through memory outside the resource window
  PCI: acpiphp_ibm: Fix null dereferences on null ibm_slot
  ...
2016-01-21 11:52:16 -08:00
Alan Modra c153693d7e powerpc: Simplify module TOC handling
PowerPC64 uses the symbol .TOC. much as other targets use
_GLOBAL_OFFSET_TABLE_. It identifies the value of the GOT pointer (or in
powerpc parlance, the TOC pointer). Global offset tables are generally
local to an executable or shared library, or in the kernel, module. Thus
it does not make sense for a module to resolve a relocation against
.TOC. to the kernel's .TOC. value. A module has its own .TOC., and
indeed the powerpc64 module relocation processing ignores the kernel
value of .TOC. and instead calculates a module-local value.

This patch removes code involved in exporting the kernel .TOC., tweaks
modpost to ignore an undefined .TOC., and the module loader to twiddle
the section symbol so that .TOC. isn't seen as undefined.

Note that if the kernel was compiled with -msingle-pic-base then ELFv2
would not have function global entry code setting up r2. In that case
the module call stubs would need to be modified to set up r2 using the
kernel .TOC. value, requiring some of this code to be reinstated.

mpe: Furthermore a change in binutils master (not yet released) causes
the current way we handle the TOC to no longer work when building with
MODVERSIONS=y and RELOCATABLE=n. The symptom is that modules can not be
loaded due to there being no version found for TOC.

Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: Alan Modra <amodra@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-21 14:10:56 +11:00
Daniel Axtens bf76f73c5f powerpc: enable UBSAN support
This hooks up UBSAN support for PowerPC.

So far it's found some interesting cases where we don't properly sanitise
input to shifts, including one in our futex handling.  Nothing critical,
but interesting and worth fixing.

[valentinrothberg@gmail.com: arch/powerpc/Kconfig: fix typo in select statement]
Signed-off-by: Daniel Axtens <dja@axtens.net>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rasmus Villemoes a051216427 powerpc/fadump: rename cpu_online_mask member of struct fadump_crash_info_header
The four cpumasks cpu_{possible,online,present,active}_bits are exposed
readonly via the corresponding const variables cpu_xyz_mask.  But they are
also accessible for arbitrary writing via the exposed functions
set_cpu_xyz.  There's quite a bit of code throughout the kernel which
iterates over or otherwise accesses these bitmaps, and having the access
go via the cpu_xyz_mask variables is nowadays [1] simply a useless
indirection.

It may be that any problem in CS can be solved by an extra level of
indirection, but that doesn't mean every extra indirection solves a
problem.  In this case, it even necessitates some minor ugliness (see
4/6).

Patch 1/6 is new in v2, and fixes a build failure on ppc by renaming a
struct member, to avoid problems when the identifier cpu_online_mask
becomes a macro later in the series.  The next four patches eliminate the
cpu_xyz_mask variables by simply exposing the actual bitmaps, after
renaming them to discourage direct access - that still happens through
cpu_xyz_mask, which are now simply macros with the same type and value as
they used to have.

After that, there's no longer any reason to have the setter functions be
out-of-line: The boolean parameter is almost always a literal true or
false, so by making them static inlines they will usually compile to one
or two instructions.

For a defconfig build on x86_64, bloat-o-meter says we save ~3000 bytes.
We also save a little stack (stackdelta says 127 functions have a 16 byte
smaller stack frame, while two grow by that amount).  Mostly because, when
iterating over the mask, gcc typically loads the value of cpu_xyz_mask
into a callee-saved register and from there into %rdi before each
find_next_bit call - now it can just load the appropriate immediate
address into %rdi before each call.

[1] See Rusty's kind explanation
http://thread.gmane.org/gmane.linux.kernel/2047078/focus=2047722 for
some historic context.

This patch (of 6):

As preparation for eliminating the indirect access to the various global
cpu_*_bits bitmaps via the pointer variables cpu_*_mask, rename the
cpu_online_mask member of struct fadump_crash_info_header to simply
online_mask, thus allowing cpu_online_mask to become a macro.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Linus Torvalds f689b742f2 powerpc updates for 4.5
- Ground work for the new Power9 MMU from Aneesh Kumar K.V
  - Optimise FP/VMX/VSX context switching from Anton Blanchard
 
  - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica Gupta,
    Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling, Andrew Donnellan
  - Allow wrapper to work on non-english system from Laurent Vivier
  - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
  - Fix module autoload for rackmeter & axonram drivers from Luis de Bethencourt
  - Include KVM guest test in all interrupt vectors from Paul Mackerras
  - Fix DSCR inheritance over fork() from Anton Blanchard
  - Make value-returning atomics & {cmp}xchg* & their atomic_ versions fully ordered from Boqun Feng
  - Print MSR TM bits in oops messages from Michael Neuling
  - Add TM signal return & invalid stack selftests from Michael Neuling
  - Limit EPOW reset event warnings from Vipin K Parashar
  - Remove the Cell QPACE code from Rashmica Gupta
  - Append linux_banner to exception information in xmon from Rashmica Gupta
  - Add selftest to check if VSRs are corrupted from Rashmica Gupta
  - Remove broken GregorianDay() from Daniel Axtens
  - Import Anton's context_switch2 benchmark into selftests from Michael Ellerman
  - Add selftest script to test HMI functionality from Daniel Axtens
  - Remove obsolete OPAL v2 support from Stewart Smith
  - Make enter_rtas() private from Michael Ellerman
  - PPR exception cleanups from Michael Ellerman
  - Add page soft dirty tracking from Laurent Dufour
  - Add support for Nvlink NPUs from Alistair Popple
  - Add support for kexec on 476fpe from Alistair Popple
  - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
  - Copy only required pieces of the mm_context_t to the paca from Michael Neuling
  - Add a kmsg_dumper that flushes OPAL console output on panic from Russell Currey
  - Implement save_stack_trace_regs() to enable kprobe stack tracing from Steven Rostedt
  - Add HWCAP bits for Power9 from Michael Ellerman
  - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
  - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
  - scripts/recordmcount.pl: support data in text section on powerpc from Ulrich Weigand
  - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand
 
  - cxl: Fix possible idr warning when contexts are released from Vaibhav Jain
  - cxl: use correct operator when writing pcie config space values from Andrew Donnellan
  - cxl: Fix DSI misses when the context owning task exits from Vaibhav Jain
  - cxl: fix build for GCC 4.6.x from Brian Norris
  - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
  - cxl: Enable PCI device ID for future IBM CXL adapter from Uma Krishnan
 
  - Freescale updates from Scott: Highlights include moving QE code out of
    arch/powerpc (to be shared with arm), device tree updates, and minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWmIxeAAoJEFHr6jzI4aWAA+cQAIXAw4WfVWJ2V4ZK+1eKfB57
 fdXG71PuXG+WYIWy71ly8keLHdzzD1NQ2OUB64bUVRq202nRgVc15ZYKRJ/FE/sP
 SkxaQ2AG/2kI2EflWshOi0Lu9qaZ+LMHJnszIqE/9lnGSB2kUI/cwsSXgziiMKXR
 XNci9v14SdDd40YV/6BSZXoxApwyq9cUbZ7rnzFLmz4hrFuKmB/L3LABDF8QcpH7
 sGt/YaHGOtqP0UX7h5KQTFLGe1OPvK6NWixSXeZKQ71ED6cho1iKUEOtBA9EZeIN
 QM5JdHFWgX8MMRA0OHAgidkSiqO38BXjmjkVYWoIbYz7Zax3ThmrDHB4IpFwWnk3
 l7WBykEXY7KEqpZzbh0GFGehZWzVZvLnNgDdvpmpk/GkPzeYKomBj7ZZfm3H1yGD
 BTHPwuWCTX+/K75yEVNO8aJO12wBg7DRl4IEwBgqhwU8ga4FvUOCJkm+SCxA1Dnn
 qlpS7qPwTXNIEfKMJcxp5X0KiwDY1EoOotd4glTN0jbeY5GEYcxe+7RQ302GrYxP
 zcc8EGLn8h6BtQvV3ypNHF5l6QeTW/0ZlO9c236tIuUQ5gQU39SQci7jQKsYjSzv
 BB1XdLHkbtIvYDkmbnr1elbeJCDbrWL9rAXRUTRyfuCzaFWTfZmfVNe8c8qwDMLk
 TUxMR/38aI7bLcIQjwj9
 =R5bX
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Core:
   - Ground work for the new Power9 MMU from Aneesh Kumar K.V
   - Optimise FP/VMX/VSX context switching from Anton Blanchard

  Misc:
   - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica
     Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling,
     Andrew Donnellan
   - Allow wrapper to work on non-english system from Laurent Vivier
   - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
   - Fix module autoload for rackmeter & axonram drivers from Luis de
     Bethencourt
   - Include KVM guest test in all interrupt vectors from Paul Mackerras
   - Fix DSCR inheritance over fork() from Anton Blanchard
   - Make value-returning atomics & {cmp}xchg* & their atomic_ versions
     fully ordered from Boqun Feng
   - Print MSR TM bits in oops messages from Michael Neuling
   - Add TM signal return & invalid stack selftests from Michael Neuling
   - Limit EPOW reset event warnings from Vipin K Parashar
   - Remove the Cell QPACE code from Rashmica Gupta
   - Append linux_banner to exception information in xmon from Rashmica
     Gupta
   - Add selftest to check if VSRs are corrupted from Rashmica Gupta
   - Remove broken GregorianDay() from Daniel Axtens
   - Import Anton's context_switch2 benchmark into selftests from
     Michael Ellerman
   - Add selftest script to test HMI functionality from Daniel Axtens
   - Remove obsolete OPAL v2 support from Stewart Smith
   - Make enter_rtas() private from Michael Ellerman
   - PPR exception cleanups from Michael Ellerman
   - Add page soft dirty tracking from Laurent Dufour
   - Add support for Nvlink NPUs from Alistair Popple
   - Add support for kexec on 476fpe from Alistair Popple
   - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
   - Copy only required pieces of the mm_context_t to the paca from
     Michael Neuling
   - Add a kmsg_dumper that flushes OPAL console output on panic from
     Russell Currey
   - Implement save_stack_trace_regs() to enable kprobe stack tracing
     from Steven Rostedt
   - Add HWCAP bits for Power9 from Michael Ellerman
   - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
   - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
   - scripts/recordmcount.pl: support data in text section on powerpc
     from Ulrich Weigand
   - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand

  cxl:
   - cxl: Fix possible idr warning when contexts are released from
     Vaibhav Jain
   - cxl: use correct operator when writing pcie config space values
     from Andrew Donnellan
   - cxl: Fix DSI misses when the context owning task exits from Vaibhav
     Jain
   - cxl: fix build for GCC 4.6.x from Brian Norris
   - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
   - cxl: Enable PCI device ID for future IBM CXL adapter from Uma
     Krishnan

  Freescale:
   - Freescale updates from Scott: Highlights include moving QE code out
     of arch/powerpc (to be shared with arm), device tree updates, and
     minor fixes"

* tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits)
  powerpc/module: Handle R_PPC64_ENTRY relocations
  scripts/recordmcount.pl: support data in text section on powerpc
  powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
  powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  powerpc/mm: Fix _PAGE_PTE breaking swapoff
  cxl: Enable PCI device ID for future IBM CXL adapter
  cxl: use -Werror only with CONFIG_PPC_WERROR
  cxl: fix build for GCC 4.6.x
  powerpc: Add HWCAP bits for Power9
  powerpc/powernv: Reserve PE#0 on NPU
  powerpc/powernv: Change NPU PE# assignment
  powerpc/powernv: Fix update of NVLink DMA mask
  powerpc/powernv: Remove misleading comment in pci.c
  powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing
  powerpc: Fix build break due to paca mm_context_t changes
  cxl: Fix DSI misses when the context owning task exits
  MAINTAINERS: Update Scott Wood's e-mail address
  powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
  powerpc: Fix style of self-test config prompts
  powerpc/powernv: Only delay opal_rtc_read() retry when necessary
  ...
2016-01-15 13:18:47 -08:00
Linus Torvalds 0f0836b7eb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching updates from Jiri Kosina:

 - RO/NX attribute fixes for patch module relocations from Josh
   Poimboeuf.  As part of this effort, module.c has been cleaned up as
   well and livepatching is piggy-backing on this cleanup.  Rusty is OK
   with this whole lot going through livepatching tree.

 - symbol disambiguation support from Chris J Arges.  That series is
   also

        Reviewed-by: Miroslav Benes <mbenes@suse.cz>

   but this came in only after I've alredy pushed out.  Didn't want to
   rebase because of that, hence I am mentioning it here.

 - symbol lookup fix from Miroslav Benes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
  livepatch: Cleanup module page permission changes
  module: keep percpu symbols in module's symtab
  module: clean up RO/NX handling.
  module: use a structure to encapsulate layout.
  gcov: use within_module() helper.
  module: Use the same logic for setting and unsetting RO/NX
  livepatch: function,sympos scheme in livepatch sysfs directory
  livepatch: add sympos as disambiguator field to klp_reloc
  livepatch: add old_sympos as disambiguator field to klp_func
2016-01-14 16:38:02 -08:00
Ulrich Weigand a61674bdfc powerpc/module: Handle R_PPC64_ENTRY relocations
GCC 6 will include changes to generated code with -mcmodel=large,
which is used to build kernel modules on powerpc64le.  This was
necessary because the large model is supposed to allow arbitrary
sizes and locations of the code and data sections, but the ELFv2
global entry point prolog still made the unconditional assumption
that the TOC associated with any particular function can be found
within 2 GB of the function entry point:

func:
	addis r2,r12,(.TOC.-func)@ha
	addi  r2,r2,(.TOC.-func)@l
	.localentry func, .-func

To remove this assumption, GCC will now generate instead this global
entry point prolog sequence when using -mcmodel=large:

	.quad .TOC.-func
func:
	.reloc ., R_PPC64_ENTRY
	ld    r2, -8(r12)
	add   r2, r2, r12
	.localentry func, .-func

The new .reloc triggers an optimization in the linker that will
replace this new prolog with the original code (see above) if the
linker determines that the distance between .TOC. and func is in
range after all.

Since this new relocation is now present in module object files,
the kernel module loader is required to handle them too.  This
patch adds support for the new relocation and implements the
same optimization done by the GNU linker.

Cc: stable@vger.kernel.org
Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-13 12:37:05 +11:00
Steven Rostedt 35de3b1aa1 powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing
It has come to my attention that kprobe event stack tracing does not
work on powerpc. You can see with the following:

  # cd /sys/kernel/debug/tracing
  # echo stacktrace > trace_options
  # echo 'p kfree' > kprobe_events
  # echo 1 > events/kprobes/enable

Will print the following warning:
  save_stack_trace_regs() not implemented yet.

Although save_stack_trace() (which normal event stack traces use) is
implemented, save_stack_trace_regs() which kprobe events use is not.
This is a cheap attempt to implement that function.

Note, This may have issues if a task tries to get a stack trace from
another task with its regs, because it just passes in "current" to
save_context_stack(). But this does solve the issue with stack tracing
kprobe events.

Reported-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 14:27:28 +11:00
Al Viro 7e935c7ca1 Merge branch 'memdup_user_nul' into work.misc 2016-01-04 10:25:34 -05:00
Michael Neuling 2fc251a8dd powerpc: Copy only required pieces of the mm_context_t to the paca
Currently we copy the whole mm_context_t to the paca but only access a
few bits of it.  This is wasteful of space paca and also takes quite
some time in the hot path of context switching.

This patch pulls in only the required bits from the mm_context_t to
the paca and on context switch, copies only those.

Benchmarking this (On top of Anton's recent MSR context switching
changes [1]) using processes and yield shows an improvement of almost
3% on POWER8:

  http://ozlabs.org/~anton/junkcode/context_switch2.c
  ./context_switch2 --test=yield --process 0 0

1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html

Signed-off-by: Michael Neuling <mikey@neuling.org>
[mpe: Rename paca fields to be mm_ctx_foo rather than context_foo]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:14 +11:00
Michael Neuling c395465da6 powerpc: Add function to copy mm_context_t to the paca
This adds a function to copy the mm->context to the paca.  This is
only a basic conversion for now but will be used more extensively in
the next patch.

This also adds #ifdef CONFIG_PPC_BOOK3S around this code since it's
not used elsewhere.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-19 22:13:12 +11:00
Daniel Axtens 1b855e167b powerpc: Add missing calls to va_end()
cppcheck picked up that there were a couple of missing va_end()
calls in functions using va_start().

Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 23:23:22 +11:00
Alistair Popple 4450022b49 powerpc/476fpe: Add support for kexec
PPC476FPE has a different PVR from previous PPC476 processors. The
kexec code checks the PVR in order to correctly setup the MMU. When
the initial support for 476FPE processors was added the corresponding
change in the kexec code was missed. This patch simply adds the check
and solves the following bug on kexec:

kexec: Starting new kernel
Bye!
Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0xee9a50f8
cpu 0x0: Vector: 400 (Instruction Access) at [ee9d7d20]
    pc: ee9a50f8
    lr: ee9a50e4
    sp: ee9d7dd0
    msr: 21020
    current = 0xee40f000
    pid   = 960, comm = kexec
enter ? for help
[link register   ] ee9a50e4
[ee9d7dd0] c0013748 default_machine_kexec+0x58/0x70 (unreliable)
[ee9d7df0] c0012f04 machine_kexec+0x34/0x40
[ee9d7e00] c00aa1ec kernel_kexec+0x9c/0xb0
[ee9d7e20] c005d704 SyS_reboot+0x1f4/0x220
[ee9d7f40] c000db68 ret_from_syscall+0x0/0x3c

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:00 +11:00
Michael Ellerman 2613265cb5 powerpc/kernel: Combine vec/loc for STD_EXCEPTION_PSERIES
The STD_EXCEPTION_PSERIES macro takes both a vector number, and a
location (memory address). However both are always identical, so combine
them to save repeating ourselves.

This does mean an exception handler must always exist at the location in
memory that matches its vector number. But that's OK because this is the
"STD" macro (standard), which does exactly that. We have other macros
for the other cases, eg. STD_EXCEPTION_PSERIES_OOL (out of line).

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:58 +11:00
Michael Ellerman d8725ce86c powerpc/kernel: Open code SET_DEFAULT_THREAD_PPR
This is only used in one location, open code it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:57 +11:00
Michael Ellerman d030a4b5eb powerpc/kernel: Open code HMT_MEDIUM_LOW_HAS_PPR
HMT_MEDIUM_LOW_HAS_PPR is only used in once place, open code it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:57 +11:00
Michael Ellerman d6265aeaf8 powerpc/kernel: Drop HMT_MEDIUM_PPR_DISCARD
HMT_MEDIUM_PPR_DISCARD is a macro which is present at the start of most
of our first level exception handlers. It conditionally executes a
HMT_MEDIUM instruction, which sets the processor priority to medium.

On on modern systems, ie. Power7 and later, it is nop'ed out at boot.
All it does is make the exception vectors more cramped, and consume 4
bytes of icache.

On old systems it has the effect of boosting the processor priority at
the start of exception processing. If we were previously in the idle
loop for example, we may be at low or very low priority. This is
desirable as we want to process the exception as fast as possible.

However looking closely at the generated code, we see that in all cases
we execute another HMT_MEDIUM just four instructions later. With code
patching applied, the final code on an old (Power6) system will look
like, eg:

  c000000000000300 <data_access_pSeries>:
  c000000000000300:	7c 42 13 78	mr	r2,r2		<-
  c000000000000304:	7d b2 43 a6	mtsprg	2,r13
  c000000000000308:	7d b1 42 a6	mfsprg	r13,1
  c00000000000030c:	f9 2d 00 80	std	r9,128(r13)
  c000000000000310:	60 00 00 00	nop
  c000000000000314:	7c 42 13 78	mr	r2,r2		<-

So I suggest that the added code complexity of HMT_MEDIUM_PPR_DISCARD is
not justified by the benefit of boosting the processor priority for the
duration of four instructions, and therefore we drop it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:57 +11:00
Michael Ellerman cd5cdeb6c8 powerpc/rtas: Make enter_rtas() private
There are no longer any users of enter_rtas() outside of rtas.c, so make
it "private", by moving the declaration inside rtas.c. Hopefully this
will encourage people to use one of the wrappers which takes the sharp
edges off the RTAS calling sequence.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:56 +11:00
Michael Ellerman 4456f45246 powerpc/rtas: Use rtas_call_unlocked() in call_rtas_display_status()
Although call_rtas_display_status() does actually want to use the
regular RTAS locking, it doesn't want the extra logic that is in
rtas_call(), so currently it open codes the logic.

Instead we can use rtas_call_unlocked(), after taking the RTAS lock.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:56 +11:00
Michael Ellerman 209eb4e5cb powerpc/rtas: Add rtas_call_unlocked()
Most users of RTAS (Run-Time Abstraction Services) use rtas_call(),
which deals with locking as well as endian handling.

However we have two users outside of rtas.c that can't use rtas_call()
because they have different locking requirements.

The hotplug CPU code can't take the RTAS lock because the CPU would go
offline with the lock held and no other CPUs would be able to call RTAS
until the CPU came back online.

The xmon code doesn't want to take the lock because it would risk dead
locking when we are trying to recover from a crash.

Both sites required multiple patches when we added little endian
support, proving that programmers can't do endian right.

Although that ship has sailed, we can still clean the code up by
providing an unlocked version of rtas_call() which avoids the need to
open code the logic elsewhere.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:55 +11:00
Daniel Axtens 00b912b0c8 powerpc: Remove broken GregorianDay()
GregorianDay() is supposed to calculate the day of the week
(tm->tm_wday) for a given day/month/year. In that calcuation it
indexed into an array called MonthOffset using tm->tm_mon-1. However
tm_mon is zero-based, not one-based, so this is off-by-one. It also
means that every January, GregoiranDay() will access element -1 of
the MonthOffset array.

It also doesn't appear to be a correct algorithm either: see in
contrast kernel/time/timeconv.c's time_to_tm function.

It's been broken forever, which suggests no-one in userland uses
this. It looks like no-one in the kernel uses tm->tm_wday either
(see e.g. drivers/rtc/rtc-ds1305.c:319).

tm->tm_wday is conventionally set to -1 when not available in
hardware so we can simply set it to -1 and drop the function.
(There are over a dozen other drivers in drivers/rtc that do
this.)

Found using UBSAN.

Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrew Morton <akpm@linux-foundation.org> # as an example of what UBSan finds.
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: rtc-linux@googlegroups.com
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-16 12:54:04 +11:00
Michael Ellerman 1901d8bb45 powerpc fixes for 4.4 #2
- tm: Block signal return from setting invalid MSR state from Michael Neuling
  - tm: Check for already reclaimed tasks from Michael Neuling
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWVul+AAoJEFHr6jzI4aWAUZAQAK2s+E4WTtcExXSG1bqq05o6
 5wCLtIq6M91h5HDpgBfN7S5OXpRd72ZeIVdzC0HkFeLBF3y7NSHSEezUw4g/GGfz
 K2xGV1CCXC3Rb3qyHSdyi6+c1AnLVRPBVzVxPVmlXigrXeFiQ4613YW9rzf9b8fs
 oktUciwW9aHbrIv7g8f82gpuk9jwwhp/sF+1H/7fGOozT4CFsKo4wj4HOOCBwH4y
 ODEjs6Z+9Uwb6Kfvi/rn3k4XA1wC36WFq3ORI6KrmK/ZB1eR0Kwf0IELYpMj8cOX
 q5ZtCH7t68f9vmEK2B34AUijf/amm+2vLwvF6xAuZJFPUPZtgMBdRcqkLalbtPAO
 8hlyPPgoZcgR/Of+lEYxUobcL0SMNufXwmfwRO35ktkm9Z9Ee96C8NNbpybBSDXL
 YRa6is5MeO4GL8Gbcc0TA50hGjok7o3acGE6HSAReyzf0guQ4xqcif7+6lfWZPkk
 P3aM02ajp2qoqyjhT/Ei6JlMptAiuQY+HvELFneqn5s9nDbv6cGuYZNNap0c1fK+
 74W0p7MiZh7+IF5HpyUIeYV836inXMDIoKzjA6H3OWitk/1lbcrbF34Qpz9zAWZn
 YF3w786ZzzQLw0jcALaqZejm58MLGIakO4MNDB0/ZBh0nKKfEV8WvPOjev78OAp1
 +pDrJh0iQzrujN6OMKQ0
 =IL8A
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.4-3' into next

Merge the two TM fixes we merged in 4.4. We are about to merge selftests
for these, and without the fixes the selftests will oops.

powerpc fixes for 4.4 #2

 - tm: Block signal return from setting invalid MSR state from Michael Neuling
 - tm: Check for already reclaimed tasks from Michael Neuling
2015-12-14 20:40:32 +11:00
Michael Neuling 801c0b2c4d powerpc: Print MSR TM bits in oops messages
Print MSR TM bits in oops messages.  This appends them to the end
like this:

    MSR: 8000000502823031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[TE]>

You get the TM[] only if at least one TM MSR bit is set.  Inside the
TM[], E means Enabled (bit 32), S means Suspended (bit 33), and T
means Transactional (bit 34)

If no bits are set, you get no TM[] output.

Include rework of printbits() to handle this case.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 20:40:26 +11:00
Aneesh Kumar K.V 106713a145 powerpc/mm: Remove the dependency on pte bit position in asm code
We should not expect pte bit position in asm code. Simply
by moving part of that to C

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 15:19:10 +11:00
Bjorn Helgaas 800e07b609 Merge branches 'pci/aspm', 'pci/hotplug', 'pci/misc' and 'pci/msi' into next
* pci/aspm:
  PCI/ASPM: Make sysfs link_state_store() consistent with link_state_show()

* pci/hotplug:
  PCI: pciehp: Always protect pciehp_disable_slot() with hotplug mutex

* pci/misc:
  x86/PCI: Simplify pci_bios_{read,write}
  PCI: Simplify config space size computation
  PCI: Limit config space size for Netronome NFP6000 family
  PCI: Add Netronome vendor and device IDs
  PCI: Support PCIe devices with short cfg_size
  x86/PCI: Clarify AMD Fam10h config access restrictions comment
  PCI: Print warnings for all invalid expansion ROM headers
  PCI: Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask

* pci/msi:
  PCI/MSI: Remove empty pci_msi_init_pci_dev()
  PCI/MSI: Initialize MSI capability for all architectures
2015-12-10 19:40:14 -06:00
Bjorn Helgaas 93de690176 PCI: Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask
Bit 7 of the "Header Type" register indicates a multi-function device when
set.  Bits 0-6 contain encoded values, where 0x1 indicates a PCI-PCI
bridge.  It is incorrect to test this as though it were a mask.

For example, while the PCI 3.0 spec only defines values 0x0, 0x1, and 0x2,
it's conceivable that a future spec could define 0x3 to mean something
else; then tests for "(hdr_type & 0x7f) & PCI_HEADER_TYPE_BRIDGE" would
incorrectly succeed for this new 0x3 header type.

Test bits 0-6 of the Header Type for equality with PCI_HEADER_TYPE_BRIDGE.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-12-10 19:38:06 -06:00
Anton Blanchard db1231dcdb powerpc: Fix DSCR inheritance over fork()
Two DSCR tests have a hack in them:

	/*
	 * XXX: Force a context switch out so that DSCR
	 * current value is copied into the thread struct
	 * which is required for the child to inherit the
	 * changed value.
	 */
	sleep(1);

We should not be working around this in the testcase, it is a kernel bug.
Fix it by copying the current DSCR to the child, instead of what we
had in the thread struct at last context switch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-10 21:11:13 +11:00
Anton Blanchard 20dbe67062 powerpc: Call restore_sprs() before _switch()
commit 152d523e63 ("powerpc: Create context switch helpers save_sprs()
and restore_sprs()") moved the restore of SPRs after the call to _switch().

There is an issue with this approach - new tasks do not return through
_switch(), they are set up by copy_thread() to directly return through
ret_from_fork() or ret_from_kernel_thread(). This means restore_sprs() is
not getting called for new tasks.

Fix this by moving restore_sprs() before _switch().

Fixes: 152d523e63 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()")
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-10 21:10:55 +11:00
Anton Blanchard d64d02ce4e powerpc: Call check_if_tm_restore_required() in enable_kernel_*()
Commit a0e72cf12b ("powerpc: Create msr_check_and_{set,clear}()")
removed a call to check_if_tm_restore_required() in the
enable_kernel_*() functions. Add them back in.

Fixes: a0e72cf12b ("powerpc: Create msr_check_and_{set,clear}()")
Reported-by: Rashmica Gupta <rashmicy@gmail.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-10 20:10:53 +11:00
Al Viro b808b1d632 don't open-code generic_file_llseek_size()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-09 13:00:45 -05:00
Andrew Donnellan dc9c41bd9e Revert "powerpc/eeh: Don't unfreeze PHB PE after reset"
This reverts commit 527d10ef3a.

The reverted commit breaks cxlflash devices following an EEH reset (and
possibly other cxl devices, however this has not been tested).

The reverted commit changed the behaviour of eeh_reset_device() so that PHB
PEs are not unfrozen following the completion of the reset. This should not
be problematic, as no device resources should have been associated with the
PHB PE.

However, when attempting to load the cxlflash driver after a reset, the
driver attempts to read Vital Product Data through a call to
pci_read_vpd() (which is called on the physical cxl device, not on the
virtual AFU device). pci_read_vpd() in turn attempts to read from the cxl
device's config space. This fails, as the PE it's trying to read from is
still frozen. In turn, the driver gets an -ENODEV and fails to initialise.

It appears this issue only affects some parts of the VPD area, as "lspci
-vvv", which only reads a subset of the VPD bytes, is not broken by the
original patch.

At this stage, we don't fully understand why we're trying to read a frozen
PE, and we don't know how this affects other cxl devices. It is possible
that there is an underlying bug in the cxl driver or the powerpc CAPI
support code, or alternatively a bug in the PCI resource allocation/mapping
code that is incorrectly mapping resources to PE#0.

As such, this fix is incomplete, however it is necessary to prevent a
serious regression in CAPI support.

In the meantime, revert the commit, especially as it was intended to be a
non-functional change.

Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-09 14:05:10 +11:00
Rusty Russell 7523e4dc50 module: use a structure to encapsulate layout.
Makes it easier to handle init vs core cleanly, though the change is
fairly invasive across random architectures.

It simplifies the rbtree code immediately, however, while keeping the
core data together in the same cachline (now iff the rbtree code is
enabled).

Acked-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-12-04 22:46:25 +01:00
Anton Blanchard d1e1cf2e38 powerpc: clean up asm/switch_to.h
Remove a bunch of unnecessary fallback functions and group
things in a more logical way.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-02 19:34:41 +11:00
Anton Blanchard f3d885ccba powerpc: Rearrange __switch_to()
Most of __switch_to() is housekeeping, TLB batching, timekeeping etc.
Move these away from the more complex and critical context switching
code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-02 19:34:41 +11:00
Anton Blanchard 579e633e76 powerpc: create flush_all_to_thread()
Create a single function that flushes everything (FP, VMX, VSX, SPE).
Doing this all at once means we only do one MSR write.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-02 19:34:40 +11:00
Anton Blanchard c208505900 powerpc: create giveup_all()
Create a single function that gives everything up (FP, VMX, VSX, SPE).
Doing this all at once means we only do one MSR write.

A context switch microbenchmark using yield():

http://ozlabs.org/~anton/junkcode/context_switch2.c

./context_switch2 --test=yield --fp --altivec --vector 0 0

shows an improvement of 3% on POWER8.

Signed-off-by: Anton Blanchard <anton@samba.org>
[mpe: giveup_all() needs to be EXPORT_SYMBOL'ed]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-02 19:34:26 +11:00
Anton Blanchard 1f2e25b2d5 powerpc: Remove fp_enable() and vec_enable(), use msr_check_and_{set, clear}()
More consolidation of our MSR available bit handling.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:26 +11:00
Anton Blanchard 3eb5d5888d powerpc: Add ppc_strict_facility_enable boot option
Add a boot option that strictly manages the MSR unavailable bits.
This catches kernel uses of FP/Altivec/SPE that would otherwise
corrupt user state.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:26 +11:00
Anton Blanchard dc4fbba11e powerpc: Create disable_kernel_{fp,altivec,vsx,spe}()
The enable_kernel_*() functions leave the relevant MSR bits enabled
until we exit the kernel sometime later. Create disable versions
that wrap the kernel use of FP, Altivec VSX or SPE.

While we don't want to disable it normally for performance reasons
(MSR writes are slow), it will be used for a debug boot option that
does this and catches bad uses in other areas of the kernel.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:25 +11:00
Anton Blanchard a0e72cf12b powerpc: Create msr_check_and_{set,clear}()
Create helper functions to set and clear MSR bits after first
checking if they are already set. Grouping them will make it
easy to avoid the MSR writes in a subsequent optimisation.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:25 +11:00
Anton Blanchard a7d623d4d0 powerpc: Move part of giveup_vsx into c
Move the MSR modification into c. Removing it from the assembly
function will allow us to avoid costly MSR writes by batching them
up.

Check the FP and VMX bits before calling the relevant giveup_*()
function. This makes giveup_vsx() and flush_vsx_to_thread() perform
more like their sister functions, and allows us to use
flush_vsx_to_thread() in the signal code.

Move the check_if_tm_restore_required() check in.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:25 +11:00
Anton Blanchard 98da581e08 powerpc: Move part of giveup_fpu,altivec,spe into c
Move the MSR modification into new c functions. Removing it from
the low level functions will allow us to avoid costly MSR writes
by batching them up.

Move the check_if_tm_restore_required() check into these new functions.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:25 +11:00
Anton Blanchard b51b1153d0 powerpc: Remove NULL task struct pointer checks in FP and vector code
We used to allow giveup_*() to be called with a NULL task struct
pointer. Now those cases are handled in the caller we can remove
the checks. We can also remove giveup_altivec_notask() which is also
unused.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:25 +11:00
Anton Blanchard 611b0e5c19 powerpc: Create mtmsrd_isync()
mtmsrd_isync() will do an mtmsrd followed by an isync on older
processors. On newer processors we avoid the isync via a feature fixup.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:25 +11:00
Anton Blanchard b86fd2bd03 powerpc: Simplify TM restore checks
Instead of having multiple giveup_*_maybe_transactional() functions,
separate out the TM check into a new function called
check_if_tm_restore_required().

This will make it easier to optimise the giveup_*() functions in a
subsequent patch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:24 +11:00
Anton Blanchard af1bbc3dd3 powerpc: Remove UP only lazy floating point and vector optimisations
The UP only lazy floating point and vector optimisations were written
back when SMP was not common, and neither glibc nor gcc used vector
instructions. Now SMP is very common, glibc aggressively uses vector
instructions and gcc autovectorises.

We want to add new optimisations that apply to both UP and SMP, but
in preparation for that remove these UP only optimisations.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:24 +11:00
Anton Blanchard 68bfa962bf powerpc: Remove redundant mflr in _switch
No need to execute mflr twice.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:24 +11:00
Anton Blanchard 152d523e63 powerpc: Create context switch helpers save_sprs() and restore_sprs()
Move all our context switch SPR save and restore code into two
helpers. We do a few optimisations:

- Group all mfsprs and all mtsprs. In many cases an mtspr sets a
scoreboarding bit that an mfspr waits on, so the current practise of
mfspr A; mtspr A; mfpsr B; mtspr B is the worst scheduling we can
do.

- SPR writes are slow, so check that the value is changing before
writing it.

A context switch microbenchmark using yield():

http://ozlabs.org/~anton/junkcode/context_switch2.c

./context_switch2 --test=yield 0 0

shows an improvement of almost 10% on POWER8.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:24 +11:00
Anton Blanchard af72ab646a powerpc: Don't disable MSR bits in do_load_up_transact_*() functions
Similar to the non TM load_up_*() functions, don't disable the MSR
bits on the way out.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:24 +11:00
Anton Blanchard 07e45c120c powerpc: Don't disable kernel FP/VMX/VSX MSR bits on context switch
Writing the MSR is slow, so we want to avoid it whenever possible.

A subsequent patch will add a debug option that strictly manages the
FP/VMX/VSX unavailable bits. For now just remove it, matching what
we do in other areas of the kernel (eg enable_kernel_altivec()).

A context switch microbenchmark using yield():

http://ozlabs.org/~anton/junkcode/context_switch2.c

./context_switch2 --test=yield --fp 0 0

shows an improvement of almost 3% on POWER8.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:24 +11:00
Paul Mackerras 31a40e2b05 powerpc/64: Include KVM guest test in all interrupt vectors
Currently, if HV KVM is configured but PR KVM isn't, we don't include
a test to see whether we were interrupted in KVM guest context for the
set of interrupts which get delivered directly to the guest by hardware
if they occur in the guest.  This includes things like program
interrupts.

However, the recent bug where userspace could set the MSR for a VCPU
to have an illegal value in the TS field, and thus cause a TM Bad Thing
type of program interrupt on the hrfid that enters the guest, showed that
we can never be completely sure that these interrupts can never occur
in the guest entry/exit code.  If one of these interrupts does happen
and we have HV KVM configured but not PR KVM, then we end up trying to
run the handler in the host with the MMU set to the guest MMU context,
which generally ends badly.

Thus, for robustness it is better to have the test in every interrupt
vector, so that if some way is found to trigger some interrupt in the
guest entry/exit path, we can handle it without immediately crashing
the host.

This means that the distinction between KVMTEST and KVMTEST_PR goes
away.  Thus we delete KVMTEST_PR and associated macros and use KVMTEST
everywhere that we previously used either KVMTEST_PR or KVMTEST.  It
also means that SOFTEN_TEST_HV_201 becomes the same as SOFTEN_TEST_PR,
so we deleted SOFTEN_TEST_HV_201 and use SOFTEN_TEST_PR instead.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01 13:52:23 +11:00
Rashmica Gupta 343c3327c1 powerpc: Add rN aliases to the pt_regs_offset table.
It is common practice with powerpc to use 'rN' to refer to register 'N'. However
when using the pt_regs_offset table we have to use 'gprN'.

So add aliases such that both 'rN' and 'gprN' can be used.

For example, we can currently do:
  $ su -
  $ cd /sys/kernel/debug/tracing
  $ echo "p:probe/sys_fchownat sys_fchownat %gpr3:s32 +0(%gpr4):string %gpr5:s32 %gpr6:s32 %gpr7:s32" > kprobe_events
  $ echo 1 > events/probe/sys_fchownat/enable
  $ touch /tmp/foo
  $ chown root /tmp/foo
  $ echo 0 > events/enable
  $ cat trace
    chown-2925  [014] d...    76.160657: sys_fchownat: (SyS_fchownat+0x8/0x1a0) arg1=-100 arg2="/tmp/foo" arg3=0 arg4=-1 arg5=0

Instead we'd like to be able to use:
 $ echo "p:probe/sys_fchownat sys_fchownat %r3:s32 +0(%r4):string %r5:s32 %r6:s32 %r7:s32" > kprobe_events

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-26 22:11:17 +11:00
Rashmica Gupta f43194e458 powerpc: Standardise on NR_syscalls rather than __NR_syscalls.
Most architectures use NR_syscalls as the #define for the number of syscalls.

We use __NR_syscalls, and then define NR_syscalls as __NR_syscalls.

__NR_syscalls is not used outside arch code, whereas NR_syscalls is. So as
NR_syscalls must be defined and __NR_syscalls does not, replace __NR_syscalls
with NR_syscalls.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-26 22:11:17 +11:00
Rashmica Gupta cdfc8ed690 powerpc: Remove unused function trace_syscall()
This function has been unused since commit 14cf11af6c ("powerpc: Merge enough
to start building in arch/powerpc."), so remove it.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-26 22:11:16 +11:00
Guilherme G. Piccoli e80e7edc55 PCI/MSI: Initialize MSI capability for all architectures
1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't
support MSI") moved dev->msi_cap and dev->msix_cap initialization from the
pci_init_capabilities() path (used on all architectures) to the
pci_setup_device() path (not used on Open Firmware architectures).

This broke MSI or MSI-X on Open Firmware machines.  4d9aac397a
("powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case")
fixed it for PowerPC but not for SPARC.

Set up MSI and MSI-X (initialize msi_cap and msix_cap and disable MSI and
MSI-X) in pci_init_capabilities() so all architectures do it the same way.

This reverts 4d9aac397a since this patch fixes the problem generically
for both PowerPC and SPARC.

[bhelgaas: changelog, make pci_msi_setup_pci_dev() static]
Fixes: 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-11-24 17:45:18 -06:00
Michael Neuling 7f821fc9c7 powerpc/tm: Check for already reclaimed tasks
Currently we can hit a scenario where we'll tm_reclaim() twice.  This
results in a TM bad thing exception because the second reclaim occurs
when not in suspend mode.

The scenario in which this can happen is the following.  We attempt to
deliver a signal to userspace.  To do this we need obtain the stack
pointer to write the signal context.  To get this stack pointer we
must tm_reclaim() in case we need to use the checkpointed stack
pointer (see get_tm_stackpointer()).  Normally we'd then return
directly to userspace to deliver the signal without going through
__switch_to().

Unfortunatley, if at this point we get an error (such as a bad
userspace stack pointer), we need to exit the process.  The exit will
result in a __switch_to().  __switch_to() will attempt to save the
process state which results in another tm_reclaim().  This
tm_reclaim() now causes a TM Bad Thing exception as this state has
already been saved and the processor is no longer in TM suspend mode.
Whee!

This patch checks the state of the MSR to ensure we are TM suspended
before we attempt the tm_reclaim().  If we've already saved the state
away, we should no longer be in TM suspend mode.  This has the
additional advantage of checking for a potential TM Bad Thing
exception.

Found using syscall fuzzer.

Fixes: fb09692e71 ("powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-23 20:18:03 +11:00
Michael Neuling d2b9d2a5ad powerpc/tm: Block signal return setting invalid MSR state
Currently we allow both the MSR T and S bits to be set by userspace on
a signal return.  Unfortunately this is a reserved configuration and
will cause a TM Bad Thing exception if attempted (via rfid).

This patch checks for this case in both the 32 and 64 bit signals
code.  If both T and S are set, we mark the context as invalid.

Found using a syscall fuzzer.

Fixes: 2b0a576d15 ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-23 20:06:31 +11:00
Linus Torvalds 2f4bf528ec powerpc updates for 4.4
- Kconfig: remove BE-only platforms from LE kernel build from Boqun Feng
  - Refresh ps3_defconfig from Geoff Levand
  - Emit GNU & SysV hashes for the vdso from Michael Ellerman
  - Define an enum for the bolted SLB indexes from Anshuman Khandual
  - Use a local to avoid multiple calls to get_slb_shadow() from Michael Ellerman
  - Add gettimeofday() benchmark from Michael Neuling
  - Avoid link stack corruption in __get_datapage() from Michael Neuling
  - Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar K.V
  - Add ppc64le_defconfig from Michael Ellerman
  - pseries: extract of_helpers module from Andy Shevchenko
  - Correct string length in pseries_of_derive_parent() from Nathan Fontenot
  - Free the MSI bitmap if it was slab allocated from Denis Kirjanov
  - Shorten irq_chip name for the SIU from Christophe Leroy
  - Wait 1s for secondaries to enter OPAL during kexec from Samuel Mendoza-Jonas
  - Fix _ALIGN_* errors due to type difference. from Aneesh Kumar K.V
  - powerpc/pseries/hvcserver: don't memset pi_buff if it is null from Colin Ian King
  - Disable hugepd for 64K page size. from Aneesh Kumar K.V
  - Differentiate between hugetlb and THP during page walk from Aneesh Kumar K.V
  - Make PCI non-optional for pseries from Michael Ellerman
  - Individual System V IPC system calls from Sam bobroff
  - Add selftest of unmuxed IPC calls from Michael Ellerman
  - discard .exit.data at runtime from Stephen Rothwell
  - Delete old orphaned PrPMC 280/2800 DTS and boot file. from Paul Gortmaker
  - Use of_get_next_parent to simplify code from Christophe Jaillet
  - Paginate some xmon output from Sam bobroff
  - Add some more elements to the xmon PACA dump from Michael Ellerman
  - Allow the tm-syscall selftest to build with old headers from Michael Ellerman
  - Run EBB selftests only on POWER8 from Denis Kirjanov
  - Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael Ellerman
  - Avoid reference to potentially freed memory in prom.c from Christophe Jaillet
  - Quieten boot wrapper output with run_cmd from Geoff Levand
  - EEH fixes and cleanups from Gavin Shan
  - Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan
  - Use of_get_next_parent() in of_get_ibm_chip_id() from Michael Ellerman
  - Fix section mismatch warning in msi_bitmap_alloc() from Denis Kirjanov
  - Fix ps3-lpm white space from Rudhresh Kumar J
  - Fix ps3-vuart null dereference from Colin King
  - nvram: Add missing kfree in error path from Christophe Jaillet
  - nvram: Fix function name in some errors messages. from Christophe Jaillet
  - drivers/macintosh: adb: fix misleading Kconfig help text from Aaro Koskinen
  - agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov
  - cxl: Free virtual PHB when removing from Andrew Donnellan
  - scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from Michael Ellerman
  - scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building with O= from Michael Ellerman
 
  - Freescale updates from Scott: Highlights include 64-bit book3e kexec/kdump
    support, a rework of the qoriq clock driver, device tree changes including
    qoriq fman nodes, support for a new 85xx board, and some fixes.
 
  - MPC5xxx updates from Anatolij: Highlights include a driver for MPC512x
    LocalPlus Bus FIFO with its device tree binding documentation, mpc512x
    device tree updates and some minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWPEZgAAoJEFHr6jzI4aWANjYQAKX2Q/95hqKfCuF5FBcUmtMC
 Pu/Nff027MVzxZ2ApDcvvLGps5Nz2bn3nIhc9zjkXc5E8DuL6X3Yl8ce7qyNcc3g
 cJJ8RvtUo6J1OMWetXFehtPYniAAwKMhZYKnj0+WnLr2SyH/Vhl3ehDkFbGyPtuH
 r+2E7krFjfVgU+bzciIFnOaDekFuFN/pXWMb6e6zQyBJe9N8ZIp96uouGCebKVd0
 VDLItzdaKErT8JFfbymMPvZm3V0rMVx4WWu3kAbQX8LrD5a18NF1zrjAOHRXc61n
 kkk8/DPuNOon1PbXXyiS5BcFyZRe+KE3VBnoW5sOMqMIRg5WdO1oU3e2pEfXMO8+
 leXYwFLXiKzUZuOgQG2QiUhrzD2yC1o6/TJWATv0dSl9AwrecgPX+Vj6X357slAf
 A9E3eMy5tgnpndBWZmvZS3W7YDKH+NkeZ+Q40+NErAlqr++ErrTcKVndk5vWlYTT
 7mMZeTXagX66al/k5ATKqwB7iUSpnYHSAa9fcUYPSM2FnXsDxPyeJGkBbcoOmkGj
 QrpgNYOvJaUJd076goZCV39v0c1xpfV9/9kyVch8HUadf6JcjpVZwYnbGw2qlJjh
 ZanuBG2VOeSwaKQqXiRBSBetnpAg8CVpFjDmX9wOBfSek2wxEJqDX/vQExdbIDQQ
 pUs7vnUxLzhmW/x+ygOI
 =YwcM
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Kconfig: remove BE-only platforms from LE kernel build from Boqun
   Feng
 - Refresh ps3_defconfig from Geoff Levand
 - Emit GNU & SysV hashes for the vdso from Michael Ellerman
 - Define an enum for the bolted SLB indexes from Anshuman Khandual
 - Use a local to avoid multiple calls to get_slb_shadow() from Michael
   Ellerman
 - Add gettimeofday() benchmark from Michael Neuling
 - Avoid link stack corruption in __get_datapage() from Michael Neuling
 - Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar
   K.V
 - Add ppc64le_defconfig from Michael Ellerman
 - pseries: extract of_helpers module from Andy Shevchenko
 - Correct string length in pseries_of_derive_parent() from Nathan
   Fontenot
 - Free the MSI bitmap if it was slab allocated from Denis Kirjanov
 - Shorten irq_chip name for the SIU from Christophe Leroy
 - Wait 1s for secondaries to enter OPAL during kexec from Samuel
   Mendoza-Jonas
 - Fix _ALIGN_* errors due to type difference, from Aneesh Kumar K.V
 - powerpc/pseries/hvcserver: don't memset pi_buff if it is null from
   Colin Ian King
 - Disable hugepd for 64K page size, from Aneesh Kumar K.V
 - Differentiate between hugetlb and THP during page walk from Aneesh
   Kumar K.V
 - Make PCI non-optional for pseries from Michael Ellerman
 - Individual System V IPC system calls from Sam bobroff
 - Add selftest of unmuxed IPC calls from Michael Ellerman
 - discard .exit.data at runtime from Stephen Rothwell
 - Delete old orphaned PrPMC 280/2800 DTS and boot file, from Paul
   Gortmaker
 - Use of_get_next_parent to simplify code from Christophe Jaillet
 - Paginate some xmon output from Sam bobroff
 - Add some more elements to the xmon PACA dump from Michael Ellerman
 - Allow the tm-syscall selftest to build with old headers from Michael
   Ellerman
 - Run EBB selftests only on POWER8 from Denis Kirjanov
 - Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael
   Ellerman
 - Avoid reference to potentially freed memory in prom.c from Christophe
   Jaillet
 - Quieten boot wrapper output with run_cmd from Geoff Levand
 - EEH fixes and cleanups from Gavin Shan
 - Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan
 - Use of_get_next_parent() in of_get_ibm_chip_id() from Michael
   Ellerman
 - Fix section mismatch warning in msi_bitmap_alloc() from Denis
   Kirjanov
 - Fix ps3-lpm white space from Rudhresh Kumar J
 - Fix ps3-vuart null dereference from Colin King
 - nvram: Add missing kfree in error path from Christophe Jaillet
 - nvram: Fix function name in some errors messages, from Christophe
   Jaillet
 - drivers/macintosh: adb: fix misleading Kconfig help text from Aaro
   Koskinen
 - agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov
 - cxl: Free virtual PHB when removing from Andrew Donnellan
 - scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from
   Michael Ellerman
 - scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building
   with O= from Michael Ellerman
 - Freescale updates from Scott: Highlights include 64-bit book3e
   kexec/kdump support, a rework of the qoriq clock driver, device tree
   changes including qoriq fman nodes, support for a new 85xx board, and
   some fixes.
 - MPC5xxx updates from Anatolij: Highlights include a driver for
   MPC512x LocalPlus Bus FIFO with its device tree binding
   documentation, mpc512x device tree updates and some minor fixes.

* tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (106 commits)
  powerpc/msi: Fix section mismatch warning in msi_bitmap_alloc()
  powerpc/prom: Use of_get_next_parent() in of_get_ibm_chip_id()
  powerpc/pseries: Correct string length in pseries_of_derive_parent()
  powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same tlb entry
  powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s)
  powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan
  powerpc/fsl: Add #clock-cells and clockgen label to clockgen nodes
  powerpc: handle error case in cpm_muram_alloc()
  powerpc: mpic: use IRQCHIP_SKIP_SET_WAKE instead of redundant mpic_irq_set_wake
  powerpc/book3e-64: Enable kexec
  powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop
  powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32
  powerpc/book3e-64/kexec: Enable SMP release
  powerpc/book3e-64/kexec: create an identity TLB mapping
  powerpc/book3e-64: Don't limit paca to 256 MiB
  powerpc/book3e/kdump: Enable crash_kexec_wait_realmode
  powerpc/book3e: support CONFIG_RELOCATABLE
  powerpc/booke64: Fix args to copy_and_flush
  powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts
  powerpc/e6500: kexec: Handle hardware threads
  ...
2015-11-05 23:38:43 -08:00
Linus Torvalds 1873499e13 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem update from James Morris:
 "This is mostly maintenance updates across the subsystem, with a
  notable update for TPM 2.0, and addition of Jarkko Sakkinen as a
  maintainer of that"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (40 commits)
  apparmor: clarify CRYPTO dependency
  selinux: Use a kmem_cache for allocation struct file_security_struct
  selinux: ioctl_has_perm should be static
  selinux: use sprintf return value
  selinux: use kstrdup() in security_get_bools()
  selinux: use kmemdup in security_sid_to_context_core()
  selinux: remove pointless cast in selinux_inode_setsecurity()
  selinux: introduce security_context_str_to_sid
  selinux: do not check open perm on ftruncate call
  selinux: change CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE default
  KEYS: Merge the type-specific data with the payload data
  KEYS: Provide a script to extract a module signature
  KEYS: Provide a script to extract the sys cert list from a vmlinux file
  keys: Be more consistent in selection of union members used
  certs: add .gitignore to stop git nagging about x509_certificate_list
  KEYS: use kvfree() in add_key
  Smack: limited capability for changing process label
  TPM: remove unnecessary little endian conversion
  vTPM: support little endian guests
  char: Drop owner assignment from i2c_driver
  ...
2015-11-05 15:32:38 -08:00
Michael Ellerman 3b0e21ec3b Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include 64-bit book3e kexec/kdump support, a rework of the
qoriq clock driver, device tree changes including qoriq fman nodes,
support for a new 85xx board, and some fixes.

Note that there is a trivial merge conflict with the clock tree's next
branch, in the clock Makefile."
2015-11-02 13:59:48 +11:00
Benjamin Herrenschmidt 977bf062bb powerpc/dma: dma_set_coherent_mask() should not be GPL only
When turning this from inline to an exported function I was a bit
over-eager and made it GPL only. This prevents the use of pretty much
all non-GPL PCI driver which is a bit over the top. Let's bring it
back in line with other architecture.

Fixes: 817820b022 ("powerpc/iommu: Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-28 14:20:50 +09:00
Michael Ellerman 16c1d60626 powerpc/prom: Use of_get_next_parent() in of_get_ibm_chip_id()
Use of_get_next_parent() to simplifiy the logic in of_get_ibm_chip_id().

Original-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-28 12:08:32 +09:00
Tiejun Chen 96eea6426f powerpc/book3e-64: Enable kexec
Allow KEXEC for book3e, and bypass or convert non-book3e stuff
in kexec code.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood@freescale.com: move code to minimize diff, and cleanup]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:30 -05:00
Scott Wood ae73e4ccbc powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop
book3e_secondary_core_init will only create a TLB entry if r4 = 0,
so do so.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:30 -05:00
Scott Wood 567cf94dc7 powerpc/book3e-64/kexec: Enable SMP release
The SMP release mechanism for FSL book3e is different from when booting
with normal hardware.  In theory we could simulate the normal spin
table mechanism, but not at the addresses U-Boot put in the device tree
-- so there'd need to be even more communication between the kernel and
kexec to set that up.  Instead, kexec-tools will set a boolean property
linux,booted-from-kexec in the /chosen node.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: devicetree@vger.kernel.org
2015-10-27 18:13:29 -05:00
Tiejun Chen cf904e3088 powerpc/book3e-64/kexec: create an identity TLB mapping
book3e has no real MMU mode so we have to create an identity TLB
mapping to make sure we can access the real physical address.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: cleanup, and split off some changes]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:28 -05:00
Scott Wood ecc4999f68 powerpc/book3e-64: Don't limit paca to 256 MiB
This limit only makes sense on book3s, and on book3e it can cause
problems with kdump if we don't have any memory under 256 MiB.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:28 -05:00
Scott Wood eeaab663a0 powerpc/book3e/kdump: Enable crash_kexec_wait_realmode
While book3e doesn't have "real mode", we still want to wait for
all the non-crash cpus to complete their shutdown.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:27 -05:00
Tiejun Chen 1cb6e06492 powerpc/book3e: support CONFIG_RELOCATABLE
book3e is different with book3s since 3s includes the exception
vectors code in head_64.S as it relies on absolute addressing
which is only possible within this compilation unit. So we have
to get that label address with got.

And when boot a relocated kernel, we should reset ipvr properly again
after .relocate.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: cleanup and ifdef removal]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:27 -05:00
Tiejun Chen 835c031c98 powerpc/booke64: Fix args to copy_and_flush
Convert r4/r5, not r6, to a virtual address when calling
copy_and_flush.  Otherwise, r3 is already virtual, and copy_to_flush
tries to access r3+r6, PAGE_OFFSET gets added twice.

This isn't normally seen because on book3e we normally enter with
the kernel at zero and thus skip copy_to_flush -- but it will be
needed for kexec support.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: split patch and rewrote changelog]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:26 -05:00
Tiejun Chen 68d1014019 powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts
Rename 'interrupt_end_book3e' to '__end_interrupts' so that the symbol
can be used by both book3s and book3e.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: edit changelog]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:26 -05:00
Scott Wood f34b3e19fd powerpc/e6500: kexec: Handle hardware threads
The new kernel will be expecting secondary threads to be disabled,
not spinning.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27 18:13:25 -05:00
Scott Wood d9e1831a42 powerpc/85xx: Load all early TLB entries at once
Use an AS=1 trampoline TLB entry to allow all normal TLB1 entries to
be loaded at once.  This avoids the need to keep the translation that
code is executing from in the same TLB entry in the final TLB
configuration as during early boot, which in turn is helpful for
relocatable kernels (e.g. kdump) where the kernel is not running from
what would be the first TLB entry.

On e6500, we limit map_mem_in_cams() to the primary hwthread of a
core (the boot cpu is always considered primary, as a kdump kernel
can be entered on any cpu).  Each TLB only needs to be set up once,
and when we do, we don't want another thread to be running when we
create a temporary trampoline TLB1 entry.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-22 22:50:46 -05:00
Vasant Hegde 8832317f66 powerpc/rtas: Validate rtas.entry before calling enter_rtas()
Currently we do not validate rtas.entry before calling enter_rtas(). This
leads to a kernel oops when user space calls rtas system call on a powernv
platform (see below). This patch adds code to validate rtas.entry before
making enter_rtas() call.

  Oops: Exception in kernel mode, sig: 4 [#1]
  SMP NR_CPUS=1024 NUMA PowerNV
  task: c000000004294b80 ti: c0000007e1a78000 task.ti: c0000007e1a78000
  NIP: 0000000000000000 LR: 0000000000009c14 CTR: c000000000423140
  REGS: c0000007e1a7b920 TRAP: 0e40   Not tainted  (3.18.17-340.el7_1.pkvm3_1_0.2400.1.ppc64le)
  MSR: 1000000000081000 <HV,ME>  CR: 00000000  XER: 00000000
  CFAR: c000000000009c0c SOFTE: 0
  NIP [0000000000000000]           (null)
  LR [0000000000009c14] 0x9c14
  Call Trace:
  [c0000007e1a7bba0] [c00000000041a7f4] avc_has_perm_noaudit+0x54/0x110 (unreliable)
  [c0000007e1a7bd80] [c00000000002ddc0] ppc_rtas+0x150/0x2d0
  [c0000007e1a7be30] [c000000000009358] syscall_exit+0x0/0x98

Cc: stable@vger.kernel.org # v3.2+
Fixes: 55190f8878 ("powerpc: Add skeleton PowerNV platform")
Reported-by: NAGESWARA R. SASTRY <nasastry@in.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
[mpe: Reword change log, trim oops, and add stable + fixes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-22 11:03:25 +11:00
Gavin Shan 872ee2d652 powerpc/eeh: More relaxed condition for enabled IO path
When one or both of the below two flags are marked in the PE state, the
PE's IO path is regarded as enabled: EEH_STATE_MMIO_ACTIVE or
EEH_STATE_MMIO_ENABLED.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:41:43 +11:00
Gavin Shan 8234fcedf1 powerpc/eeh: Force reset on fenced PHB
On fenced PHB, the error handlers in the drivers of its subordinate
devices could return PCI_ERS_RESULT_CAN_RECOVER, indicating no reset
will be issued during the recovery. It's conflicting with the fact
that fenced PHB won't be recovered without reset.

This limits the return value from the error handlers in the drivers
of the fenced PHB's subordinate devices to PCI_ERS_RESULT_NEED_NONE
or PCI_ERS_RESULT_NEED_RESET, to ensure reset will be issued during
recovery.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:41:43 +11:00
Gavin Shan f2da4ccf8b powerpc/eeh: More relaxed hotplug criterion
Currently, we rely on the existence of struct pci_driver::err_handler
to decide if the corresponding PCI device should be unplugged during
EEH recovery (partially hotplug case). However that check is not
sufficient. Some device drivers implement only some of the EEH error
handlers to collect diag-data. That means the driver still expects a
hotplug to recover from the EEH error.

This makes the hotplug criterion more relaxed: if the device driver
doesn't provide all necessary EEH error handlers, it will experience
hotplug during EEH recovery.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
[mpe: Minor change log rewording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:39:07 +11:00
Gavin Shan 527d10ef3a powerpc/eeh: Don't unfreeze PHB PE after reset
On PowerNV platform, the PE is kept in frozen state until the PE
reset is completed to avoid recursive EEH error caused by MMIO
access during the period of EEH reset. The PE's frozen state is
cleared after BARs of PCI device included in the PE are restored
and enabled. However, we needn't clear the frozen state for PHB PE
explicitly at this point as there is no real PE for PHB PE. As the
PHB PE is always binding with PE#0, we actually clear PE#0, which
is wrong. It doesn't incur any problem though.

This checks if the PE is PHB PE and doesn't clear the frozen state
if it is.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 20:06:57 +11:00
Christophe Jaillet 1856f50c66 powerpc/prom: Avoid reference to potentially freed memory
of_get_property() is used inside the loop, but then the reference to the
node is dropped before dereferencing the prop pointer, which could by then
point to junk if the node has been freed.

Instead use of_property_read_u32() to actually read the property
value before dropping the reference.

of_property_read_u32() requires at least one cell (u32) to be present,
which is stricter than the old logic which would happily dereference a
property of any size. However we believe all device trees in the wild
have at least one cell.

Skiboot may produce memory nodes with more than one cell, but that is
OK, of_property_read_u32() will return the first one.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[mpe: Expand change log with device tree details]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21 15:31:25 +11:00
Hon Ching \(Vicky\) Lo 9e5d4af458 vTPM: get the buffer allocated for event log instead of the actual log
The OS should ask Power Firmware (PFW) for the size of the buffer
allocated for the event log, instead of the size of the actual
event log.  It then passes the buffer adddress and size to PFW in
the handover process, into which PFW copies the log.

Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
2015-10-19 01:01:23 +02:00
Hon Ching \(Vicky\) Lo b4ed0469d0 vTPM: reformat event log to be byte-aligned
The event log generated by OpenFirmware in PowerPC is 4-byte aligned.
This patch reformats the log to be byte-aligned for the Linux client.

Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
2015-10-19 01:01:23 +02:00
Hon Ching \(Vicky\) Lo 2f82e98265 vTPM: fix searching for the right vTPM node in device tree
Replace all occurrences of '/ibm,vtpm' with '/vdevice/vtpm',
as only the latter is guanranteed to be available for the client OS.
The '/ibm,vtpm' node should only be used by Open Firmware, which
is susceptible to changes.

Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
2015-10-19 01:01:22 +02:00
Stephen Rothwell 4c8123181d powerpc: discard .exit.data at runtime
.exit.text is discarded at run time and there are some references from
that to .exit.data, so we need to discard .exit.data at run time as well.

Fixes these errors:

`.exit.data' referenced in section `.exit.text' of drivers/built-in.o: defined in discarded section `.exit.data' of drivers/built-in.o
`.exit.data' referenced in section `.exit.text' of drivers/built-in.o: defined in discarded section `.exit.data' of drivers/built-in.o

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-15 20:31:59 +11:00
Gavin Shan 54f9a64a36 powerpc/eeh: atomic_dec_if_positive() to update passthru count
No need to have two atomic opertions (update and fetch/check) when
decreasing PE's number of passed devices as one atomic operation
is enough.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-15 20:31:58 +11:00
Andrew Donnellan 6b8b252f40 powerpc/pci: export pcibios_free_controller()
Export pcibios_free_controller(), so it can be used by the cxl module to
free virtual PHBs.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-15 20:31:57 +11:00
Aneesh Kumar K.V 891121e6c0 powerpc/mm: Differentiate between hugetlb and THP during page walk
We need to properly identify whether a hugepage is an explicit or
a transparent hugepage in follow_huge_addr(). We used to depend
on hugepage shift argument to do that. But in some case that can
result in wrong results. For ex:

On finding a transparent hugepage we set hugepage shift to PMD_SHIFT.
But we can end up clearing the thp pte, via pmdp_huge_get_and_clear.
We do prevent reusing the pfn page via the usage of
kick_all_cpus_sync(). But that happens after we updated the pte to 0.
Hence in follow_huge_addr() we can find hugepage shift set, but transparent
huge page check fail for a thp pte.

NOTE: We fixed a variant of this race against thp split in commit
691e95fd73
("powerpc/mm/thp: Make page table walk safe against thp split/collapse")

Without this patch, we may hit the BUG_ON(flags & FOLL_GET) in
follow_page_mask occasionally.

In the long term, we may want to switch ppc64 64k page size config to
enable CONFIG_ARCH_WANT_GENERAL_HUGETLB

Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-12 15:30:09 +11:00
Christophe Jaillet b6080db4f4 powerpc/nvram: Fix function name in some errors messages.
'nvram_create_os_partition' should be 'nvram_create_partition'.
Use __func__ to have it right, as done elsewhere in this file.

Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-02 22:55:05 +10:00
Christophe Jaillet 7d52318717 powerpc/nvram: Add missing kfree in error path
If 'nvram_write_header' fails, then 'new_part' should be freed, otherwise,
there is a memory leak.

Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-02 22:54:55 +10:00
Michael Neuling c974809a26 powerpc/vdso: Avoid link stack corruption in __get_datapage()
powerpc has a link register (lr) used for calling functions. We "bl
<func>" to call a function, and "blr" to return back to the call site.

The lr is only a single register, so if we call another function from
inside this function (ie. nested calls), software must save away the
lr on the software stack before calling the new function. Before
returning (ie. before the "blr"), the lr is restored by software from
the software stack.

This makes branch prediction quite difficult for the processor as it
will only know the branch target just before the "blr".

To help with this, modern powerpc processors keep a (non-architected)
hardware stack of lr called a "link stack". When a "bl <func>" is
run, the lr is pushed onto this stack. When a "blr" is called, the
branch predictor pops the lr value from the top of the link stack, and
uses it to predict the branch target. Hence the processor pipeline
knows a lot earlier the branch target.

This works great but there are some cases where you call "bl" but
without a matching "blr". Once such case is when trying to determine
the program counter (which can't be read directly). Here you "bl+4;
mflr" to get the program counter. If you do this, the link stack will
get out of sync with reality, causing the branch predictor to
mis-predict subsequent function returns.

To avoid this, modern micro-architectures have a special case of bl.
Using the form "bcl 20,31,+4", ensures the processor doesn't push to
the link stack.

The 32 and 64 bit variants of __get_datapage() use a "bl; mflr" to
determine the loaded address of the VDSO. The current versions of
these attempt to use this special bl variant.

Unfortunately they use +8 rather than the required +4. Hence the
current code results in the link stack getting out of sync with
reality and hence the resulting performance degradation.

This patch moves it to bcl+4 by moving __kernel_datapage_offset out of
__get_datapage().

With this patch, running a gettimeofday() (which uses
__get_datapage()) microbenchmark we get a decent bump in performance
on POWER7/8.

For the benchmark in tools/testing/selftests/powerpc/benchmarks/gettimeofday.c
  POWER8:
    64bit gets ~4% improvement
    32bit gets ~9% improvement
  POWER7:
    64bit gets ~7% improvement

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: Aaron Sawdey <sawdey@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-01 16:52:02 +10:00
Michael Ellerman 787b393c9f powerpc/vdso: Emit GNU & SysV hashes
Andy Lutomirski says:

  Some dynamic loaders may be slightly faster if a GNU hash is
  available.

  This is unlikely to have any measurable effect on the time it takes
  to resolve vdso symbols (since there are so few of them).  In some
  contexts, it can be a win for a different reason: if every DSO has a
  GNU hash section, then libc can avoid calculating SysV hashes at
  all. Both musl and glibc appear to have this optimization.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-01 16:52:00 +10:00
Linus Torvalds 966966a630 PCI updates for v4.3:
Resource management
     - Revert pci_read_bridge_bases() unification (Bjorn Helgaas)
     - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn Helgaas)
 
   MSI
     - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson)
 
   Renesas R-Car host bridge driver
     - Add R8A7794 support (Sergei Shtylyov)
 
   Miscellaneous
     - Fix devfn for VPD access through function 0 (Alex Williamson)
     - Use function 0 VPD only for identical functions (Alex Williamson)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBUq2AAoJEFmIoMA60/r8wbwP/0D/+fKEPYJlB6hx1wLHpVk3
 K//vEwH0RgA3v2X53QUoHg94gTYhSZLKX0zdAFshbphE0HCZ6AO3UO+/ZJ3cui6J
 PYvKOnhby2ErNotZqrs3DQIM8rGgl0ZVgoFQrAWEvwiHRHI/r2ArK/oR4PiBjxJT
 StYuJoTkZIlJyHXza6tvHDcWi+Jc8t8r0HC4Vs32BlaVBQM0SH3CMxHfhJw/Q9xP
 WHFif1sH0N+p7WDyHH71C1T8POOgXY73BsD2AC0se3lRYZ9SVkOVy9ECGUucx8F6
 LDAuFelwRvW2Dr9kh38+5f8Xp155E+eZ6zRWW9/JlrUKVEtHhOFhtrRfDNKHuDCt
 B9ETrxDiSUFAdQ2weye9BK6aXK0CHF6YP3PCbvK77qFUUsN8csFSKktanKrFAbML
 CdjkVkEoeLHw+aXzyDg0pSBRZMQ24dTQDh7YqOFZGuEjCLPXOEQ8nitf0IzBB0KI
 4QetT/QK3bKkgtVKTwPP+s9f4g+fA/oiwJ21ZTV9hi/9upywTa/umCUvH9Fmb8Fp
 VZeTzugSht0+ioXpaF/6/KO0Ccp/t5uAHYeuBBMqiHX7ks8DdnfPCwbWNRKkg25O
 Qy7Y8VnnOtesRCAqBq5y/hHlLUluMkjYpEblYFiD6HBWcjUh6xE6LlIO4mwnjqWI
 zjB3w7+0GOrvS7dBSx0N
 =f4c3
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "These are fixes for things we merged for v4.3 (VPD, MSI, and bridge
  window management), and a new Renesas R8A7794 SoC device ID.

  Details:

  Resource management:
   - Revert pci_read_bridge_bases() unification (Bjorn Helgaas)
   - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn
     Helgaas)

  MSI:
   - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson)

  Renesas R-Car host bridge driver:
   - Add R8A7794 support (Sergei Shtylyov)

  Miscellaneous:
   - Fix devfn for VPD access through function 0 (Alex Williamson)
   - Use function 0 VPD only for identical functions (Alex Williamson)"

* tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: rcar: Add R8A7794 support
  PCI: Use function 0 VPD for identical functions, regular VPD for others
  PCI: Fix devfn for VPD access through function 0
  PCI/MSI: Fix MSI IRQ domains for VFs on virtual buses
  PCI: Clear IORESOURCE_UNSET when clipping a bridge window
  PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"
2015-09-25 11:16:53 -07:00
Linus Torvalds fadb97b089 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "This is a rather large update post rc1 due to the final steps of
  cleanups and API changes which had to wait for the preparatory patches
  to hit your tree.

   - Regression fixes for ARM GIC irqchips

   - Regression fixes and lockdep anotations for renesas irq chips

   - The leftovers of the cleanup and preparatory patches which have
     been ignored by maintainers

   - Final conversions of the newly merged users of obsolete APIs

   - Final removal of obsolete APIs

   - Final removal of ARM artifacts which had been introduced during the
     conversion of ARM to the generic interrupt code.

   - Final split of the irq_data into chip specific and common data to
     reflect the needs of hierarchical irq domains.

   - Treewide removal of the first argument of interrupt flow handlers,
     i.e. the irq number, which is not used by the majority of handlers
     and simple to retrieve from the other argument the irq descriptor.

   - A few comment updates and build warning fixes"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  arm64: Remove ununsed set_irq_flags
  ARM: Remove ununsed set_irq_flags
  sh: Kill off set_irq_flags usage
  irqchip: Kill off set_irq_flags usage
  gpu/drm: Kill off set_irq_flags usage
  genirq: Remove irq argument from irq flow handlers
  genirq: Move field 'msi_desc' from irq_data into irq_common_data
  genirq: Move field 'affinity' from irq_data into irq_common_data
  genirq: Move field 'handler_data' from irq_data into irq_common_data
  genirq: Move field 'node' from irq_data into irq_common_data
  irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag
  irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag
  genirq: Provide IRQD_FORWARDED_TO_VCPU status flag
  genirq: Simplify irq_data_to_desc()
  genirq: Remove __irq_set_handler_locked()
  pinctrl/pistachio: Use irq_set_handler_locked
  gpio: vf610: Use irq_set_handler_locked
  powerpc/mpc8xx: Use irq_set_handler_locked()
  powerpc/ipic: Use irq_set_handler_locked()
  powerpc/cpm2: Use irq_set_handler_locked()
  ...
2015-09-18 08:11:42 -07:00
Linus Torvalds f240bdd2a5 powerpc fixes for 4.3
- Fix 32-bit TCE table init in kdump kernel from Nish
  - Fix kdump with non-power-of-2 crashkernel= from Nish
  - Abort cxl_pci_enable_device_hook() if PCI channel is offline from Andrew
  - Fix to release DRC when configure_connector() fails from Bharata
  - Wire up sys_userfaultfd()
  - Fix race condition in tearing down MSI interrupts from Paul
  - Fix unbalanced pci_dev_get() in cxl_probe() from Daniel
  - Fix cxl build failure due to -Wunused-variable gcc behaviour change from Ian
  - Tell the toolchain to use ABI v2 when building an LE boot wrapper from Benh
  - Fix THP to recompute hash value after a failed update from Aneesh
  - 32-bit memcpy/memset: only use dcbz once cache is enabled from Christophe
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV+h5EAAoJEFHr6jzI4aWADboP/3jc6vUtOmFZ1GBGwXrftOc0
 PGtkTtYnGbaNsp+ZBl39Y+CFsSfhFVaDgylm/8G5NMKZaSBVdcFJwfU1w6Ymn49R
 nZkAT5PC9KgT5RTuRTZ3DO/Y2RC9vg2T9pXjEn8NGYcV8GgUkc3dZAn48S3AFgnV
 4jQI5sbxvwU12XkCUn+DkETh13g3gLYtRxwehBu/S/ovED5iNHKJwnXRzxyAA969
 dARNriSeyLBVMLamJ+rJB1S5hVTZTMbughFVVFbgriyIGuC/C1g9b9GN86dCGS6w
 T6VrKveK/iVCLUB16KV8+inbfvUrXItOxhGJWPHw9uAJGLZTz5G+yLHRRPX8onyC
 pgDesJDDpP/P7sAnKto3tF1Vzi7lwVtVPC1dT1Fc9VAWJGPYC/d16EKGIpNqqlnc
 mAIJ7wcI5c/HxvqXR2rdRV6fMer+aY7utwMsh4o/gDs7ArQUcuCrOKSW0jvHGmyr
 MumARXnUGDwPnGD8IfYI2vDOvwisv9g6XACwsM+pi499SfowaiLuK3utsMccagGZ
 INFtaqS7gpcj8TTu3kymw1TbKW/tqG9T81RRZ0rFH1q3aSfvwQ9QvsXctSeqOl/n
 lkxR3Mk0CT0oupXNKV6pjqsRwLroA0AF5tGxuw4ap1Rt4i8G7WHaPgRp7SPZxtkR
 qVBryfK5bKqYtWp4ztVK
 =Dgn5
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix 32-bit TCE table init in kdump kernel from Nish

 - Fix kdump with non-power-of-2 crashkernel= from Nish

 - Abort cxl_pci_enable_device_hook() if PCI channel is offline from
   Andrew

 - Fix to release DRC when configure_connector() fails from Bharata

 - Wire up sys_userfaultfd()

 - Fix race condition in tearing down MSI interrupts from Paul

 - Fix unbalanced pci_dev_get() in cxl_probe() from Daniel

 - Fix cxl build failure due to -Wunused-variable gcc behaviour change
   from Ian

 - Tell the toolchain to use ABI v2 when building an LE boot wrapper
   from Benh

 - Fix THP to recompute hash value after a failed update from Aneesh

 - 32-bit memcpy/memset: only use dcbz once cache is enabled from
   Christophe

* tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc32: memset: only use dcbz once cache is enabled
  powerpc32: memcpy: only use dcbz once cache is enabled
  powerpc/mm: Recompute hash value after a failed update
  powerpc/boot: Specify ABI v2 when building an LE boot wrapper
  cxl: Fix build failure due to -Wunused-variable behaviour change
  cxl: Fix unbalanced pci_dev_get in cxl_probe
  powerpc/MSI: Fix race condition in tearing down MSI interrupts
  powerpc: Wire up sys_userfaultfd()
  powerpc/pseries: Release DRC when configure_connector fails
  cxl: abort cxl_pci_enable_device_hook() if PCI channel is offline
  powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=
  powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel
2015-09-18 08:01:06 -07:00
LEROY Christophe 400c47d81c powerpc32: memset: only use dcbz once cache is enabled
memset() uses instruction dcbz to speed up clearing by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memset(). Allthough no part of the code
explicitly uses memset(), GCC may make calls to it.

This patch modifies memset() such that at startup, memset()
unconditionally skip the optimised bloc that uses dcbz instruction.

Once the initial MMU is set up, in machine_init() we patch memset()
by replacing this inconditional jump by a NOP

Tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-17 10:36:53 +10:00
LEROY Christophe 1cd03890ea powerpc32: memcpy: only use dcbz once cache is enabled
memcpy() uses instruction dcbz to speed up copy by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memcpy(). Allthough no part of the code
explicitly uses memcpy(), GCC makes calls to it.

This patch modifies memcpy() such that at startup, memcpy()
unconditionally jumps to generic_memcpy() which doesn't use
the dcbz instruction.

Once the initial MMU is set up, in machine_init() we patch memcpy()
by replacing this inconditional jump by a NOP

Reported-by: Michal Sojka <sojkam1@fel.cvut.cz>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-17 10:36:44 +10:00
Bjorn Helgaas 237865f195 PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"
Revert dff22d2054 ("PCI: Call pci_read_bridge_bases() from core instead
of arch code").

Reading PCI bridge windows is not arch-specific in itself, but there is PCI
core code that doesn't work correctly if we read them too early.  For
example, Hannes found this case on an ARM Freescale i.mx6 board:

  pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff]
  pci 0000:00:00.0: PCI bridge to [bus 01-ff]
  pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window)
  pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000]
  pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000]
  pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100]

The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs
0x204100 of space, and mem windows are megabyte-aligned.

Bus sizing can increase a bridge window size, but never *decrease* it (see
d65245c329 ("PCI: don't shrink bridge resources")).  Prior to
dff22d2054, ARM didn't read bridge windows at all, so the "original size"
was zero, and we assigned a 3MB window.

After dff22d2054, we read the bridge windows before sizing the bus.  The
firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since
we never decrease the size, we kept 16MB even though we only needed 3MB.
But 16MB doesn't fit in the host bridge aperture, so we failed to assign
space for the window and the downstream devices.

I think this is a defect in the PCI core: we shouldn't rely on the firmware
to assign sensible windows.

Ray reported a similar problem, also on ARM, with Broadcom iProc.

Issues like this are too hard to fix right now, so revert dff22d2054.

Reported-by: Hannes <oe5hpm@gmail.com>
Reported-by: Ray Jui <rjui@broadcom.com>
Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com
Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2015-09-15 13:18:04 -05:00
Jiang Liu da92b4eb7e powerpc, irq: Use access helper irq_data_get_affinity_mask()
Use access helper irq_data_get_affinity_mask() so we can move the
affinity mask to irq_common_data.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1433145945-789-25-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-15 17:06:28 +02:00
Linus Torvalds 519f526d39 ARM:
- Full debug support for arm64
 - Active state switching for timer interrupts
 - Lazy FP/SIMD save/restore for arm64
 - Generic ARMv8 target
 
 PPC:
 - Book3S: A few bug fixes
 - Book3S: Allow micro-threading on POWER8
 
 x86:
 - Compiler warnings
 
 Generic:
 - Adaptive polling for guest halt
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJV7qd/AAoJEL/70l94x66DDBcH/2OLomKHjDOGXqJ/dpkqf4UU
 FYI1pVjs2zP4z3L7RYV/DeuEsD6XaWzS7EXQOS3mcb9d8GWahPrdofeVmpmhg/8y
 jmkuUEFHl2Ut6imk8qDlG3m42c86Mk8/1k38l1bp8S3lL0/Q7IyADyYAlHdwzpOx
 yEyOAE4VU4n+VyQH5dbnzc12QRTeHfRQc/dI3eQq238gf37SF/1qzOzeLIdbEa+N
 DCzqQ8SExbctiRaLzCY5Ogan+unZBQbFfhrDrUSryywrzo/8WRFVmbjuf5O5Ucxa
 +UTLMvmm1YgxvBvWhlcmA+HSzSVeWNvaHQ9illgE5+74G5CzaD2ukurmoz/+r+A=
 =XtrL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:
 "ARM:
   - Full debug support for arm64
   - Active state switching for timer interrupts
   - Lazy FP/SIMD save/restore for arm64
   - Generic ARMv8 target

  PPC:
   - Book3S: A few bug fixes
   - Book3S: Allow micro-threading on POWER8

  x86:
   - Compiler warnings

  Generic:
   - Adaptive polling for guest halt"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (49 commits)
  kvm: irqchip: fix memory leak
  kvm: move new trace event outside #ifdef CONFIG_KVM_ASYNC_PF
  KVM: trace kvm_halt_poll_ns grow/shrink
  KVM: dynamic halt-polling
  KVM: make halt_poll_ns per-vCPU
  Silence compiler warning in arch/x86/kvm/emulate.c
  kvm: compile process_smi_save_seg_64() only for x86_64
  KVM: x86: avoid uninitialized variable warning
  KVM: PPC: Book3S: Fix typo in top comment about locking
  KVM: PPC: Book3S: Fix size of the PSPB register
  KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is set
  KVM: PPC: Book3S HV: Fix race in starting secondary threads
  KVM: PPC: Book3S: correct width in XER handling
  KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation
  KVM: PPC: Book3S HV: Fix preempted vcore list locking
  KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  KVM: PPC: Book3S HV: Fix bug in dirty page tracking
  KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE
  KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8
  KVM: PPC: Book3S HV: Make use of unused threads when running guests
  ...
2015-09-10 16:42:49 -07:00
Linus Torvalds 12f03ee606 libnvdimm for 4.3:
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
    mechanism for adding device-driver-discovered memory regions to the
    kernel's direct map.  This facility is used by the pmem driver to
    enable pfn_to_page() operations on the page frames returned by DAX
    ('direct_access' in 'struct block_device_operations'). For now, the
    'memmap' allocation for these "device" pages comes from "System
    RAM".  Support for allocating the memmap from device memory will
    arrive in a later kernel.
 
 2/ Introduce memremap() to replace usages of ioremap_cache() and
    ioremap_wt().  memremap() drops the __iomem annotation for these
    mappings to memory that do not have i/o side effects.  The
    replacement of ioremap_cache() with memremap() is limited to the
    pmem driver to ease merging the api change in v4.3.  Completion of
    the conversion is targeted for v4.4.
 
 3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
    driver, update the VFS DAX implementation and PMEM api to provide
    persistence guarantees for kernel operations on a DAX mapping.
 
 4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
    cacheable to improve performance.
 
 5/ Miscellaneous updates and fixes to libnvdimm including support
    for issuing "address range scrub" commands, clarifying the optimal
    'sector size' of pmem devices, a clarification of the usage of the
    ACPI '_STA' (status) property for DIMM devices, and other minor
    fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
 JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
 OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
 nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
 NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
 zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
 1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
 sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
 bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
 o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
 dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
 slsw6DkrWT60CRE42nbK
 =o57/
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "This update has successfully completed a 0day-kbuild run and has
  appeared in a linux-next release.  The changes outside of the typical
  drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
  removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
  the introduction of ZONE_DEVICE + devm_memremap_pages().

  Summary:

   - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
     mechanism for adding device-driver-discovered memory regions to the
     kernel's direct map.

     This facility is used by the pmem driver to enable pfn_to_page()
     operations on the page frames returned by DAX ('direct_access' in
     'struct block_device_operations').

     For now, the 'memmap' allocation for these "device" pages comes
     from "System RAM".  Support for allocating the memmap from device
     memory will arrive in a later kernel.

   - Introduce memremap() to replace usages of ioremap_cache() and
     ioremap_wt().  memremap() drops the __iomem annotation for these
     mappings to memory that do not have i/o side effects.  The
     replacement of ioremap_cache() with memremap() is limited to the
     pmem driver to ease merging the api change in v4.3.

     Completion of the conversion is targeted for v4.4.

   - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
     driver, update the VFS DAX implementation and PMEM api to provide
     persistence guarantees for kernel operations on a DAX mapping.

   - Convert the ACPI NFIT 'BLK' driver to map the block apertures as
     cacheable to improve performance.

   - Miscellaneous updates and fixes to libnvdimm including support for
     issuing "address range scrub" commands, clarifying the optimal
     'sector size' of pmem devices, a clarification of the usage of the
     ACPI '_STA' (status) property for DIMM devices, and other minor
     fixes"

* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
  libnvdimm, pmem: direct map legacy pmem by default
  libnvdimm, pmem: 'struct page' for pmem
  libnvdimm, pfn: 'struct page' provider infrastructure
  x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
  add devm_memremap_pages
  mm: ZONE_DEVICE for "device memory"
  mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
  dax: drop size parameter to ->direct_access()
  nd_blk: change aperture mapping from WC to WB
  nvdimm: change to use generic kvfree()
  pmem, dax: have direct_access use __pmem annotation
  dax: update I/O path to do proper PMEM flushing
  pmem: add copy_from_iter_pmem() and clear_pmem()
  pmem, x86: clean up conditional pmem includes
  pmem: remove layer when calling arch_has_wmb_pmem()
  pmem, x86: move x86 PMEM API to new pmem.h header
  libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
  pmem: switch to devm_ allocations
  devres: add devm_memremap
  libnvdimm, btt: write and validate parent_uuid
  ...
2015-09-08 14:35:59 -07:00
Linus Torvalds ff474e8ca8 powerpc updates for 4.3
- Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask from Benjamin Herrenschmidt
  - EEH fixes for SRIOV from Gavin
  - Introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth
  - Use hardware RNG for arch_get_random_seed_* not arch_get_random_* from Paul Mackerras
  - Seccomp filter support from Michael Ellerman
  - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh Salgaonkar
  - Add powerpc timebase as a trace clock source from Naveen N. Rao
  - Misc cleanups in the xmon, signal & SLB code from Anshuman Khandual
  - Add an inline function to update POWER8 HID0 from Gautham R. Shenoy
  - Fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman
  - Drop support for 64K local store on 4K kernels from Michael Ellerman
  - move dma_get_required_mask() from pnv_phb to pci_controller_ops from Andrew Donnellan
  - Initialize distance lookup table from drconf path from Nikunj A Dadhania
  - Enable RTC class support from Vaibhav Jain
  - Disable automatically blocked PCI config from Gavin Shan
  - Add LEDs driver for PowerNV platform from Vasant Hegde
  - Fix endianness issues in the HVSI driver from Laurent Dufour
  - Kexec endian fixes from Samuel Mendoza-Jonas
  - Fix corrupted pdn list from Gavin Shan
  - Fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan
 
  - Freescale updates from Scott: Highlights include 32-bit memcpy/memset
    optimizations, checksum optimizations, 85xx config fragments and updates,
    device tree updates, e6500 fixes for non-SMP, and misc cleanup and minor
    fixes.
 
  - A ton of cxl updates & fixes:
   - Add explicit precision specifiers from Rasmus Villemoes
   - use more common format specifier from Rasmus Villemoes
   - Destroy cxl_adapter_idr on module_exit from Johannes Thumshirn
   - Destroy afu->contexts_idr on release of an afu from Johannes Thumshirn
   - Compile with -Werror from Daniel Axtens
   - EEH support from Daniel Axtens
   - Plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain
   - Add alternate MMIO error handling from Ian Munsie
   - Allow release of contexts which have been OPENED but not STARTED from Andrew Donnellan
   - Remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar
   - Release irqs if memory allocation fails from Vaibhav Jain
   - Remove racy attempt to force EEH invocation in reset from Daniel Axtens
   - Fix + cleanup error paths in cxl_dev_context_init from Ian Munsie
   - Fix force unmapping mmaps of contexts allocated through the kernel api from Ian Munsie
   - Set up and enable PSL Timebase from Philippe Bergheaud
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV5+GzAAoJEFHr6jzI4aWA0iAP/jcd0kNaNBzLgcDKKygKdgz4
 xn4EWu81vfMfZYWesb0ATrjlH0hLsRxSXoFUqUMhtJTa5kNAoCIaz/M8WBALS50h
 aT+i7br4WEU2j2FcaMyP3iAZx/2hl+2utODJSHPRWPkec1fUDBfEyBf++e520RWM
 HUQGIGZXh8yq7KMA96Pwhsvls9vOB8hS2UdU/NS8ff3J5jFvXC1/WmF2qfzJBS1V
 8iHyz26Jl8+dJ+et7iC2oD5XQAjIH1oJgOyPVPBzAQttfi8RjuVzRA30TfPBAUwI
 lC9nlmPy6bCe4kiQYWVB1z7GegHyW/9vkeuMj/u8mZbqpaayMEMZmd2C3iNDXNHx
 i2NSvdln539t4qWYsV2v6lVCfa/ayDHD73Wackj5Dk394tzXnpCPhxNzc2yKEd5v
 h7vwYc9jBhsbfSCSogaM+gSHJ1APgCidggHJMYYNA2nN2u6V62RpsMB7zp/1+Q2v
 yqYdD8oYF4Dm21x/ujaNFrlizROD46WS0UqdJ3yP6HAqRYIpRXtibmpECJgt1n5h
 HjADEci4hQ2UQxdMdp/Q5KZnPTJebBtrZrmkW5r6cZBUaTB5TVkFaEWN44CT/Loh
 tMNeA3qOBN06CaQS2WL3UUUWpbZq9fSbWuUZ5lWZDb5AOyRxe5eWVYNLkiyIXozY
 L24l1bYdBhXahnjoS/kc
 =n9+X
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask
   from Benjamin Herrenschmidt

 - EEH fixes for SRIOV from Gavin

 - introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth

 - use hardware RNG for arch_get_random_seed_* not arch_get_random_*
   from Paul Mackerras

 - seccomp filter support from Michael Ellerman

 - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh
   Salgaonkar

 - add powerpc timebase as a trace clock source from Naveen N.  Rao

 - misc cleanups in the xmon, signal & SLB code from Anshuman Khandual

 - add an inline function to update POWER8 HID0 from Gautham R.  Shenoy

 - fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman

 - drop support for 64K local store on 4K kernels from Michael Ellerman

 - move dma_get_required_mask() from pnv_phb to pci_controller_ops from
   Andrew Donnellan

 - initialize distance lookup table from drconf path from Nikunj A
   Dadhania

 - enable RTC class support from Vaibhav Jain

 - disable automatically blocked PCI config from Gavin Shan

 - add LEDs driver for PowerNV platform from Vasant Hegde

 - fix endianness issues in the HVSI driver from Laurent Dufour

 - kexec endian fixes from Samuel Mendoza-Jonas

 - fix corrupted pdn list from Gavin Shan

 - fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan

 - Freescale updates from Scott: Highlights include 32-bit memcpy/memset
   optimizations, checksum optimizations, 85xx config fragments and
   updates, device tree updates, e6500 fixes for non-SMP, and misc
   cleanup and minor fixes.

 - a ton of cxl updates & fixes:
    - add explicit precision specifiers from Rasmus Villemoes
    - use more common format specifier from Rasmus Villemoes
    - destroy cxl_adapter_idr on module_exit from Johannes Thumshirn
    - destroy afu->contexts_idr on release of an afu from Johannes
      Thumshirn
    - compile with -Werror from Daniel Axtens
    - EEH support from Daniel Axtens
    - plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain
    - add alternate MMIO error handling from Ian Munsie
    - allow release of contexts which have been OPENED but not STARTED
      from Andrew Donnellan
    - remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar
    - release irqs if memory allocation fails from Vaibhav Jain
    - remove racy attempt to force EEH invocation in reset from Daniel
      Axtens
    - fix + cleanup error paths in cxl_dev_context_init from Ian Munsie
    - fix force unmapping mmaps of contexts allocated through the kernel
      api from Ian Munsie
    - set up and enable PSL Timebase from Philippe Bergheaud

* tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (140 commits)
  cxl: Set up and enable PSL Timebase
  cxl: Fix force unmapping mmaps of contexts allocated through the kernel api
  cxl: Fix + cleanup error paths in cxl_dev_context_init
  powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()
  powerpc/pseries: Cleanup on pci_dn_reconfig_notifier()
  powerpc/pseries: Fix corrupted pdn list
  powerpc/powernv: Enable LEDS support
  powerpc/iommu: Set default DMA offset in dma_dev_setup
  cxl: Remove racy attempt to force EEH invocation in reset
  cxl: Release irqs if memory allocation fails
  cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE
  powerpc/powernv: Fix mis-merge of OPAL support for LEDS driver
  powerpc/powernv: Reset HILE before kexec_sequence()
  powerpc/kexec: Reset secondary cpu endianness before kexec
  powerpc/hvsi: Fix endianness issues in the HVSI driver
  leds/powernv: Add driver for PowerNV platform
  powerpc/powernv: Create LED platform device
  powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states
  powerpc/powernv: Fix the log message when disabling VF
  cxl: Allow release of contexts which have been OPENED but not STARTED
  ...
2015-09-03 16:41:38 -07:00
Linus Torvalds ca520cab25 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and atomic updates from Ingo Molnar:
 "Main changes in this cycle are:

   - Extend atomic primitives with coherent logic op primitives
     (atomic_{or,and,xor}()) and deprecate the old partial APIs
     (atomic_{set,clear}_mask())

     The old ops were incoherent with incompatible signatures across
     architectures and with incomplete support.  Now every architecture
     supports the primitives consistently (by Peter Zijlstra)

   - Generic support for 'relaxed atomics':

       - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
       - atomic_read_acquire()
       - atomic_set_release()

     This came out of porting qwrlock code to arm64 (by Will Deacon)

   - Clean up the fragile static_key APIs that were causing repeat bugs,
     by introducing a new one:

       DEFINE_STATIC_KEY_TRUE(name);
       DEFINE_STATIC_KEY_FALSE(name);

     which define a key of different types with an initial true/false
     value.

     Then allow:

       static_branch_likely()
       static_branch_unlikely()

     to take a key of either type and emit the right instruction for the
     case.  To be able to know the 'type' of the static key we encode it
     in the jump entry (by Peter Zijlstra)

   - Static key self-tests (by Jason Baron)

   - qrwlock optimizations (by Waiman Long)

   - small futex enhancements (by Davidlohr Bueso)

   - ... and misc other changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  jump_label/x86: Work around asm build bug on older/backported GCCs
  locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
  locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
  locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
  locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
  locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
  locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
  locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
  locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
  locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
  locking/static_keys: Make verify_keys() static
  jump label, locking/static_keys: Update docs
  locking/static_keys: Provide a selftest
  jump_label: Provide a self-test
  s390/uaccess, locking/static_keys: employ static_branch_likely()
  x86, tsc, locking/static_keys: Employ static_branch_likely()
  locking/static_keys: Add selftest
  locking/static_keys: Add a new static_key interface
  locking/static_keys: Rework update logic
  locking/static_keys: Add static_key_{en,dis}able() helpers
  ...
2015-09-03 15:46:07 -07:00
Linus Torvalds 5e359bf221 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "Rather large, but nothing exiting:

   - new range check for settimeofday() to prevent that boot time
     becomes negative.
   - fix for file time rounding
   - a few simplifications of the hrtimer code
   - fix for the proc/timerlist code so the output of clock realtime
     timers is accurate
   - more y2038 work
   - tree wide conversion of clockevent drivers to the new callbacks"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (88 commits)
  hrtimer: Handle failure of tick_init_highres() gracefully
  hrtimer: Unconfuse switch_hrtimer_base() a bit
  hrtimer: Simplify get_target_base() by returning current base
  hrtimer: Drop return code of hrtimer_switch_to_hres()
  time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64()
  time: Introduce current_kernel_time64()
  time: Introduce struct itimerspec64
  time: Add the common weak version of update_persistent_clock()
  time: Always make sure wall_to_monotonic isn't positive
  time: Fix nanosecond file time rounding in timespec_trunc()
  timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers
  cris/time: Migrate to new 'set-state' interface
  kernel: broadcast-hrtimer: Migrate to new 'set-state' interface
  xtensa/time: Migrate to new 'set-state' interface
  unicore/time: Migrate to new 'set-state' interface
  um/time: Migrate to new 'set-state' interface
  sparc/time: Migrate to new 'set-state' interface
  sh/localtimer: Migrate to new 'set-state' interface
  score/time: Migrate to new 'set-state' interface
  s390/time: Migrate to new 'set-state' interface
  ...
2015-09-01 14:04:50 -07:00
Linus Torvalds d4c90396ed Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 4.3:

  API:

   - the AEAD interface transition is now complete.
   - add top-level skcipher interface.

  Drivers:

   - x86-64 acceleration for chacha20/poly1305.
   - add sunxi-ss Allwinner Security System crypto accelerator.
   - add RSA algorithm to qat driver.
   - add SRIOV support to qat driver.
   - add LS1021A support to caam.
   - add i.MX6 support to caam"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (163 commits)
  crypto: algif_aead - fix for multiple operations on AF_ALG sockets
  crypto: qat - enable legacy VFs
  MPI: Fix mpi_read_buffer
  crypto: qat - silence a static checker warning
  crypto: vmx - Fixing opcode issue
  crypto: caam - Use the preferred style for memory allocations
  crypto: caam - Propagate the real error code in caam_probe
  crypto: caam - Fix the error handling in caam_probe
  crypto: caam - fix writing to JQCR_MS when using service interface
  crypto: hash - Add AHASH_REQUEST_ON_STACK
  crypto: testmgr - Use new skcipher interface
  crypto: skcipher - Add top-level skcipher interface
  crypto: cmac - allow usage in FIPS mode
  crypto: sahara - Use dmam_alloc_coherent
  crypto: caam - Add support for LS1021A
  crypto: qat - Don't move data inside output buffer
  crypto: vmx - Fixing GHASH Key issue on little endian
  crypto: vmx - Fixing AES-CTR counter bug
  crypto: null - Add missing Kconfig tristate for NULL2
  crypto: nx - Add forward declaration for struct crypto_aead
  ...
2015-08-31 17:38:39 -07:00
Linus Torvalds 26f8b7edc9 PCI changes for the v4.3 merge window:
Enumeration
     Allocate ATS struct during enumeration (Bjorn Helgaas)
     Embed ATS info directly into struct pci_dev (Bjorn Helgaas)
     Reduce size of ATS structure elements (Bjorn Helgaas)
     Stop caching ATS Invalidate Queue Depth (Bjorn Helgaas)
     iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth (Bjorn Helgaas)
     Move MPS configuration check to pci_configure_device() (Bjorn Helgaas)
     Set MPS to match upstream bridge (Keith Busch)
     ARM/PCI: Set MPS before pci_bus_add_devices() (Murali Karicheri)
     Add pci_scan_root_bus_msi() (Lorenzo Pieralisi)
     ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() (Lorenzo Pieralisi)
 
   Resource management
     Call pci_read_bridge_bases() from core instead of arch code (Lorenzo Pieralisi)
 
   PCI device hotplug
     pciehp: Remove unused interrupt events (Bjorn Helgaas)
     pciehp: Remove ignored MRL sensor interrupt events (Bjorn Helgaas)
     pciehp: Handle invalid data when reading from non-existent devices (Jarod Wilson)
     pciehp: Simplify pcie_poll_cmd() (Yijing Wang)
     Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot (Yijing Wang)
     Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem (Yijing Wang)
     Hold pci_slot_mutex while searching bus->slots list (Yijing Wang)
 
   Power management
     Disable async suspend/resume for JMicron multi-function SATA/AHCI (Zhang Rui)
 
   Virtualization
     Add ACS quirks for Intel I219-LM/V (Alex Williamson)
     Restore ACS configuration as part of pci_restore_state() (Alexander Duyck)
 
   MSI
     Add pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu)
     x86: Implement pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu)
     Add helpers to manage pci_dev->irq and pci_dev->irq_managed (Jiang Liu)
     Free legacy IRQ when enabling MSI/MSI-X (Jiang Liu)
     ARM/PCI: Remove msi_controller from struct pci_sys_data (Lorenzo Pieralisi)
     Remove unused pcibios_msi_controller() hook (Lorenzo Pieralisi)
 
   Generic host bridge driver
     Remove dependency on ARM-specific struct hw_pci (Jayachandran C)
     Build setup-irq.o for arm64 (Jayachandran C)
     Add arm64 support (Jayachandran C)
 
   APM X-Gene host bridge driver
     Add APM X-Gene PCIe 64-bit prefetchable window (Duc Dang)
     Add support for a 64-bit prefetchable memory window (Duc Dang)
     Drop owner assignment from platform_driver (Krzysztof Kozlowski)
 
   Broadcom iProc host bridge driver
     Allow BCMA bus driver to be built as module (Hauke Mehrtens)
     Delete unnecessary checks before phy calls (Markus Elfring)
     Add arm64 support (Ray Jui)
 
   Synopsys DesignWare host bridge driver
     Don't complain missing *config* reg space if va_cfg0 is set (Murali Karicheri)
 
   TI DRA7xx host bridge driver
     Disable pm_runtime on get_sync failure (Kishon Vijay Abraham I)
     Add PM support (Kishon Vijay Abraham I)
     Clear MSE bit during suspend so clocks will idle (Kishon Vijay Abraham I)
     Add support to make GPIO drive PERST# line (Kishon Vijay Abraham I)
 
   Xilinx AXI host bridge driver
     Check for MSI interrupt flag before handling as INTx (Russell Joyce)
 
   Miscellaneous
     Fix Intersil/Techwell TW686[4589] AV capture class code (Krzysztof Hałasa)
     Use PCI_CLASS_SERIAL_USB instead of bare number (Bjorn Helgaas)
     Fix generic NCR 53c810 class code quirk (Bjorn Helgaas)
     Fix TI816X class code quirk (Bjorn Helgaas)
     Remove unused "pci_probe" flags (Bjorn Helgaas)
     Host bridge driver code simplifications (Fabio Estevam)
     Add dev_flags bit to access VPD through function 0 (Mark Rustad)
     Add VPD function 0 quirk for Intel Ethernet devices (Mark Rustad)
     Kill off set_irq_flags() usage (Rob Herring)
     Remove Intel Cherrytrail D3 delays (Srinidhi Kasagar)
     Clean up pci_find_capability() (Wei Yang)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV5FE/AAoJEFmIoMA60/r8I2QP/R9b9MrvH2i9tN98/lTDl7g3
 czE58ZM1d4kMYtW3Pm/DrYI6y6RprAaB4ZEp5rHxlFLqBPZEQwWodA19NkjECcb6
 g5qKWOdIWA4T6Jaab6a/yCmAFa0jni7iAmmTYqca9o3Xj7tFovxDxqPSYkh+rer0
 v+1sAr/4HXSiN339KR6teEF3VZqLFp6ewMydQlVS+R7kAOHHYQDqoo9WF6JnIoL5
 PO3Kbmr1WN3fZY3s98yLq1x6XmLrLlmGdJI+2r+KewO4r/05CL6wTVP/oTMi+Eti
 dueseeISlOTcTAUhk87Vap23uJPeB/rJbYoFdCr7+0AkZGe/U/E2dpZm2wyMcCvq
 OrATuFymgzIuJm5uUPsdH4lzsX97U9BcDccracfC38rYnP5u3bqHCjw8HJzANR7p
 VYbFBzc5ZCCUYtQAjyrKt2820AvTFo+Bu+z75IsJO8LQQgv/zGtQQ8grIQeAjH+l
 sAe3xOTwzZnq6Obl4qb/GElHmIGUbQ1X4Dx1mliiijKMKkhYHOA0iFnB/OBILmEZ
 wHzKU8chWcI9lip0aaX8q9i/qovdVUt2+rdo/N40l7YY66x4jkNgQQXZX+FSKk6H
 stTvEBQgK28EKCHDxMsgzTGIqllSyk4DnRMA7ij1hRWqdUbGk7wOPTvm9QSwNDWe
 SokuWzAQD9YeMRGdsYjZ
 =DX1r
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for the v4.3 merge window:

  Enumeration:
   - Allocate ATS struct during enumeration (Bjorn Helgaas)
   - Embed ATS info directly into struct pci_dev (Bjorn Helgaas)
   - Reduce size of ATS structure elements (Bjorn Helgaas)
   - Stop caching ATS Invalidate Queue Depth (Bjorn Helgaas)
   - iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth (Bjorn Helgaas)
   - Move MPS configuration check to pci_configure_device() (Bjorn Helgaas)
   - Set MPS to match upstream bridge (Keith Busch)
   - ARM/PCI: Set MPS before pci_bus_add_devices() (Murali Karicheri)
   - Add pci_scan_root_bus_msi() (Lorenzo Pieralisi)
   - ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() (Lorenzo Pieralisi)

  Resource management:
   - Call pci_read_bridge_bases() from core instead of arch code (Lorenzo Pieralisi)

  PCI device hotplug:
   - pciehp: Remove unused interrupt events (Bjorn Helgaas)
   - pciehp: Remove ignored MRL sensor interrupt events (Bjorn Helgaas)
   - pciehp: Handle invalid data when reading from non-existent devices (Jarod Wilson)
   - pciehp: Simplify pcie_poll_cmd() (Yijing Wang)
   - Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot (Yijing Wang)
   - Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem (Yijing Wang)
   - Hold pci_slot_mutex while searching bus->slots list (Yijing Wang)

  Power management:
   - Disable async suspend/resume for JMicron multi-function SATA/AHCI (Zhang Rui)

  Virtualization:
   - Add ACS quirks for Intel I219-LM/V (Alex Williamson)
   - Restore ACS configuration as part of pci_restore_state() (Alexander Duyck)

  MSI:
   - Add pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu)
   - x86: Implement pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu)
   - Add helpers to manage pci_dev->irq and pci_dev->irq_managed (Jiang Liu)
   - Free legacy IRQ when enabling MSI/MSI-X (Jiang Liu)
   - ARM/PCI: Remove msi_controller from struct pci_sys_data (Lorenzo Pieralisi)
   - Remove unused pcibios_msi_controller() hook (Lorenzo Pieralisi)

  Generic host bridge driver:
   - Remove dependency on ARM-specific struct hw_pci (Jayachandran C)
   - Build setup-irq.o for arm64 (Jayachandran C)
   - Add arm64 support (Jayachandran C)

  APM X-Gene host bridge driver:
   - Add APM X-Gene PCIe 64-bit prefetchable window (Duc Dang)
   - Add support for a 64-bit prefetchable memory window (Duc Dang)
   - Drop owner assignment from platform_driver (Krzysztof Kozlowski)

  Broadcom iProc host bridge driver:
   - Allow BCMA bus driver to be built as module (Hauke Mehrtens)
   - Delete unnecessary checks before phy calls (Markus Elfring)
   - Add arm64 support (Ray Jui)

  Synopsys DesignWare host bridge driver:
   - Don't complain missing *config* reg space if va_cfg0 is set (Murali Karicheri)

  TI DRA7xx host bridge driver:
   - Disable pm_runtime on get_sync failure (Kishon Vijay Abraham I)
   - Add PM support (Kishon Vijay Abraham I)
   - Clear MSE bit during suspend so clocks will idle (Kishon Vijay Abraham I)
   - Add support to make GPIO drive PERST# line (Kishon Vijay Abraham I)

  Xilinx AXI host bridge driver:
   - Check for MSI interrupt flag before handling as INTx (Russell Joyce)

  Miscellaneous:
   - Fix Intersil/Techwell TW686[4589] AV capture class code (Krzysztof Hałasa)
   - Use PCI_CLASS_SERIAL_USB instead of bare number (Bjorn Helgaas)
   - Fix generic NCR 53c810 class code quirk (Bjorn Helgaas)
   - Fix TI816X class code quirk (Bjorn Helgaas)
   - Remove unused "pci_probe" flags (Bjorn Helgaas)
   - Host bridge driver code simplifications (Fabio Estevam)
   - Add dev_flags bit to access VPD through function 0 (Mark Rustad)
   - Add VPD function 0 quirk for Intel Ethernet devices (Mark Rustad)
   - Kill off set_irq_flags() usage (Rob Herring)
   - Remove Intel Cherrytrail D3 delays (Srinidhi Kasagar)
   - Clean up pci_find_capability() (Wei Yang)"

* tag 'pci-v4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (72 commits)
  PCI: Disable async suspend/resume for JMicron multi-function SATA/AHCI
  PCI: Set MPS to match upstream bridge
  PCI: Move MPS configuration check to pci_configure_device()
  PCI: Drop references acquired by of_parse_phandle()
  PCI/MSI: Remove unused pcibios_msi_controller() hook
  ARM/PCI: Remove msi_controller from struct pci_sys_data
  ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi()
  PCI: Add pci_scan_root_bus_msi()
  ARM/PCI: Replace panic with WARN messages on failures
  PCI: generic: Add arm64 support
  PCI: Build setup-irq.o for arm64
  PCI: generic: Remove dependency on ARM-specific struct hw_pci
  PCI: imx6: Simplify a trivial if-return sequence
  PCI: spear: Use BUG_ON() instead of condition followed by BUG()
  PCI: dra7xx: Remove unneeded use of IS_ERR_VALUE()
  PCI: Remove pci_ats_enabled()
  PCI: Stop caching ATS Invalidate Queue Depth
  PCI: Move ATS declarations to linux/pci.h so they're all together
  PCI: Clean up ATS error handling
  PCI: Use pci_physfn() rather than looking up physfn by hand
  ...
2015-08-31 17:14:39 -07:00
Gavin Shan 259800135c powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()
The config space of some PCI devices can't be accessed when their
PEs are in frozen state. Otherwise, fenced PHB might be seen.
Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing
EEH_PE_CFG_BLOCKED is set automatically when the PE is put to
frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores
PCI device BARs with eeh_pe_restore_bars(), which then calls
eeh_ops->restore_config() to reinitialize the PCI device in
(OPAL) firmware. eeh_ops->restore_config() produces PCI config
access that causes fenced PHB. The problem was reported on below
adapter:

   0001:01:00.0 0200: 14e4:168e (rev 10)
   0001:01:00.0 Ethernet controller: Broadcom Corporation \
                NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)

This fixes the issue by skipping eeh_pe_restore_bars() in
eeh_slot_error_detail() when EEH_PE_CFG_BLOCKED is set for the PE.

Fixes: b6541db1 ("powerpc/eeh: Block PCI config access upon frozen PE")
Cc: stable@vger.kernel.org # v4.0+
Reported-by: Manvanthara B. Puttashankar <mputtash@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-28 13:26:31 +10:00
Michael Ellerman 9698351565 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include 32-bit memcpy/memset optimizations, checksum
optimizations, 85xx config fragments and updates, device tree updates,
e6500 fixes for non-SMP, and misc cleanup and minor fixes."
2015-08-27 20:13:12 +10:00
Guilherme G. Piccoli 4d9aac397a powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case
Since commit 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if
kernel doesn't support MSI"), the setup of dev->msi_cap/msix_cap and the
disable of MSI/MSI-X interrupts isn't being done at PCI probe time, as
the logic responsible for this was moved in the aforementioned commit
from pci_device_add() to pci_setup_device(). The latter function is not
reachable on PowerPC pseries platform during Open Firmware PCI probing
time.

This exhibits as drivers not being able to enable MSI, eg:

  bnx2x 0000:01:00.0: no msix capability found

This patch calls pci_msi_setup_pci_dev() explicitly to disable MSI/MSI-X
during PCI probe time on pSeries platform.

Fixes: 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
[mpe: Flesh out change log and clarify comment]
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-26 21:40:50 +10:00
Paul Mackerras b4deba5c41 KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8
This builds on the ability to run more than one vcore on a physical
core by using the micro-threading (split-core) modes of the POWER8
chip.  Previously, only vcores from the same VM could be run together,
and (on POWER8) only if they had just one thread per core.  With the
ability to split the core on guest entry and unsplit it on guest exit,
we can run up to 8 vcpu threads from up to 4 different VMs, and we can
run multiple vcores with 2 or 4 vcpus per vcore.

Dynamic micro-threading is only available if the static configuration
of the cores is whole-core mode (unsplit), and only on POWER8.

To manage this, we introduce a new kvm_split_mode struct which is
shared across all of the subcores in the core, with a pointer in the
paca on each thread.  In addition we extend the core_info struct to
have information on each subcore.  When deciding whether to add a
vcore to the set already on the core, we now have two possibilities:
(a) piggyback the vcore onto an existing subcore, or (b) start a new
subcore.

Currently, when any vcpu needs to exit the guest and switch to host
virtual mode, we interrupt all the threads in all subcores and switch
the core back to whole-core mode.  It may be possible in future to
allow some of the subcores to keep executing in the guest while
subcore 0 switches to the host, but that is not implemented in this
patch.

This adds a module parameter called dynamic_mt_modes which controls
which micro-threading (split-core) modes the code will consider, as a
bitmap.  In other words, if it is 0, no micro-threading mode is
considered; if it is 2, only 2-way micro-threading is considered; if
it is 4, only 4-way, and if it is 6, both 2-way and 4-way
micro-threading mode will be considered.  The default is 6.

With this, we now have secondary threads which are the primary thread
for their subcore and therefore need to do the MMU switch.  These
threads will need to be started even if they have no vcpu to run, so
we use the vcore pointer in the PACA rather than the vcpu pointer to
trigger them.

It is now possible for thread 0 to find that an exit has been
requested before it gets to switch the subcore state to the guest.  In
that case we haven't added the guest's timebase offset to the
timebase, so we need to be careful not to subtract the offset in the
guest exit path.  In fact we just skip the whole path that switches
back to host context, since we haven't switched to the guest context.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-08-22 11:16:17 +02:00
Paul Mackerras ec25716508 KVM: PPC: Book3S HV: Make use of unused threads when running guests
When running a virtual core of a guest that is configured with fewer
threads per core than the physical cores have, the extra physical
threads are currently unused.  This makes it possible to use them to
run one or more other virtual cores from the same guest when certain
conditions are met.  This applies on POWER7, and on POWER8 to guests
with one thread per virtual core.  (It doesn't apply to POWER8 guests
with multiple threads per vcore because they require a 1-1 virtual to
physical thread mapping in order to be able to use msgsndp and the
TIR.)

The idea is that we maintain a list of preempted vcores for each
physical cpu (i.e. each core, since the host runs single-threaded).
Then, when a vcore is about to run, it checks to see if there are
any vcores on the list for its physical cpu that could be
piggybacked onto this vcore's execution.  If so, those additional
vcores are put into state VCORE_PIGGYBACK and their runnable VCPU
threads are started as well as the original vcore, which is called
the master vcore.

After the vcores have exited the guest, the extra ones are put back
onto the preempted list if any of their VCPUs are still runnable and
not idle.

This means that vcpu->arch.ptid is no longer necessarily the same as
the physical thread that the vcpu runs on.  In order to make it easier
for code that wants to send an IPI to know which CPU to target, we
now store that in a new field in struct vcpu_arch, called thread_cpu.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-08-22 11:16:17 +02:00
Samuel Mendoza-Jonas ffebf5f391 powerpc/kexec: Reset secondary cpu endianness before kexec
If the target kernel does not inlcude the FIXUP_ENDIAN check, coming
from a different-endian kernel will cause the target kernel to panic.
All ppc64 kernels can handle starting in big-endian mode, so return to
big-endian before branching into the target kernel.

This mainly affects pseries as secondaries on powernv are returned to
OPAL.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-20 18:19:08 +10:00
Hari Bathini 74943dab6b powerpc/nvram: print no error when pstore backend is not nvram
Pstore only supports one backend at a time. The preferred
pstore backend is set by passing the pstore.backend=<name>
argument to the kernel at boot time. Currently, while trying
to register with pstore, nvram throws an error message even
when "pstore.backend != nvram", which is unnecessary. This
patch removes the error message in case "pstore.backend != nvram".

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-19 16:14:21 +10:00
Andrzej Hajda fc9e9cbf4e powerpc/nvram: use kmemdup rather than duplicating its implementation
The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-18 19:34:43 +10:00
Gavin Shan 39bfd715b4 powerpc/eeh: Disable automatically blocked PCI config
pcibios_set_pcie_reset_state() could be called to complete
reset request when passing through PCI device, flag
EEH_PE_ISOLATED is set before saving the PCI config sapce.
On some Broadcom adapters, EEH_PE_CFG_BLOCKED is automatically
set when the flag EEH_PE_ISOLATED is marked. It caused bogus
data saved from the PCI config space, which will be restored
to the PCI adapter after the reset. Eventually, the hardware
can't work with corrupted data in PCI config space.

The patch fixes the issue with eeh_pe_state_mark_no_cfg(), which
doesn't set EEH_PE_CFG_BLOCKED when seeing EEH_PE_ISOLATED on the
PE, in order to avoid the bogus data saved and restored to the PCI
config space.

Reported-by: Rajanikanth H. Adaveeshaiah <rajanikanth.ha@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-18 19:34:42 +10:00
Andrew Donnellan 53522982fc powerpc/powernv: move dma_get_required_mask from pnv_phb to pci_controller_ops
Simplify the dma_get_required_mask call chain by moving it from pnv_phb to
pci_controller_ops, similar to commit 763d2d8df1 ("powerpc/powernv:
Move dma_set_mask from pnv_phb to pci_controller_ops").

Previous call chain:

  0) call dma_get_required_mask() (kernel/dma.c)
  1) call ppc_md.dma_get_required_mask, if it exists. On powernv, that
     points to pnv_dma_get_required_mask() (platforms/powernv/setup.c)
  2) device is PCI, therefore call pnv_pci_dma_get_required_mask()
     (platforms/powernv/pci.c)
  3) call phb->dma_get_required_mask if it exists
  4) it only exists in the ioda case, where it points to
       pnv_pci_ioda_dma_get_required_mask() (platforms/powernv/pci-ioda.c)

New call chain:

  0) call dma_get_required_mask() (kernel/dma.c)
  1) device is PCI, therefore call pci_controller_ops.dma_get_required_mask
     if it exists
  2) in the ioda case, that points to pnv_pci_ioda_dma_get_required_mask()
     (platforms/powernv/pci-ioda.c)

In the p5ioc2 case, the call chain remains the same -
dma_get_required_mask() does not find either a ppc_md call or
pci_controller_ops call, so it calls __dma_get_required_mask().

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-18 19:32:11 +10:00
Kevin Hao e5e55cc08c powerpc/e6500: remove the stale TCD_LOCK macro
Since we moved the "lock" to be the first element of
struct tlb_core_data in commit 82d86de25b ("powerpc/e6500: Make TLB
lock recursive"), this macro is not used by any code. Just delete it.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-08-17 18:53:42 -05:00
Bjorn Helgaas 1f408d5743 Merge branches 'pci/hotplug', 'pci/iommu', 'pci/irq' and 'pci/virtualization' into next
* pci/hotplug:
  PCI: pciehp: Remove ignored MRL sensor interrupt events
  PCI: pciehp: Remove unused interrupt events
  PCI: pciehp: Handle invalid data when reading from non-existent devices
  PCI: Hold pci_slot_mutex while searching bus->slots list
  PCI: Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem
  PCI: pciehp: Simplify pcie_poll_cmd()
  PCI: Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot

* pci/iommu:
  PCI: Remove pci_ats_enabled()
  PCI: Stop caching ATS Invalidate Queue Depth
  PCI: Move ATS declarations to linux/pci.h so they're all together
  PCI: Clean up ATS error handling
  PCI: Use pci_physfn() rather than looking up physfn by hand
  PCI: Inline the ATS setup code into pci_ats_init()
  PCI: Rationalize pci_ats_queue_depth() error checking
  PCI: Reduce size of ATS structure elements
  PCI: Embed ATS info directly into struct pci_dev
  PCI: Allocate ATS struct during enumeration
  iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth

* pci/irq:
  PCI: Kill off set_irq_flags() usage

* pci/virtualization:
  PCI: Add ACS quirks for Intel I219-LM/V
2015-08-14 08:16:29 -05:00
Daniel Axtens e642d11bdb powerpc/eeh: Probe after unbalanced kref check
In the complete hotplug case, EEH PEs are supposed to be released
and set to NULL. Normally, this is done by eeh_remove_device(),
which is called from pcibios_release_device().

However, if something is holding a kref to the device, it will not
be released, and the PE will remain. eeh_add_device_late() has
a check for this which will explictly destroy the PE in this case.

This check in eeh_add_device_late() occurs after a call to
eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(),
which will exit without probing if there is an existing PE.

This means that on PowerNV, devices with outstanding krefs will not
be rediscovered by EEH correctly after a complete hotplug. This is
affecting CXL (CAPI) devices in the field.

Put the probe after the kref check so that the PE is destroyed
and affected devices are correctly rediscovered by EEH.

Fixes: d91dafc02f ("powerpc/eeh: Delay probing EEH device during hotplug")
Cc: stable@vger.kernel.org
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:31:49 +10:00
Ingo Molnar f52609fdab Merge branch 'locking/arch-atomic' into locking/core, because it's ready for upstream
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 11:44:30 +02:00
Anshuman Khandual 9afac93343 powerpc/prom: Use DRCONF flags while processing detected LMBs
Replace hard coded values with existing DRCONF flags while procesing
detected LMBs from the device tree. Does not change any functionality.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-12 15:05:47 +10:00
Anshuman Khandual 9c61f7a0ad powerpc/prom: Simplify the logic to fetch SLB size
The code to fetch the SLB size from the device tree wants to first look
for "slb-size" and then if that's not found "ibm,slb-size".

We can simplify the code by looking for the properties and then if we
find one of them we set mmu_slb_size.

We also change the function name from check_cpu_slb_size() to
init_mmu_slb_size() as the function doesn't check anything, it only
initialises mmu_slb_size.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
[mpe: Rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-12 15:05:46 +10:00
Dan Williams 92b19ff50e cleanup IORESOURCE_CACHEABLE vs ioremap()
Quoting Arnd:
    I was thinking the opposite approach and basically removing all uses
    of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
    them.and we can probably replace them all with hardcoded
    ioremap_cached() calls in the cases they are actually useful.

All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
ioremap_nocache() if the resource is cacheable, however ioremap() is
uncached by default. Clearly none of the existing usages care about the
cacheability. Particularly devm_ioremap_resource() never worked as
advertised since it always fell back to plain ioremap().

Clean this up as the new direction we want is to convert
ioremap_<type>() usages to memremap(..., flags).

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-10 23:07:06 -04:00
Viresh Kumar 37a13e78e0 powerpc/time: Migrate to new 'set-state' interface
Migrate powerpc driver to the new 'set-state' interface provided by
clockevents core, the earlier 'set-mode' interface is marked obsolete
now.

This also enables us to implement callbacks for new states of clockevent
devices, for example: ONESHOT_STOPPED.

We weren't doing anything in ->set_mode(ONSHOT) and so
set_state_oneshot() isn't implemented.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-08-10 11:41:02 +02:00
Scott Wood c60232029a powerpc/fsl: Force coherent memory on e500mc derivatives
In CoreNet systems it is not allowed to mix M and non-M mappings to the
same memory, and coherent DMA accesses are considered to be M mappings
for this purpose.  Ignoring this has been observed to cause hard
lockups in non-SMP kernels on e6500.

Furthermore, e6500 implements the LRAT (logical to real address table)
which allows KVM guests to control the WIMGE bits.  This means that
KVM cannot force the M bit on the way it usually does, so the guest had
better set it itself.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-08-07 23:00:01 -05:00
Amanieu d'Antras 3c00cb5e68 signal: fix information leak in copy_siginfo_from_user32
This function can leak kernel stack data when the user siginfo_t has a
positive si_code value.  The top 16 bits of si_code descibe which fields
in the siginfo_t union are active, but they are treated inconsistently
between copy_siginfo_from_user32, copy_siginfo_to_user32 and
copy_siginfo_to_user.

copy_siginfo_from_user32 is called from rt_sigqueueinfo and
rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
of si_code.

This fixes the following information leaks:
x86:   8 bytes leaked when sending a signal from a 32-bit process to
       itself. This leak grows to 16 bytes if the process uses x32.
       (si_code = __SI_CHLD)
x86:   100 bytes leaked when sending a signal from a 32-bit process to
       a 64-bit process. (si_code = -1)
sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
       64-bit process. (si_code = any)

parsic and s390 have similar bugs, but they are not vulnerable because
rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
to a different process.  These bugs are also fixed for consistency.

Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-07 04:39:40 +03:00
Naveen N. Rao 197165d449 powerpc/ftrace: add powerpc timebase as a trace clock source
Add a new powerpc-specific trace clock using the timebase register,
similar to x86-tsc. This gives us
- a fast, monotonic, hardware clock source for trace entries, and
- a clock that can be used to correlate events across cpus as well as across
  hypervisor and guests.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-06 16:36:23 +10:00
Joe Perches a825ac078b powerpc: Remove redundant breaks
break; break; isn't useful.

Remove one.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-06 15:10:20 +10:00
Kevin Hao ae2a84b407 powerpc: pci: use %pR for printing struct resource
Use %pR to simplify the debug code. This also make the debug info more
readable.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
[mpe: Unsplit multi-line printk strings]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-06 15:10:19 +10:00
Peter Zijlstra 76b235c6bc jump_label: Rename JUMP_LABEL_{EN,DIS}ABLE to JUMP_LABEL_{JMP,NOP}
Since we've already stepped away from ENABLE is a JMP and DISABLE is a
NOP with the branch_default bits, and are going to make it even worse,
rename it to make it all clearer.

This way we don't mix multiple levels of logic attributes, but have a
plain 'physical' name for what the current instruction patching status
of a jump label is.

This is a first step in removing the naming confusion that has led to
a stream of avoidable bugs such as:

  a833581e37 ("x86, perf: Fix static_key bug in load_mm_cr4()")

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
[ Beefed up the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-03 11:34:12 +02:00
Yijing Wang 017ffe64e8 PCI: Hold pci_slot_mutex while searching bus->slots list
Previously, pci_setup_device() and similar functions searched the
pci_bus->slots list without any locking.  It was possible for another
thread to update the list while we searched it.

Add pci_dev_assign_slot() to search the list while holding pci_slot_mutex.

[bhelgaas: changelog, fold in CONFIG_SYSFS fix]
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-07-30 16:19:53 -05:00
Michael Ellerman 2449acc534 powerpc/kernel: Enable seccomp filter
This commit enables seccomp filter on powerpc, now that we have all the
necessary pieces in place.

To support seccomp's desire to modify the syscall return value under
some circumstances, we use a different ABI to the ptrace ABI. That is we
use r3 as the syscall return value, and orig_gpr3 is the first syscall
parameter.

This means the seccomp code, or a ptracer via SECCOMP_RET_TRACE, will
see -ENOSYS preloaded in r3. This is identical to the behaviour on x86,
and allows seccomp or the ptracer to either leave the -ENOSYS or change
it to something else, as well as rejecting or not the syscall by
modifying r0.

If seccomp does not reject the syscall, we restore the register state to
match what ptrace and audit expect, ie. r3 is the first syscall
parameter again. We do this restore using orig_gpr3, which may have been
modified by seccomp, which allows seccomp to modify the first syscall
paramater and allow the syscall to proceed.

We need to #ifdef the the additional handling of r3 for seccomp, so move
it all out of line.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
2015-07-30 14:34:44 +10:00
Michael Ellerman 1b60bab04e powerpc/kernel: Add SIG_SYS support for compat tasks
SIG_SYS was added in commit a0727e8ce5 "signal, x86: add SIGSYS info
and make it synchronous."

Because we use the asm-generic struct siginfo, we got support for
SIG_SYS for free as part of that commit.

However there was no compat handling added for powerpc. That means we've
been advertising the existence of signfo._sifields._sigsys to compat
tasks, but not actually filling in the fields correctly.

Luckily it looks like no one has noticed, presumably because the only
user of SIGSYS in the kernel is seccomp filter, which we don't support
yet.

So before we enable seccomp filter, add compat handling for SIGSYS.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
2015-07-29 11:56:13 +10:00
Michael Ellerman d38374142b powerpc/kernel: Change the do_syscall_trace_enter() API
The API for calling do_syscall_trace_enter() is currently sensible
enough, it just returns the (modified) syscall number.

However once we enable seccomp filter it will get more complicated. When
seccomp filter runs, the seccomp kernel code (via SECCOMP_RET_ERRNO), or
a ptracer (via SECCOMP_RET_TRACE), may reject the syscall and *may* or may
*not* set a return value in r3.

That means the assembler that calls do_syscall_trace_enter() can not
blindly return ENOSYS, it needs to only return ENOSYS if a return value
has not already been set.

There is no way to implement that logic with the current API. So change
the do_syscall_trace_enter() API to make it deal with the return code
juggling, and the assembler can then just return whatever return code it
is given.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
2015-07-29 11:56:11 +10:00
Michael Ellerman c3525940cc powerpc/kernel: Switch to using MAX_ERRNO
Currently on powerpc we have our own #define for the highest (negative)
errno value, called _LAST_ERRNO. This is defined to be 516, for reasons
which are not clear.

The generic code, and x86, use MAX_ERRNO, which is defined to be 4095.

In particular seccomp uses MAX_ERRNO to restrict the value that a
seccomp filter can return.

Currently with the mismatch between _LAST_ERRNO and MAX_ERRNO, a seccomp
tracer wanting to return 600, expecting it to be seen as an error, would
instead find on powerpc that userspace sees a successful syscall with a
return value of 600.

To avoid this inconsistency, switch powerpc to use MAX_ERRNO.

We are somewhat confident that generic syscalls that can return a
non-error value above negative MAX_ERRNO have already been updated to
use force_successful_syscall_return().

I have also checked all the powerpc specific syscalls, and believe that
none of them expect to return a non-error value between -MAX_ERRNO and
-516. So this change should be safe ...

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
2015-07-29 11:56:11 +10:00
Peter Zijlstra de9e432cb5 atomic: Collapse all atomic_{set,clear}_mask definitions
Move the now generic definitions of atomic_{set,clear}_mask() into
linux/atomic.h to avoid endless and pointless repetition.

Also, provide an atomic_andnot() wrapper for those few archs that can
implement that.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-27 14:06:24 +02:00
Lorenzo Pieralisi dff22d2054 PCI: Call pci_read_bridge_bases() from core instead of arch code
When we scan a PCI bus, we read PCI-PCI bridge window registers with
pci_read_bridge_bases() so we can validate the resource hierarchy.  Most
architectures call pci_read_bridge_bases() from pcibios_fixup_bus(), but
PCI-PCI bridges are not arch-specific, so this doesn't need to be in
arch-specific code.

Call pci_read_bridge_bases() directly from the PCI core instead of from
arch code.

For alpha and mips, we now call pci_read_bridge_bases() always; previously
we only called it if PCI_PROBE_ONLY was set.

[bhelgaas: changelog]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: James E.J. Bottomley <jejb@parisc-linux.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: David Howells <dhowells@redhat.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Tony Luck <tony.luck@intel.com>
CC: David S. Miller <davem@davemloft.net>
CC: Ingo Molnar <mingo@redhat.com>
CC: Guenter Roeck <linux@roeck-us.net>
CC: Michal Simek <monstr@monstr.eu>
CC: Chris Zankel <chris@zankel.net>
2015-07-23 10:13:29 -05:00
Thomas Huth 1c2cb59444 powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers
The EPOW interrupt handler uses rtas_get_sensor(), which in turn
uses rtas_busy_delay() to wait for RTAS becoming ready in case it
is necessary. But rtas_busy_delay() is annotated with might_sleep()
and thus may not be used by interrupts handlers like the EPOW handler!
This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is
enabled:

 BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496
 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6
 Call Trace:
 [c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable)
 [c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180
 [c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0
 [c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0
 [c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450
 [c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300
 [c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0
 [c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260
 [c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80
 [c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200
 [c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24
 [c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110
 [c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180

Fix this issue by introducing a new rtas_get_sensor_fast() function
that does not use rtas_busy_delay() - and thus can only be used for
sensors that do not cause a BUSY condition - known as "fast" sensors.

The EPOW sensor is defined to be "fast" in sPAPR - mpe.

Fixes: 587f83e8dd ("powerpc/pseries: Use rtas_get_sensor in RAS code")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-23 19:43:11 +10:00
Thomas Huth 9ef03193a9 powerpc/rtas: Replace magic values with defines
rtas.h already has some nice #defines for RTAS return status
codes - let's use them instead of hard-coded "magic" values!

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-23 16:27:05 +10:00
Anshuman Khandual 2476c09f39 powerpc/signal: Add helper function to fetch quad word aligned pointer
This patch adds one helper function 'sigcontext_vmx_regs' which computes
quad word aligned pointer for 'vmx_reserve' array element in sigcontext
structure making the code more readable.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
[mpe: Reword comment and fix build for CONFIG_ALTIVEC=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-21 11:38:29 +10:00
Anshuman Khandual 829023df86 powerpc/tm: Drop tm_orig_msr from thread_struct
Currently tm_orig_msr is getting used during process context switch only.
Then there is ckpt_regs which saves the checkpointed userspace context
The MSR slot contained in ckpt_regs structure can be used during process
context switch instead of tm_orig_msr, thus allowing us to drop it from
thread_struct structure. This patch does that change.

Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-16 16:02:37 +10:00
Leonidas Da Silva Barbosa 72cd7b44bc powerpc: Uncomment and make enable_kernel_vsx() routine available
enable_kernel_vsx() function was commented since anything was using
it. However, vmx-crypto driver uses VSX instructions which are
only available if VSX is enable. Otherwise it rises an exception oops.

This patch uncomment enable_kernel_vsx() routine and makes it available.

Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-07-14 14:56:48 +08:00
Michael Ellerman e8a4fd0afe powerpc: Add macros for the ibm_architecture_vec[] lengths
The encoding of the lengths in the ibm_architecture_vec array is
"interesting" to say the least. It's non-obvious how the number of bytes
we provide relates to the length value.

In fact we already got it wrong once, see 11e9ed43ca "Fix up
ibm_architecture_vec definition".

So add some macros to make it (hopefully) clearer. These at least have
the property that the integer present in the code is equal to the number
of bytes that follows it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-07-13 15:46:04 +10:00
Benjamin Herrenschmidt 817820b022 powerpc/iommu: Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask
This patch adds the ability to the DMA direct ops to fallback to the IOMMU
ops for coherent alloc/free if the coherent mask of the device isn't
suitable for accessing the direct DMA space and the device also happens
to have an active IOMMU table.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-13 10:10:55 +10:00
Shreyas B. Prabhu b32aadc1a8 powerpc/powernv: Fix race in updating core_idle_state
core_idle_state is maintained for each core. It uses 0-7 bits to track
whether a thread in the core has entered fastsleep or winkle. 8th bit is
used as a lock bit.
The lock bit is set in these 2 scenarios-
 - The thread is first in subcore to wakeup from sleep/winkle.
 - If its the last thread in the core about to enter sleep/winkle

While the lock bit is set, if any other thread in the core wakes up, it
loops until the lock bit is cleared before proceeding in the wakeup
path. This helps prevent race conditions w.r.t fastsleep workaround and
prevents threads from switching to process context before core/subcore
resources are restored.

But, in the path to sleep/winkle entry, we currently don't check for
lock-bit. This exposes us to following race when running with subcore
on-

First thread in the subcorea		Another thread in the same
waking up		   		core entering sleep/winkle

lwarx   r15,0,r14
ori     r15,r15,PNV_CORE_IDLE_LOCK_BIT
stwcx.  r15,0,r14
[Code to restore subcore state]

						lwarx   r15,0,r14
						[clear thread bit]
						stwcx.  r15,0,r14

andi.   r15,r15,PNV_CORE_IDLE_THREAD_BITS
stw     r15,0(r14)

Here, after the thread entering sleep clears its thread bit in
core_idle_state, the value is overwritten by the thread waking up.
In such cases when the core enters fastsleep, code mistakes an idle
thread as running. Because of this, the first thread waking up from
fastsleep which is supposed to resync timebase skips it. So we can
end up having a core with stale timebase value.

This patch fixes the above race by looping on the lock bit even while
entering the idle states.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Fixes: 7b54e9f213f76 'powernv/powerpc: Add winkle support for offline cpus'
Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-07 10:16:52 +10:00
Daniel Axtens 27ea2c420c powerpc: Set the correct kernel taint on machine check errors.
This means the 'M' flag will work properly when the kernel prints a backtrace.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-06 20:24:35 +10:00
Linus Torvalds 2d4407079c Replace module_init with equivalent device_initcall in non modules.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVkO5XAAoJEOvOhAQsB9HWe4cQAJcsmSXIDN2O6oxvgH8Wilof
 EIEMvT13uwBdsjQdYUY6A6B3iUV9wzEEgoosg/JRgpz5/b1FTDMIO4arUPD3Lcak
 5bmyVO2qAT+yaLAWSgn6I8DMplXrKiEuK+TkH/mW3p9TdvElLdG3Vg6UI407hSWv
 W0QbVwkNtv8XmzshV9F2YdmflT8j1PgYxIu/tEkVOWn37DNW+Fp2OVBrdTIYp3AJ
 X6bYZPEcQDCrWWW/s2GmIDrNgryiebasns+CAgGY21262jAYaRcFOJmR47AsTqW7
 DSZXIlLc/gJca++hfxqV15RZ4NRHxrebCypTsPtZUV7ZiYHI726eeUZzxsp/9itu
 mvhmi+aQUTTUP3dDhiv05f4syAKEb4zslT6SLwcna6oi09M97HfCeQsHqhcFq/MG
 KnS2JJoJQToQtJvMUXMQzp5hyHjNlOclIvCxEiL32EZU54PeJOKasy/mptNGEctk
 TxACWvoXBQglRaVN+1wIjjr0BaHJSuJa9CUnIfM4WZdSHiMQMx00XLTkZcTnSM6R
 12pE54vVolrXswGPJhy4W/Sf1yPSW1tkWSVBbkKLyCIrlAWJtu68rXhvwhG/nz6E
 3g6QrDEQGlk6bzUH4CJCEqXLPRN1bNS0XjdkEFh60Lury3Ns5yHKZXPW5vCQ5csr
 FQNUyBs595CWbJNfbn1n
 =0BDx
 -----END PGP SIGNATURE-----

Merge tag 'module_init-device_initcall-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux

Pull module_init replacement part one from Paul Gortmaker:
 "Replace module_init with equivalent device_initcall in non modules.

  This series of commits converts non-modular code that is using the
  module_init() call to hook itself into the system to instead use
  device_initcall().

  The conversion is a runtime no-op, since module_init actually becomes
  __initcall in the non-modular case, and that in turn gets mapped onto
  device_initcall.  A couple files show a larger negative diffstat,
  representing ones that had a module_exit function that we remove here
  vs previously relying on the linker to dispose of it.

  We make this conversion now, so that we can relocate module_init from
  init.h into module.h in the future.

  The files changed here are just limited to those that would otherwise
  have to add module.h to obviously non-modular code, in order to avoid
  a compile fail, as testing has shown"

* tag 'module_init-device_initcall-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
  MIPS: don't use module_init in non-modular cobalt/mtd.c file
  drivers/leds: don't use module_init in non-modular leds-cobalt-raq.c
  cris: don't use module_init for non-modular core eeprom.c code
  tty/metag_da: Avoid module_init/module_exit in non-modular code
  drivers/clk: don't use module_init in clk-nomadik.c which is non-modular
  xtensa: don't use module_init for non-modular core network.c code
  sh: don't use module_init in non-modular psw.c code
  mn10300: don't use module_init in non-modular flash.c code
  parisc64: don't use module_init for non-modular core perf code
  parisc: don't use module_init for non-modular core pdc_cons code
  cris: don't use module_init for non-modular core intmem.c code
  ia64: don't use module_init in non-modular sim/simscsi.c code
  ia64: don't use module_init for non-modular core kernel/mca.c code
  arm: don't use module_init in non-modular mach-vexpress/spc.c code
  powerpc: don't use module_init in non-modular 83xx suspend code
  powerpc: use device_initcall for registering rtc devices
  x86: don't use module_init in non-modular devicetree.c code
  x86: don't use module_init in non-modular intel_mid_vrtc.c
2015-07-02 10:30:48 -07:00
Linus Torvalds 4da3064d17 Devicetree changes for v4.2
A whole lot of bug fixes. Nothing stands out here except the ability to
 enable CONFIG_OF on every architecture, and an import of a newer version
 of dtc.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVlAkwAAoJEMWQL496c2LNNYMP/23EdDPyRneoaIynd0nNk9SO
 UfhOSJdSo7vMmT9Rea2eBHdn3leJrx9m9JXvIrBwGdcDxMNsS4mS1k9Bj63aqEVn
 kK+IrI1Jbx7F6/AlBh3u4nHixIjoTc3IWlFdxUTBKQ2ATYKmCVhVCsf6UyfSxAj+
 xPL6bmALegEZ2kJzK+qhk6K0j7GeQDnk1SAS3xMvTpJH76Ac2F+Gi9u7J68GqXAS
 d7WBCAjijkqskfAdeP13XasvSdU7ZCOnDjClwJd83ZQGmtp77T8PWF0lzLlnC8Ho
 sMwDhoWHnCtFP0U1hnhUF1pXhhn8W9NlxymtYbxR1tJcku0fSiYlibZ6jnzTRc2m
 TsqzaWDR3U/VX4t5wH5FtXM1Cum/eAfV6HX9fGXeYYP7Einl7Kg6yXYjIY+b7HG9
 R3znQ2TKoYPsUr/WWXrZK52ZTesTe+LG98WYH1YhNbZ5riev9fLZxI2zMl/h83/Z
 LrF0g0MLQobHuBCUSIXSUot6RTQgLzFWHtnSrNOUycMwlRNZHYOY3DSvzLYLw+hJ
 XwV9p2k3DV/l/XnQJPy3y/MA+7jEudzlq7HukmtYVhh9rOy3y+Sq3GMGAiUFjAqj
 YDxBrrIpoPWNp/OJJX2yhnTvnNaV/BjhCB1CiJooFCjHz78I5daqBXO155hn9msY
 7To1PHvyEngabBpdN/MZ
 =tm5y
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux

Pull devicetree updates from Grant Likely:
 "A whole lot of bug fixes.

  Nothing stands out here except the ability to enable CONFIG_OF on
  every architecture, and an import of a newer version of dtc"

* tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux: (22 commits)
  of/irq: Rename "intc_desc" to "of_intc_desc" to fix OF on sh
  of/irq: Fix pSeries boot failure
  Documentation: DT: Fix a typo in the filename "lantiq,<chip>-pinumx.txt"
  of: define of_find_node_by_phandle for !CONFIG_OF
  of/address: use atomic allocation in pci_register_io_range()
  of: Add vendor prefix for Zodiac Inflight Innovations
  dt/fdt: add empty versions of early_init_dt_*_memory_arch
  of: clean-up unnecessary libfdt include paths
  of: make unittest select OF_EARLY_FLATTREE instead of depend on it
  of: make CONFIG_OF user selectable
  MIPS: prepare for user enabling of CONFIG_OF
  of/fdt: fix argument name and add comments of unflatten_dt_node()
  of: return NUMA_NO_NODE from fallback of_node_to_nid()
  tps6507x.txt: Remove executable permission
  of/overlay: Grammar s/an negative/a negative/
  of/fdt: Make fdt blob input parameters of unflatten functions const
  of: add helper function to retrive match data
  of: Grammar s/property exist/property exists/
  of: Move OF flags to be visible even when !CONFIG_OF
  scripts/dtc: Update to upstream version 9d3649bd3be245c9
  ...
2015-07-01 19:40:18 -07:00
Grant Likely becfc3c86d Merge remote-tracking branch 'robh/for-next' into devicetree/next 2015-06-30 14:28:52 +01:00
Akinobu Mita 5935877af4 powerpc: use for_each_sg()
This replaces the plain loop over the sglist array with for_each_sg()
macro which consists of sg_next() function calls.  Since powerpc does
select ARCH_HAS_SG_CHAIN, it is necessary to use for_each_sg() in order
to loop over each sg element.  This also help find problems with drivers
that do not properly initialize their sg tables when CONFIG_DEBUG_SG is
enabled.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:38 -07:00
Linus Torvalds e3d8238d7f arm64 updates for 4.2, mostly refactoring/clean-up:
- CPU ops and PSCI (Power State Coordination Interface) refactoring
   following the merging of the arm64 ACPI support, together with
   handling of Trusted (secure) OS instances
 
 - Using fixmap for permanent FDT mapping, removing the initial dtb
   placement requirements (within 512MB from the start of the kernel
   image). This required moving the FDT self reservation out of the
   memreserve processing
 
 - Idmap (1:1 mapping used for MMU on/off) handling clean-up
 
 - Removing flush_cache_all() - not safe on ARM unless the MMU is off.
   Last stages of CPU power down/up are handled by firmware already
 
 - "Alternatives" (run-time code patching) refactoring and support for
   immediate branch patching, GICv3 CPU interface access
 
 - User faults handling clean-up
 
 And some fixes:
 
 - Fix for VDSO building with broken ELF toolchains
 
 - Fixing another case of init_mm.pgd usage for user mappings (during
   ASID roll-over broadcasting)
 
 - Fix for FPSIMD reloading after CPU hotplug
 
 - Fix for missing syscall trace exit
 
 - Workaround for .inst asm bug
 
 - Compat fix for switching the user tls tpidr_el0 register
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVitZgAAoJEGvWsS0AyF7x+ToP/0Yci5bNsYVwVay8N1rK6WHh
 SGzDMzyxcSBjQpz2IhhTJ28eTAEH+a+HWQms+0tBehjqxqkvjuzBN0okDkc/z8NB
 7Z0BV2aLkQcMwTbjgIh5jm25ZpGmvmvbWPD5oBwgmgQ4v4i1OLRKgx7+YQ+z9rWZ
 zC70d0UwyWjs2oxmjd2ZrAkps7v/qozEFhcRHxLzCn8Mbw+3FcTQsqMbfnoWGnH0
 YuGfHQQqBY4/HC7uAslMCy7tXeJXqb+NsgrnAovjfEbHGDjsg0KNl06K++LHwE37
 A5Noa3M0AQEPYqx/sg0Ec8RNUUEMB4RA2DCaibp7XlVGncXOwFfiyk/M5uVrYXIO
 ku5QF0ytUfZKzrMq3yQKbEDuCPOFTqjjdVpkeXKFdW66zYTohKVc3vUBV5xHZ5uO
 7Kr8H0ZnhAv3OxPyKdEwAuQ5sJdWwQSvZyGClxMUO4eC/UzD0Mwwf1Y8WYtiAXx+
 NSTeBKw/m33W3/KhNuNH1+qGEOKhuXuKX7AcYA84Rab8ytxYWcurHCG2bmhzpEse
 2DZtNMybrP/HMQPyJlYgGac8B3QbsAIAkkU1f+dJTAv9otuBDhscaDQyZ9Y6WVht
 /k8zJiZeMEuGAmwgTkzLmWs/8pQq42nW4J4eQdXPZAwp4ghCIypPWfaZASAwee6/
 p+es3v8P4k9wkv2TFZMh
 =YeGl
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Mostly refactoring/clean-up:

   - CPU ops and PSCI (Power State Coordination Interface) refactoring
     following the merging of the arm64 ACPI support, together with
     handling of Trusted (secure) OS instances

   - Using fixmap for permanent FDT mapping, removing the initial dtb
     placement requirements (within 512MB from the start of the kernel
     image).  This required moving the FDT self reservation out of the
     memreserve processing

   - Idmap (1:1 mapping used for MMU on/off) handling clean-up

   - Removing flush_cache_all() - not safe on ARM unless the MMU is off.
     Last stages of CPU power down/up are handled by firmware already

   - "Alternatives" (run-time code patching) refactoring and support for
     immediate branch patching, GICv3 CPU interface access

   - User faults handling clean-up

  And some fixes:

   - Fix for VDSO building with broken ELF toolchains

   - Fix another case of init_mm.pgd usage for user mappings (during
     ASID roll-over broadcasting)

   - Fix for FPSIMD reloading after CPU hotplug

   - Fix for missing syscall trace exit

   - Workaround for .inst asm bug

   - Compat fix for switching the user tls tpidr_el0 register"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits)
  arm64: use private ratelimit state along with show_unhandled_signals
  arm64: show unhandled SP/PC alignment faults
  arm64: vdso: work-around broken ELF toolchains in Makefile
  arm64: kernel: rename __cpu_suspend to keep it aligned with arm
  arm64: compat: print compat_sp instead of sp
  arm64: mm: Fix freeing of the wrong memmap entries with !SPARSEMEM_VMEMMAP
  arm64: entry: fix context tracking for el0_sp_pc
  arm64: defconfig: enable memtest
  arm64: mm: remove reference to tlb.S from comment block
  arm64: Do not attempt to use init_mm in reset_context()
  arm64: KVM: Switch vgic save/restore to alternative_insn
  arm64: alternative: Introduce feature for GICv3 CPU interface
  arm64: psci: fix !CONFIG_HOTPLUG_CPU build warning
  arm64: fix bug for reloading FPSIMD state after CPU hotplug.
  arm64: kernel thread don't need to save fpsimd context.
  arm64: fix missing syscall trace exit
  arm64: alternative: Work around .inst assembler bugs
  arm64: alternative: Merge alternative-asm.h into alternative.h
  arm64: alternative: Allow immediate branch as alternative instruction
  arm64: Rework alternate sequence for ARM erratum 845719
  ...
2015-06-24 10:02:15 -07:00
Linus Torvalds 08d183e3c1 powerpc updates for 4.2
- Disable the 32-bit vdso when building LE, so we can build with a 64-bit only
    toolchain.
  - EEH fixes from Gavin & Richard.
  - Enable the sys_kcmp syscall from Laurent.
  - Sysfs control for fastsleep workaround from Shreyas.
  - Expose OPAL events as an irq chip by Alistair.
  - MSI ops moved to pci_controller_ops by Daniel.
  - Fix for kernel to userspace backtraces for perf from Anton.
  - Merge pseries and pseries_le defconfigs from Cyril.
  - CXL in-kernel API from Mikey.
  - OPAL prd driver from Jeremy.
  - Fix for DSCR handling & tests from Anshuman.
  - Powernv flash mtd driver from Cyril.
  - Dynamic DMA Window support on powernv from Alexey.
  - LLVM clang fixes & workarounds from Anton.
  - Reworked version of the patch to abort syscalls when transactional.
  - Fix the swap encoding to support 4TB, from Aneesh.
  - Various fixes as usual.
  - Freescale updates from Scott: Highlights include more 8xx optimizations, an
    e6500 hugetlb optimization, QMan device tree nodes, t1024/t1023 support, and
    various fixes and cleanup.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJViSZqAAoJEFHr6jzI4aWAA7kQAKq3+pejfo2rY7alpKJyeVao
 vlaIEaDNOTh+ctcmu3MFF9Jy6fai8gNZziRXU5JRmE5RW4GVBN4KZiqXRbkVjdBK
 uG9sCX7Y58VRsS2vnGBYLsamfTMgjaXeDvgunQHVLiechJnrDr0RHEK90F3LSi73
 Axp6l8XIG63a3zFZmkhzANMCme2lm5+MWmGlSjUUNi5F+viQUgJc5iiO8xrVUgM5
 RpNlV2NJSqFiU+gMQWJ226V85UIniouq4j+qtyUcu8/m9BberyolXVU0GPlPFdsx
 r/Qh9uCJyZaUdSB5hzomQZj50IsSz6J6nEuJTeGRoVZOmeI8Dnc2xU9fxQF5fC8H
 lUJw10WPoNOggQZTeSUKn7wTXw3i4p3KsWNUczaW68VJdhqZUVaSp0+I6mnDSqzs
 9iGC+VffLYNa1OHq7mGRFrgDdLBCHes31aZ3CxlQsmyNpAPCwMzsD4TUfVnvOG6E
 oJOeaQ4mZM9PvqxEYJfoIL+vgRxmQ8sdIBtNY4in+C7J6eFnZNFO9xmPnJZuVU31
 PGtx60kjFCOVMXvqn34WkRNbgqGWI91IK0KcRwFO2LXVio1uY77TWL52kNK2IMsp
 Az+VDDvqnT3+BoV1yz0P6SrXAkwTpvFk2y+IdmEiUUN7zZFL5ZSA2epej9AzHTAK
 WID2bc5yVtIL6p6x5ICH
 =d9Wh
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull powerpc updates from Michael Ellerman:

 - disable the 32-bit vdso when building LE, so we can build with a
   64-bit only toolchain.

 - EEH fixes from Gavin & Richard.

 - enable the sys_kcmp syscall from Laurent.

 - sysfs control for fastsleep workaround from Shreyas.

 - expose OPAL events as an irq chip by Alistair.

 - MSI ops moved to pci_controller_ops by Daniel.

 - fix for kernel to userspace backtraces for perf from Anton.

 - merge pseries and pseries_le defconfigs from Cyril.

 - CXL in-kernel API from Mikey.

 - OPAL prd driver from Jeremy.

 - fix for DSCR handling & tests from Anshuman.

 - Powernv flash mtd driver from Cyril.

 - dynamic DMA Window support on powernv from Alexey.

 - LLVM clang fixes & workarounds from Anton.

 - reworked version of the patch to abort syscalls when transactional.

 - fix the swap encoding to support 4TB, from Aneesh.

 - various fixes as usual.

 - Freescale updates from Scott: Highlights include more 8xx
   optimizations, an e6500 hugetlb optimization, QMan device tree nodes,
   t1024/t1023 support, and various fixes and cleanup.

* tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (180 commits)
  cxl: Fix typo in debug print
  cxl: Add CXL_KERNEL_API config option
  powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma()
  powerpc/mm: Change the swap encoding in pte.
  powerpc/mm: PTE_RPN_MAX is not used, remove the same
  powerpc/tm: Abort syscalls in active transactions
  powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off
  powerpc/include: Add opal-prd to installed uapi headers
  powerpc/powernv: fix construction of opal PRD messages
  powerpc/powernv: Increase opal-irqchip initcall priority
  powerpc: Make doorbell check preemption safe
  powerpc/powernv: pnv_init_idle_states() should only run on powernv
  macintosh/nvram: Remove as unused
  powerpc: Don't use gcc specific options on clang
  powerpc: Don't use -mno-strict-align on clang
  powerpc: Only use -mtraceback=no, -mno-string and -msoft-float if toolchain supports it
  powerpc: Only use -mabi=altivec if toolchain supports it
  powerpc: Fix duplicate const clang warning in user access code
  vfio: powerpc/spapr: Support Dynamic DMA windows
  vfio: powerpc/spapr: Register memory and define IOMMU v2
  ...
2015-06-24 08:46:32 -07:00
Linus Torvalds d8133356e9 PCI changes for the v4.2 merge window:
Enumeration
     - Move pci_ari_enabled() to global header (Alex Williamson)
     - Account for ARI in _PRT lookups (Alex Williamson)
     - Remove unused pci_scan_bus_parented() (Yijing Wang)
 
   Resource management
     - Use host bridge _CRS info on systems with >32 bit addressing (Bjorn Helgaas)
     - Use host bridge _CRS info on Foxconn K8M890-8237A (Bjorn Helgaas)
     - Fix pci_address_to_pio() conversion of CPU address to I/O port (Zhichang Yuan)
     - Add pci_bus_addr_t (Yinghai Lu)
 
   PCI device hotplug
     - Wait for pciehp command completion where necessary (Alex Williamson)
     - Drop pointless ACPI-based "slot detection" check (Rafael J. Wysocki)
     - Check ignore_hotplug for all downstream devices (Rafael J. Wysocki)
     - Propagate the "ignore hotplug" setting to parent (Rafael J. Wysocki)
     - Inline pciehp "handle event" functions into the ISR (Bjorn Helgaas)
     - Clean up pciehp debug logging (Bjorn Helgaas)
 
   Power management
     - Remove redundant PCIe port type checking (Yijing Wang)
     - Add dev->has_secondary_link to track downstream PCIe links (Yijing Wang)
     - Use dev->has_secondary_link to find downstream links for ASPM (Yijing Wang)
     - Drop __pci_disable_link_state() useless "force" parameter (Bjorn Helgaas)
     - Simplify Clock Power Management setting (Bjorn Helgaas)
 
   Virtualization
     - Add ACS quirks for Intel 9-series PCH root ports (Alex Williamson)
     - Add function 1 DMA alias quirk for Marvell 9120 (Sakari Ailus)
 
   MSI
     - Disable MSI at enumeration even if kernel doesn't support MSI (Michael S. Tsirkin)
     - Remove unused pci_msi_off() (Bjorn Helgaas)
     - Rename msi_set_enable(), msix_clear_and_set_ctrl() (Michael S.  Tsirkin)
     - Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl() (Michael S. Tsirkin)
     - Drop pci_msi_off() calls during probe (Michael S. Tsirkin)
 
   APM X-Gene host bridge driver
     - Add APM X-Gene v1 PCIe MSI/MSIX termination driver (Duc Dang)
     - Add APM X-Gene PCIe MSI DTS nodes (Duc Dang)
     - Disable Configuration Request Retry Status for v1 silicon (Duc Dang)
     - Allow config access to Root Port even when link is down (Duc Dang)
 
   Broadcom iProc host bridge driver
     - Allow override of device tree IRQ mapping function (Hauke Mehrtens)
     - Add BCMA PCIe driver (Hauke Mehrtens)
     - Directly add PCI resources (Hauke Mehrtens)
     - Free resource list after registration (Hauke Mehrtens)
 
   Freescale i.MX6 host bridge driver
     - Add speed change timeout message (Troy Kisky)
     - Rename imx6_pcie_start_link() to imx6_pcie_establish_link() (Bjorn Helgaas)
 
   Freescale Layerscape host bridge driver
     - Use dw_pcie_link_up() consistently (Bjorn Helgaas)
     - Factor out ls_pcie_establish_link() (Bjorn Helgaas)
 
   Marvell MVEBU host bridge driver
     - Remove mvebu_pcie_scan_bus() (Yijing Wang)
 
   NVIDIA Tegra host bridge driver
     - Remove tegra_pcie_scan_bus() (Yijing Wang)
 
   Synopsys DesignWare host bridge driver
     - Consolidate outbound iATU programming functions (Jisheng Zhang)
     - Use iATU0 for cfg and IO, iATU1 for MEM (Jisheng Zhang)
     - Add support for x8 links (Zhou Wang)
     - Wait for link to come up with consistent style (Bjorn Helgaas)
     - Use pci_scan_root_bus() for simplicity (Yijing Wang)
 
   TI DRA7xx host bridge driver
     - Use dw_pcie_link_up() consistently (Bjorn Helgaas)
 
   Miscellaneous
     - Include <linux/pci.h>, not <asm/pci.h> (Bjorn Helgaas)
     - Remove unnecessary #includes of <asm/pci.h> (Bjorn Helgaas)
     - Remove unused pcibios_select_root() (again) (Bjorn Helgaas)
     - Remove unused pci_dma_burst_advice() (Bjorn Helgaas)
     - xen/pcifront: Don't use deprecated function pci_scan_bus_parented() (Arnd Bergmann)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJViCSWAAoJEFmIoMA60/r8zX8P/1DPNnk+8xSQe3dYjnG8VW3P
 GPxeCqLMkjiF3ffxcLDzsgrHMjZEb8Co67WePs0k5V0lbZevoIwUo48+oO9B5jhc
 H5DuPZHyTHeOvaZv4GUY5vq/1DBh4JXmJc2V/BkaJ6qhXckF+SCam9C+s0p4950o
 QX/ifOjg/VHzmhaiL7wymJOzuniZmIttl+y+nzkl3AUJ+T6ZtQbUhz+8GZ3lj7Ma
 F+7JHhvm9K8Ljajxb6BLWTw4xgHA6ZN5PtYEx+Sl9QBYSsGfL7LnqyYD3KhJ7KV5
 4AHNJGEVhzNwSuyh+VQx1tNm7OHOqkAaTsYdCVUZRow+6CPd8P75QOMtpl+SmPJB
 RV1BAO75OTGqKg0B9IDg855y4Nh+4/dKoZlBPzpp7+cKw3ylaRAsNnaZ9ik5D62v
 RR06CFgWGHwDXSObgbRm4v0HwfAIHWWJzrPqAZmElh2dzb1Lv1I3AbB1SClCN6sl
 fnAu6CAwA47A5GT8xW3L0oQXdcSmdNUdNzZrsfDnOBIQWMsF+zBFKr6sTABVgyxp
 /WEJaNlvx8Zlq0bZlhGDdsGSbFNFzhX4avWZtXhvdcqFzH0KaVghYSayYvJE9Haq
 oakWqS+GZ3x40j+rdrgLg98AWRVraE1MvV1A7N9TIGjuuKqqbZfSP8kvX3QRQQhO
 Z2+X5hMM0s/tdYtADYu/
 =Qw+j
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for the v4.2 merge window:

  Enumeration
    - Move pci_ari_enabled() to global header (Alex Williamson)
    - Account for ARI in _PRT lookups (Alex Williamson)
    - Remove unused pci_scan_bus_parented() (Yijing Wang)

  Resource management
    - Use host bridge _CRS info on systems with >32 bit addressing (Bjorn Helgaas)
    - Use host bridge _CRS info on Foxconn K8M890-8237A (Bjorn Helgaas)
    - Fix pci_address_to_pio() conversion of CPU address to I/O port (Zhichang Yuan)
    - Add pci_bus_addr_t (Yinghai Lu)

  PCI device hotplug
    - Wait for pciehp command completion where necessary (Alex Williamson)
    - Drop pointless ACPI-based "slot detection" check (Rafael J. Wysocki)
    - Check ignore_hotplug for all downstream devices (Rafael J. Wysocki)
    - Propagate the "ignore hotplug" setting to parent (Rafael J. Wysocki)
    - Inline pciehp "handle event" functions into the ISR (Bjorn Helgaas)
    - Clean up pciehp debug logging (Bjorn Helgaas)

  Power management
    - Remove redundant PCIe port type checking (Yijing Wang)
    - Add dev->has_secondary_link to track downstream PCIe links (Yijing Wang)
    - Use dev->has_secondary_link to find downstream links for ASPM (Yijing Wang)
    - Drop __pci_disable_link_state() useless "force" parameter (Bjorn Helgaas)
    - Simplify Clock Power Management setting (Bjorn Helgaas)

  Virtualization
    - Add ACS quirks for Intel 9-series PCH root ports (Alex Williamson)
    - Add function 1 DMA alias quirk for Marvell 9120 (Sakari Ailus)

  MSI
    - Disable MSI at enumeration even if kernel doesn't support MSI (Michael S. Tsirkin)
    - Remove unused pci_msi_off() (Bjorn Helgaas)
    - Rename msi_set_enable(), msix_clear_and_set_ctrl() (Michael S.  Tsirkin)
    - Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl() (Michael S. Tsirkin)
    - Drop pci_msi_off() calls during probe (Michael S. Tsirkin)

  APM X-Gene host bridge driver
    - Add APM X-Gene v1 PCIe MSI/MSIX termination driver (Duc Dang)
    - Add APM X-Gene PCIe MSI DTS nodes (Duc Dang)
    - Disable Configuration Request Retry Status for v1 silicon (Duc Dang)
    - Allow config access to Root Port even when link is down (Duc Dang)

  Broadcom iProc host bridge driver
    - Allow override of device tree IRQ mapping function (Hauke Mehrtens)
    - Add BCMA PCIe driver (Hauke Mehrtens)
    - Directly add PCI resources (Hauke Mehrtens)
    - Free resource list after registration (Hauke Mehrtens)

  Freescale i.MX6 host bridge driver
    - Add speed change timeout message (Troy Kisky)
    - Rename imx6_pcie_start_link() to imx6_pcie_establish_link() (Bjorn Helgaas)

  Freescale Layerscape host bridge driver
    - Use dw_pcie_link_up() consistently (Bjorn Helgaas)
    - Factor out ls_pcie_establish_link() (Bjorn Helgaas)

  Marvell MVEBU host bridge driver
    - Remove mvebu_pcie_scan_bus() (Yijing Wang)

  NVIDIA Tegra host bridge driver
    - Remove tegra_pcie_scan_bus() (Yijing Wang)

  Synopsys DesignWare host bridge driver
    - Consolidate outbound iATU programming functions (Jisheng Zhang)
    - Use iATU0 for cfg and IO, iATU1 for MEM (Jisheng Zhang)
    - Add support for x8 links (Zhou Wang)
    - Wait for link to come up with consistent style (Bjorn Helgaas)
    - Use pci_scan_root_bus() for simplicity (Yijing Wang)

  TI DRA7xx host bridge driver
    - Use dw_pcie_link_up() consistently (Bjorn Helgaas)

  Miscellaneous
    - Include <linux/pci.h>, not <asm/pci.h> (Bjorn Helgaas)
    - Remove unnecessary #includes of <asm/pci.h> (Bjorn Helgaas)
    - Remove unused pcibios_select_root() (again) (Bjorn Helgaas)
    - Remove unused pci_dma_burst_advice() (Bjorn Helgaas)
    - xen/pcifront: Don't use deprecated function pci_scan_bus_parented() (Arnd Bergmann)"

* tag 'pci-v4.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (58 commits)
  PCI: pciehp: Inline the "handle event" functions into the ISR
  PCI: pciehp: Rename queue_interrupt_event() to pciehp_queue_interrupt_event()
  PCI: pciehp: Make queue_interrupt_event() void
  PCI: xgene: Allow config access to Root Port even when link is down
  PCI: xgene: Disable Configuration Request Retry Status for v1 silicon
  PCI: pciehp: Clean up debug logging
  x86/PCI: Use host bridge _CRS info on systems with >32 bit addressing
  PCI: imx6: Add #define PCIE_RC_LCSR
  PCI: imx6: Use "u32", not "uint32_t"
  PCI: Remove unused pci_scan_bus_parented()
  xen/pcifront: Don't use deprecated function pci_scan_bus_parented()
  PCI: imx6: Add speed change timeout message
  PCI/ASPM: Simplify Clock Power Management setting
  PCI: designware: Wait for link to come up with consistent style
  PCI: layerscape: Factor out ls_pcie_establish_link()
  PCI: layerscape: Use dw_pcie_link_up() consistently
  PCI: dra7xx: Use dw_pcie_link_up() consistently
  x86/PCI: Use host bridge _CRS info on Foxconn K8M890-8237A
  PCI: pciehp: Wait for hotplug command completion where necessary
  PCI: Remove unused pci_dma_burst_advice()
  ...
2015-06-23 13:41:24 -07:00
Linus Torvalds 44d21c3f3a Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "Here is the crypto update for 4.2:

  API:

   - Convert RNG interface to new style.

   - New AEAD interface with one SG list for AD and plain/cipher text.
     All external AEAD users have been converted.

   - New asymmetric key interface (akcipher).

  Algorithms:

   - Chacha20, Poly1305 and RFC7539 support.

   - New RSA implementation.

   - Jitter RNG.

   - DRBG is now seeded with both /dev/random and Jitter RNG.  If kernel
     pool isn't ready then DRBG will be reseeded when it is.

   - DRBG is now the default crypto API RNG, replacing krng.

   - 842 compression (previously part of powerpc nx driver).

  Drivers:

   - Accelerated SHA-512 for arm64.

   - New Marvell CESA driver that supports DMA and more algorithms.

   - Updated powerpc nx 842 support.

   - Added support for SEC1 hardware to talitos"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (292 commits)
  crypto: marvell/cesa - remove COMPILE_TEST dependency
  crypto: algif_aead - Temporarily disable all AEAD algorithms
  crypto: af_alg - Forbid the use internal algorithms
  crypto: echainiv - Only hold RNG during initialisation
  crypto: seqiv - Add compatibility support without RNG
  crypto: eseqiv - Offer normal cipher functionality without RNG
  crypto: chainiv - Offer normal cipher functionality without RNG
  crypto: user - Add CRYPTO_MSG_DELRNG
  crypto: user - Move cryptouser.h to uapi
  crypto: rng - Do not free default RNG when it becomes unused
  crypto: skcipher - Allow givencrypt to be NULL
  crypto: sahara - propagate the error on clk_disable_unprepare() failure
  crypto: rsa - fix invalid select for AKCIPHER
  crypto: picoxcell - Update to the current clk API
  crypto: nx - Check for bogus firmware properties
  crypto: marvell/cesa - add DT bindings documentation
  crypto: marvell/cesa - add support for Kirkwood and Dove SoCs
  crypto: marvell/cesa - add support for Orion SoCs
  crypto: marvell/cesa - add allhwsupport module parameter
  crypto: marvell/cesa - add support for all armada SoCs
  ...
2015-06-22 21:04:48 -07:00
Herbert Xu c0b59fafe3 Merge branch 'mvebu/drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Merge the mvebu/drivers branch of the arm-soc tree which contains
just a single patch bfa1ce5f38 ("bus:
mvebu-mbus: add mv_mbus_dram_info_nooverlap()") that happens to be
a prerequisite of the new marvell/cesa crypto driver.
2015-06-19 22:07:07 +08:00
Michael Ellerman 6096f88451 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include more 8xx optimizations, an e6500 hugetlb optimization,
QMan device tree nodes, t1024/t1023 support, and various fixes and
cleanup."
2015-06-19 17:23:48 +10:00
Sam bobroff b4b56f9eca powerpc/tm: Abort syscalls in active transactions
This patch changes the syscall handler to doom (tabort) active
transactions when a syscall is made and return very early without
performing the syscall and keeping side effects to a minimum (no CPU
accounting or system call tracing is performed). Also included is a
new HWCAP2 bit, PPC_FEATURE2_HTM_NOSC, to indicate this
behaviour to userspace.

Currently, the system call instruction automatically suspends an
active transaction which causes side effects to persist when an active
transaction fails.

This does change the kernel's behaviour, but in a way that was
documented as unsupported.  It doesn't reduce functionality as
syscalls will still be performed after tsuspend; it just requires that
the transaction be explicitly suspended.  It also provides a
consistent interface and makes the behaviour of user code
substantially the same across powerpc and platforms that do not
support suspended transactions (e.g. x86 and s390).

Performance measurements using
http://ozlabs.org/~anton/junkcode/null_syscall.c indicate the cost of
a normal (non-aborted) system call increases by about 0.25%.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-19 17:10:28 +10:00
Paul Gortmaker 8f6b9512ce powerpc: use device_initcall for registering rtc devices
Currently these two RTC devices are in core platform code
where it is not possible for them to be modular.  It will
never be modular, so using module_init as an alias for
__initcall can be somewhat misleading.

Fix this up now, so that we can relocate module_init from
init.h into module.h in the future.  If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.

Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups.  As __initcall gets
mapped onto device_initcall, our use of device_initcall
directly in this change means that the runtime impact is
zero -- they will remain at level 6 in initcall ordering.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Geoff Levand <geoff@infradead.org>
Acked-by: Geoff Levand <geoff@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2015-06-16 14:12:29 -04:00
Alexey Kardashevskiy 15b244a88e powerpc/mmu: Add userspace-to-physical addresses translation cache
We are adding support for DMA memory pre-registration to be used in
conjunction with VFIO. The idea is that the userspace which is going to
run a guest may want to pre-register a user space memory region so
it all gets pinned once and never goes away. Having this done,
a hypervisor will not have to pin/unpin pages on every DMA map/unmap
request. This is going to help with multiple pinning of the same memory.

Another use of it is in-kernel real mode (mmu off) acceleration of
DMA requests where real time translation of guest physical to host
physical addresses is non-trivial and may fail as linux ptes may be
temporarily invalid. Also, having cached host physical addresses
(compared to just pinning at the start and then walking the page table
again on every H_PUT_TCE), we can be sure that the addresses which we put
into TCE table are the ones we already pinned.

This adds a list of memory regions to mm_context_t. Each region consists
of a header and a list of physical addresses. This adds API to:
1. register/unregister memory regions;
2. do final cleanup (which puts all pre-registered pages);
3. do userspace to physical address translation;
4. manage usage counters; multiple registration of the same memory
is allowed (once per container).

This implements 2 counters per registered memory region:
- @mapped: incremented on every DMA mapping; decremented on unmapping;
initialized to 1 when a region is just registered; once it becomes zero,
no more mappings allowe;
- @used: incremented on every "register" ioctl; decremented on
"unregister"; unregistration is allowed for DMA mapped regions unless
it is the very last reference. For the very last reference this checks
that the region is still mapped and returns -EBUSY so the userspace
gets to know that memory is still pinned and unregistration needs to
be retried; @used remains 1.

Host physical addresses are stored in vmalloc'ed array. In order to
access these in the real mode (mmu off), there is a real_vmalloc_addr()
helper. In-kernel acceleration patchset will move it from KVM to MMU code.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11 15:16:54 +10:00
Alexey Kardashevskiy 05c6cfb9dc powerpc/iommu/powernv: Release replaced TCE
At the moment writing new TCE value to the IOMMU table fails with EBUSY
if there is a valid entry already. However PAPR specification allows
the guest to write new TCE value without clearing it first.

Another problem this patch is addressing is the use of pool locks for
external IOMMU users such as VFIO. The pool locks are to protect
DMA page allocator rather than entries and since the host kernel does
not control what pages are in use, there is no point in pool locks and
exchange()+put_page(oldtce) is sufficient to avoid possible races.

This adds an exchange() callback to iommu_table_ops which does the same
thing as set() plus it returns replaced TCE and DMA direction so
the caller can release the pages afterwards. The exchange() receives
a physical address unlike set() which receives linear mapping address;
and returns a physical address as the clear() does.

This implements exchange() for P5IOC2/IODA/IODA2. This adds a requirement
for a platform to have exchange() implemented in order to support VFIO.

This replaces iommu_tce_build() and iommu_clear_tce() with
a single iommu_tce_xchg().

This makes sure that TCE permission bits are not set in TCE passed to
IOMMU API as those are to be calculated by platform code from
DMA direction.

This moves SetPageDirty() to the IOMMU code to make it work for both
VFIO ioctl interface in in-kernel TCE acceleration (when it becomes
available later).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11 15:16:49 +10:00
Alexey Kardashevskiy b82c75bfbe powerpc/iommu: Fix IOMMU ownership control functions
This adds missing locks in iommu_take_ownership()/
iommu_release_ownership().

This marks all pages busy in iommu_table::it_map in order to catch
errors if there is an attempt to use this table while ownership over it
is taken.

This only clears TCE content if there is no page marked busy in it_map.
Clearing must be done outside of the table locks as iommu_clear_tce()
called from iommu_clear_tces_and_put_pages() does this.

In order to use bitmap_empty(), the existing code clears bit#0 which
is set even in an empty table if it is bus-mapped at 0 as
iommu_init_table() reserves page#0 to prevent buggy drivers
from crashing when allocated page is bus-mapped at zero
(which is correct). This restores the bit in the case of failure
to bring the it_map to the state it was in when we called
iommu_take_ownership().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11 15:16:47 +10:00
Alexey Kardashevskiy f87a88642e vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control
This adds tce_iommu_take_ownership() and tce_iommu_release_ownership
which call in a loop iommu_take_ownership()/iommu_release_ownership()
for every table on the group. As there is just one now, no change in
behaviour is expected.

At the moment the iommu_table struct has a set_bypass() which enables/
disables DMA bypass on IODA2 PHB. This is exposed to POWERPC IOMMU code
which calls this callback when external IOMMU users such as VFIO are
about to get over a PHB.

The set_bypass() callback is not really an iommu_table function but
IOMMU/PE function. This introduces a iommu_table_group_ops struct and
adds take_ownership()/release_ownership() callbacks to it which are
called when an external user takes/releases control over the IOMMU.

This replaces set_bypass() with ownership callbacks as it is not
necessarily just bypass enabling, it can be something else/more
so let's give it more generic name.

The callbacks is implemented for IODA2 only. Other platforms (P5IOC2,
IODA1) will use the old iommu_take_ownership/iommu_release_ownership API.
The following patches will replace iommu_take_ownership/
iommu_release_ownership calls in IODA2 with full IOMMU table release/
create.

As we here and touching bypass control, this removes
pnv_pci_ioda2_setup_bypass_pe() as it does not do much
more compared to pnv_pci_ioda2_set_bypass. This moves tce_bypass_base
initialization to pnv_pci_ioda2_setup_dma_pe.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11 15:16:47 +10:00
Alexey Kardashevskiy 0eaf4defc7 powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group
So far one TCE table could only be used by one IOMMU group. However
IODA2 hardware allows programming the same TCE table address to
multiple PE allowing sharing tables.

This replaces a single pointer to a group in a iommu_table struct
with a linked list of groups which provides the way of invalidating
TCE cache for every PE when an actual TCE table is updated. This adds
pnv_pci_link_table_and_group() and pnv_pci_unlink_table_and_group()
helpers to manage the list. However without VFIO, it is still going
to be a single IOMMU group per iommu_table.

This changes iommu_add_device() to add a device to a first group
from the group list of a table as it is only called from the platform
init code or PCI bus notifier and at these moments there is only
one group per table.

This does not change TCE invalidation code to loop through all
attached groups in order to simplify this patch and because
it is not really needed in most cases. IODA2 is fixed in a later
patch.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11 15:16:15 +10:00