linux

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	ec3604c7a5	Writeback error handling fixes for v4.14 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZrTy3AAoJEAAOaEEZVoIVaucP/ApBAj2S5wzvlV1u6l8E6ae7 ZeEEZfcWwzRYlKjZAkTWqj9XvGpDGO5gLq4wsZK2edFAq++/MJF8ZVtN4tdZ1kUZ DUvRodtVOrT08Kp9wZXGT7JOFrf6U/6gMcR6p0MuWnHndeKYvlpcFi9NPT4EC9/z Zm9V7gtlPdSOha7eaSjUS0+vLERkxqXLBW3Av9QUOBP/lbI3lqIroGKeHDYnVdya 2P/k5EcRRJMyJP6TqyYxmmJl+UWjJFMLvnlUDBslHnD/u3mIUhw3JLHYBjn5dZRE Xjq56IDPoXDUvzlBhtn/Uqyx+/wtwsNsylpmKv6K5G1JfdeuSsPVsCey+A1cqV64 LpE5896wf9TmnmI9LNyh6vDn925xPSGBiF45UEp5f9aO7jXeY0MaEZ8g+ENqFIDK v4gtZdS9FhYHV+/l4qEwYMKrqSbwKEs1r1FT+f4wnABby1ojfdA57ZPlp5PV2Vjp szTp88Zkb7cMvZwEnWwxWofcJNmgS7uNahvnQF3IJ4ITsioEkuyYR3K4ZQMaaaV9 wCp6G0FhXZaK3OI7o9WiDwaO2elp9Hxc8bnqKpiBbHZkY0NLh7/++5VxpeNbTHFy AGijQiiKGNNyYqNj93wq9jpVdMNjB0pXrHRxfav8v7MtQ+WfbEoAENF4T7hN7iXn UuF6eSWEC5O1UCRUk1A+ =LLY3 -----END PGP SIGNATURE----- Merge tag 'wberr-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull writeback error handling updates from Jeff Layton: "This pile continues the work from last cycle on better tracking writeback errors. In v4.13 we added some basic errseq_t infrastructure and converted a few filesystems to use it. This set continues refining that infrastructure, adds documentation, and converts most of the other filesystems to use it. The main exception at this point is the NFS client" * tag 'wberr-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: ecryptfs: convert to file_write_and_wait in ->fsync mm: remove optimizations based on i_size in mapping writeback waits fs: convert a pile of fsync routines to errseq_t based reporting gfs2: convert to errseq_t based writeback error reporting for fsync fs: convert sync_file_range to use errseq_t based error-tracking mm: add file_fdatawait_range and file_write_and_wait fuse: convert to errseq_t based error tracking for fsync mm: consolidate dax / non-dax checks for writeback Documentation: add some docs for errseq_t errseq: rename __errseq_set to errseq_set	2017-09-06 14:11:03 -07:00
Linus Torvalds	5f82e71a00	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Add 'cross-release' support to lockdep, which allows APIs like completions, where it's not the 'owner' who releases the lock, to be tracked. It's all activated automatically under CONFIG_PROVE_LOCKING=y. - Clean up (restructure) the x86 atomics op implementation to be more readable, in preparation of KASAN annotations. (Dmitry Vyukov) - Fix static keys (Paolo Bonzini) - Add killable versions of down_read() et al (Kirill Tkhai) - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini) - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra) - Remove smp_mb__before_spinlock() and convert its usages, introduce smp_mb__after_spinlock() (Peter Zijlstra) * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) locking/lockdep/selftests: Fix mixed read-write ABBA tests sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK() acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures smp: Avoid using two cache lines for struct call_single_data locking/lockdep: Untangle xhlock history save/restore from task independence locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being futex: Remove duplicated code and fix undefined behaviour Documentation/locking/atomic: Finish the document... locking/lockdep: Fix workqueue crossrelease annotation workqueue/lockdep: 'Fix' flush_work() annotation locking/lockdep/selftests: Add mixed read-write ABBA tests mm, locking/barriers: Clarify tlb_flush_pending() barriers locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive locking/lockdep: Explicitly initialize wq_barrier::done::map locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING locking/refcounts, x86/asm: Implement fast refcount overflow protection locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease ...	2017-09-04 11:52:29 -07:00
Linus Torvalds	9657752cb5	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel side changes: - Add branch type profiling/tracing support. (Jin Yao) - Add the PERF_SAMPLE_PHYS_ADDR ABI to allow the tracing/profiling of physical memory addresses, where the PMU supports it. (Kan Liang) - Export some PMU capability details in the new /sys/bus/event_source/devices/cpu/caps/ sysfs directory. (Andi Kleen) - Aux data fixes and updates (Will Deacon) - kprobes fixes and updates (Masami Hiramatsu) - AMD uncore PMU driver fixes and updates (Janakarajan Natarajan) On the tooling side, here's a (limited!) list of highlights - there were many other changes that I could not list, see the shortlog and git history for details: UI improvements: - Implement a visual marker for fused x86 instructions in the annotate TUI browser, available now in 'perf report', more work needed to have it available as well in 'perf top' (Jin Yao) Further explanation from one of Jin's patches: │ ┌──cmpl $0x0,argp_program_version_hook 81.93 │ ├──je 20 │ │ lock cmpxchg %esi,0x38a9a4(%rip) │ │↓ jne 29 │ │↓ jmp 43 11.47 │20:└─→cmpxch %esi,0x38a999(%rip) That means the cmpl+je is a fused instruction pair and they should be considered together. - Record the branch type and then show statistics and info about in callchain entries (Jin Yao) Example from one of Jin's patches: # perf record -g -j any,save_type # perf report --branch-history --stdio --no-children 38.50% div.c:45 [.] main div \| ---main div.c:42 (RET CROSS_2M cycles:2) compute_flag div.c:28 (cycles:2) compute_flag div.c:27 (RET CROSS_2M cycles:1) rand rand.c:28 (cycles:1) rand rand.c:28 (RET CROSS_2M cycles:1) __random random.c:298 (cycles:1) __random random.c:297 (COND_BWD CROSS_2M cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (COND_BWD CROSS_2M cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (RET CROSS_2M cycles:9) namespaces support: - Add initial support for namespaces, using setns to access files in namespaces, grabbing their build-ids, etc. (Krister Johansen) perf trace enhancements: - Beautify pkey_{alloc,free,mprotect} arguments in 'perf trace' (Arnaldo Carvalho de Melo) - Add initial 'clone' syscall args beautifier in 'perf trace' (Arnaldo Carvalho de Melo) - Ignore 'fd' and 'offset' args for MAP_ANONYMOUS in 'perf trace' (Arnaldo Carvalho de Melo) - Beautifiers for the 'cmd' arg of several ioctl types, including: sound, DRM, KVM, vhost virtio and perf_events. (Arnaldo Carvalho de Melo) - Add PERF_SAMPLE_CALLCHAIN and PERF_RECORD_MMAP[2] to 'perf data' CTF conversion, allowing CTF trace visualization tools to show callchains and to resolve symbols (Geneviève Bastien) - Beautify the fcntl syscall, which is an interesting one in the sense that infrastructure had to be put in place to change the formatters of some arguments according to the value in a previous one, i.e. cmd dictates how arg and the syscall return will be formatted. (Arnaldo Carvalho de Melo perf stat enhancements: - Use group read for event groups in 'perf stat', reducing overhead when groups are defined in the event specification, i.e. when using {} to enclose a list of events, asking them to be read at the same time, e.g.: "perf stat -e '{cycles,instructions}'" (Jiri Olsa) pipe mode improvements: - Process tracing data in 'perf annotate' pipe mode (David Carrillo-Cisneros) - Add header record types to pipe-mode, now this command: $ perf record -o - -e cycles sleep 1 \| perf report --stdio --header Will show the same as in non-pipe mode, i.e. involving a perf.data file (David Carrillo-Cisneros) Vendor specific hardware event support updates/enhancements: - Update POWER9 vendor events tables (Sukadev Bhattiprolu) - Add POWER9 PMU events Sukadev (Bhattiprolu) - Support additional POWER8+ PVR in PMU mapfile (Shriya) - Add Skylake server uncore JSON vendor events (Andi Kleen) - Support exporting Intel PT data to sqlite3 with python perf scripts, this is in addition to the postgresql support that was already there (Adrian Hunter)" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (253 commits) perf symbols: Fix plt entry calculation for ARM and AARCH64 perf probe: Fix kprobe blacklist checking condition perf/x86: Fix caps/ for !Intel perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR perf/core, pt, bts: Get rid of itrace_started perf trace beauty: Beautify pkey_{alloc,free,mprotect} arguments tools headers: Sync cpu features kernel ABI headers with tooling headers perf tools: Pass full path of FEATURES_DUMP perf tools: Robustify detection of clang binary tools lib: Allow external definition of CC, AR and LD perf tools: Allow external definition of flex and bison binary names tools build tests: Don't hardcode gcc name perf report: Group stat values on global event id perf values: Zero value buffers perf values: Fix allocation check perf values: Fix thread index bug perf report: Add dump_read function perf record: Set read_format for inherit_stat perf c2c: Fix remote HITM detection for Skylake perf tools: Fix static build with newer toolchains ...	2017-09-04 08:39:02 -07:00
Linus Torvalds	0081a0ce80	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnad: "The main RCU related changes in this cycle were: - Removal of spin_unlock_wait() - SRCU updates - RCU torture-test updates - RCU Documentation updates - Extend the sys_membarrier() ABI with the MEMBARRIER_CMD_PRIVATE_EXPEDITED variant - Miscellaneous RCU fixes - CPU-hotplug fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) arch: Remove spin_unlock_wait() arch-specific definitions locking: Remove spin_unlock_wait() generic definitions drivers/ata: Replace spin_unlock_wait() with lock/unlock pair ipc: Replace spin_unlock_wait() with lock/unlock pair exit: Replace spin_unlock_wait() with lock/unlock pair completion: Replace spin_unlock_wait() with lock/unlock pair doc: Set down RCU's scheduling-clock-interrupt needs doc: No longer allowed to use rcu_dereference on non-pointers doc: Add RCU files to docbook-generation files doc: Update memory-barriers.txt for read-to-write dependencies doc: Update RCU documentation membarrier: Provide expedited private command rcu: Remove exports from rcu_idle_exit() and rcu_idle_enter() rcu: Add warning to rcu_idle_enter() for irqs enabled rcu: Make rcu_idle_enter() rely on callers disabling irqs rcu: Add assertions verifying blocked-tasks list rcu/tracing: Set disable_rcu_irq_enter on rcu_eqs_exit() rcu: Add TPS() protection for _rcu_barrier_trace strings rcu: Use idle versions of swait to make idle-hack clear swait: Add idle variants which don't contribute to load average ...	2017-09-04 08:13:52 -07:00
Cédric Le Goater	265601f034	powerpc/xive: Fix section __init warning xive_spapr_init() is called from a __init routine and calls __init routines. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-04 19:38:07 +10:00
Paul Mackerras	4716e488ab	powerpc: Fix kernel crash in emulation of vector loads and stores Commit `350779a29f` ("powerpc: Handle most loads and stores in instruction emulation code", 2017-08-30) changed the register usage in get_vr and put_vr with the aim of leaving the register number in r3 untouched on return. Unfortunately, r6 was not a good choice, as the callers as of `350779a29f` store a MSR value in r6. Then, in commit `c22435a5f3` ("powerpc: Emulate FP/vector/VSX loads/stores correctly when regs not live", 2017-08-30), the saving and restoring of the MSR got moved into get_vr and put_vr. Either way, the effect is that we put a value in MSR that only has the 0x3f8 bits non-zero, meaning that we are switching to 32-bit mode. That leads to a crash like this: Unable to handle kernel paging request for instruction fetch Faulting instruction address: 0x0007bea0 Oops: Kernel access of bad area, sig: 11 [#12] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: vmx_crypto binfmt_misc ip_tables x_tables autofs4 crc32c_vpmsum CPU: 6 PID: 32659 Comm: trashy_testcase Tainted: G D 4.13.0-rc2-00313-gf3026f57e6ed-dirty #23 task: c000000f1bb9e780 task.stack: c000000f1ba98000 NIP: 000000000007bea0 LR: c00000000007b054 CTR: c00000000007be70 REGS: c000000f1ba9b960 TRAP: 0400 Tainted: G D (4.13.0-rc2-00313-gf3026f57e6ed-dirty) MSR: 10000000400010a1 <HV,ME,IR,LE> CR: 48000228 XER: 00000000 CFAR: c00000000007be74 SOFTE: 1 GPR00: c00000000007b054 c000000f1ba9bbe0 c000000000e6e000 000000000000001d GPR04: c000000f1ba9bc00 c00000000007be70 00000000000000e8 9000000002009033 GPR08: 0000000002000000 100000000282f033 000000000b0a0900 0000000000001009 GPR12: 0000000000000000 c00000000fd42100 0706050303020100 a5a5a5a5a5a5a5a5 GPR16: 2e2e2e2e2e2de70c 2e2e2e2e2e2e2e2d 0000000000ff00ff 0606040202020000 GPR20: 000000000000005b ffffffffffffffff 0000000003020100 0000000000000000 GPR24: c000000f1ab90020 c000000f1ba9bc00 0000000000000001 0000000000000001 GPR28: c000000f1ba9bc90 c000000f1ba9bea0 000000000b0a0908 0000000000000001 NIP [000000000007bea0] 0x7bea0 LR [c00000000007b054] emulate_loadstore+0x1044/0x1280 Call Trace: [c000000f1ba9bbe0] [c000000000076b80] analyse_instr+0x60/0x34f0 (unreliable) [c000000f1ba9bc70] [c00000000007b7ec] emulate_step+0x23c/0x544 [c000000f1ba9bce0] [c000000000053424] arch_uprobe_skip_sstep+0x24/0x40 [c000000f1ba9bd00] [c00000000024b2f8] uprobe_notify_resume+0x598/0xba0 [c000000f1ba9be00] [c00000000001c284] do_notify_resume+0xd4/0xf0 [c000000f1ba9be30] [c00000000000bd44] ret_from_except_lite+0x70/0x74 Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace a7ae7a7f3e0256b5 ]--- To fix this, we just revert to using r3 as before, since the callers don't rely on r3 being left unmodified. Fortunately, this can't be triggered by a misaligned load or store, because vector loads and stores truncate misaligned addresses rather than taking an alignment interrupt. It can be triggered using uprobes. Fixes: `350779a29f` ("powerpc: Handle most loads and stores in instruction emulation code") Reported-by: Anton Blanchard <anton@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Tested-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-04 19:38:07 +10:00
Ingo Molnar	edc2988c54	Merge branch 'linus' into locking/core, to fix up conflicts Conflicts: mm/page_alloc.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-09-04 11:01:18 +02:00
Cédric Le Goater	5f121292f0	powerpc/xive: improve debugging macros Having the CPU identifier in the debug logs is helpful when tracking issues. Also add some more logging and fix a compile issue in xive_do_source_eoi(). Signed-off-by: Cédric Le Goater <clg@kaod.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-02 21:02:38 +10:00
Cédric Le Goater	ac5e5a5402	powerpc/xive: add XIVE Exploitation Mode to CAS On POWER9, the Client Architecture Support (CAS) negotiation process determines whether the guest operates in XIVE Legacy compatibility or in XIVE exploitation mode. Now that we have initial guest support for the XIVE interrupt controller, let's inform the hypervisor what we can do. The platform advertises the XIVE Exploitation Mode support using the property "ibm,arch-vec-5-platform-support-vec-5", byte 23 bits 0-1 : - 0b00 XIVE legacy mode Only - 0b01 XIVE exploitation mode Only - 0b10 XIVE legacy or exploitation mode The OS asks for XIVE Exploitation Mode support using the property "ibm,architecture-vec-5", byte 23 bits 0-1: - 0b00 XIVE legacy mode Only - 0b01 XIVE exploitation mode Only Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-02 21:02:38 +10:00
Cédric Le Goater	bed81ee181	powerpc/xive: introduce H_INT_ESB hcall The H_INT_ESB hcall() is used to issue a load or store to the ESB page instead of using the MMIO pages. This can be used as a workaround on some HW issues. The OS knows that this hcall should be used on an interrupt source when the ESB hcall flag is set to 1 in the hcall H_INT_GET_SOURCE_INFO. To maintain the frontier between the xive frontend and backend, we introduce a new xive operation 'esb_rw' to be used in the routines doing memory accesses on the ESBs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-02 21:02:37 +10:00
Cédric Le Goater	c58a14a9cc	powerpc/xive: add the HW IRQ number under xive_irq_data It will be required later by the H_INT_ESB hcall. Signed-off-by: Cédric Le Goater <clg@kaod.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-02 21:02:37 +10:00
Cédric Le Goater	99f122573e	powerpc/xive: introduce xive_esb_write() Some source support MMIO stores on the ESB page to perform EOI. Let's introduce a specific routine for this case even if this should be the only use of it. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-02 21:02:36 +10:00
Cédric Le Goater	59fc2724e4	powerpc/xive: rename xive_poke_esb() in xive_esb_read() xive_poke_esb() is performing a load/read so it is better named as xive_esb_read() as we will need to introduce a xive_esb_write() routine. Also use the XIVE_ESB_LOAD_EOI offset when EOI'ing LSI interrupts. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-02 21:02:36 +10:00
Cédric Le Goater	eac1e731b5	powerpc/xive: guest exploitation of the XIVE interrupt controller This is the framework for using XIVE in a PowerVM guest. The support is very similar to the native one in a much simpler form. Each source is associated with an Event State Buffer (ESB). This is a two bit state machine which is used to trigger events. The bits are named "P" (pending) and "Q" (queued) and can be controlled by MMIO. The Guest OS registers event (or notifications) queues on which the HW will post event data for a target to notify. Instead of OPAL calls, a set of Hypervisors call are used to configure the interrupt sources and the event/notification queues of the guest: - H_INT_GET_SOURCE_INFO used to obtain the address of the MMIO page of the Event State Buffer (PQ bits) entry associated with the source. - H_INT_SET_SOURCE_CONFIG assigns a source to a "target". - H_INT_GET_SOURCE_CONFIG determines to which "target" and "priority" is assigned to a source - H_INT_GET_QUEUE_INFO returns the address of the notification management page associated with the specified "target" and "priority". - H_INT_SET_QUEUE_CONFIG sets or resets the event queue for a given "target" and "priority". It is also used to set the notification config associated with the queue, only unconditional notification for the moment. Reset is performed with a queue size of 0 and queueing is disabled in that case. - H_INT_GET_QUEUE_CONFIG returns the queue settings for a given "target" and "priority". - H_INT_RESET resets all of the partition's interrupt exploitation structures to their initial state, losing all configuration set via the hcalls H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG. - H_INT_SYNC issue a synchronisation on a source to make sure sure all notifications have reached their queue. As for XICS, the XIVE interface for the guest is described in the device tree under the "interrupt-controller" node. A couple of new properties are specific to XIVE : - "reg" contains the base address and size of the thread interrupt managnement areas (TIMA), also called rings, for the User level and for the Guest OS level. Only the Guest OS level is taken into account today. - "ibm,xive-eq-sizes" the size of the event queues. One cell per size supported, contains log2 of size, in ascending order. - "ibm,xive-lisn-ranges" the interrupt numbers ranges assigned to the guest. These are allocated using a simple bitmap. and also : - "/ibm,plat-res-int-priorities" contains a list of priorities that the hypervisor has reserved for its own use. Tested with a QEMU XIVE model for pseries and with the Power hypervisor. Signed-off-by: Cédric Le Goater <clg@kaod.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-02 21:02:35 +10:00
Cédric Le Goater	994ea2f419	powerpc/xive: introduce a common routine xive_queue_page_alloc() This routine will be used in the spapr backend. Also introduce a short xive_alloc_order() helper. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-02 21:02:34 +10:00
David S. Miller	6026e043d0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Three cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 17:42:05 -07:00
Michael Ellerman	3b79b26101	powerpc/sstep: Avoid used uninitialized error Older compilers think val may be used uninitialized: arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': arch/powerpc/lib/sstep.c:2758:23: error: 'val' may be used uninitialized in this function We know better, but initialise val to 0 to avoid breaking the build. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-02 08:04:26 +10:00
Markus Elfring	fdbb9457b4	axonram: Return directly after a failed kzalloc() in axon_ram_probe() * Return directly after a call of the function "kzalloc" failed at the beginning. * Delete a repeated check for the local variable "bank" which became unnecessary with this refactoring. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:56 +10:00
Markus Elfring	a1bddf3991	axonram: Improve a size determination in axon_ram_probe() Replace the specification of a data structure by a pointer dereference as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:56 +10:00
Markus Elfring	c86a93971e	axonram: Delete an error message for a failed memory allocation in axon_ram_probe() Omit an extra message for a memory allocation failure in this function. This issue was detected by using the Coccinelle software. Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:55 +10:00
Alistair Popple	bab9f954aa	powerpc/powernv/npu: Move tlb flush before launching ATSD The nest MMU tlb flush needs to happen before the GPU translation shootdown is launched to avoid the GPU refilling its tlb with stale nmmu translations prior to the nmmu flush completing. Fixes: `1ab66d1fba` ("powerpc/powernv: Introduce address translation services for Nvlink2") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:55 +10:00
Julia Lawall	8a7aef2cb3	powerpc/iommu: Use permission-specific DEVICE_ATTR variants Use DEVICE_ATTR_RW for read-write attributes. This simplifies the source code, improves readbility, and reduces the chance of inconsistencies. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:54 +10:00
Markus Elfring	6ab41161b4	powerpc/eeh: Delete an error out of memory message at init time Omit an extra message for a memory allocation failure in eeh_dev_init(). This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> [mpe: Do not drop the message that can happen at runtime and lead to an event not being handled] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:53 +10:00
Markus Elfring	aae85e3c20	powerpc/mm: Use seq_putc() in two functions Two single characters (line breaks) should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:52 +10:00
Haren Myneni	146e9f1b65	crypto/nx: Add P9 NX specific error codes for 842 engine This patch adds changes for checking P9 specific 842 engine error codes. These errros are reported in coprocessor status block (CSB) for failures. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:50 +10:00
Christophe Leroy	ad1b0122bd	powerpc/32: remove a NOP from memset() memset() is patched after initialisation to activate the optimised part which uses cache instructions. Today we have a 'b 2f' to skip the optimised patch, which then gets replaced by a NOP, implying a useless cycle consumption. As we have a 'bne 2f' just before, we could use that instruction for the live patching, hence removing the need to have a dedicated 'b 2f' to be replaced by a NOP. This patch changes the 'bne 2f' by a 'b 2f'. During init, that 'b 2f' is then replaced by 'bne 2f' Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:46 +10:00
Christophe Leroy	7bf6057b96	powerpc/32: optimise memset() There is no need to extend the set value to an int when the length is lower than 4 as in that case we only do byte stores. We can therefore immediately branch to the part handling it. By separating it from the normal case, we are able to eliminate a few actions on the destination pointer. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:46 +10:00
Christophe Leroy	c0622167e3	powerpc: fix location of two EXPORT_SYMBOL Commit `9445aa1a30` ("ppc: move exports to definitions") added EXPORT_SYMBOL() for memset() and flush_hash_pages() in the middle of the functions. This patch moves them at the end of the two functions. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:45 +10:00
Christophe Leroy	da74f65920	powerpc/32: add memset16() Commit `694fc88ce2` ("powerpc/string: Implement optimized memset variants") added memset16(), memset32() and memset64() for the 64 bits PPC. On 32 bits, memset64() is not relevant, and as shown below, the generic version of memset32() gives a good code, so only memset16() is candidate for an optimised version. 000009c0 <memset32>: 9c0: 2c 05 00 00 cmpwi r5,0 9c4: 39 23 ff fc addi r9,r3,-4 9c8: 4d 82 00 20 beqlr 9cc: 7c a9 03 a6 mtctr r5 9d0: 94 89 00 04 stwu r4,4(r9) 9d4: 42 00 ff fc bdnz 9d0 <memset32+0x10> 9d8: 4e 80 00 20 blr The last part of memset() handling the not 4-bytes multiples operates on bytes, making it unsuitable for handling word without modification. As it would increase memset() complexity, it is better to implement memset16() from scratch. In addition it has the advantage of allowing a more optimised memset16() than what we would have by using the memset() function. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:45 +10:00
Paul Mackerras	45f62159f3	powerpc: Wrap register number correctly for string load/store instructions Michael Ellerman reported that emulate_loadstore() was trying to access element 32 of regs->gpr[], which doesn't exist, when emulating a string store instruction. This is because the string load and store instructions (lswi, lswx, stswi and stswx) are defined to wrap around from register 31 to register 0 if the number of bytes being loaded or stored is sufficiently large. This wrapping was not implemented in the emulation code. To fix it, we mask the register number after incrementing it. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Fixes: `c9f6f4ed95` ("powerpc: Implement emulation of string loads and stores") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:44 +10:00
Paul Mackerras	d2b65ac652	powerpc: Emulate load/store floating point as integer word instructions This adds emulation for the lfiwax, lfiwzx and stfiwx instructions. This necessitated adding a new flag to indicate whether a floating point or an integer conversion was needed for LOAD_FP and STORE_FP, so this moves the size field in op->type up 4 bits. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:44 +10:00
Paul Mackerras	31bfdb036f	powerpc: Use instruction emulation infrastructure to handle alignment faults This replaces almost all of the instruction emulation code in fix_alignment() with calls to analyse_instr(), emulate_loadstore() and emulate_dcbz(). The only emulation code left is the SPE emulation code; analyse_instr() etc. do not handle SPE instructions at present. One result of this is that we can now handle alignment faults on all the new VSX load and store instructions that were added in POWER9. VSX loads/stores will take alignment faults for unaligned accesses to cache-inhibited memory. Another effect is that we no longer rely on the DAR and DSISR values set by the processor. With this, we now need to include the instruction emulation code unconditionally. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:43 +10:00
Paul Mackerras	a53d5182e2	powerpc: Separate out load/store emulation into its own function This moves the parts of emulate_step() that deal with emulating load and store instructions into a new function called emulate_loadstore(). This is to make it possible to reuse this code in the alignment handler. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:42:41 +10:00
Paul Mackerras	d955189ae4	powerpc: Handle opposite-endian processes in emulation code This adds code to the load and store emulation code to byte-swap the data appropriately when the process being emulated is set to the opposite endianness to that of the kernel. This also enables the emulation for the multiple-register loads and stores (lmw, stmw, lswi, stswi, lswx, stswx) to work for little-endian. In little-endian mode, the partial word at the end of a transfer for lsw/stsw (when the byte count is not a multiple of 4) is loaded/stored at the least-significant end of the register. Additionally, this fixes a bug in the previous code in that it could call read_mem/write_mem with a byte count that was not 1, 2, 4 or 8. Note that this only works correctly on processors with "true" little-endian mode, such as IBM POWER processors from POWER6 on, not the so-called "PowerPC" little-endian mode that uses address swizzling as implemented on the old 32-bit 603, 604, 740/750, 74xx CPUs. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:55 +10:00
Paul Mackerras	b9da9c8a48	powerpc: Set regs->dar if memory access fails in emulate_step() This adds code to the instruction emulation code to set regs->dar to the address of any memory access that fails. This address is not necessarily the same as the effective address of the instruction, because if the memory access is unaligned, it might cross a page boundary and fault on the second page. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:54 +10:00
Paul Mackerras	b2543f7b20	powerpc: Emulate the dcbz instruction This adds code to analyse_instr() and emulate_step() to understand the dcbz (data cache block zero) instruction. The emulate_dcbz() function is made public so it can be used by the alignment handler in future. (The apparently unnecessary cropping of the address to 32 bits is there because it will be needed in that situation.) Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:54 +10:00
Paul Mackerras	1f41fb7904	powerpc: Emulate load/store floating double pair instructions This adds lfdp[x] and stfdp[x] to the set of instructions that analyse_instr() and emulate_step() understand. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:53 +10:00
Paul Mackerras	e61ccc7b0c	powerpc: Emulate vector element load/store instructions This adds code to analyse_instr() and emulate_step() to handle the vector element loads and stores: lvebx, lvehx, lvewx, stvebx, stvehx, stvewx. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:53 +10:00
Paul Mackerras	c22435a5f3	powerpc: Emulate FP/vector/VSX loads/stores correctly when regs not live At present, the analyse_instr/emulate_step code checks for the relevant MSR_FP/VEC/VSX bit being set when a FP/VMX/VSX load or store is decoded, but doesn't recheck the bit before reading or writing the relevant FP/VMX/VSX register in emulate_step(). Since we don't have preemption disabled, it is possible that we get preempted between checking the MSR bit and doing the register access. If that happened, then the registers would have been saved to the thread_struct for the current process. Accesses to the CPU registers would then potentially read stale values, or write values that would never be seen by the user process. Another way that the registers can become non-live is if a page fault occurs when accessing user memory, and the page fault code calls a copy routine that wants to use the VMX or VSX registers. To fix this, the code for all the FP/VMX/VSX loads gets restructured so that it forms an image in a local variable of the desired register contents, then disables preemption, checks the MSR bit and either sets the CPU register or writes the value to the thread struct. Similarly, the code for stores checks the MSR bit, copies either the CPU register or the thread struct to a local variable, then reenables preemption and then copies the register image to memory. If the instruction being emulated is in the kernel, then we must not use the register values in the thread_struct. In this case, if the relevant MSR enable bit is not set, then emulate_step refuses to emulate the instruction. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:52 +10:00
Paul Mackerras	e0a0986b44	powerpc: Make load/store emulation use larger memory accesses At the moment, emulation of loads and stores of up to 8 bytes to unaligned addresses on a little-endian system uses a sequence of single-byte loads or stores to memory. This is rather inefficient, and the code is hard to follow because it has many ifdefs. In addition, the Power ISA has requirements on how unaligned accesses are performed, which are not met by doing all accesses as sequences of single-byte accesses. Emulation of VSX loads and stores uses __copy_{to,from}_user, which means the emulation code has no control on the size of accesses. To simplify this, we add new copy_mem_in() and copy_mem_out() functions for accessing memory. These use a sequence of the largest possible aligned accesses, up to 8 bytes (or 4 on 32-bit systems), to copy memory between a local buffer and user memory. We then rewrite {read,write}_mem_unaligned and the VSX load/store emulation using these new functions. These new functions also simplify the code in do_fp_load() and do_fp_store() for the unaligned cases. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:51 +10:00
Paul Mackerras	958465ee54	powerpc: Add emulation for the addpcis instruction The addpcis instruction puts the sum of the next instruction address plus a constant into a register. Since the result depends on the address of the instruction, it will give an incorrect result if it is single-stepped out of line, which is what the *probes subsystem will currently do if a probe is placed on an addpcis instruction. This fixes the problem by adding emulation of it to analyse_instr(). Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:51 +10:00
Paul Mackerras	5762e08344	powerpc: Don't update CR0 in emulation of popcnt, prty, bpermd instructions The architecture shows the least-significant bit of the instruction word as reserved for the popcnt[bwd], prty[wd] and bpermd instructions, that is, these instructions never update CR0. Therefore this changes the emulation of these instructions to skip the CR0 update. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:50 +10:00
Paul Mackerras	f1bbb99f41	powerpc: Fix emulation of the isel instruction The case added for the isel instruction was added inside a switch statement which uses the 10-bit minor opcode field in the 0x7fe bits of the instruction word. However, for the isel instruction, the minor opcode field is only the 0x3e bits, and the 0x7c0 bits are used for the "BC" field, which indicates which CR bit to use to select the result. Therefore, for the isel emulation to work correctly when BC != 0, we need to match on ((instr >> 1) & 0x1f) == 15). To do this, we pull the isel case out of the switch statement and put it in an if statement of its own. Fixes: `e27f71e5ff` ("powerpc/lib/sstep: Add isel instruction emulation") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:49 +10:00
Paul Mackerras	d120cdbce6	powerpc/64: Fix update forms of loads and stores to write 64-bit EA When a 64-bit processor is executing in 32-bit mode, the update forms of load and store instructions are required by the architecture to write the full 64-bit effective address into the RA register, though only the bottom 32 bits are used to address memory. Currently, the instruction emulation code writes the truncated address to the RA register. This fixes it by keeping the full 64-bit EA in the instruction_op structure, truncating the address in emulate_step() where it is used to address memory, rather than in the address computations in analyse_instr(). Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:49 +10:00
Paul Mackerras	350779a29f	powerpc: Handle most loads and stores in instruction emulation code This extends the instruction emulation infrastructure in sstep.c to handle all the load and store instructions defined in the Power ISA v3.0, except for the atomic memory operations, ldmx (which was never implemented), lfdp/stfdp, and the vector element load/stores. The instructions added are: Integer loads and stores: lbarx, lharx, lqarx, stbcx., sthcx., stqcx., lq, stq. VSX loads and stores: lxsiwzx, lxsiwax, stxsiwx, lxvx, lxvl, lxvll, lxvdsx, lxvwsx, stxvx, stxvl, stxvll, lxsspx, lxsdx, stxsspx, stxsdx, lxvw4x, lxsibzx, lxvh8x, lxsihzx, lxvb16x, stxvw4x, stxsibx, stxvh8x, stxsihx, stxvb16x, lxsd, lxssp, lxv, stxsd, stxssp, stxv. These instructions are handled both in the analyse_instr phase and in the emulate_step phase. The code for lxvd2ux and stxvd2ux has been taken out, as those instructions were never implemented in any processor and have been taken out of the architecture, and their opcodes have been reused for other instructions in POWER9 (lxvb16x and stxvb16x). The emulation for the VSX loads and stores uses helper functions which don't access registers or memory directly, which can hopefully be reused by KVM later. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:48 +10:00
Paul Mackerras	ee0a54d797	powerpc: Don't check MSR FP/VMX/VSX enable bits in analyse_instr() This removes the checks for the FP/VMX/VSX enable bits in the MSR from analyse_instr() and adds them to emulate_step() instead. The reason for this is that we may want to use analyse_instr() in a situation where the FP/VMX/VSX register values are stored in the current thread_struct and the FP/VMX/VSX enable bits in the MSR image in the pt_regs are zero. Since analyse_instr() doesn't make any changes to register state, it is reasonable for it to indicate what the effect of an instruction would be even though the relevant enable bit is off. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:48 +10:00
Paul Mackerras	3cdfcbfd32	powerpc: Change analyse_instr so it doesn't modify regs The analyse_instr function currently doesn't just work out what an instruction does, it also executes those instructions whose effect is only to update CPU registers that are stored in struct pt_regs. This is undesirable because optprobes uses analyse_instr to work out if an instruction could be successfully emulated in future. This changes analyse_instr so it doesn't modify regs; instead it stores information in the instruction_op structure to indicate what registers (GPRs, CR, XER, LR) would be set and what value they would be set to. A companion function called emulate_update_regs() can then use that information to update a pt_regs struct appropriately. As a minor cleanup, this replaces inline asm using the cntlzw and cntlzd instructions with calls to __builtin_clz() and __builtin_clzl(). Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-09-01 16:39:27 +10:00
nixiaoming	43f6b0cfb2	KVM: PPC: Book3S HV: Fix memory leak in kvm_vm_ioctl_get_htab_fd We do ctx = kzalloc(sizeof(*ctx), GFP_KERNEL) and then later on call anon_inode_getfd(), but if that fails we don't free ctx, so that memory gets leaked. To fix it, this adds kfree(ctx) in the failure path. Signed-off-by: nixiaoming <nixiaoming@huawei.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-09-01 10:17:58 +10:00
Jérôme Glisse	fb1522e099	KVM: update to new mmu_notifier semantic v2 Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Changed since v1 (Linus Torvalds) - remove now useless kvm_arch_mmu_notifier_invalidate_page() Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Mike Galbraith <efault@gmx.de> Tested-by: Adam Borowski <kilobyte@angband.pl> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-31 16:13:00 -07:00
Jérôme Glisse	d1d5762e47	powerpc/powernv: update to new mmu_notifier semantic Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and now are bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Alistair Popple <alistair@popple.id.au> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-31 16:12:59 -07:00
Paul Mackerras	93b2d3cf37	powerpc: Correct instruction code for xxlor instruction The instruction code for xxlor that commit `0016a4cf55` ("powerpc: Emulate most Book I instructions in emulate_step()", 2010-06-15) added is actually the code for xxlnor. It is used in get_vsr() and put_vsr() and the effect of the error is that if emulate_step is used to emulate a VSX load or store from any register other than vsr0, the bitwise complement of the correct value will be loaded or stored. This corrects the error. Fixes: `0016a4cf55` ("powerpc: Emulate most Book I instructions in emulate_step()") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 22:07:19 +10:00
Michael Ellerman	f9effe9250	powerpc: Fix DAR reporting when alignment handler faults Anton noticed that if we fault part way through emulating an unaligned instruction, we don't update the DAR to reflect that. The DAR value is eventually reported back to userspace as the address in the SEGV signal, and if userspace is using that value to demand fault then it can be confused by us not setting the value correctly. This patch is ugly as hell, but is intended to be the minimal fix and back ports easily. Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 22:06:57 +10:00
John Allen	afb5519fdb	powerpc/pseries: Don't attempt to acquire drc during memory hot add for assigned lmbs Check if an LMB is assigned before attempting to call dlpar_acquire_drc in order to avoid any unnecessary rtas calls. This substantially reduces the running time of memory hot add on lpars with large amounts of memory. [mpe: We need to explicitly set rc to 0 in the success case, otherwise the compiler might think we use rc without initialising it.] Fixes: `c21f515c74` ("powerpc/pseries: Make the acquire/release of the drc for memory a seperate step") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 20:02:23 +10:00
Arvind Yadav	7def9a2418	powerpc/4xx: Constify cpm_suspend_ops struct platform_suspend_ops are not supposed to change at runtime. Functions suspend_set_ops working with const platform_suspend_ops. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 19:56:33 +10:00
Oliver O'Halloran	96d91431d6	powerpc/smp: Add Power9 scheduler topology In previous generations of Power processors each core had a private L2 cache. The Power 9 processor has a slightly different design where the L2 cache is shared among pairs of cores rather than being completely private. Making the scheduler aware of this cache sharing allows the scheduler to make better migration decisions. For example, if two CPU heavy tasks share a core then one task can be migrated to the paired core to improve throughput. Under the existing three level topology the task could be migrated to any core on the same chip, while with the new topology it would be preferentially migrated to the paired core so it remains cache-hot. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 18:16:08 +10:00
Oliver O'Halloran	2a636a56d2	powerpc/smp: Add cpu_l2_cache_map We want to add an extra level to the CPU scheduler topology to account for cores which share a cache. To do this we need to build a cpumask for each CPU that indicates which CPUs share this cache to use as an input to the scheduler. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:56 +10:00
Oliver O'Halloran	df52f67140	powerpc/smp: Rework CPU topology construction The CPU scheduler topology is constructed from a number of per-cpu cpumasks which describe which sets of logical CPUs are related in some fashion. Current code that handles constructing these masks when CPUs are hot(un)plugged can be simplified a bit by exploiting the fact that the scheduler requires higher levels of the toplogy (e.g package level groupings) to be supersets of the lower levels (e.g. threas in a core). This patch reworks the cpumask construction to be simpler and easier to extend with extra topology levels. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Fix CONFIG_HOTPLUG_CPU=n build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:51 +10:00
Oliver O'Halloran	e3d8b67e2c	powerpc/smp: Use cpu_to_chip_id() to find core siblings When building the CPU scheduler topology the kernel uses the ibm,chipid property from the devicetree to group logical CPUs. Currently the DT search for this property is open-coded in smp.c and this functionality is a duplication of what's in cpu_to_chip_id() already. This patch removes the existing search in favor of that. It's worth mentioning that the semantics of the search are different in cpu_to_chip_id(). When there is no ibm,chipid in the CPUs node it will also search /cpus and / for the property, but this should not effect the output topology. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:50 +10:00
Hannes Reinecke	866bfc75f4	powerpc: conditionally compile platform-specific serial drivers mpsc.c and mpc52xx-psc.c are platform-specific serial drivers, and should be compiled for the respective platforms only. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:48 +10:00
Tobin C. Harding	eb039161da	powerpc/asm: Convert .llong directives to .8byte .llong is an undocumented PPC specific directive. The generic equivalent is .quad, but even better (because it's self describing) is .8byte. Convert all .llong directives to .8byte. Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:47 +10:00
Balbir Singh	5b593949f8	powerpc/configs: Enable THP and 64K for ppc64(le)_defconfig Enable 64K page size and THP. I use ppc64le_defconfig when I need a single config across guest and host, but having 4K page size as default is not what I expect. I could move these over to server.config and merge if ppc64_defconfig is meant for systems that use 4k pages by default. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:47 +10:00
Balbir Singh	539df7fcb3	powerpc/configs: Enable function trace by default Most (all?) distros turn these on, so it makes sense to enable them for testing coverage, and they're also useful for developers. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Reword change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:45 +10:00
Balbir Singh	d1e1b351f5	powerpc/xmon: Add ISA v3.0 SPRs to SPR dump Add support for printing the PIDR/TIDR for ISA 300 and PSSCR and PTCR in ISA 3.0 hypervisor mode. SPRN_PSSCR_PR is the privileged mode access and is used when we are not in hypervisor mode. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:45 +10:00
Balbir Singh	64d66aa051	powerpc/xmon: Add AMR, UAMOR, AMOR, IAMR to SPR dump This patch adds support to xmon for dumping the AMR, UAMOR, AMOR and IAMR SPRs based on their supported ISA revisions. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:44 +10:00
Balbir Singh	cf9159c36c	powerpc/xmon: Dump all 64 bits of HDEC ISA 3.0 defines hypervisor decrementer to be 64 bits in length. This patch extends the print format for to be 64 bits. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:43 +10:00
Masahiro Yamada	7f2462acb6	powerpc: Squash lines for simple wrapper functions Remove unneeded variables and assignments. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:42 +10:00
Michael Ellerman	6deb6b474b	powerpc/mm/radix: Prettify mapped memory range print out When we map memory at boot we print out the ranges of real addresses that we mapped and the page size that was used. Currently it's a bit ugly: Mapped range 0x0 - 0x2000000000 with 0x40000000 Mapped range 0x200000000000 - 0x202000000000 with 0x40000000 Pad the addresses so they line up, and print the page size using actual units, eg: Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:42 +10:00
Michael Ellerman	bd350f7121	powerpc/mm/radix: Add pr_fmt() to pgtable-radix.c Make the printks look a bit nicer by adding a prefix. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:41 +10:00
Bryant G. Ly	f9df74dfce	powerpc/kernel: Change retrieval of pci_dn For a PCI device it's pci_dn can be retrieved from pdev->dev.archdata.firmware_data, PCI_DN(devnode), or parent's list. Thus, we should just use the existing function pci_get_pdn_by_devfn to get the pci_dn. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:40 +10:00
Sukadev Bhattiprolu	2392c8c8c0	powerpc/powernv/vas: Define copy/paste interfaces Define interfaces (wrappers) to the 'copy' and 'paste' instructions (which are new in PowerISA 3.0). These are intended to be used to by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to the NX hardware engines. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:38 +10:00
Sukadev Bhattiprolu	5239af679a	powerpc/powernv/vas: Define vas_tx_win_open() Define an interface to open a VAS send window. This interface is intended to be used the Nest Accelerator (NX) driver(s) to open a send window and use it to submit compression/encryption requests to a VAS receive window. The receive window, identified by the [vasid, cop] parameters, must already be open in VAS (i.e connected to an NX engine). Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:37 +10:00
Sukadev Bhattiprolu	98271d4198	powerpc/powernv/vas: Define vas_win_close() interface Define the vas_win_close() interface which should be used to close a send or receive windows. While the hardware configurations required to open send and receive windows differ, the configuration to close a window is the same for both. So we use a single interface to close the window. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:37 +10:00
Sukadev Bhattiprolu	62c4eda4fa	powerpc/powernv/vas: Define vas_rx_win_open() interface Define the vas_rx_win_open() interface. This interface is intended to be used by the Nest Accelerator (NX) driver(s) to setup receive windows for one or more NX engines (which implement compression & encryption algorithms in the hardware). Follow-on patches will provide an interface to close the window and to open a send window that kernel subsystems can use to access the NX engines. The interface to open a receive window is expected to be invoked for each instance of VAS in the system. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:36 +10:00
Sukadev Bhattiprolu	bbfe59f8a7	powerpc/powernv/vas: Define helpers to alloc/free windows Define helpers to allocate/free VAS window objects. These will be used in follow-on patches when opening/closing windows. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:35 +10:00
Sukadev Bhattiprolu	b25b33ac18	powerpc/powernv/vas: Define helpers to init window context Define helpers to initialize window context registers of the VAS hardware. These will be used in follow-on patches when opening/closing VAS windows. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:33 +10:00
Sukadev Bhattiprolu	180fe15a82	powerpc/powernv/vas: Define helpers to access MMIO regions Define some helper functions to access the MMIO regions. We use these in follow-on patches to read/write VAS hardware registers. They are also used to later issue 'paste' instructions to submit requests to the NX hardware engines. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:30 +10:00
Sukadev Bhattiprolu	4dea2d1a92	powerpc/powernv/vas: Define vas_init() and vas_exit() Implement vas_init() and vas_exit() functions for a new VAS module. This VAS module is essentially a library for other device drivers and kernel users of the NX coprocessors like NX-842 and NX-GZIP. In the future this will be extended to add support for user space to access the NX coprocessors. VAS is currently only supported with 64K page size. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:26 +10:00
Sukadev Bhattiprolu	b6622a339e	powerpc/powernv: Move GET_FIELD/SET_FIELD to vas.h Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other users of VAS, including NX-842 can use those macros. There is a lot of related code between the VAS/NX kernel drivers and skiboot. For consistency, switch the order of parameters in SET_FIELD to match the order in skiboot. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Reviewed-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:20 +10:00
Sukadev Bhattiprolu	967689141e	powerpc/powernv/vas: Define macros, register fields and structures Define macros for the VAS hardware registers and bit-fields as well as couple of data structures needed by the VAS driver. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [mpe: Fixup include guard to use _ASM_POWERPC_VAS_H] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:17 +10:00
Balbir Singh	c47a94031e	powerpc/xmon: Fix display of SPRs Convert 0.16x to 0.16lx. Otherwise we lose the top 8 nibbles and effectively print only the last 32 bits. Fixes: `1846193b17` ("powerpc/xmon: Dump ISA 2.06 SPRs") Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:13 +10:00
Alexey Kardashevskiy	f1e08232ed	powerpc/pci: Remove OF node back pointer from pci_dn The check_req() helper uses pci_get_pdn() to get an OF node pointer. pci_get_pdn() returns a pci_dn pointer which either: 1) from the OF node returned by pci_device_to_OF_node(); 2) from the parent child_list where entries don't have OF node pointers. Since check_req() does not care about 2), it can call pci_device_to_OF_node() directly, hence the change. The find_pe_dn() helper uses embedded pci_dn to get an OF node which is also stored in edev->pdev so let's take a shortcut and call pci_device_to_OF_node() directly. With these 2 changes, we can finally get rid of the OF node back pointer. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:12 +10:00
Alexey Kardashevskiy	14db3d52d3	powerpc/eeh: Reduce use of pci_dn::node The pci_dn struct caches a OF device node pointer in order to access the "ibm,loc-code" property when EEH is recovering. However, when this happens in eeh_dev_check_failure(), we also have a pci_dev pointer which should have a valid pointer to the device node when pci_dn has one (both pointers are not NULL for physical functions and are NULL for virtual functions). This changes pci_remove_device_node_info() to look for a parent of the node being removed, just like pci_add_device_node_info() does when it references the parent node. This is the first step to get rid of pci_dn::node. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:10 +10:00
Alexey Kardashevskiy	405b33a76d	powerpc/eeh: Remove unnecessary config_addr from eeh_dev The eeh_dev struct hold a config space address of an associated node and the very same address is also stored in the pci_dn struct which is always present during the eeh_dev lifetime. This uses bus:devfn directly from pci_dn instead of cached and packed config_addr. Since config_addr is made from device's bus:dev.fn, there is no point in keeping it in the debugfs either so remove that too. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:09 +10:00
Alexey Kardashevskiy	69672bd748	powerpc/eeh: Remove unnecessary pointer to phb from eeh_dev The eeh_dev struct already holds a pointer to pci_dn which it does not exist without and pci_dn itself holds the very same pointer so just use it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:09 +10:00
Alexey Kardashevskiy	8bae6a2319	powerpc/eeh: Reduce to one the number of places where edev is allocated arch/powerpc/kernel/eeh_dev.c:57 is the only legit place where edev is allocated; other 2 places allocate it on stack and in the heap for a very short period of time to use eeh_pe_get() as takes edev. This changes eeh_pe_get() to receive required parameters explicitly. This removes unnecessary temporary allocation of edev. This uses the "pe_no" name instead of the "pe_config_addr" name as it actually is a PE number and not a config space address as it seemed. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:08 +10:00
Alexey Kardashevskiy	5f600b17d1	powerpc/pci: Remove unused parameter from add_one_dev_pci_data() pdev is always NULL, remove it. To make checkpatch.pl happy, this also removes the "out of memory" message. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:07 +10:00
Arvind Yadav	9e2b70fbbc	powerpc/512x: Constify clk_div_tables clk_div_tables are not supposed to change at runtime. mpc512x_clk_divtable function working with const clk_div_table. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:06 +10:00
Dan Carpenter	8d046759f6	powerpc/44x: Fix mask and shift to zero bug My static checker complains that 0x00001800 >> 13 is zero. Looking at the context, it seems like a copy and paste bug from the line below and probably 0x3 << 13 or 0x00006000 was intended. Fixes: `2af59f7d5c` ("[POWERPC] 4xx: Add 405GPr and 405EP support in boot wrapper") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:06 +10:00
Dan Carpenter	c65540453e	powerpc/83xx: Use sizeof correct type when ioremapping There is a cut and paste error here so we use sizeof(struct mpc83xx_pmc) to remap the memory for "clock_regs". That sizeof() is 20 bytes and we only need to remap 12 bytes. It presumably doesn't affect run time too much... I changed them to both use sizeof(*variable_name) because that's the preferred kernel style these days. Fixes: `d49747bdfb` ("powerpc/mpc83xx: Power Management support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> [mpe: It will map at least one page anyway, but still a good cleanup] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:05 +10:00
Nicholas Piggin	b96672dd84	powerpc: Machine check interrupt is a non-maskable interrupt Use nmi_enter similarly to system reset interrupts. This uses NMI printk NMI buffers and turns off various debugging facilities that helps avoid tripping on ourselves or other CPUs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:04 +10:00
Nicholas Piggin	6fcd6baa90	powerpc/powernv: Use kernel crash path for machine checks There are quite a few machine check exceptions that can be caused by kernel bugs. To make debugging easier, use the kernel crash path in cases of synchronous machine checks that occur in kernel mode, if that would not result in the machine going straight to panic or crash dump. There is a downside here that die()ing the process in kernel mode can still leave the system unstable. panic_on_oops will always force the system to fail-stop, so systems where that behaviour is important will still do the right thing. As a test, when triggering an i-side 0111b error (ifetch from foreign address) in kernel mode process context on POWER9, the kernel currently dies quickly like this: Severe Machine check interrupt [Not recovered] NIP [ffff000000000000]: 0xffff000000000000 Initiator: CPU Error type: Real address [Instruction fetch (foreign)] [ 127.426651616,0] OPAL: Reboot requested due to Platform error. Effective[ 127.426693712,3] OPAL: Reboot requested due to Platform error. address: ffff000000000000 opal: Reboot type 1 not supported Kernel panic - not syncing: PowerNV Unrecovered Machine Check CPU: 56 PID: 4425 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #35 Call Trace: [ 128.017988928,4] IPMI: BUG: Dropping ESEL on the floor due to buggy/mising code in OPAL for this BMC Rebooting in 10 seconds.. Trying to free IRQ 496 from IRQ context! After this patch, the process is killed and the kernel continues with this message, which gives enough information to identify the offending branch (i.e., with CFAR): Severe Machine check interrupt [Not recovered] NIP [ffff000000000000]: 0xffff000000000000 Initiator: CPU Error type: Real address [Instruction fetch (foreign)] Effective address: ffff000000000000 Oops: Machine check, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 ... CPU: 22 PID: 4436 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #36 task: c000000932300000 task.stack: c000000932380000 NIP: ffff000000000000 LR: 00000000217706a4 CTR: ffff000000000000 REGS: c00000000fc8fd80 TRAP: 0200 Tainted: G M (4.12.0-rc1-13857-ga4700a261072-dirty) MSR: 90000000001c1003 <SF,HV,ME,RI,LE> CR: 24000484 XER: 20000000 CFAR: c000000000004c80 DAR: 0000000021770a90 DSISR: 0a000000 SOFTE: 1 GPR00: 0000000000001ebe 00007fffce4818b0 0000000021797f00 0000000000000000 GPR04: 00007fff8007ac24 0000000044000484 0000000000004000 00007fff801405e8 GPR08: 900000000280f033 0000000024000484 0000000000000000 0000000000000030 GPR12: 9000000000001003 00007fff801bc370 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28: 00007fff801b0000 0000000000000000 00000000217707a0 00007fffce481918 NIP [ffff000000000000] 0xffff000000000000 LR [00000000217706a4] 0x217706a4 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:04 +10:00
Nicholas Piggin	b746e3e01e	powerpc/powernv: Flush console before platform error reboot Unrecovered MCE and HMI errors are sent through a special restart OPAL call to log the platform error. The downside is that they don't go through normal Linux crash paths, so they don't give much information to the Linux console. Change this by providing a special crash function which does some of the console flushing from the panic() path before calling firmware to reboot. The downside of this is a little more code to execute before reaching the firmware reboot. However in practice, it's critical to get the Linux console messages output in order to debug a problem. So this is a desirable tradeoff. Note on the implementation: It is difficult to plumb a custom reboot handler into the panic path, because panic does a little bit too much work. For example, it will try to delay with the timebase, but that may be corrupted in some cases resulting in a hang without reaching the platform reboot. Another problem is that panic can invoke the crash dump code which is not what we want in the case of a hardware platform error. Long-term the best solution will be to rework the panic path so it can be suitable for this kind of panic, but for now we just duplicate a bit of the code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:03 +10:00
Nicholas Piggin	4388c9b3a6	powerpc: Do not send system reset request through the oops path A system reset is a request to crash / debug the system rather than necessarily caused by encountering a BUG. So there is no need to serialize all CPUs behind the die lock, adding taints to all subsequent traces beyond the first, breaking console locks, etc. The system reset is NMI context which has its own printk buffers to prevent output being interleaved. Then it's better to have all secondaries print out their debug as quickly as possible and the primary will flush out all printk buffers during panic(). So remove the 0x100 path from die, and move it into system_reset. Name the crash/dump reasons "System Reset". This gives "not tained" traces when crashing an untainted kernel. It also gives the panic reason as "System Reset" as opposed to "Fatal exception in interrupt" (or "die oops" for fadump). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:02 +10:00
Nicholas Piggin	bded070643	powerpc/pseries/le: Work around a firmware quirk Some PowerVM firmware when delivering a system reset interrupt to a little endian OS will mess up SRR registers. They are byteswapped, and SRR1 is incorrect. An example from a crash: NIP: 14dd0900000000c0 MSR: 1000000200000080 It's possible to detect this pattern in SRR1 (that would never happen in normal operation), and at least fix the NIP. After this patch, the same interrupt reports NIP properly: NIP [c00000000009dd14] plpar_hcall_norets+0x1c/0x28 Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:02 +10:00
Nicholas Piggin	a3b2cb30f2	powerpc: Do not call ppc_md.panic in fadump panic notifier If fadump is not registered, and no other crash or debug handlers are registered, the powerpc panic handler stops the guest before the generic panic code can push out debug information to the console. Currently, system reset injection causes the guest to silently stop. Stop calling ppc_md.panic in the panic notifier. crash_fadump already does rtas_os_term() to terminate the guest if fadump is registered. Remove ppc_md.panic. Move fadump panic notifier into fadump code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:01 +10:00
Nicholas Piggin	70412c55d4	powerpc/64: Fix watchdog configuration regressions This fixes a couple more bits of fallout from the new hard lockup watchdog patch. It restores the required hw_nmi_get_sample_period() function for the perf watchdog, and removes some function declarations on 64e that are only defined for 64s. This fixes the 64e build when the hardlockup detector is enabled. It restores the default behaviour of disabling the perf watchdog, and also fixes disabling the 64s watchdog when running as a guest. Fixes: `2104180a53` ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:26:00 +10:00
Nicholas Piggin	b68b1d7487	powerpc/64s/radix: Do not allocate SLB shadow structures These are unused in radix mode. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:25:59 +10:00
Nicholas Piggin	d55071905e	powerpc/64s/radix: Remove bolted-SLB address limit for per-cpu stacks Radix MMU does not take SLB or TLB interrupts when accessing kernel linear address. Remove this restriction for radix mode. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:25:59 +10:00
Nicholas Piggin	76b42e28be	powerpc/powernv: powernv platform is not constrained by RMA Remove incorrect comment about real mode address restrictions on powernv (bare metal), and unnecessary clamping to ppc64_rma_size. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-31 14:25:58 +10:00
Paul Mackerras	4dafecde44	Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next This merges in the 'ppc-kvm' topic branch from the powerpc tree in order to bring in some fixes which touch both powerpc and KVM code. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 12:37:03 +10:00
Paul Mackerras	e3bfed1df3	KVM: PPC: Book3S HV: Report storage key support to userspace This adds information about storage keys to the struct returned by the KVM_PPC_GET_SMMU_INFO ioctl. The new fields replace a pad field, which was zeroed by previous kernel versions. Thus userspace that knows about the new fields will see zeroes when running on an older kernel, indicating that storage keys are not supported. The size of the structure has not changed. The number of keys is hard-coded for the CPUs supported by HV KVM, which is just POWER7, POWER8 and POWER9. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 12:36:44 +10:00
Paul Mackerras	a4faf2e77a	KVM: PPC: Book3S HV: Fix case where HDEC is treated as 32-bit on POWER9 Commit `2f2724630f` ("KVM: PPC: Book3S HV: Cope with host using large decrementer mode", 2017-05-22) added code to treat the hypervisor decrementer (HDEC) as a 64-bit value on POWER9 rather than 32-bit. Unfortunately, that commit missed one place where HDEC is treated as a 32-bit value. This fixes it. This bug should not have any user-visible consequences that I can think of, beyond an occasional unnecessary exit to the host kernel. If the hypervisor decrementer has gone negative, then the bottom 32 bits will be negative for about 4 seconds after that, so as long as we get out of the guest within those 4 seconds we won't conclude that the HDEC interrupt is spurious. Reported-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Fixes: `2f2724630f` ("KVM: PPC: Book3S HV: Cope with host using large decrementer mode") Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 12:36:44 +10:00
Andreas Schwab	0bfa33c7f7	KVM: PPC: Book3S HV: Fix invalid use of register expression binutils >= 2.26 now warns about misuse of register expressions in assembler operands that are actually literals. In this instance r0 is being used where a literal 0 should be used. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> [mpe: Split into separate KVM patch, tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 12:36:44 +10:00
Nicholas Piggin	eaac112eac	KVM: PPC: Book3S HV: Fix H_REGISTER_VPA VPA size validation KVM currently validates the size of the VPA registered by the client against sizeof(struct lppaca), however we align (and therefore size) that struct to 1kB to avoid crossing a 4kB boundary in the client. PAPR calls for sizes >= 640 bytes to be accepted. Hard code this with a comment. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 12:36:44 +10:00
Ram Pai	d182b8fd60	KVM: PPC: Book3S HV: Fix setting of storage key in H_ENTER In handling a H_ENTER hypercall, the code in kvmppc_do_h_enter clobbers the high-order two bits of the storage key, which is stored in a split field in the second doubleword of the HPTE. Any storage key number above 7 hence fails to operate correctly. This makes sure we preserve all the bits of the storage key. Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 12:36:44 +10:00
Dan Carpenter	50a1a25987	KVM: PPC: e500mc: Fix a NULL dereference We should set "err = -ENOMEM;", otherwise it means we're returning ERR_PTR(0) which is NULL. It results in a NULL pointer dereference in the caller. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 12:36:44 +10:00
Dan Carpenter	73e77c0982	KVM: PPC: e500: Fix some NULL dereferences on error There are some error paths in kvmppc_core_vcpu_create_e500() where we forget to set the error code. It means that we return ERR_PTR(0) which is NULL and it results in a NULL pointer dereference in the caller. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 12:36:44 +10:00
Paul Mackerras	edd03602d9	KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list Al Viro pointed out that while one thread of a process is executing in kvm_vm_ioctl_create_spapr_tce(), another thread could guess the file descriptor returned by anon_inode_getfd() and close() it before the first thread has added it to the kvm->arch.spapr_tce_tables list. That highlights a more general problem: there is no mutual exclusion between writers to the spapr_tce_tables list, leading to the possibility of the list becoming corrupted, which could cause a host kernel crash. To fix the mutual exclusion problem, we add a mutex_lock/unlock pair around the list_del_rce in kvm_spapr_tce_release(). Also, this moves the call to anon_inode_getfd() inside the region protected by the kvm->lock mutex, after we have done the check for a duplicate LIOBN. This means that if another thread does guess the file descriptor and closes it, its call to kvm_spapr_tce_release() will not do any harm because it will have to wait until the first thread has released kvm->lock. With this, there are no failure points in kvm_vm_ioctl_create_spapr_tce() after the call to anon_inode_getfd(). The other things that the second thread could do with the guessed file descriptor are to mmap it or to pass it as a parameter to a KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl on a KVM device fd. An mmap call won't cause any harm because kvm_spapr_tce_mmap() and kvm_spapr_tce_fault() don't access the spapr_tce_tables list or the kvmppc_spapr_tce_table.list field, and the fields that they do use have been properly initialized by the time of the anon_inode_getfd() call. The KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl calls kvm_spapr_tce_attach_iommu_group(), which scans the spapr_tce_tables list looking for the kvmppc_spapr_tce_table struct corresponding to the fd given as the parameter. Either it will find the new entry or it won't; if it doesn't, it just returns an error, and if it does, it will function normally. So, in each case there is no harmful effect. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-30 14:59:31 +10:00
Kan Liang	fc7ce9c74c	perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR For understanding how the workload maps to memory channels and hardware behavior, it's very important to collect address maps with physical addresses. For example, 3D XPoint access can only be found by filtering the physical address. Add a new sample type for physical address. perf already has a facility to collect data virtual address. This patch introduces a function to convert the virtual address to physical address. The function is quite generic and can be extended to any architecture as long as a virtual address is provided. - For kernel direct mapping addresses, virt_to_phys is used to convert the virtual addresses to physical address. - For user virtual addresses, __get_user_pages_fast is used to walk the pages tables for user physical address. - This does not work for vmalloc addresses right now. These are not resolved, but code to do that could be added. The new sample type requires collecting the virtual address. The virtual address will not be output unless SAMPLE_ADDR is applied. For security, the physical address can only be exposed to root or privileged user. Tested-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: mpe@ellerman.id.au Link: http://lkml.kernel.org/r/1503967969-48278-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-29 15:09:25 +02:00
Nicholas Piggin	72b0d51d97	powerpc/64s: idle POWER9 can execute stop in virtual mode The hardware can execute stop in any context, and KVM does not require real mode because siblings do not share MMU state. This saves a switch to real-mode when going idle. Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-29 21:42:14 +10:00
Nicholas Piggin	65dbbe812f	powerpc/64s: Drop no longer used IDLE_STATE_ENTER_SEQ There are no longer any callers of IDLE_STATE_ENTER_SEQ, all callers use IDLE_STATE_ENTER_SEQ_NORET. So drop the former. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch, write change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-29 21:41:44 +10:00
Nicholas Piggin	56ee52408e	powerpc/64s: POWER9 can execute stop without a sync sequence We don't need to use IDLE_STATE_ENTER_SEQ_NORET on Power9. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-29 21:39:07 +10:00
Nicholas Piggin	aafc8a8300	powerpc/64s: Move IDLE_STATE_ENTER_SEQ[_NORET] into idle_book3s.S This macro is only used in idle_book3s.S, move it in there and add a more descriptive comment. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch and write change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-29 21:38:47 +10:00
Michael Ellerman	82b7fcc005	Merge branch 'topic/ppc-kvm' into next Merge Nicks commit to rework the KVM thread management, shared with the KVM tree via the ppc-kvm topic branch.	2017-08-29 21:26:30 +10:00
Nicholas Piggin	94a04bc25a	KVM: PPC: Book3S HV: POWER9 does not require secondary thread management POWER9 CPUs have independent MMU contexts per thread, so KVM does not need to quiesce secondary threads, so the hwthread_req/hwthread_state protocol does not have to be used. So patch it away on POWER9, and patch away the branch from the Linux idle wakeup to kvm_start_guest that is never used. Add a warning and error out of kvmppc_grab_hwthread in case it is ever called on POWER9. This avoids a hwsync in the idle wakeup path on POWER9. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Use WARN(...) instead of WARN_ON()/pr_err(...)] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-29 14:48:59 +10:00
Matt Weber	a4e89ffb59	powerpc/e6500: Update machine check for L1D cache err This patch updates the machine check handler of Linux kernel to handle the e6500 architecture case. In e6500 core, L1 Data Cache Write Shadow Mode (DCWS) register is not implemented but L1 data cache always runs in write shadow mode. So, on L1 data cache parity errors, hardware will automatically invalidate the data cache but will still log a machine check interrupt. Signed-off-by: Ronak Desai <ronak.desai@rockwellcollins.com> Signed-off-by: Matthew Weber <matthew.weber@rockwellcollins.com> Signed-off-by: Scott Wood <oss@buserror.net>	2017-08-28 23:15:32 -05:00
Michael Ellerman	12c15a7e70	powerpc/configs/6xx: Drop removed CONFIG_USB_LED In commit `a335aaf312` ("usb: misc: remove outdated USB LED driver") CONFIG_USB_LED was removed, so drop it from our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:25 +10:00
Michael Ellerman	b1f9a827e4	powerpc/configs/6xx: Drop no longer selectable CONFIG_BT_HCIUART_LL Since commit `76c4969fec` ("Bluetooth: hci_uart: fix kconfig dependency") we can no longer select CONFIG_BT_HCIUART_LL. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:24 +10:00
Michael Ellerman	8f67600f21	powerpc/configs/c2k: Switch CONFIG_GEN_RTC from =m to =y In commit `835ea93e9d` ("char/genrtc: remove powerpc support"), CONFIG_GEN_RTC switch from tristate to bool, update the defconfig to match. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:24 +10:00
Michael Ellerman	2d8b1ca3d9	powerpc/configs/6xx: Switch CONFIG_USB_EHCI_FSL to =m In commit `ca07e1c1e4` ("drivers:usb:fsl:Make fsl ehci drv an independent driver module"), CONFIG_USB_EHCI_FSL was switched from built-in to modular. Update the defconfig. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:23 +10:00
Michael Ellerman	dcb5956154	powerpc/configs/6xx: Drop no longer needed CONFIG_BT_HCIUART_H4 Since commit `943cc59219` ("Bluetooth: bpa10x: Use h4_recv_buf helper for frame reassembly") we no longer need to set CONFIG_BT_HCIUART_H4 in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:23 +10:00
Michael Ellerman	c29f9b31bb	powerpc/configs/6xx: Drop no longer needed CONFIG_NETFILTER_XT_MATCH_SOCKET Since commit `8db4c5be88` ("netfilter: move socket lookup infrastructure to nf_socket_ipv{4,6}.c") we no longer need to set CONFIG_NETFILTER_XT_MATCH_SOCKET in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:22 +10:00
Michael Ellerman	6946d5e1fa	powerpc/configs/6xx: Reinstate CONFIG_CPU_FREQ_STAT In commit `1aefc75b24` ("cpufreq: stats: Make the stats code non-modular"), the CPU_FREQ_STAT code was made non-modular. Our defconfig still said =m though, which meant we no longer got the code at all. Switch the defconfig to =y. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:21 +10:00
Michael Ellerman	886a3bacb7	powerpc/configs/6xx: Drop no longer needed CONFIG_NF_CONNTRACK_PROC_COMPAT Since commit `adf0516845` ("netfilter: remove ip_conntrack* sysctl compat code") we no longer need to set CONFIG_NF_CONNTRACK_PROC_COMPAT in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:20 +10:00
Michael Ellerman	1d66e404b9	powerpc/configs/6xx: Drop removed CONFIG_BLK_DEV_HD In commit `8e14be53f4` ("remove the obsolete hd driver") the CONFIG_BLK_DEV_HD symbol was removed, so drop it from the defconfig. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:20 +10:00
Michael Ellerman	533141ae0e	powerpc/configs/6xx: Clean up duplicate CONFIG_EXT4 values We had two values for CONFIG_EXT4, =m and =y, just use =y. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:19 +10:00
Michael Ellerman	f623a54e83	powerpc/configs/6xx: Drop no longer needed CONFIG_TIMER_STATS Since commit `dfb4357da6` ("time: Remove CONFIG_TIMER_STATS") we no longer need to set CONFIG_TIMER_STATS in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:19 +10:00
Michael Ellerman	f963937a16	powerpc/configs/6xx: Turn CONFIG_DRM_RADEON back on In commit `d92d9c3a14` ("drm: hide legacy drivers with CONFIG_DRM_LEGACY") CONFIG_DRM_RADEON was moved behind CONFIG_DRM_LEGACY meaning it stopped being enabled by ppc6xx_defconfig. Although no one has noticed, given this is basically a legacy platform, it seems anyone who is using it probably still wants this driver. So turn it back on for now. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:18 +10:00
Michael Ellerman	22220b16df	powerpc/configs/mpc5200: Drop no longer needed CONFIG_FB Since commit `a03fdcb186` ("drm: Add top level Kconfig option for DRM fbdev emulation") we no longer need to set CONFIG_FB in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:18 +10:00
Michael Ellerman	edff694087	powerpc/configs: Update for CONFIG_INPUT_MOUSEDEV=n In commit `73d8ef7600` ("Input: mousedev - stop offering PS/2 to userspace by default") the symbol INPUT_MOUSEDEV went from being 'default y' to 'default n' (implied). That means we no longer need to explicitly disable it in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:17 +10:00
Michael Ellerman	f6fe7a1583	powerpc/configs: Drop removed CONFIG_LOGFS In commit `1d0fd57a50` ("logfs: remove from tree"), logfs was removed from the tree, so we can drop it from our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:17 +10:00
Michael Ellerman	dc3f5d2168	powerpc/configs: Turn CONFIG_R128 back in pmac32_defconfig In commit `d92d9c3a14` ("drm: hide legacy drivers with CONFIG_DRM_LEGACY") CONFIG_R128 was moved behind CONFIG_DRM_LEGACY meaning it stopped being enabled by pmac32_defconfig. Although no one has noticed, given this is basically a legacy platform, it seems anyone who is using it probably still wants this driver. So turn it back on for now. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:16 +10:00
Michael Ellerman	641f656db0	powerpc/configs: Drop no longer needed CONFIG_LIBCRC32C Since commit `300ae14946` ("netfilter: select LIBCRC32C together with SCTP conntrack") we no longer need to set CONFIG_LIBCRC32C in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:16 +10:00
Michael Ellerman	ab306d1811	powerpc/configs: Drop unnecessary CONFIG_EDAC from ppc64e There are no EDAC drivers for ppc64e. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:15 +10:00
Michael Ellerman	5593edb176	powerpc/configs: Drop no longer needed CONFIG_SCSI Since commit `67f6d66559` ("powerpc: convert amigaone_defconfig to use libata PATA drivers") we no longer need to set CONFIG_SCSI in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:15 +10:00
Michael Ellerman	72ac99688d	powerpc/configs: Drop no longer needed CONFIG_IPV6 Since commit `de551f2eb2` ("net: Build IPv6 into kernel by default") we no longer need to set CONFIG_IPV6 in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:14 +10:00
Michael Ellerman	aad004a945	powerpc/configs: Add CONFIG_RAS now required for CONFIG_EDAC In commit `e3c4ff6d8c` ("EDAC: Remove EDAC_MM_EDAC") CONFIG_EDAC grew a dependency on CONFIG_RAS. Some of our defconfigs don't have the latter, which means we lose CONFIG_EDAC, so add CONFIG_RAS to fix that. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:14 +10:00
Michael Ellerman	f3a45e560f	powerpc/configs: Drop no longer needed CONFIG_AUDITSYSCALL Since commit `cb74ed278f` ("audit: always enable syscall auditing when supported and audit is enabled") we no longer need to set CONFIG_AUDITSYSCALL in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:13 +10:00
Michael Ellerman	599f959fcf	powerpc/configs: Drop CONFIG_SERIAL_TXX9_* from cell/ppc64 In commit `bf4981a006` ("powerpc: Remove the celleb support") we dropped the celleb support, which made these symbols unselectable because we no longer select HAS_TX99_SERIAL. So drop them from the defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:13 +10:00
Michael Ellerman	e9cb40a7d6	powerpc/configs: Drop MEMORY_HOTREMOVE from ppc64/cell xxxx In commit `577ec789a7` ("powerpc/cell: Drop select of MEMORY_HOTPLUG") we removed the last traces of any dependency between Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:12 +10:00
Michael Ellerman	5a73b1a18a	powerpc/configs: Drop unnecessary CONFIG_POWERNV_OP_PANEL In commit `43a1dd9b5f` ("powerpc/powernv: Add driver for operator panel on FSP machines") we added CONFIG_POWERNV_OP_PANEL=m to the powernv defconfig, but it's default m so that's no necessary. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:11 +10:00
Michael Ellerman	8b2ee33d88	powerpc/configs: Drop no longer needed PCI_MSI on powernv In commit `a311e738b6` ("powerpc/powernv: Make PCI non-optional") we made PCI (and therefore PCI_MSI) non-optional on powernv, so it doesn't need to be in the defconfig anymore. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:11 +10:00
Michael Ellerman	4ca2ddfc47	powerpc/configs: Drop no longer needed CONFIG_SMP for pseries/ppc64/powernv In commit `40e275653e` ("powerpc/powernv: Always enable SMP when building powernv") and `270e2dc9b8` ("powerpc/pseries: Always enable SMP when building pseries") we forced CONFIG_SMP on for some configs. Therefore we don't need to set it in those configs anymore. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:10 +10:00
Michael Ellerman	b9bc4bbc7c	powerpc/configs: Drop unnecessary CONFIG_UPROBE_EVENT In commit `6b0b755142` ("perf/core: Rename CONFIG_[UK]PROBE_EVENT to CONFIG_[UK]PROBE_EVENTS") it was renamed to CONFIG_UPROBE_EVENTS. Additionally it's default y, and we have the prerequisites enabled, so we don't need it in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:10 +10:00
Michael Ellerman	93a4a4ffeb	powerpc/configs: Drop unnecessary CONFIG_NUMA_BALANCING_DEFAULT_ENABLED In commit `9654f95a08` ("powerpc: Enable NUMA balancing in pseries[_le]_defconfig") we added CONFIG_NUMA_BALANCING_DEFAULT_ENABLED to our defconfigs. But it's already enabled by default, so drop it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:09 +10:00
Michael Ellerman	76869cb569	powerpc/configs: Drop no longer needed CONFIG_DEVPTS_MULTIPLE_INSTANCES Since commit `eedf265aa0` ("devpts: Make each mount of devpts an independent filesystem.") we no longer need to set CONFIG_DEVPTS_MULTIPLE_INSTANCES in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:09 +10:00
Michael Ellerman	cd67a20f77	powerpc/configs: Drop no longer needed CONFIG_CRYPTO_GCM Since commit `00b9cfa3ff` ("mac80111: Add GCMP and GCMP-256 ciphers") we no longer need to set CONFIG_CRYPTO_GCM in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:08 +10:00
Michael Ellerman	f0297310e5	powerpc/configs: Drop no longer needed CONFIG_CRYPTO_NULL in g5 / c2k Since commit `3491244c62` ("crypto: echainiv - Set Kconfig default to m") we no longer need to set CONFIG_CRYPTO_NULL in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:08 +10:00
Michael Ellerman	60ac25ba05	powerpc/configs: Drop no longer needed CONFIG_CRYPTO_NULL Since commit `00b9cfa3ff` ("mac80111: Add GCMP and GCMP-256 ciphers") we no longer need to set CONFIG_CRYPTO_NULL in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:07 +10:00
Michael Ellerman	7cf5775cd0	powerpc/configs: Drop no longer needed CONFIG_CRYPTO_SHA256 Since commit `826775bbf3` ("crypto: drbg - Add select on sha256") we no longer need to set CONFIG_CRYPTO_SHA256 in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:07 +10:00
Michael Ellerman	813f41352a	powerpc/configs: Drop no longer needed CONFIG_CRYPTO_ECB Since commit `12cb3a1c41` ("crypto: xts - Add ECB dependency") we no longer need to set CONFIG_CRYPTO_ECB in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:06 +10:00
Michael Ellerman	8f56939009	powerpc/configs: Drop no longer needed CONFIG_CRYPTO_HMAC Since commit `401e4238f3` ("crypto: rng - Make DRBG the default RNG") we no longer need to set CONFIG_CRYPTO_HMAC in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:05 +10:00
Michael Ellerman	8f3e6bdf8c	powerpc/configs: Drop no longer needed CONFIG_CRYPTO_DEV_VMX_ENCRYPT Since commit `ccf5c442a1` ("crypto: vmx - Convert to CPU feature based module autoloading") we no longer need to set CONFIG_CRYPTO_DEV_VMX_ENCRYPT in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:05 +10:00
Michael Ellerman	b8465a6ae5	powerpc/configs: Update for CONFIG_NF_CT_PROTO_(SCTP\|UDPLITE)=y In commit `a85406afeb` ("netfilter: conntrack: built-in support for SCTP"), NF_CT_PROTO_SCTP switched from tristate to bool and became default y. Similarly in commit `9b91c96c5d` ("netfilter: conntrack: built-in support for UDPlite"), NF_CT_PROTO_UDPLITE switched from tristate to bool and became default y. We had a few configs which set them to =m, which is no longer valid. We don't need to change them to =y because both symbols are default y and are enabled automatically based on the other symbols in the affected defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:04 +10:00
Michael Ellerman	360426fb8c	powerpc/configs: Update for CONFIG_FIXED_PHY being selected by CONFIG_OF_MDIO In commit `a5e4bd9913` ("of_mdio: select fixed phy support unconditionally"), CONFIG_OF_MDIO began selecting CONFIG_FIXED_PHY. That means we no longer need to set it some of our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:04 +10:00
Michael Ellerman	e5d2f4b275	powerpc/configs: Update for CONFIG_DEBUG_FS being selected via CONFIG_RCU_TRACE In commit `961518259b` ("rcu: Enable RCU tracepoints by default to aid in debugging"), CONFIG_RCU_TRACE was made default y (if CONFIG_TREE_RCU=y, which it is for some of our configs). That in turn causes CONFIG_TREE_RCU_TRACE to be enabled, which selects CONFIG_DEBUG_FS. The end result is that CONFIG_DEBUG_FS is forced on, meaning we don't have to enable it in some of our configs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:03 +10:00
Michael Ellerman	05cb48b00a	powerpc/configs: Drop no longer needed CONFIG_DEVKMEM Since commit `e334cd69fa` ("Move CONFIG_DEVKMEM default to n") we no longer need to set CONFIG_DEVKMEM in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:03 +10:00
Michael Ellerman	5c26bdfae9	powerpc/configs: Drop no longer needed CONFIG_FHANDLE Since commit `f76be61755` ("Make CONFIG_FHANDLE default y") we no longer need to set CONFIG_FHANDLE in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:02 +10:00
Michael Ellerman	f9065c83cc	powerpc/configs: Explicitly drop CONFIG_INPUT_MOUSEDEV In commit `73d8ef7600` ("Input: mousedev - stop offering PS/2 to userspace by default") (Jan 2017), CONFIG_INPUT_MOUSEDEV was switched from default y to default n, with the explanation: Evdev interface has been available for many years and by now everyone is switched to using it, so let's stop offering /dev/input/mouseN and /dev/psaux by default. We had a number of configs which had it enabled, but going by the above explanation probably don't need it enabled anymore. So drop the last remnants of it from our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:02 +10:00
Michael Ellerman	5ee5d80833	powerpc/configs: Drop unneeded CONFIG_CRYPTO_ANSI_CPRNG Since commit `401e4238f3` ("crypto: rng - Make DRBG the default RNG") we no longer need to set CONFIG_CRYPTO_ANSI_CPRNG in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:01 +10:00
Michael Ellerman	980b4503b9	powerpc/configs: Update for symbol movement only Update defconfigs for symbols that have moved around, without their value changing. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:00 +10:00
Michael Ellerman	a6036100ed	powerpc/oops: Line up NIP & MSR with other rows This is purely cosmetic, but does look nicer IMHO: Before: task: c000000001453400 task.stack: c000000001c6c000 NIP: c000000000a0fbfc LR: c000000000a0fbf4 CTR: c000000000ba6220 REGS: c0000001fffef820 TRAP: 0300 Not tainted (4.13.0-rc6-gcc-6.3.1-00234-g423af27f7d81) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 88088242 XER: 00000000 CFAR: c0000000000b3488 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 0 After: task: c000000001453400 task.stack: c000000001c6c000 NIP: c000000000a0fbfc LR: c000000000a0fbf4 CTR: c000000000ba6220 REGS: c0000001fffef820 TRAP: 0300 Not tainted (4.13.0-rc6-gcc-6.3.1-00234-g423af27f7d81-dirty) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 88088242 XER: 00000000 CFAR: c0000000000b34a4 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 0 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:10:00 +10:00
Michael Ellerman	f6fc73fb96	powerpc/oops: Print CR/XER on same line as MSR Somehow we missed this when the pr_cont() changes went in. Fix CR/XER to go on the same line as MSR, as they have historically, eg: MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 4804408a XER: 20000000 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:09:59 +10:00
Michael Ellerman	1c56cd8ee9	powerpc/oops: Use IS_ENABLED() for oops markers Just because it looks less gross. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:09:59 +10:00
Michael Ellerman	2e82ca3c39	powerpc/oops: Print the kernel's endian in the oops Although the MSR tells you what endian you're in it's possible that isn't the same endian the kernel was built for, and if that happens you're usually having a very bad day. So print a marker to make it 100% clear which endian the kernel was built for. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:09:58 +10:00
Michael Ellerman	72c0d9ee4a	powerpc/oops: Fix the oops markers to use pr_cont() When we oops we print a few markers for significant config options such as PREEMPT, SMP etc. Currently these appear on separate lines because we're not using pr_cont() properly. Fix it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:09:58 +10:00
LABBE Corentin	6538ac3084	powerpc/powernv: Fix build error in opal-imc.c when NUMA=n When building a random powerpc kernel I hit this build error: arch/powerpc/platforms/powernv/opal-imc.c:130:13: error : assignment discards « const » qualifier from pointer target type [-Werror=discarded-qualifiers] l_cpumask = cpumask_of_node(nid); ^ This happens because when CONFIG_NUMA=n cpumask_of_node() returns a const pointer. This patch simply adds const to l_cpumask to fix this issue. Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Flesh out change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-28 22:09:57 +10:00
Linus Torvalds	67a3b5cb33	Bugfixes for x86, PPC and s390. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJZoEmaAAoJEL/70l94x66DmnMH/17uzxBe3UksLBKWC5grWhRq GVlHVI+XH7jPub1hfqKkj09nnJ0OJAiO87vX9A/CCobtxLDk0UB02U2qv+jbFbmN mSkAovY8Rn4YR73SqU+XTYajnnwmYsEiPuHVUDbMaKY3yBLW/BYtSqCuAHSm3NrS UQO8DvQAY7+W7/gA9QY7aaK/sc8N6oAwE4DHsxTYKR70Eax4SjjMLWYQY7oSutTx U8XpguF5CwP8iYbsF++WkNYxe85piheWIpUIKg+3pYxKgpDNBST8ROmxmuvSdAh6 1hkXy2qxpw+YYM6JkHRb7kBpuUAGqzYNrEF/c2Wfor+gufsyoq8LQSq5pB+d/5I= =M40T -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull Paolo Bonzini: "Bugfixes for x86, PPC and s390" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state KVM: x86: simplify handling of PKRU KVM: x86: block guest protection keys unless the host has them enabled KVM: PPC: Book3S HV: Add missing barriers to XIVE code and document them KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss KVM: PPC: Book3S HV: Use msgsync with hypervisor doorbells on POWER9 KVM: s390: sthyi: fix specification exception detection KVM: s390: sthyi: fix sthyi inline assembly	2017-08-25 17:46:23 -07:00
Linus Torvalds	42e6d5e5ee	powerpc fixes for 4.13 #8 Just one fix, to add a barrier in the switch_mm() code to make sure the mm cpumask update is ordered vs the MMU starting to load translations. As far as we know no one's actually hit the bug, but that's just luck. Thanks to: Benjamin Herrenschmidt, Nicholas Piggin. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZoAZDAAoJEFHr6jzI4aWAi3AQAJq4boEBqdmL042oNK4PWW0M uGfehNmtzCw9Hp8bPfzOf8NypJ51Kw7eDQELaeSaazKW+gffUCBeEsKGS7kmHvc+ x1tHxkXxI7PXuNIRojJg9y7rlKXdRym5SecvPSo1cm/c46RRWOlNGZaIwiHyrXSh eBjyP5EHu1HXpRxkcUh+//PQp2b+7SmgUYzSf0hA9UCtzSZSJr19DuY8uhetI9Ws AfjkO1uvb2KETqBVegGBpAruZzQtxqdtffd2HToSaCHUnAKma2iqUZqkqBNjL6OQ gSXWpXVInng/7ktrrfEgSiwlHns7pgHkxYHS8thDZqQpIt3GNsUg2UwpHGf6oL7V L+GtRp36LM91Ueq6KdlU7bJkmoiJ798Hnp3FOjpkqo+j/MGuCQDDDK4Ge1popehJ a17K7lE/FKGqNaFINc1Q6hnXg4MPyawAOLDlV839Ap5+ISPS6WcHaa1AgKjdQNkH fIkZZsYT531FIf853AjUGFw8frSlVfrHmIx9/HJOhEa1KHQhBqGRV1sWYEjuN6IB av+tQDlleG5aT641qhHlA/hN5DGrGZXLp8e6cFRufF+CSsRayL27u0Qw9pP9VZ3S bgfdnmZZyP23+bzaq/m/bjhRiOf0snSQPxIKe56KmNCJ8buTrGWDw4IuiPKB7Y6V 06vBFn7ZUP5aeHIZkS62 =IClj -----END PGP SIGNATURE----- Merge tag 'powerpc-4.13-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: "Just one fix, to add a barrier in the switch_mm() code to make sure the mm cpumask update is ordered vs the MMU starting to load translations. As far as we know no one's actually hit the bug, but that's just luck. Thanks to Benjamin Herrenschmidt, Nicholas Piggin" * tag 'powerpc-4.13-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Ensure cpumask update is ordered	2017-08-25 17:32:35 -07:00
Jiri Slaby	30d6e0a419	futex: Remove duplicated code and fix undefined behaviour There is code duplicated over all architecture's headers for futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, and comparison of the result. Remove this duplication and leave up to the arches only the needed assembly which is now in arch_futex_atomic_op_inuser. This effectively distributes the Will Deacon's arm64 fix for undefined behaviour reported by UBSAN to all architectures. The fix was done in commit `5f16a046f8` (arm64: futex: Fix undefined behaviour with FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump. And as suggested by Thomas, check for negative oparg too, because it was also reported to cause undefined behaviour report. Note that s390 removed access_ok check in `d12a29703` ("s390/uaccess: remove pointless access_ok() checks") as access_ok there returns true. We introduce it back to the helper for the sake of simplicity (it gets optimized away anyway). Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390] Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64] Cc: linux-mips@linux-mips.org Cc: Rich Felker <dalias@libc.org> Cc: linux-ia64@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: peterz@infradead.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: sparclinux@vger.kernel.org Cc: Jonas Bonn <jonas@southpole.se> Cc: linux-s390@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-hexagon@vger.kernel.org Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-snps-arc@lists.infradead.org Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-xtensa@linux-xtensa.org Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Stafford Horne <shorne@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Richard Henderson <rth@twiddle.net> Cc: Chris Zankel <chris@zankel.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-parisc@vger.kernel.org Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-alpha@vger.kernel.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz	2017-08-25 22:49:59 +02:00
Paul Mackerras	47c5310a8d	KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() Nixiaoming pointed out that there is a memory leak in kvm_vm_ioctl_create_spapr_tce() if the call to anon_inode_getfd() fails; the memory allocated for the kvmppc_spapr_tce_table struct is not freed, and nor are the pages allocated for the iommu tables. In addition, we have already incremented the process's count of locked memory pages, and this doesn't get restored on error. David Hildenbrand pointed out that there is a race in that the function checks early on that there is not already an entry in the stt->iommu_tables list with the same LIOBN, but an entry with the same LIOBN could get added between then and when the new entry is added to the list. This fixes all three problems. To simplify things, we now call anon_inode_getfd() before placing the new entry in the list. The check for an existing entry is done while holding the kvm->lock mutex, immediately before adding the new entry to the list. Finally, on failure we now call kvmppc_account_memlimit to decrement the process's count of locked memory pages. Reported-by: Nixiaoming <nixiaoming@huawei.com> Reported-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-08-25 11:08:57 +02:00
Ingo Molnar	10c9850cb2	Merge branch 'linus' into locking/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-25 11:04:51 +02:00
Rashmica Gupta	9d5171a8f2	powerpc/powernv: Enable removal of memory for in memory tracing The hardware trace macro feature requires access to a chunk of real memory. This patch provides a debugfs interface to do this. By writing an integer containing the size of memory to be unplugged into /sys/kernel/debug/powerpc/memtrace/enable, the code will attempt to remove that much memory from the end of each NUMA node. This patch also adds additional debugsfs files for each node that allows the tracer to interact with the removed memory, as well as a trace file that allows userspace to read the generated trace. Note that this patch does not invoke the hardware trace macro, it only allows memory to be removed during runtime for the trace macro to utilise. Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> [mpe: Minor formatting etc fixups] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-24 22:14:38 +10:00
Benjamin Herrenschmidt	bb9b52bd51	KVM: PPC: Book3S HV: Add missing barriers to XIVE code and document them This adds missing memory barriers to order updates/tests of the virtual CPPR and MFRR, thus fixing a lost IPI problem. While at it also document all barriers in this file. This fixes a bug causing guest IPIs to occasionally get lost. The symptom then is hangs or stalls in the guest. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-24 20:02:01 +10:00
Benjamin Herrenschmidt	2c4fb78f78	KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss This adds a workaround for a bug in POWER9 DD1 chips where changing the CPPR (Current Processor Priority Register) can cause bits in the IPB (Interrupt Pending Buffer) to get lost. Thankfully it only happens when manually manipulating CPPR which is quite rare. When it does happen it can cause interrupts to be delayed or lost. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-24 20:01:39 +10:00
Nicholas Piggin	bd0fdb191c	KVM: PPC: Book3S HV: Use msgsync with hypervisor doorbells on POWER9 When msgsnd is used for IPIs to other cores, msgsync must be executed by the target to order stores performed on the source before its msgsnd (provided the source executes the appropriate sync). Fixes: `1704a81cce` ("KVM: PPC: Book3S HV: Use msgsnd for IPIs to other cores on POWER9") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-24 20:01:39 +10:00
Naveen N. Rao	2dea1d9c38	powerpc/uprobes: Implement arch_uretprobe_is_alive() This helper is used to detect if a uprobe'd function has returned through a setjmp/longjmp, rather than branching to the LR that was updated previously by us. This fixes a SIGSEGV that gets generated when programs use setjmp/longjmp with uretprobes. We use the arm64 model (arch/arm64/kernel/probes/uprobes.c: arch_uretprobe_is_alive()) for detecting when stack frames have been removed from under us. Reference: https://marc.info/?l=linux-kernel&m=143748610330073 commit `7b868e4802` ("uprobes/x86: Reimplement arch_uretprobe_is_alive()") commit `db087ef69a` ("uprobes/x86: Make arch_uretprobe_is_alive(RP_CHECK_CALL) more clever") Tested with the test program from: https://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=blob;f=testsuite/systemtap.base/bz5274.c;hb=HEAD And this script: $ cat test.sh #!/bin/bash perf probe -x ./bz5274 -a bz5274_main_return=main%return perf probe -x ./bz5274 -a bz5274_funca_return=funca%return perf probe -x ./bz5274 -a bz5274_funcb_return=funcb%return perf probe -x ./bz5274 -a bz5274_funcc_return=funcc%return perf probe -x ./bz5274 -a bz5274_funcd_return=funcd%return perf record -e 'probe_bz5274:*' -aR ./bz5274 Reported-by: Gustavo Luiz Duarte <gduarte@redhat.com> Reported-by: zsun@redhat.com Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-24 16:19:21 +10:00
Naveen N. Rao	ec4189c4e8	powerpc/kprobes: Don't save/restore DAR/DSISR to/from pt_regs for optprobes We don't save/restore these across a trap, or with KPROBES_ON_FTRACE. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-24 16:19:01 +10:00
Cédric Le Goater	a9dadc1c51	powerpc/xive: Fix the size of the cpumask used in xive_find_target_in_mask() When called from xive_irq_startup(), the size of the cpumask can be larger than nr_cpu_ids. This can result in a WARN_ON such as: WARNING: CPU: 10 PID: 1 at ../arch/powerpc/sysdev/xive/common.c:476 xive_find_target_in_mask+0x110/0x2f0 ... NIP [c00000000008a310] xive_find_target_in_mask+0x110/0x2f0 LR [c00000000008a2e4] xive_find_target_in_mask+0xe4/0x2f0 Call Trace: xive_find_target_in_mask+0x74/0x2f0 (unreliable) xive_pick_irq_target.isra.1+0x200/0x230 xive_irq_startup+0x60/0x180 irq_startup+0x70/0xd0 __setup_irq+0x7bc/0x880 request_threaded_irq+0x14c/0x2c0 request_event_sources_irqs+0x100/0x180 __machine_initcall_pseries_init_ras_IRQ+0x104/0x134 do_one_initcall+0x68/0x1d0 kernel_init_freeable+0x290/0x374 kernel_init+0x24/0x170 ret_from_kernel_thread+0x5c/0x74 This happens because we're being called with our affinity mask set to irq_default_affinity. That in turn was populated using cpumask_setall(), which sets NR_CPUs worth of bits, not nr_cpu_ids worth. Finally cpumask_weight() will return > nr_cpu_ids when passed a mask which has > nr_cpu_ids bits set. Fix it by limiting the value returned by cpumask_weight(). Signed-off-by: Cédric Le Goater <clg@kaod.org> [mpe: Add change log details on actual cause] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-24 15:20:18 +10:00
Christoph Hellwig	74d46992e0	block: replace bi_bdev with a gendisk pointer and partitions index This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-23 12:49:55 -06:00
Nicholas Piggin	d1d0d5ffb3	powerpc/64: Optimise set/clear of CTRL[RUN] (runlatch) On modern CPUs the CTRL register is read-only except bit 63 which is the run latch control. This means it can be updated with a mtspr rather than mfspr/mtspr. To accomodate older CPUs (Cell at least), where there are other bits in the register, we still do a read/modify/write on pre 2.06 CPUs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Update change log to mention 2.06 workaround] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 23:48:38 +10:00
Nicholas Piggin	7b76a1f5ed	powerpc/64s: Remove spurious IRQ reason in IRQ replay HVI interrupts have always used 0x500, so remove the dead branch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 23:17:29 +10:00
Nicholas Piggin	ccd5eb837c	powerpc/64: Remove redundant instruction in interrupt replay Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 23:17:16 +10:00
Nicholas Piggin	e6c1203d5c	powerpc/64s: Use the HV handler for external IRQ replay in HV mode on POWER9 POWER9 host external interrupts use the h_virt_irq_common handler, so use that to replay them rather than using the hardware_interrupt_common handler. Both call do_IRQ, but using the correct handler reduces i-cache footprint. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 23:15:23 +10:00
Nicholas Piggin	d6f73fc69b	powerpc/64s: Merge HV and non-HV paths for doorbell IRQ replay This results in smaller code, and fewer branches. This relies on the fact that both the 0xe80 and 0xa00 handlers call the same upper level code, namely doorbell_exception(). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Mention we rely on the implementation of the 0xe80/0xa00 handlers] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 23:13:27 +10:00
Nicholas Piggin	6f881eaeb5	powerpc/64: Cleanup __check_irq_replay() Move the clearing of irq_happened bits into the condition where they were found to be set. This reduces instruction count slightly, and reduces stores into irq_happened. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 23:13:15 +10:00
Nicholas Piggin	c05f0be888	powerpc/64s: masked_interrupt() returns to kernel so avoid restoring r13 Places in the kernel where r13 is not the PACA pointer must have maskable interrupts disabled, so r13 does not have to be restored when returning from a soft-masked interrupt. We should never have interrupts soft disabled when we're in user space. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 23:11:28 +10:00
Nicholas Piggin	6e9a2f6eba	powerpc/64s: Optimise clearing of MSR_EE in masked_[H]interrupt() MSR_EE is always enabled in SRR1 for masked interrupts, so we can use xor to clear it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 23:06:48 +10:00
Nicholas Piggin	e0c827c09c	powerpc/64s: Avoid a branch in masked_[H]interrupt() Interrupts which do not require EE to be cleared can all be tested with a single bitwise test. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 23:02:48 +10:00
Benjamin Herrenschmidt	3a2df3798d	powerpc/mm: Make switch_mm_irqs_off() out of line It's too big to be inline, there is no reason to keep it that way. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Rework to incorporate the comment changes via fixes branch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 22:48:51 +10:00
Benjamin Herrenschmidt	a619e59c07	powerpc/mm: Optimize detection of thread local mm's Instead of comparing the whole CPU mask every time, let's keep a counter of how many bits are set in the mask. Thus testing for a local mm only requires testing if that counter is 1 and the current CPU bit is set in the mask. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 22:28:38 +10:00
Benjamin Herrenschmidt	b426e4bd77	powerpc/mm: Use mm_is_thread_local() instread of open-coding We open-code testing for the mm being local to the current CPU in a few places. Use our existing helper instead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 22:27:45 +10:00
Benjamin Herrenschmidt	058ccc3465	powerpc/mm: Avoid double irq save/restore in activate_mm It calls switch_mm() which already does the irq save/restore these days. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 22:27:44 +10:00
Benjamin Herrenschmidt	43ed84a891	powerpc/mm: Move pgdir setting into a helper Makes switch_mm_irqs_off() a bit more readable Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 22:27:42 +10:00
Michael Ellerman	3e23a12bca	powerpc/64s: Fix replay interrupt return label name In __replay_interrupt() we take the address of a local label so we can return to it later. However the assembler turns the local label into a symbol with a name like ".L1^B42" - where "^B" is literally "\002". This does not make for pleasant stack traces. Fix it by giving the label a sensible name. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 22:27:05 +10:00
Rob Herring	215ee763f8	powerpc: pseries: remove dlpar_attach_node dependency on full path In preparation to stop storing the full node path in full_name, remove the dependency on full_name from dlpar_attach_node(). Callers of dlpar_attach_node() already have the parent device_node, so just pass the parent node into dlpar_attach_node instead of the path. This avoids doing a lookup of the parent node by the path. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 22:27:04 +10:00
Rob Herring	b7c670d673	powerpc: Convert to using %pOF instead of full_name Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Anatolij Gustschin <agust@denx.de> Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linuxppc-dev@lists.ozlabs.org Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 22:27:04 +10:00
Michael Ellerman	bcf21e3a97	powerpc/vio: Use device_type to detect family Currently in the vio.c code we use a comparision against the parent device node's full path to decide if the device is a PFO or VIO family device. Both the ibm,platform-facilities and vdevice nodes are defined by PAPR, and must have a matching device_type. So instead of using the path we can instead compare the device_type. I've checked Qemu and kvmtool both do this correctly, and all the PowerVM systems I have access to do also. So it seems to be safe. This removes the dependency on full_name, which is being removed upstream. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-23 22:27:03 +10:00
Michael Ellerman	15c659ff9d	Merge branch 'fixes' into next There's a non-trivial dependency between some commits we want to put in next and the KVM prefetch work around that went into fixes. So merge fixes into next.	2017-08-23 22:20:10 +10:00
David S. Miller	e2a7c34fb2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-08-21 17:06:42 -07:00
Ingo Molnar	94edf6f3c2	Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Removal of spin_unlock_wait() - SRCU updates - Torture-test updates - Documentation updates - Miscellaneous fixes - CPU-hotplug fixes - Miscellaneous non-RCU fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-21 09:45:19 +02:00
Nicholas Piggin	92e5aae457	kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog Commit `05a4a95279` ("kernel/watchdog: split up config options") lost the perf-based hardlockup detector's dependency on PERF_EVENTS, which can result in broken builds with some powerpc configurations. Restore the dependency. Add it in for x86 too, despite x86 always selecting PERF_EVENTS it seems reasonable to make the dependency explicit. Link: http://lkml.kernel.org/r/20170810114452.6673-1-npiggin@gmail.com Fixes: `05a4a95279` ("kernel/watchdog: split up config options") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-18 15:32:01 -07:00
Linus Torvalds	039a8e3847	powerpc fixes for 4.13 #7 A bug in the VSX register saving that could cause userspace FP/VMX register corruption. Never seen to happen (that we know of), was found by code inspection, but still tagged for stable given the consequences. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZlt/PAAoJEFHr6jzI4aWAHW8P/iKMN08HDpvG49xRlhzxlVxA blHw5HPBcnBO4wiVO5Imi5uryj59erPmD49fbvnoS8iYeJjLw6VitRV4UCOm0Ssx 1oprF4xkaahtd4TJmbdf8WIz5H5NlYhpw5CXV1PRdP0X9r64MLIgZihYCdCWqGzN GLVTHtbIuNewJTN8IbLkmEdhAThtriBwFVUxrNOQ3JLqUPWAWO5qKeKgaPeeBLlw LJ4Q7TbdHgXZa/QEW2RmB+IfFcew4WVf/7+4ZgHjMhtSsIGrgqEE2680SGNY/rNY kXqR/g0l/GbQUjWFDDlveeJY5W0RenD/L4VodjS0ODMop9ZB+0LcRV0qUR7vDvmU 7xi571hBPpskVsPH1sUahwJtaWAYkf5JqROLVryAYDBNcqnlohk4HdWltigT7asi GyHvJ32MWzmCDc0obEskQtqNlbIyRDy2qoTtl8O4YnWWf5nASKGyiRJlvLyLlADK 2g64jP7vppfRM5cskZ3JkQ1NH3ORzVgwfGUNWaAZW4jCE8XeqXQnPtGsWvqqQJOd 1UmxNHjtjc0TlMmxNp0EjGjpRXj8eerzDXv0ecRH/xqo376Ies99uGo/vs+htjrp HQ+wjOZxRk/jB0KBOoNbTkM6FY04I/ZIvIwJWboyEHzqctwj1pA63I+bzRbzBxww TLP2TDPIKiwTVbHcb6XP =jIDJ -----END PGP SIGNATURE----- Merge tag 'powerpc-4.13-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A bug in the VSX register saving that could cause userspace FP/VMX register corruption. Never seen to happen (that we know of), was found by code inspection, but still tagged for stable given the consequences" * tag 'powerpc-4.13-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC	2017-08-18 11:11:03 -07:00
Benjamin Herrenschmidt	1a92a80ad3	powerpc/mm: Ensure cpumask update is ordered There is no guarantee that the various isync's involved with the context switch will order the update of the CPU mask with the first TLB entry for the new context being loaded by the HW. Be safe here and add a memory barrier to order any subsequent load/store which may bring entries into the TLB. The corresponding barrier on the other side already exists as pte updates use pte_xchg() which uses __cmpxchg_u64 which has a sync after the atomic operation. Cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add comments in the code] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-18 13:07:16 +10:00
Paul E. McKenney	952111d7db	arch: Remove spin_unlock_wait() arch-specific definitions There is no agreed-upon definition of spin_unlock_wait()'s semantics, and it appears that all callers could do just as well with a lock/unlock pair. This commit therefore removes the underlying arch-specific arch_spin_unlock_wait() for all architectures providing them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: <linux-arch@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Boqun Feng <boqun.feng@gmail.com>	2017-08-17 08:08:59 -07:00
Aneesh Kumar K.V	0f4bc0932e	powerpc/mm/cxl: Add the fault handling cpu to mm cpumask We use mm cpumask for serializing against lockless page table walk. Anybody who is doing a lockless page table walk is expected to disable irq and only cpus in mm cpumask is expected do the lockless walk. This ensure that a THP split can send IPI to only cpus in the mm cpumask, to make sure there are no parallel lockless page table walk. Add the CAPI fault handling cpu to the mm cpumask so that we can do the lockless page table walk while inserting hash page table entries. Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-17 23:31:52 +10:00
Aneesh Kumar K.V	fa4531f753	powerpc/mm: Don't send IPI to all cpus on THP updates Now that we made sure that lockless walk of linux page table is mostly limitted to current task(current->mm->pgdir) we can update the THP update sequence to only send IPI to CPUs on which this task has run. This helps in reducing the IPI overload on systems with large number of CPUs. WRT kvm even though kvm is walking page table with vpc->arch.pgdir, it is done only on secondary CPUs and in that case we have primary CPU added to task's mm cpumask. Sending an IPI to primary will force the secondary to do a vm exit and hence this mm cpumask usage is safe here. WRT CAPI, we still end up walking linux page table with capi context MM. For now the pte lookup serialization sends an IPI to all CPUs in CPI is in use. We can further improve this by adding the CAPI interrupt handling CPU to task mm cpumask. That will be done in a later patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-17 23:31:13 +10:00
Michael Ellerman	8434f0892e	Merge branch 'topic/ppc-kvm' into next Bring in the commit to rename find_linux_pte_or_hugepte() which touches arch and KVM code, and might need to be merged with the kvmppc tree to avoid conflicts.	2017-08-17 23:14:17 +10:00
Aneesh Kumar K.V	94171b19c3	powerpc/mm: Rename find_linux_pte_or_hugepte() Add newer helpers to make the function usage simpler. It is always recommended to use find_current_mm_pte() for walking the page table. If we cannot use find_current_mm_pte(), it should be documented why the said usage of __find_linux_pte() is safe against a parallel THP split. For now we have KVM code using __find_linux_pte(). This is because kvm code ends up calling __find_linux_pte() in real mode with MSR_EE=0 but with PACA soft_enabled = 1. We may want to fix that later and make sure we keep the MSR_EE and PACA soft_enabled in sync. When we do that we can switch kvm to use find_linux_pte(). Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-17 23:13:46 +10:00
Naveen N. Rao	6acdc9a6ba	powerpc/bpf: Use memset32() to pre-fill traps in BPF page(s) Use the newly introduced memset32() to pre-fill BPF page(s) with trap instructions. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-17 23:04:47 +10:00
Naveen N. Rao	694fc88ce2	powerpc/string: Implement optimized memset variants Based on Matthew Wilcox's patches for other architectures. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-17 23:04:35 +10:00
Madhavan Srinivasan	711bd207a2	powerpc/perf: Fix usage of nest_imc_refc nest_imc_refc is a reference count struct, used to track number of active perf sessions using the nest units. Currently the code accesses nest_imc_refc using node_id, which is incorrect, the array is indexed by node number. Meaning in the case of sparse node ids we index off the end of the array. Fix it to use get_nest_pmu_ref() which uses the existing per-cpu variable local_nest_imc_refc. Fixes: `885dcd709b` ('powerpc/perf: Add nest IMC PMU support') Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-17 21:56:36 +10:00
Bhumika Goyal	8bfa42ab84	powerpc: Add const to bin_attribute structures Declare bin_attribute structures as const as they are only passed as an argument to the function sysfs_create_bin_file. This argument is of type const, so declare the structure as const. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-17 21:56:26 +10:00
Benjamin Herrenschmidt	96c79b6bd7	powerpc: Remove more redundant VSX save/tests __giveup_vsx/save_vsx are completely equivalent to testing MSR_FP and MSR_VEC and calling the corresponding giveup/save function so just remove the spurious VSX cases. Also add WARN_ONs checking that we never have VSX enabled without the two other. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-16 22:35:04 +10:00
Benjamin Herrenschmidt	dc801081f2	powerpc: Remove redundant clear of MSR_VSX in __giveup_vsx() __giveup_fpu() already does it and we cannot have MSR_VSX set without having MSR_FP also set. This also adds a warning to check we indeed do Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-16 22:35:03 +10:00
Benjamin Herrenschmidt	746874d31c	powerpc: Remove redundant FP/Altivec giveup code __giveup_vsx() already calls those two functions. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-16 22:34:42 +10:00
Benjamin Herrenschmidt	6a303833b5	powerpc: Fix missing newline before { Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-16 22:34:34 +10:00
Dou Liyang	7baebe54a6	powerpc/topology: Remove the unused parent_node() macro Commit `a7be6e5a7f` ("mm: drop useless local parameters of __register_one_node()") removes the last user of parent_node(). The parent_node() macro in POWERPC platform is unnecessary. Remove it for cleanup. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-16 21:12:41 +10:00
Benjamin Herrenschmidt	5a69aec945	powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC VSX uses a combination of the old vector registers, the old FP registers and new "second halves" of the FP registers. Thus when we need to see the VSX state in the thread struct (flush_vsx_to_thread()) or when we'll use the VSX in the kernel (enable_kernel_vsx()) we need to ensure they are all flushed into the thread struct if either of them is individually enabled. Unfortunately we only tested if the whole VSX was enabled, not if they were individually enabled. Fixes: `72cd7b44bc` ("powerpc: Uncomment and make enable_kernel_vsx() routine available") Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-16 19:35:54 +10:00
Aneesh Kumar K.V	4ae279c2c9	powerpc/mm/hugetlb: Allow runtime allocation of 16G. Now that we have GIGANTIC_PAGE enabled on powerpc, use this for 16G hugepages with hash translation mode. Depending on the total system memory we have, we may be able to allocate 16G hugepages runtime. This also remove the hugetlb setup difference between hash/radix translation mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-16 14:56:13 +10:00
Aneesh Kumar K.V	79cc38ded1	powerpc/mm/hugetlb: Add support for reserving gigantic huge pages via kernel command line With commit `aa888a7497` ("hugetlb: support larger than MAX_ORDER") we added support for allocating gigantic hugepages via kernel command line. Switch ppc64 arch specific code to use that. W.r.t FSL support, we now limit our allocation range using BOOTMEM_ALLOC_ACCESSIBLE. We use the kernel command line to do reservation of hugetlb pages on powernv platforms. On pseries hash mmu mode the supported gigantic huge page size is 16GB and that can only be allocated with hypervisor assist. For pseries the command line option doesn't do the allocation. Instead pseries does gigantic hugepage allocation based on hypervisor hint that is specified via "ibm,expected#pages" property of the memory node. Cc: Scott Wood <oss@buserror.net> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-16 14:56:12 +10:00
David S. Miller	463910e2df	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-08-15 20:23:23 -07:00
Christophe Leroy	ca8afd4046	powerpc/hugetlb: fix page rights verification in gup_hugepte() gup_hugepte() checks if pages are present and readable, and when 'write' is set, also checks if the pages are writable. Initially this was done by checking if _PAGE_PRESENT and _PAGE_READ were set. In addition, _PAGE_WRITE was verified for write accesses. The problem is that we have to handle the three following cases: 1/ The target defines __PAGE_READ and __PAGE_WRITE 2/ The target defines __PAGE_RW 3/ The target defines __PAGE_RO In case 1/, this is obvious In case 2/, __PAGE_READ is defined as 0 and __PAGE_WRITE as __PAGE_RW so it works as well. But in case 3, __PAGE_RW is defined as 0, which means __PAGE_WRITE is 0 and then the test returns true (page writable) in all cases. A first correction was attempted in commit `6b8cb66a6a` ("powerpc: Fix usage of _PAGE_RO in hugepage"), but that fix is wrong: instead of checking that the page is writable when write is requested, it checks that the page is NOT writable when write is NOT requested. This patch adds a new pte_read() helper to check whether a page is readable or not. This avoids handling all possible cases in gup_hugepte(). Then gup_hugepte() is modified to use pte_present(), pte_read() and pte_write() instead of the raw flags. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:58 +10:00
Christophe Leroy	4cfac2f9c7	powerpc/mm: Simplify __set_fixmap() __set_fixmap() uses __fix_to_virt() then does the boundary checks by it self. Instead, we can use fix_to_virt() which does the verification at build time. For this, we need to use it inline so that GCC can see the real value of idx at buildtime. In the meantime, we remove the 'fixmaps' variable. This variable is set but has never been used from the beginning (commit `2c419bdeca` ("[POWERPC] Port fixmap from x86 and use for kmap_atomic")) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:58 +10:00
Christophe Leroy	86b19520e7	powerpc/mm: declare some local functions static get_pteptr() and __mapin_ram_chunk() are only used locally, so define them static Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:57 +10:00
Christophe Leroy	95902e6c88	powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32 This patch implements STRICT_KERNEL_RWX on PPC32. As for CONFIG_DEBUG_PAGEALLOC, it deactivates BAT and LTLB mappings in order to allow page protection setup at the level of each page. As BAT/LTLB mappings are deactivated, there might be a performance impact. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:57 +10:00
Christophe Leroy	3184cc4b6f	powerpc/mm: Fix kernel RAM protection after freeing unused memory on PPC32 As seen below, allthough the init sections have been freed, the associated memory area is still marked as executable in the page tables. ~ dmesg [ 5.860093] Freeing unused kernel memory: 592K (c0570000 - c0604000) ~ cat /sys/kernel/debug/kernel_page_tables ---[ Start of kernel VM ]--- 0xc0000000-0xc0497fff 4704K rw X present dirty accessed shared 0xc0498000-0xc056ffff 864K rw present dirty accessed shared 0xc0570000-0xc059ffff 192K rw X present dirty accessed shared 0xc05a0000-0xc7ffffff 125312K rw present dirty accessed shared ---[ vmalloc() Area ]--- This patch fixes that. The implementation is done by reusing the change_page_attr() function implemented for CONFIG_DEBUG_PAGEALLOC Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:56 +10:00
Christophe Leroy	e611939fc8	powerpc/mm: Ensure change_page_attr() doesn't invalidate pinned TLBs __change_page_attr() uses flush_tlb_page(). flush_tlb_page() uses tlbie instruction, which also invalidates pinned TLBs, which is not what we expect. This patch modifies the implementation to use flush_tlb_kernel_range() instead. This will make use of tlbia which will preserve pinned TLBs. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:56 +10:00
Christophe Leroy	2ef973a9af	powerpc/8xx: Reduce DTLB miss handler by one insn This reduces the DTLB miss handler hot path (user address path) by one instruction by preserving r10. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:55 +10:00
Christophe Leroy	346bcc4d33	powerpc/8xx: mark init functions with __init setup_initial_memory_limit() is only called during init. mmu_patch_cmp_limit() is only called from 8xx_mmu.c Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:54 +10:00
Christophe Leroy	87be3e2d31	powerpc/8xx: Do not allow Pinned TLBs with STRICT_KERNEL_RWX or DEBUG_PAGEALLOC Pinning TLBs bypasses STRICT_KERNEL_RWX or DEBUG_PAGEALLOC protections so it should only be allowed when those are not selected Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:54 +10:00
Christophe Leroy	a3059b0ca0	powerpc/8xx: Make pinning of ITLBs optional As stated in a comment in head_8xx.S, today we "Always pin the first 8 MB ITLB to prevent ITLB misses while mucking around with SRR0/SRR1 in asm". This issue has just been cleared by the preceding patch, therefore we can make this pinning optional (on by default) and independent of DATA pinning. This patch also makes pinning of IMMR independent of pinning of DATA. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:53 +10:00
Christophe Leroy	0eb0d2e77d	powerpc/32: Avoid risk of unrecoverable TLBmiss inside entry_32.S By default, the 8xx pins an ITLB on the first 8M of memory in order to avoid any ITLB miss on kernel code. However, with some debug functions like DEBUG_PAGEALLOC and DEBUG_RODATA, pinning TLBs is contradictory. In order to avoid any ITLB miss in a critical section without pinning TLBs, we have to ensure that there is no page boundary crossed between the setup of a new value in SRR0/SRR1 and the associated RFI. The functions modifying srr0/srr1 are all located in setup_32.S. They are spread over almost 4kbytes. The patch forces a 12 bits (4kbytes) alignment for those functions. This garanties that the functions remain in a single 4k page. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:53 +10:00
Christophe Leroy	c8a127092e	powerpc/8xx: Remove macro that checks kernel address The macro to check if an address is a kernel address or not is not used anymore in DTLBmiss handler. It is used in ITLB miss handler and in DTLB error handler. DTLB error handler is not a hot path, it doesn't need such optimisation. In order to simplify a following patch which will rework ITLB miss handler, we remove the macros and reintroduce them inside the handler. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:52 +10:00
Christophe Leroy	eef784bbe7	powerpc/8xx: Ensures RAM mapped with LTLB is seen as block mapped on 8xx. On the 8xx, the RAM mapped with LTLBs must be seen as block mapped, just like areas mapped with BATs on standard PPC32. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 22:55:52 +10:00
Julia Lawall	36992606ee	powerpc/chrp: Store the intended structure Normally the values in the resource field and the argument to ARRAY_SIZE in the num_resources are the same. In this case, the value in the reousrce field is the same as the one in the previous platform_device structure, and appears to be a copy-paste error. Replace the value in the resource field with the argument to the local call to ARRAY_SIZE. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 21:13:13 +10:00
Andreas Schwab	6c80d3164e	powerpc/l2cr_6xx: Fix invalid use of register expressions This fixes another invalid use of register expressions. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 21:04:32 +10:00
Michael Ellerman	63b85621d9	powerpc/iommu: Avoid undefined right shift in iommu_range_alloc() In iommu_range_alloc() we generate a mask by right shifting ~0, however if the specified alignment is 0 then we right shift by 64, which is undefined. UBSAN tells us so: UBSAN: Undefined behaviour in ../arch/powerpc/kernel/iommu.c:193:35 shift exponent 64 is too large for 64-bit type 'long unsigned int' We can avoid it by instead generating the mask with: align_mask = (1ull << align_order) - 1; That will also generate an undefined shift if align_order is 64 or greater, but that shouldn't be a problem for a while. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 20:30:58 +10:00
Anju T	7efbae9089	powerpc/perf/imc: Fix nest events on muti socket system In a multi node system with discontiguous node ids, nest event values are not showing up properly. eg. lscpu output: NUMA node0 CPU(s): 0-15 NUMA node8 CPU(s): 16-31 Nest event values on such systems can be counted on CPUs <= 15: $./perf stat -e 'nest_powerbus0_imc/PM_PB_CYC/' -C 0-14 -I 1000 sleep 1000 # time counts unit events 1.000294577 30,17,24,42,880 nest_powerbus0_imc/PM_PB_CYC/ But not on CPUs >= 16: $./perf stat -e 'nest_powerbus0_imc/PM_PB_CYC/' -C 16-28 -I 1000 sleep 1000 # time counts unit events 1.000049902 <not supported> nest_powerbus0_imc/PM_PB_CYC/ This is because, when fetching the reference count, the node id (which may be sparse) is used as the array index, not the node number (which is 0 based and contiguous). Fix it by using the node number as the array index. $./perf stat -e 'nest_powerbus0_imc/PM_PB_CYC/' -C 16-28 -I 1000 sleep 1000 # time counts unit events 1.000241961 26,12,35,28,704 nest_powerbus0_imc/PM_PB_CYC/ Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> [mpe: Change log tweaks for clarity and brevity] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 20:27:28 +10:00
Michael Ellerman	5b6c133e08	powerpc/mm/nohash: Move definition of PGALLOC_GFP to fix build errors In some obscure Book3E configs (randconfig) we can end up missing a definition for PGALLOC_GFP in pgtable_64.c. Fix it by moving the definition to asm/pgalloc.h. Fixes: `de3b87611d` ("powerpc/mm/book(e)(3s)/64: Add page table accounting") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 20:02:56 +10:00
Naveen N. Rao	e12d94f806	powerpc/xmon: Exclude all of xmon from ftrace Exclude core xmon files from ftrace (along with an xmon xive helper outside of xmon/) to minimize impact of ftrace while within xmon. Before: /sys/kernel/debug/tracing# grep -ci xmon available_filter_functions 26 After: /sys/kernel/debug/tracing# grep -ci xmon available_filter_functions 0 Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Use $(subst ..) on KBUILD_CFLAGS rather than CFLAGS_REMOVE_xxx] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-15 15:30:17 +10:00
Breno Leitao	ed49f7fd64	powerpc/xmon: Disable tracing when entering xmon If tracing is enabled and you get into xmon, the tracing buffer continues to be updated, causing possible loss of data and unnecessary tracing information coming from xmon functions. This patch simple disables tracing when entering xmon, and re-enables it if the kernel is resumed (with 'x'). Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-14 21:57:57 +10:00
Breno Leitao	4125d012ff	powerpc/xmon: Dump ftrace buffers for the current CPU only Current xmon 'dt' command dumps the tracing buffer for all the CPUs, which makes it very hard to read due to the fact that most of powerpc machines currently have many CPUs. Other than that, the CPU lines are interleaved in the ftrace log. This new option just dumps the ftrace buffer for the current CPU. Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-14 21:57:56 +10:00
Dan Carpenter	b3376dcc6c	powerpc/perf: Fix double unlock in imc_common_cpuhp_mem_free() This function is not called with the nest_init_lock held, and it also unlocks the nest_init_lock immediately below, so it's fairly clear that this is a typo and should be locking the lock. Fixes: `885dcd709b` ("powerpc/perf: Add nest IMC PMU support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-14 21:57:55 +10:00
Christophe Leroy	ab2675d6ac	powerpc/8xx: Fix two CONFIG_8xx left behind Commit `968159c003` ("powerpc/8xx: Getting rid of remaining use of CONFIG_8xx") removed all but 2 references to 8xx in Kconfigs. This patch removes the two remaining ones. Fixes: `968159c003` ("powerpc/8xx: Getting rid of remaining use of CONFIG_8xx") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-14 21:57:33 +10:00
Linus Torvalds	8001a975f9	powerpc fixes for 4.13 #6 All fixes for code that went in this cycle. - A revert of an optimisation to the syscall exit path, which could lead to an oops on either older machines or machines with > 1T of memory. - Disable some deep idle states if the firmware configuration for them fails. - Re-enable HARD/SOFT lockup detectors in defconfigs after a Kconfig change. - Six fairly small patches fixing bugs in our new watchdog code. Thanks to: Gautham R. Shenoy, Nicholas Piggin. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZjZBeAAoJEFHr6jzI4aWA0RoP/jreL470i8sjPvm7EuF0+jWb D3gaXx55jPo65CLxDH/qMElKk1VxAgCpWHqVN9e/wd9qKxE+xwQ890Enlx4wQRQe I7AkQQoATsf05medbvihTcNRxMcOvLPCi4GP64W/aegaTGliKyFWAu7w/ZsSzGi+ jaskMrBJAR2X4OBfbUi0UEE/KsvDkUje2qJj1NiFyhFFXLyizy4dQknYXVCFD2LR IJah0ZaZM0scOd7YDfsfybBbBDnwi4VqdHTLA5NFBB+sz55+bO9lVOYWpnUYLouH sdZjpjtCuIS+aYOLHupCDXzW2ntv2AVDe4pckTpXrJbUYypXlg8lVx6kfbJ2QETr NrNHljcmmqgqDSglUrDdEpkuQT6Y6hJMroUvo8h20Y+8WXNaKFA5VhhGLkQuBcE4 v/osXs1aiiBURilWTDS5Jo1f5FTZT8ZkNr1/n5jC7q+Y/qkVFglr32hjuomwWxOz SmnwOOcIdJnykr3/pUCvOAffApLfzWFrZeW1MpV1NVlodrubKU939wYZT0yU86b3 ZSs7pK07eZhfdepanG3nCDahhl2qxMyuEjglSfX8WDvVB2MdoxhoUeLwbyMV9qIZ 5A7Anozod4tOrU+qxtPp3uQ230nbQ9JW9yIQM/tdr88k8swpo3NO4O5kfcmH8FVN CBG5pCRNXIZhtJiYVAiv =Vy1K -----END PGP SIGNATURE----- Merge tag 'powerpc-4.13-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "All fixes for code that went in this cycle. - a revert of an optimisation to the syscall exit path, which could lead to an oops on either older machines or machines with > 1TB of memory - disable some deep idle states if the firmware configuration for them fails - re-enable HARD/SOFT lockup detectors in defconfigs after a Kconfig change - six fairly small patches fixing bugs in our new watchdog code Thanks to: Gautham R Shenoy, Nicholas Piggin" * tag 'powerpc-4.13-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/watchdog: add locking around init/exit functions powerpc/watchdog: Fix marking of stuck CPUs powerpc/watchdog: Fix final-check recovered case powerpc/watchdog: Moderate touch_nmi_watchdog overhead powerpc/watchdog: Improve watchdog lock primitive powerpc: NMI IPI improve lock primitive powerpc/configs: Re-enable HARD/SOFT lockup detectors powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails Revert "powerpc/64: Avoid restore_math call if possible in syscall exit"	2017-08-11 08:56:01 -07:00
Michael Ellerman	df4c798318	powerpc/xive: Fix section mismatch warnings Both xive_core_init() and xive_native_init() are called from and call __init routines, so they should also be __init. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:41:02 +10:00
Michael Ellerman	7559952e1f	powerpc/mm: Fix section mismatch warning in early_check_vec5() early_check_vec5() is called from and calls __init routines, so should also be __init. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:40:51 +10:00
Christophe Leroy	0e9645df58	powerpc/8xx: Remove cpu dependent macro instructions from head_8xx head_8xx is dedicated to 8xx so no need to use macros that depends on the CPU Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:21 +10:00
Christophe Leroy	4915349b10	powerpc/8xx: Use symbolic names for DSISR bits in DSI Use symbolic names for DSISR bits in DSI Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:20 +10:00
Christophe Leroy	3ee87674e0	powerpc/8xx: Use symbolic PVR value For the 8xx, PVR values defined in arch/powerpc/include/asm/reg.h are nowhere used. Remove all defines and add PVR_8xx Use it in arch/powerpc/kernel/cputable.c Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:18 +10:00
Christophe Leroy	04f61b660f	powerpc/8xx: remove CONFIG_8xx Two config options exist to define powerpc MPC8xx: * CONFIG_PPC_8xx * CONFIG_8xx arch/powerpc/platforms/Kconfig.cputype has contained the following comment about CONFIG_8xx item for some years: "# this is temp to handle compat with arch=ppc" There is no more users of CONFIG_8xx, so remove it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:17 +10:00
Christophe Leroy	968159c003	powerpc/8xx: Getting rid of remaining use of CONFIG_8xx Two config options exist to define powerpc MPC8xx: * CONFIG_PPC_8xx * CONFIG_8xx arch/powerpc/platforms/Kconfig.cputype has contained the following comment about CONFIG_8xx item for some years: "# this is temp to handle compat with arch=ppc" arch/powerpc is now the only place with remaining use of CONFIG_8xx: get rid of them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:12 +10:00
Christophe Leroy	e959986694	powerpc/kconfig: Simplify PCI_QSPAN selection 4xx, CPM2 and 8xx cannot be selected at the same time, so no need to test 8xx && !4xx && !CPM2. Testing 8xx is enough. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:11 +10:00
Christophe Leroy	72e4b2cdf0	powerpc/time: refactor MFTB() to limit number of ifdefs The 8xx cannot access the TBL and TBU registers using mfspr/mtspr It must be accessed using mftb/mftbu Due to this, there is a number of places with #ifdef CONFIG_8xx This patch defines new macros MFTBL(x) and MFTBU(x) on the same model as MFTB(x) and tries to make use of them as much as possible. In arch/powerpc/include/asm/timex.h, we also remove the ifdef for the asm() operands as the compiler doesn't mind unused operands Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:09 +10:00
Christophe Leroy	de41ef6e4d	powerpc/8xx: Move mpc8xx_pic.c from sysdev to platform/8xx mpc8xx_pic.c is dedicated to the 8xx, so move it to platform/8xx Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:07 +10:00
Christophe Leroy	ea16e83aec	powerpc/cpm1: link to CONFIG_CPM1 instead of CONFIG_8xx To remain consistent with what is done with CPM2, let's link CPM1 related parts to CONFIG_CPM1 instead of CONFIG_8xx When something depends on both CPM1 and CPM2 we associate it with CONFIG_CPM Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:05 +10:00
Christophe Leroy	fbbcc3bb13	powerpc/8xx: Remove SoftwareEmulation() Since commit `aa42c69c67` ("[POWERPC] Add support for FP emulation for the e300c2 core"), program_check_exception() can be called for math emulation. In that case, 'reason' is 0. On the 8xx, there is a Software Emulation interrupt which is called for all unimplemented and illegal instructions. This interrupt calls SoftwareEmulation() which does almost the same as program_check_exception() called with reason = 0. The Software Emulation interrupt sets all reason bits to 0, it is therefore possible to call program_check_exception() directly from the interrupt handler. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:04 +10:00
Christophe Leroy	f70b1e8d17	powerpc/8xx: Move 8xx machine check handlers into platforms/8xx In the same spirit as what was done for 4xx and 44x, move the 8xx machine check into platforms/8xx Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:02 +10:00
Christophe Leroy	0e23e7b32b	powerpc/8xx: Simplify CONFIG_8xx checks in Makefile The entire 8xx directory is omitted if CONFIG_8xx is not enabled, so within the 8xx/Makefile CONFIG_8xx is always y. So convert obj-$(CONFIG_8xx) to the more obvious obj-y. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:32:02 +10:00
Michael Ellerman	d30a5a5262	powerpc/traps: Use SRR1 defines for program check reasons Currently we open code the reason codes for program checks. Instead use the existing SRR1 defines. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:31:32 +10:00
Michael Ellerman	ccd3cd3613	powerpc/mce: Move 64-bit machine check code into mce.c We already have mce.c which is built for 64bit and contains other parts of the machine check code, so move these bits in there too. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:31:31 +10:00
Michael Ellerman	7f3f819e94	powerpc/traps: machine_check_generic() is only used on 32-bit Make it clear that the fallback version of machine_check_generic() is only used on 32-bit configs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:31:31 +10:00
Michael Ellerman	42bff234be	powerpc/traps: Inline get_mc_reason() get_mc_reason() no longer provides (if it ever really did) any meaningful abstraction, so remove it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:31:30 +10:00
Michael Ellerman	0d0935b367	powerpc/4xx: Move machine_check_4xx() into platforms/4xx Now that we have 4xx platform directory we can move the 4xx machine check handler in there. Again we drop get_mc_reason() and replace it with regs->dsisr directly (which is actually SPRN_ESR). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:31:30 +10:00
Michael Ellerman	bfa9a2eb92	powerpc/4xx: Create 4xx pseudo-platform in platforms/4xx We have a lot of code in sysdev for supporting 4xx, ie. either 40x or 44x. Instead it would be cleaner if it was all in platforms/4xx. This is slightly odd in that we don't actually define any machines in the 4xx platform, as is usual for a platform directory. But still it seems like a better result to have all this related code in a directory by itself. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:31:30 +10:00
Michael Ellerman	5b4e28577b	powerpc/44x: Move 44x machine check handlers into platforms/44x We have several 44x machine check handlers defined in traps.c. It would be preferable if they were split out with the platforms that use them. Do that. In the process, drop get_mc_reason() and instead just open code the lookup of reason from regs->dsisr. This avoids a pointless layer of abstraction. We know to use regs->dsisr because 44x enables BOOKE which enables PPC_ADV_DEBUG_REGS, and FSL_BOOKE is not enabled on 44x builds. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:31:24 +10:00
Michael Ellerman	8ab8fc688a	powerpc/44x: Simplify CONFIG_44x checks in Makefile The entire 44x directory is omitted if CONFIG_44x is not enabled, so within the 44x/Makefile CONFIG_44x is always y. So convert obj-$(CONFIG_44x) to the more obvious obj-y. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:29:03 +10:00
Michael Ellerman	13fef7f9da	powerpc/47x: Guard 47x cputable entries with CONFIG_PPC_47x Currently we build the 47x cputable entries even when CONFIG_PPC_47x is disabled. That means a kernel built without CONFIG_PPC_47x will claim to support a 47x CPU and start booting, only to break somewhere later because it doesn't have 47x support compiled in. So guard the 47x cputable entries with CONFIG_PPC_47x. Note that this is inside the #ifdef CONFIG_44x section, because 47x depends on 44x. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 23:28:59 +10:00
Shilpasri G Bhat	bf9571550f	powerpc/powernv: Add support to clear sensor groups data Adds support for clearing different sensor groups. OCC inband sensor groups like CSM, Profiler, Job Scheduler can be cleared using this driver. The min/max of all sensors belonging to these sensor groups will be cleared. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:40:05 +10:00
Shilpasri G Bhat	8e84b2d1f0	powerpc/powernv: Add support to set power-shifting-ratio This patch adds support to set power-shifting-ratio which hints the firmware how to distribute/throttle power between different entities in a system (e.g CPU v/s GPU). This ratio is used by OCC for power capping algorithm. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:40:01 +10:00
Shilpasri G Bhat	cb8b340de2	powerpc/powernv: Add support for powercap framework Adds a generic powercap framework to change the system powercap inband through OPAL-OCC command/response interface. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:39:53 +10:00
Nicholas Piggin	a70a0b9f44	powerpc/pseries: Don't print failure message in energy driver This driver currently reports the H_BEST_ENERGY hypervisor call is unsupported (even when booting in a non-virtualised environment). This is not something the administrator can do much with, and not significant for debugging. Remove it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:29 +10:00
Matt Brown	e27f71e5ff	powerpc/lib/sstep: Add isel instruction emulation This adds emulation for the isel instruction. Tested for correctness against the isel instruction and its extended mnemonics (lt, gt, eq) on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:29 +10:00
Matt Brown	2c979c489f	powerpc/lib/sstep: Add prty instruction emulation This adds emulation for the prtyw and prtyd instructions. Tested for logical correctness against the prtyw and prtyd instructions on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:28 +10:00
Matt Brown	f312793dda	powerpc/lib/sstep: Add bpermd instruction emulation This adds emulation for the bpermd instruction. Tested for correctness against the bpermd instruction on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:27 +10:00
Matt Brown	dcbd19b48d	powerpc/lib/sstep: Add popcnt instruction emulation This adds emulations for the popcntb, popcntw, and popcntd instructions. Tested for correctness against the popcnt{b,w,d} instructions on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:27 +10:00
Matt Brown	02c0f62a60	powerpc/lib/sstep: Add cmpb instruction emulation This patch adds emulation of the cmpb instruction, enabling xmon to emulate this instruction. Tested for correctness against the cmpb asm instruction on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:26 +10:00
Madhavan Srinivasan	93fc5ca9a0	powerpc/perf: Cleanup of PM_BR_CMPL vs. PM_BRU_CMPL in Power9 event list Fixes: `34922527a2` ("powerpc/perf: Add power9 event list macros for generic and cache events") Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:06 +10:00
Madhavan Srinivasan	91e0bd1e62	powerpc/perf: Add PM_LD_MISS_L1 and PM_BR_2PATH to power9 event list Add couple of more events (PM_LD_MISS_L1 and PM_BR_2PATH) to power9 event list and power9_event_alternatives array (these events can be counted in more than one PMC). Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:05 +10:00
Madhavan Srinivasan	70a7e72099	powerpc/perf: Factor out PPMU_ONLY_COUNT_RUN check code from power8 There are some hardware events on Power systems which only count when the processor is not idle, and there are some fixed-function counters which count such events. For example, the "run cycles" event counts cycles when the processor is not idle. If the user asks to count cycles, we can use "run cycles" if this is a per-task event, since the processor is running when the task is running, by definition. We can't use "run cycles" if the user asks for "cycles" on a system-wide counter. Currently in power8 this check is done using PPMU_ONLY_COUNT_RUN flag in power8_get_alternatives() function. Based on the flag, events are switched if needed. This function should also be enabled in power9, so factor out the code to isa207_get_alternatives(). Fixes: `efe881afdd` ('powerpc/perf: Factor out event_alternative function') Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:05 +10:00
Madhavan Srinivasan	7aa345d842	powerpc/perf: Update default sdar_mode value for power9 Commit `20dd4c624d` ('powerpc/perf: Fix SDAR_MODE value for continous sampling on Power9') set the default sdar_mode value in MMCRA[SDAR_MODE] to be used as 0b01 (Update on TLB miss). And this value is set if sdar_mode from event is zero, or we are in continous sampling mode in power9 dd1. But it is preferred to have the sdar_mode value for power9 as 0b10 (Update on dcache miss) for better sampling updates instead of 0b01 (Update on TLB miss). From Anton: Using a bandwidth test case with a 1MB footprint, I profiled cycles and chose TLB updates of the SDAR: $ perf record -d -e r000400000000001E:u ./bw2001 1M ^ SDAR TLB $ perf report -D \| grep PERF_RECORD_SAMPLE \| sed 's/.addr: //' \| sort -u \| wc -l 4 I get 4 unique addresses. If I ran with dcache misses: $ perf record -d -e r000800000000001E:u ./bw2001 1M ^ SDAR dcache miss $ perf report -D\|grep PERF_RECORD_SAMPLE\| sed 's/.addr: //'\|sort -u \| wc -l 5217 I get 5217 unique addresses. No surprises here, but it does show why TLB misses is the wrong event to default to - we get very little useful information out of it. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:04 +10:00
Ivan Mikhaylov	754f030908	powerpc/44x/fsp2: Enable eMMC arasan for fsp2 platform Add mmc0 changes for enabling arasan emmc and change defconfig appropriately. Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:04 +10:00
Suraj Jitindar Singh	7cd2a8695e	powerpc/mm: Properly invalidate when setting process table base The host process table base is stored in the partition table by calling the function native_register_process_table(). Currently this just sets the entry in memory and is missing a subsequent cache invalidation instruction. Any update to the partition table should be followed by a cache invalidation instruction specifying invalidation of the caching of any partition table entries (RIC = 2, PRS = 0). We already have a function to update the partition table with the required cache invalidation instructions - mmu_partition_table_set_entry(). Update the native_register_process_table() function to call mmu_partition_table_set_entry(), this ensures all appropriate invalidation will be performed. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Use a local for patb0 to clean it up slightly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:03 +10:00
Benjamin Herrenschmidt	cffb717ceb	powerpc/xive: Ensure active irqd when setting affinity Ensure irqd is active before attempting to set affinity. This should make the set affinity code more robust. For instance, this prevents these messages seen on a 4.12 based kernel when taking cpus offline: [ 123.053037264,3] XIVE[ IC 00 ] ISN 2 lead to invalid IVE ! [ 77.885859] xive: Error -6 reconfiguring irq 17 [ 77.885862] IRQ17: set affinity failed(-6). That particular case has been fixed in 4.13-rc1 by commit `91f26cb4cd` ("genirq/cpuhotplug: Do not migrated shutdown irqs"). Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:03 +10:00
Nicholas Piggin	04019bf8eb	powerpc: Add irq accounting for watchdog interrupts This adds an irq counter for the watchdog soft-NMI. This interrupt only fires when interrupts are soft-disabled, so it will not increment much even when the watchdog is running. However it's useful for debugging and sanity checking. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:02 +10:00
Nicholas Piggin	ca41ad4377	powerpc: Add irq accounting for system reset interrupts Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:01 +10:00
Nicholas Piggin	75eb767e4c	powerpc: Fix powerpc-specific watchdog build configuration The powerpc kernel/watchdog.o should be built when HARDLOCKUP_DETECTOR and HAVE_HARDLOCKUP_DETECTOR_ARCH are both selected. If only the former is selected, then the generic perf watchdog has been selected. To simplify this check, introduce a new Kconfig symbol PPC_WATCHDOG that depends on both. This Kconfig option means the powerpc specific watchdog is enabled. Without this patch, Book3E will attempt to build the powerpc watchdog. Fixes: `2104180a53` ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:01 +10:00
Nicholas Piggin	f886f0f6e0	powerpc/64s: Fix mce accounting for powernv On 64-bit Book3s, when we're in HV mode, we have already counted the machine check exception in machine_check_early(). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Use IS_ENABLED() rather than an #ifdef] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:00 +10:00
Nathan Fontenot	1a367063ca	powerpc/pseries: Check memory device state before onlining/offlining When DLPAR adding or removing memory we need to check the device offline status before trying to online/offline the memory. This is needed because calls to device_online() and device_offline() will return non-zero for memory that is already online and offline respectively. This update resolves two scenarios. First, for a kernel built with auto-online memory enabled (CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y), memory will be onlined as part of calls to add_memory(). After adding the memory the pseries DLPAR code tries to online it and fails since the memory is already online. The DLPAR code then tries to remove the memory which produces the oops message below because the memory is not offline. The second scenario occurs when removing memory that is already offline, i.e. marking memory offline (via sysfs) and then trying to remove that memory. This doesn't work because offlining the already offline memory does not succeed and the DLPAR code then fails the DLPAR remove operation. The fix for both scenarios is to check the device.offline status before making the calls to device_online() or device_offline(). kernel BUG at mm/memory_hotplug.c:1936! ... NIP [c0000000002ca428] .remove_memory+0xb8/0xc0 LR [c0000000002ca3cc] .remove_memory+0x5c/0xc0 Call Trace: .remove_memory+0x5c/0xc0 (unreliable) .dlpar_add_lmb+0x384/0x400 .dlpar_memory+0x5dc/0xca0 .handle_dlpar_errorlog+0x74/0xe0 .pseries_hp_work_fn+0x2c/0x90 .process_one_work+0x17c/0x460 .worker_thread+0x88/0x500 .kthread+0x15c/0x1a0 .ret_from_kernel_thread+0x58/0xc0 Fixes: `943db62c31` ("powerpc/pseries: Revert 'Auto-online hotplugged memory'") Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> [mpe: Use bool, add explicit rc=0 case, change log typos & formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:30:00 +10:00
Andreas Schwab	8a583c0a8d	powerpc: Fix invalid use of register expressions binutils >= 2.26 now warns about misuse of register expressions in assembler operands that are actually literals, for example: arch/powerpc/kernel/entry_64.S:535: Warning: invalid register expression In practice these are almost all uses of r0 that should just be a literal 0. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> [mpe: Mention r0 is almost always the culprit, fold in purgatory change] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-10 22:29:41 +10:00
Peter Zijlstra	a9668cd6ee	locking: Remove smp_mb__before_spinlock() Now that there are no users of smp_mb__before_spinlock() left, remove it entirely. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-10 12:29:03 +02:00
Peter Zijlstra	d89e588ca4	locking: Introduce smp_mb__after_spinlock() Since its inception, our understanding of ACQUIRE, esp. as applied to spinlocks, has changed somewhat. Also, I wonder if, with a simple change, we cannot make it provide more. The problem with the comment is that the STORE done by spin_lock isn't itself ordered by the ACQUIRE, and therefore a later LOAD can pass over it and cross with any prior STORE, rendering the default WMB insufficient (pointed out by Alan). Now, this is only really a problem on PowerPC and ARM64, both of which already defined smp_mb__before_spinlock() as a smp_mb(). At the same time, we can get a much stronger construct if we place that same barrier _inside_ the spin_lock(). In that case we upgrade the RCpc spinlock to an RCsc. That would make all schedule() calls fully transitive against one another. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-10 12:29:02 +02:00
Daniel Borkmann	20dbf5ccbb	bpf, ppc64: implement jiting of BPF_J{LT, LE, SLT, SLE} This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions with BPF_X/BPF_K variants for the ppc64 eBPF JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-09 16:53:57 -07:00
Nicholas Piggin	96ea91e7b6	powerpc/watchdog: add locking around init/exit functions When CPUs start and stop the watchdog, they manipulate shared data that is normally protected by the lock. Other CPUs can be running concurrently at this time, so it's a good idea to use locking here to be on the safe side. Remove the barrier which is undocumented and didn't do anything. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-09 23:45:33 +10:00
Nicholas Piggin	87607a30be	powerpc/watchdog: Fix marking of stuck CPUs When the SMP detector finds other CPUs stuck, it iterates over them and marks them as stuck. This pulls them out of the pending mask and allows the detector to continue with remaining good CPUs (if nmi_watchdog=panic is not enabled). The code to dothat was buggy because when setting a CPU stuck, if the pending mask became empty, it resets it to keep the watchdog running. However the iterator will continue to run over the new pending mask and mark remaining good CPUs sas stuck. Fix this by doing it with cpumask bitwise operations. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-09 23:45:32 +10:00
Nicholas Piggin	8e23692175	powerpc/watchdog: Fix final-check recovered case When the watchdog decides to panic, it takes the lock and double checks everything (to avoid races with the CPU being unstuck or panic()ed by something else). The exit label was misplaced and would result in all-CPUs backtrace and watchdog panic even in the case that the condition was found to be resolved. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-09 23:45:31 +10:00
Nicholas Piggin	26c5c6e129	powerpc/watchdog: Moderate touch_nmi_watchdog overhead Some code can go into a tight loop calling touch_nmi_watchdog (e.g., stop_machine CPU hotplug code). This can cause contention on watchdog locks particularly if all CPUs with watchdog enabled are spinning in the loops. Avoid this storm of activity by running the watchdog timer callback from this path if we have exceeded the timer period since it was last run. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-09 23:45:29 +10:00
Nicholas Piggin	d8e2a40535	powerpc/watchdog: Improve watchdog lock primitive - Hard-disable interrupts before taking the lock, which prevents soft-NMI re-entrancy and therefore can prevent deadlocks. - Use raw_ variants of local_irq_disable to avoid irq debugging. - When the lock is contended, spin at low SMT priority, using loads only, and with interrupts enabled (where possible). Some stalls have been noticed at high loads that go away with improved locking. There should not be so much locking contention in the first place (which is addressed in a subsequent patch), but locking should still be improved. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-09 23:45:28 +10:00
Nicholas Piggin	0459ddfdb3	powerpc: NMI IPI improve lock primitive When the NMI IPI lock is contended, spin at low SMT priority, using loads only, and with interrupts enabled (where possible). This improves behaviour under high contention (e.g., a system crash when a number of CPUs are trying to enter the debugger). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-09 23:45:26 +10:00
Michael Ellerman	7310d5c8c5	powerpc/configs: Re-enable HARD/SOFT lockup detectors In commit `05a4a95279` ("kernel/watchdog: split up config options"), CONFIG_LOCKUP_DETECTOR was split into two separate config options, CONFIG_HARDLOCKUP_DETECTOR and CONFIG_SOFTLOCKUP_DETECTOR. Our defconfigs still have CONFIG_LOCKUP_DETECTOR=y, but that is no longer user selectable, and we don't mention the new options, so we end up with none of them enabled. So update the defconfigs to turn on the new SOFT and HARD options, the end result being the same as what we had previously. Fixes: `05a4a95279` ("kernel/watchdog: split up config options") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-09 23:45:13 +10:00
Gautham R. Shenoy	785a12afdb	powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails Currently, we use the opal call opal_slw_set_reg() to inform the Sleep-Winkle Engine (SLW) to restore the contents of some of the Hypervisor state on wakeup from deep idle states that lose full hypervisor context (characterized by the flag OPAL_PM_LOSE_FULL_CONTEXT). However, the current code has a bug in that if opal_slw_set_reg() fails, we don't disable the use of these deep states (winkle on POWER8, stop4 onwards on POWER9). This patch fixes this bug by ensuring that if programing the sleep-winkle engine to restore the hypervisor states in pnv_save_sprs_for_deep_states() fails, then we exclude such states by clearing the OPAL_PM_LOSE_FULL_CONTEXT flag from supported_cpuidle_states. As a result POWER8 will be prevented from using winkle for CPU-Hotplug, and POWER9 will put the offlined CPUs to the default stop state when available. Further, we ensure in the initialization of the cpuidle-powernv driver to only include those states whose flags are present in supported_cpuidle_states, thereby skipping OPAL_PM_LOSE_FULL_CONTEXT states when they have been disabled due to stop-api failure. Fixes: `1e1601b38e` ("powerpc/powernv/idle: Restore SPRs for deep idle states via stop API.") Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-08 20:21:23 +10:00
Michael Ellerman	21a0e8c14b	powerpc/mm/hash64: Make vmalloc 56T on hash On 64-bit book3s, with the hash MMU, we currently define the kernel virtual space (vmalloc, ioremap etc.), to be 16T in size. This is a leftover from pre v3.7 when our user VM was also 16T. Of that 16T we split it 50/50, with half used for PCI IO and ioremap and the other 8T for vmalloc. We never bothered to make it any bigger because 8T of vmalloc ought to be enough for anybody. But it turns out that's not true, the per cpu allocator wants large amounts of vmalloc space, not to make large allocations, but to allow a large stride between allocations, because we use pcpu_embed_first_chunk(). With a bit of juggling we can increase the entire kernel virtual space to 64T. The only real complication is the check of the address in the SLB miss handler, see the comment in the code. Although we could continue to split virtual space 50/50 as we do now, no one seems to be running out of PCI IO or ioremap space. So instead keep that as 8T, and use the remaining 56T for vmalloc. In future we should be able to increase the kernel virtual space to 512T, the code already supports that, it just needs testing on older hardware. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2017-08-08 19:37:05 +10:00
Michael Ellerman	b5048de04b	powerpc/mm/slb: Move comment next to the code it's referring to There is a comment in slb_allocate() referring to the load of paca->vmalloc_sllp, but it's several lines prior in the assembly. We're about to change this code, and we want to add another comment, so move the comment immediately prior to the instruction it's talking about. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-08 19:37:04 +10:00
Michael Ellerman	63ee9b2ff9	powerpc/mm/book3s64: Make KERN_IO_START a variable Currently KERN_IO_START is defined as: #define KERN_IO_START (KERN_VIRT_START + (KERN_VIRT_SIZE >> 1)) Although it looks like a constant, both the components are actually variables, to allow us to have a different value between Radix and Hash with a single kernel. However that still requires both Radix and Hash to place the kernel IO region at the same location relative to the start and end of the kernel virtual region (namely 1/2 way through it), and we'd like to change that. So split KERN_IO_START out into its own variable, and initialise it for Radix and Hash. In the medium term we should be able to reconsolidate this, by doing a more involved rearrangement of the location of the regions. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-08 19:37:04 +10:00
Matt Brown	e66ca3db59	powerpc/powernv: Use darn instruction for get_random_seed() on Power9 This adds powernv_get_random_darn() which utilises the darn instruction, introduced in ISA v3.0/POWER9. The darn instruction can potentially return an error, which is supported by the get_random_seed() API, in normal usage if we see an error we just return that to the caller. However when detecting whether darn is functional at boot we try up to 10 times, before deciding that darn doesn't work and failing the registration of get_random_seed(). That way an intermittent failure at boot doesn't deprive the system of randomness until the next reboot. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> [mpe: Move init into a function, tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-08 19:37:03 +10:00
Christophe Leroy	64d0a506fb	powerpc/32: Fix boot failure on non 6xx platforms Commit `d300627c6a` ("powerpc/6xx: Handle DABR match before calling do_page_fault") breaks non 6xx platforms. Failed to execute /init (error -14) Starting init: /bin/sh exists but couldn't execute it (error -14) Kernel panic - not syncing: No working init found. Try passing init= ... CPU: 0 PID: 1 Comm: init Not tainted 4.13.0-rc3-s3k-dev-00143-g7aa62e972a56 #56 Call Trace: panic+0x108/0x250 (unreliable) rootfs_mount+0x0/0x58 ret_from_kernel_thread+0x5c/0x64 Rebooting in 180 seconds.. This is because in handle_page_fault(), the call to do_page_fault() has been mistakenly enclosed inside an #ifdef CONFIG_6xx Fixes: `d300627c6a` ("powerpc/6xx: Handle DABR match before calling do_page_fault") Brown-paper-bag-to-be-worn-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-08 19:35:34 +10:00
Longpeng(Mike)	199b5763d3	KVM: add spinlock optimization framework If a vcpu exits due to request a user mode spinlock, then the spinlock-holder may be preempted in user mode or kernel mode. (Note that not all architectures trap spin loops in user mode, only AMD x86 and ARM/ARM64 currently do). But if a vcpu exits in kernel mode, then the holder must be preempted in kernel mode, so we should choose a vcpu in kernel mode as a more likely candidate for the lock holder. This introduces kvm_arch_vcpu_in_kernel() to decide whether the vcpu is in kernel-mode when it's preempted. kvm_vcpu_on_spin's new argument says the same of the spinning VCPU. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-08-08 10:57:43 +02:00
Frederic Barrat	2552910084	powerpc/powernv: Enable PCI peer-to-peer P9 has support for PCI peer-to-peer, enabling a device to write in the MMIO space of another device directly, without interrupting the CPU. This patch adds support for it on powernv, by adding a new API to be called by drivers. The pnv_pci_set_p2p(...) call configures an 'initiator', i.e the device which will issue the MMIO operation, and a 'target', i.e. the device on the receiving side. P9 really only supports MMIO stores for the time being but that's expected to change in the future, so the API allows to define both load and store operations. /* PCI p2p descriptor / #define OPAL_PCI_P2P_ENABLE 0x1 #define OPAL_PCI_P2P_LOAD 0x2 #define OPAL_PCI_P2P_STORE 0x4 int pnv_pci_set_p2p(struct pci_dev initiator, struct pci_dev *target, u64 desc) It uses a new OPAL call, as the configuration magic is done on the PHBs by skiboot. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> [mpe: Drop unrelated OPAL calls, s/uint64_t/u64/, minor formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-08 11:27:30 +10:00
Michael Ellerman	44a12806d0	Revert "powerpc/64: Avoid restore_math call if possible in syscall exit" This reverts commit `bc4f65e4cf`. As reported by Andreas, this commit is causing unrecoverable SLB misses in the system call exit path: Unrecoverable exception 4100 at c00000000000a1ec Oops: Unrecoverable exception, sig: 6 [#1] SMP NR_CPUS=2 PowerMac ... CPU: 0 PID: 18626 Comm: rm Not tainted 4.13.0-rc3 #1 task: c00000018335e080 task.stack: c000000139e50000 NIP: c00000000000a1ec LR: c00000000000a118 CTR: 0000000000000000 REGS: c000000139e53bb0 TRAP: 4100 Not tainted (4.13.0-rc3) MSR: 9000000000001030 <SF,HV,ME,IR,DR> CR: 24000044 XER: 20000000 SOFTE: 1 GPR00: 0000000000000000 c000000139e53e30 c000000000abb500 fffffffffffffffe GPR04: c0000001eb866298 0000000000000000 0000000000000000 c00000018335e080 GPR08: 900000000000d032 0000000000000000 0000000000000002 fffffffffffff001 GPR12: c000000139e50000 c00000000ffff000 00003fffa8c0dca0 00003fffa8c0dc88 GPR16: 0000000010000000 0000000000000001 00003fffa8c0eaa0 0000000000000000 GPR20: 00003fffa8c27528 00003fffa8c27b00 0000000000000000 0000000000000000 GPR24: 00003fffa8c0d918 00003ffff1b3efa0 00003fffa8c26d68 0000000000000000 GPR28: 00003fffa8c249e8 00003fffa8c263d0 00003fffa8c27550 00003ffff1b3ef10 NIP [c00000000000a1ec] system_call_exit+0xc0/0x21c LR [c00000000000a118] system_call+0x58/0x6c Call Trace: [c000000139e53e30] [c00000000000a118] system_call+0x58/0x6c (unreliable) Instruction dump: 64a51000 7c6300d0 f8a101a0 4bffff9c 3c000000 60000006 780007c6 64000000 60000000 7c004039 4082001c e8ed0170 <88070b78> 88c70b79 7c003214 2c200000 This is caused by us trying to load THREAD_LOAD_FP with MSR_RI=0, and taking an SLB miss on the thread struct. Reported-by: Andreas Schwab <schwab@linux-m68k.org> Diagnosed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-07 21:36:56 +10:00
Linus Torvalds	841fe95303	powerpc fixes for 4.13 #5 Fixes for recently merged code: - a fix for the _PAGE_DEVMAP support, which was breaking KVM on Power9 radix - avoid a (harmless) lockdep warning in the early SMP code - return failure for some uses of dma_set_mask() rather than falling back to 32-bits - fix stack setup in watchdog soft_nmi_common() to use emergency stack - fix of_irq_to_resource() error check in of_fsl_spi_probe() Two fixes going to stable: - fix saving of Transactional Memory SPRs in core dump - fix __check_irq_replay missing decrementer interrupt And two misc: - fix 64-bit boot wrapper build with non-biarch compiler - work around a POWER9 PMU hang after state-loss idle Thanks to: Alistair Popple, Aneesh Kumar K.V, Cyril Bur, Gustavo Romero, Jose Ricardo Ziviani, Laurent Vivier, Nicholas Piggin, Oliver O'Halloran, Sergei Shtylyov, Suraj Jitindar Singh, Thomas Gleixner. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZhEH/AAoJEFHr6jzI4aWA8GoP+wdtaBYZIElFMnPay/BY8lSQ MODF6FlaexdNuVFo5In2BnMAs12bIXDO69wMziqc+uZIA+vI3K7lqsi+8p1R38Ts FG5OFLzX/gx1ykoeDe0mC8GPUHGVBdMr6+rHzt/9M7PrQWApCoBJLakYIElOrN5B EiAxunwJQISKNQYYmyVhjMVFq8ydMHSgtR7JVDHTh1AoStaO9M5hoJegD4Dnytaf F0/0Y5U+NGGWwPIyDrjB4wE9L8NCk0JfvpBhjORMkFSpAOe4VhxZrAGDyaCzVKVh YkxSirVKuVwEmvV2prmq4SAJi61EMUEPx2WL7hRN3ol/dF+2lU9nl2lyL3yRnGPY 9mtNHur07qoMqyXI8750ayTzbMG75aakJF2be9G+zwNc3Sw8dqJvs2PwpSVdnima jWmWnRikws4qmEOEBV6nxtWf8qYACHevyK7YzIHDU8SVXLjAkX/u6LZGHAfTQ8Xo jYcISpiA2ko8YnMR7yy9FZqs4KGqeOotMamgJRGUcSY/9Xn8WDIrtQ4tIuJtGjzW XFOFafZOzyt6s1dNz01C4hzP60J+/CxPNRxx6zWnbijY6puADu/Mv1g+ai3nYdKA e/CYt6O8tzooFHCHtVLUPgWA82heDwNtjmlnSikWTdT07KLRhP+0w9Qq8gBzztKT 7B/vPMjCMPQaJrrnj6GX =Rdy8 -----END PGP SIGNATURE----- Merge tag 'powerpc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fixes for recently merged code: - a fix for the _PAGE_DEVMAP support, which was breaking KVM on Power9 radix - avoid a (harmless) lockdep warning in the early SMP code - return failure for some uses of dma_set_mask() rather than falling back to 32-bits - fix stack setup in watchdog soft_nmi_common() to use emergency stack - fix of_irq_to_resource() error check in of_fsl_spi_probe() Two fixes going to stable: - fix saving of Transactional Memory SPRs in core dump - fix __check_irq_replay missing decrementer interrupt And two misc: - fix 64-bit boot wrapper build with non-biarch compiler - work around a POWER9 PMU hang after state-loss idle Thanks to: Alistair Popple, Aneesh Kumar K.V, Cyril Bur, Gustavo Romero, Jose Ricardo Ziviani, Laurent Vivier, Nicholas Piggin, Oliver O'Halloran, Sergei Shtylyov, Suraj Jitindar Singh, Thomas Gleixner" * tag 'powerpc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64: Fix __check_irq_replay missing decrementer interrupt powerpc/perf: POWER9 PMU stops after idle workaround powerpc/83xx/mpc832x_rdb: fix of_irq_to_resource() error check powerpc/64s: Fix stack setup in watchdog soft_nmi_common() powerpc/powernv/pci: Return failure for some uses of dma_set_mask() powerpc/boot: Fix 64-bit boot wrapper build with non-biarch compiler powerpc/smp: Call smp_ops->setup_cpu() directly on the boot CPU powerpc/tm: Fix saving of TM SPRs in core dump powerpc/mm: Fix pmd/pte_devmap() on non-leaf entries	2017-08-04 09:56:54 -07:00
Nicholas Piggin	3db40c312c	powerpc/64: Fix __check_irq_replay missing decrementer interrupt If the decrementer wraps again and de-asserts the decrementer exception while hard-disabled, __check_irq_replay() has a test to notice the wrap when interrupts are re-enabled. The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked because the decrementer interrupt was always the first one checked after clearing the hard disable flag, but HMI check was moved ahead of that, which introduced this bug. This can cause a missed decrementer interrupt if we soft-disable interrupts then take an HMI which is recorded in irq_happened, then hard-disable interrupts for > 4s to wrap the decrementer. Fixes: `e0e0d6b739` ("powerpc/64: Replay hypervisor maintenance interrupt first") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-04 12:55:49 +10:00
Nicholas Piggin	09539f9b12	powerpc/perf: POWER9 PMU stops after idle workaround POWER9 DD2 PMU can stop after a state-loss idle in some conditions. A solution is to set then clear MMCRA[60] after wake from state-loss idle. MMCRA[60] is a non-architected bit, see the user manual for details. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-04 12:52:26 +10:00
Benjamin Herrenschmidt	6ff4d3e966	powerpc: Remove old unused icswx based coprocessor support We have a whole pile of unused code to maintain the ACOP register, allocate coprocessor PIDs and handle ACOP faults. This mechanism was used for the HFI adapter on POWER7 which is dead and gone and whose driver never went upstream. It was used on some A2 core based stuff that also never saw the light of day. Take out all that code. There is still some POWER8 coprocessor code that uses icswx but it's kernel only and thus doesn't use any of that infrastructure. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:52 +10:00
Benjamin Herrenschmidt	8f5ca0b319	powerpc/mm: Cleanup check for stack expansion When hitting below a VM_GROWSDOWN vma (typically growing the stack), we check whether it's a valid stack-growing instruction and we check the distance to GPR1. This is largely open coded with lots of comments, so move it out to a helper. While at it, make store_update_sp a boolean. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:51 +10:00
Benjamin Herrenschmidt	f43bb27ebf	powerpc/mm: Don't lose "major" fault indication on retry If the first iteration returns VM_FAULT_MAJOR but the second one doesn't, we fail to account the fault as a major fault. This fixes it and brings the code in line with x86. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:51 +10:00
Benjamin Herrenschmidt	bd0d63f809	powerpc/mm: Move page fault VMA access checks to a helper Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:51 +10:00
Benjamin Herrenschmidt	d2e0d2c51a	powerpc/mm: Set fault flags earlier Move out the code that sets FAULT_FLAG_WRITE so the block that check access permissions can be extracted. While at it also set FAULT_FLAG_INSTRUCTION which will be used for protection keys. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:50 +10:00
Benjamin Herrenschmidt	b15021d994	powerpc/mm: Add a bunch of (un)likely annotations to do_page_fault Mostly for the failure cases Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:50 +10:00
Benjamin Herrenschmidt	11ccdd33d6	powerpc/mm: Move/simplify faulthandler_disabled() and !mm check Do the check before we re-enable interrupts and clean the code up a bit. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:49 +10:00
Benjamin Herrenschmidt	2865d08dd9	powerpc/mm: Move the DSISR_PROTFAULT sanity check This has a page of comment explaining what's going on right in the middle of do_page_fault() which makes things a bit hard to follow. Move it to a helper instead. Also do the test earlier as there's no point waiting until after we found the VMA. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:49 +10:00
Benjamin Herrenschmidt	04aafdc601	powerpc/mm: Cosmetic fix to page fault accounting No need to break those lines, they aren't that long Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:48 +10:00
Benjamin Herrenschmidt	3da026480a	powerpc/mm: Move CMO accounting out of do_page_fault into a helper It makes do_page_fault() more readable. No functional change. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:48 +10:00
Benjamin Herrenschmidt	b5c8f0fd59	powerpc/mm: Rework mm_fault_error() First, handle the normal retry failure in do_page_fault itself, since it's a simple return statement. That allows us to remove the "continue" special return code from mm_fault_error(). Once that's done, we can have an implementation much closer to x86 where we only call mm_fault_error() if VM_FAULT_ERROR is set and directly return. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:47 +10:00
Benjamin Herrenschmidt	c3350602e8	powerpc/mm: Make bad_area* helper functions Instead of goto labels, instead call those functions and return. This gets us closer to x86 and allows us to shring do_page_fault() even more. The main difference with x86 is that those function return a value which we then return from do_page_fault(). That value is our return value from do_page_fault() which we use to generate kernel faults. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:47 +10:00
Benjamin Herrenschmidt	d3ca587404	powerpc/mm: Fix reporting of kernel execute faults We currently test for is_exec and DSISR_PROTFAULT but that doesn't make sense as this is the wrong error bit to test for an execute permission failure. In fact, we had code that would return early if we had an exec fault in kernel mode so I think that was just dead code anyway. Finally the location of that test is awkward and prevents further simplifications. So instead move that test into a helper along with the existing early test for kernel exec faults and out of range accesses, and put it all in a "bad_kernel_fault()" helper. While at it test the correct error bits. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:46 +10:00
Benjamin Herrenschmidt	65d47fd4a3	powerpc/mm: Simplify returns from __do_page_fault Now that we moved the exception state handling to a wrapper, we can just directly return rather than "goto bail" Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:46 +10:00
Benjamin Herrenschmidt	bb4be50e61	powerpc/mm: Move debugger check to notify_page_fault() unclutters the main path Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:46 +10:00
Benjamin Herrenschmidt	f3d96e698e	powerpc/mm: Overhaul handling of bad page faults A bad page fault is when the HW signals an error such as a bad copy/paste, an AMO error, or some other type of error that will not be fixed by updating the PTE. Use a helper page_fault_is_bad() to check for bad page faults thus removing the per-processor family open-coding in __do_page_fault() and trigger a SIGBUS rather than a SIGSEGV which is more appropriate. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:45 +10:00
Benjamin Herrenschmidt	e6c8290a89	powerpc/mm: Move error_code checks for bad faults earlier There's no point looking for the VMA etc.. when we already know we are going to fail. This adds some code to set "code" for the si_code but that will be gone in subsequent patches. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:44 +10:00
Benjamin Herrenschmidt	41b464e5e5	powerpc/mm: Move out definition of CPU specific is_write bits Define a common page_fault_is_write() helper and use it Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:44 +10:00
Benjamin Herrenschmidt	b4c001dc44	powerpc/mm: Use symbolic constants for filtering SRR1 bits on ISIs This uses the newly defined constants for this rather than open-coded numbers. There is a side effect on 64-bit which is to pass through some of the new P9 bits which we didn't before. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:44 +10:00
Benjamin Herrenschmidt	398a719d34	powerpc/mm: Update bits used to skip hash_page We test a number of bits from DSISR/SRR1 before deciding to call hash_page(). If any of these is set, we go directly to do_page_fault() as the bit indicate a fault that needs to be handled there (no hashing needed). This updates the current open-coded masks to use the new DSISR definitions. This does change the masks actually used in two ways: - We used to test various bits that were defined as "always 0" in the architecture and could be repurposed for something else. From now on, we just ignore such bits. - We were missing some new bits defined on P9 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:43 +10:00
Benjamin Herrenschmidt	870cfe77a9	powerpc/mm: Update definitions of DSISR bits This updates the definitions for the various DSISR bits to match both some historical stuff and to match new bits on POWER9. In addition, we define some masks corresponding to the "bad" faults on Book3S, and some masks corresponding to the bits that match between DSISR and SRR1 for a DSI and an ISI. This comes with a small code update to change the definition of DSISR_PGDIRFAULT which becomes DSISR_PRTABLE_FAULT to match architecture 3.0B Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:43 +10:00
Benjamin Herrenschmidt	d300627c6a	powerpc/6xx: Handle DABR match before calling do_page_fault On legacy 6xx 32-bit procesors, we checked for the DABR match bit in DSISR from do_page_fault(), in the middle of a pile of ifdef's because all other CPU types do it in assembly prior to calling do_page_fault. Fix that. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Add #ifdef CONFIG_6xx] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-03 16:06:26 +10:00
Sergei Shtylyov	f29bb7861a	powerpc/83xx/mpc832x_rdb: fix of_irq_to_resource() error check of_irq_to_resource() has recently been fixed to return negative error #'s along with 0 in case of failure, however the Freescale MPC832x RDB board code still only regards 0 as a failure indication -- fix it up. Fixes: `7a4228bbff` ("of: irq: use of_irq_get() in of_irq_to_resource()") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-02 19:26:52 +10:00
Benjamin Herrenschmidt	c433ec0455	powerpc/mm: Pre-filter SRR1 bits before do_page_fault() By filtering the relevant SRR1 bits in the assembly rather than in do_page_fault() itself, we avoid a conditional branch (since we already come from different path for data and instruction faults). This will allow more simplifications later Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-02 13:11:07 +10:00
Benjamin Herrenschmidt	7afad422ac	powerpc/mm: Move exception_enter/exit to a do_page_fault wrapper This will allow simplifying the returns from do_page_fault Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-02 13:11:07 +10:00
Benjamin Herrenschmidt	424de9c6e3	powerpc/mm/radix: Avoid flushing the PWC on every flush_tlb_range We do that because it's used by THP pmd collapsing, so use instead a dedicated flush function. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-02 13:11:06 +10:00
Benjamin Herrenschmidt	a46cc7a90f	powerpc/mm/radix: Improve TLB/PWC flushes At the moment we have to rather sub-optimal flushing behaviours: - flush_tlb_mm() will flush the PWC which is unnecessary (for example when doing a fork) - A large unmap will call flush_tlb_pwc() multiple times causing us to perform that fairly expensive operation repeatedly. This happens often in batches of 3 on every new process. So we change flush_tlb_mm() to only flush the TLB, and we use the existing "need_flush_all" flag in struct mmu_gather to indicate that the PWC needs flushing. Unfortunately, flush_tlb_range() still needs to do a full flush for now as it's used by the THP collapsing. We will fix that later. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-02 13:11:06 +10:00
Benjamin Herrenschmidt	5ce5fe14ed	powerpc/mm/radix: Improve _tlbiel_pid to be usable for PWC flushes The PWC flush only needs a single set call, just like the full (RIC=2) flush. This will allow us to get rid of the dedicated _tlbiel_pwc() Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-02 13:11:05 +10:00
Jeff Layton	3b49c9a1e9	fs: convert a pile of fsync routines to errseq_t based reporting This patch converts most of the in-kernel filesystems that do writeback out of the pagecache to report errors using the errseq_t-based infrastructure that was recently added. This allows them to report errors once for each open file description. Most filesystems have a fairly straightforward fsync operation. They call filemap_write_and_wait_range to write back all of the data and wait on it, and then (sometimes) sync out the metadata. For those filesystems this is a straightforward conversion from calling filemap_write_and_wait_range in their fsync operation to calling file_write_and_wait_range. Acked-by: Jan Kara <jack@suse.cz> Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Jeff Layton <jlayton@redhat.com>	2017-08-01 08:39:29 -04:00
Victor Aoqui	75f327c6b7	powerpc/kernel: Avoid preemption check in iommu_range_alloc() Replace the __this_cpu_read() with raw_cpu_read() in iommu_range_alloc(). Otherwise we get a warning about using __this_cpu_read() in preemptible code: BUG: using __this_cpu_read() in preemptible caller is iommu_range_alloc+0xa8/0x3d0 Preemption doesn't need to be disabled since according to the comment any CPU can safely use any IOMMU pool. Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-01 21:49:23 +10:00
Gautham R. Shenoy	24be85a23d	powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug Currently we use the stop-api provided by the firmware to program the SLW engine to restore the values of hypervisor resources that get lost on deeper idle states (such as winkle). Since the deep states were only used for CPU-Hotplug on POWER8 systems, we would program the LPCR to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously woken up by decrementer. On POWER9, some of the deep platform idle states such as stop4 can be used in cpuidle as well. In this case, we want the CPU in stop4 to be woken up by the decrementer when some timer on the CPU expires. In this patch, we program the stop-api for LPCR with PECE1 bit cleared only when we are offlining the CPU and set it back once the CPU is online. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-01 21:01:28 +10:00
Gautham R. Shenoy	e1c1cfed54	powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle The stop4 idle state on POWER9 is a deep idle state which loses hypervisor resources, but whose latency is low enough that it can be exposed via cpuidle. Until now, the deep idle states which lose hypervisor resources (eg: winkle) were only exposed via CPU-Hotplug. Hence currently on wakeup from such states, barring a few SPRs which need to be restored to their older value, rest of the SPRS are reinitialized to their values corresponding to that at boot time. When stop4 is used in the context of cpuidle, we want these additional SPRs to be restored to their older value, to ensure that the context on the CPU coming back from idle is same as it was before going idle. In this patch, we define a SPR save area in PACA (since we have used up the volatile register space in the stack) and on POWER9, we restore SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1, SPRN_MMCR2 to the values they had before entering stop. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-08-01 21:01:20 +10:00
Nicholas Piggin	cc491f1d35	powerpc/64s: Fix stack setup in watchdog soft_nmi_common() The watchdog soft-NMI exception stack setup loads a stack pointer twice, which is an obvious error. It ends up using the system reset interrupt (true-NMI) stack, which is also a bug because the watchdog could be preempted by a system reset interrupt that overwrites the NMI stack. Change the soft-NMI to use the "emergency stack". The current kernel stack is not used, because of the longer-term goal to prevent asynchronous stack access using soft-disable. Fixes: `2104180a53` ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-07-31 20:22:37 +10:00
Michael Ellerman	bb272221e9	Linux v4.13-rc1 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZapWhAAoJEHm+PkMAQRiGKb0IAJM6b7SbWaw69Og7+qiFB+zZ xp29iXqbE9fPISC6a5BRQV1ONjeDM6opGixGHqGC8Hla6k2IYz25VDNoF8wd0MXN cz/Ih20vd3C5afxXGe5cTT8lsPAlV0mWXxForlu6j8jPeL62FPfq6RhEkw7AcrYL yfYy3k3qSdOrrvBdII0WAAUi46UfIs+we9BQgbsMbkHOiqV2K0MOrzKE84Xbgepq RAy2xg6P4b4+hTx8xTrYc1MXwpnqjRc0oJ08gdmiwW3AOOU7LxYFn7zDkLPWi9Rr g4x6r4YhBTGxT4wNvovLIiqd9QFs//dMCuPWYwEtTICG48umIqqq24beQ0mvCdg= =08Ic -----END PGP SIGNATURE----- Merge tag 'v4.13-rc1' into fixes The fixes branch is based off a random pre-rc1 commit, because we had some fixes that needed to go in before rc1 was released. However we now need to fix some code that went in after that point, but before rc1, so merge rc1 to get that code into fixes so we can fix it!	2017-07-31 20:20:29 +10:00
Rui Teng	23493c1219	powerpc/mm: Fix check of multiple 16G pages from device tree The offset of hugepage block will not be 16G, if the expected page is more than one. Calculate the totol size instead of the hardcode value. Fixes: `4792adbac9` ("powerpc: Don't use a 16G page if beyond mem= limits") Signed-off-by: Rui Teng <rui.teng@linux.vnet.ibm.com> Tested-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-07-31 16:56:58 +10:00
Michael Ellerman	3603c52f02	powerpc/configs: Add a powernv_be_defconfig Although pretty much everyone using powernv is running little endian, we should still test we can build for big endian. So add a powernv_be_defconfig, which is autogenerated by flipping the endian symbol in powernv_defconfig. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Cyril Bur <cyrilbur@gmail.com>	2017-07-31 16:56:37 +10:00
Linus Torvalds	8155469341	s390: SRCU fix. PPC: host crash fixes. x86: bugfixes, including making nested posted interrupts really work. Generic: tweaks to kvm_stat and to uevents -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJZe2EYAAoJEL/70l94x66D4JYH/AnvioKWTsplhUKt4Y4JlpJX EXYjQd/CIZ+MHNNUH+U+XEj6tKQymKrz4TeZSs1o0nyxCeyparR3gK27OYVpPspN GkPSit3hyRgW9r5uXp6pZCJuFCAMpMZ6z4sKbT1FxDhnWnpWayV9w8KA+yQT/UUX dNQ9JJPUxApcM4NCaj2OCQ8K1koNIDCc52+jATf0iK/Heiaf6UGqCcHXUIy5I5wM OWk05Qm32VBAYb6P6FfoyGdLMNAAkJtr1fyOJDkxX730CYgwpjIP0zifnJ1bt8V2 YRnjvPO5QciDHbZ8VynwAkKi0ZAd8psjwXh0KbyahPL/2/sA2xCztMH25qweriI= =fsfr -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "s390: - SRCU fix PPC: - host crash fixes x86: - bugfixes, including making nested posted interrupts really work Generic: - tweaks to kvm_stat and to uevents" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: LAPIC: Fix reentrancy issues with preempt notifiers tools/kvm_stat: add '-f help' to get the available event list tools/kvm_stat: use variables instead of hard paths in help output KVM: nVMX: Fix loss of L2's NMI blocking state KVM: nVMX: Fix posted intr delivery when vcpu is in guest mode x86: irq: Define a global vector for nested posted interrupts KVM: x86: do mask out upper bits of PAE CR3 KVM: make pid available for uevents without debugfs KVM: s390: take srcu lock when getting/setting storage keys KVM: VMX: remove unused field KVM: PPC: Book3S HV: Fix host crash on changing HPT size KVM: PPC: Book3S HV: Enable TM before accessing TM registers	2017-07-28 13:36:56 -07:00

... 5 6 7 8 9 ...

17448 Commits