linux

Commit Graph

Author	SHA1	Message	Date
Naveen N. Rao	a9f8553e93	powerpc/kprobes: Pause function_graph tracing during jprobes handling This fixes a crash when function_graph and jprobes are used together. This is essentially commit `237d28db03` ("ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing"), but for powerpc. Jprobes breaks function_graph tracing since the jprobe hook needs to use jprobe_return(), which never returns back to the hook, but instead to the original jprobe'd function. The solution is to momentarily pause function_graph tracing before invoking the jprobe hook and re-enable it when returning back to the original jprobe'd function. Fixes: `6794c78243` ("powerpc64: port of the function graph tracer") Cc: stable@vger.kernel.org # v2.6.30+ Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-16 19:49:43 +10:00
Alexey Kardashevskiy	a093c92dc7	powerpc/debug: Add missing warn flag to WARN_ON's non-builtin path When trapped on WARN_ON(), report_bug() is expected to return BUG_TRAP_TYPE_WARN so the caller will increment NIP by 4 and continue. The __builtin_constant_p() path of the PPC's WARN_ON() calls (indirectly) __WARN_FLAGS() which has BUGFLAG_WARNING set, however the other branch does not which makes report_bug() report a bug rather than a warning. Fixes: `f26dee1510` ("debug: Avoid setting BUGFLAG_WARNING twice") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-16 16:10:37 +10:00
Benjamin Herrenschmidt	25642705b2	powerpc/xive: Fix offset for store EOI MMIOs Architecturally we should apply a 0x400 offset for these. Not doing it will break future HW implementations. The offset of 0 is supposed to remain for "triggers" though not all sources support both trigger and store EOI, and in P9 specifically, some sources will treat 0 as a store EOI. But future chips will not. So this makes us use the properly architected offset which should work always. Fixes: `243e25112d` ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-15 23:29:39 +10:00
Murilo Opsfelder Araujo	42bed04255	drivers/watchdog/Kconfig: Update CONFIG_WATCHDOG_RTAS dependencies drivers/watchdog/wdrtas.c uses symbols defined in arch/powerpc/kernel/rtas.c, which are exported iff CONFIG_PPC_RTAS is selected. Building wdrtas.c without setting CONFIG_PPC_RTAS throws the following errors: ERROR: ".rtas_token" [drivers/watchdog/wdrtas.ko] undefined! ERROR: "rtas_data_buf" [drivers/watchdog/wdrtas.ko] undefined! ERROR: "rtas_data_buf_lock" [drivers/watchdog/wdrtas.ko] undefined! ERROR: ".rtas_get_sensor" [drivers/watchdog/wdrtas.ko] undefined! ERROR: ".rtas_call" [drivers/watchdog/wdrtas.ko] undefined! This was identified during a randconfig build where CONFIG_WATCHDOG_RTAS=m and CONFIG_PPC_RTAS was not set. Logs are here: http://kisskb.ellerman.id.au/kisskb/buildresult/12982152/ This patch fixes the issue by updating CONFIG_WATCHDOG_RTAS to depend on just CONFIG_PPC_RTAS, removing COMPILE_TEST entirely. Signed-off-by: Murilo Opsfelder Araujo <mopsfelder@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-15 16:37:40 +10:00
Nicholas Piggin	07d2a628bc	powerpc/64s: Avoid cpabort in context switch when possible The ISA v3.0B copy-paste facility only requires cpabort when switching to a process that has foreign real addresses mapped (direct access to accelerators), to clear a potential copy buffer filled by a previous thread. There is no accelerator driver implemented yet, so cpabort can be removed. It can be be re-added when a driver is implemented. POWER9 DD1 requires the copy buffer to always be cleared on context switch, but if accelerators are not in use, then an unpaired copy from a dummy region is sufficient to clear data out of the copy buffer. This increases context switch performance by about 5% on POWER9. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-15 16:34:39 +10:00
Nicholas Piggin	9145effd62	powerpc/64: Drop explicit hwsync in context switch The sync (aka. hwsync, aka. heavyweight sync) in the context switch code to prevent MMIO access being reordered from the point of view of a single process if it gets migrated to a different CPU is not required because there is an hwsync performed earlier in the context switch path. Comment this so it's clear enough if anything changes on the scheduler or the powerpc sides. Remove the hwsync from _switch. This improves context switch performance by 2-3% on POWER8. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-15 16:34:39 +10:00
Nicholas Piggin	837e72f78a	powerpc/64: Drop reservation-clearing ldarx in context switch There is no need to explicitly break the reservation in _switch, because we are guaranteed that the context switch path will include a larx/stcx. Comment the guarantee and remove the reservation clear from _switch. This is worth 1-2% in context switch performance. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-15 16:34:39 +10:00
Nicholas Piggin	e4c0fc5f72	powerpc/64s: Leave interrupts hard enabled in context switch for radix Commit 4387e9ff25 ("[POWERPC] Fix PMU + soft interrupt disable bug") hard disabled interrupts over the low level context switch, because the SLB management can't cope with a PMU interrupt accesing the stack in that window. Radix based kernel mapping does not use the SLB so it does not require interrupts hard disabled here. This is worth 1-2% in context switch performance on POWER9. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-15 16:34:39 +10:00
Nicholas Piggin	bc4f65e4cf	powerpc/64: Avoid restore_math call if possible in syscall exit The syscall exit code that branches to restore_math is quite heavy on Book3S, consisting of 2 mtmsr instructions. Threads that don't use both FP and vector can get caught here if the kernel ever uses FP or vector. Lazy-FP/vec context switching also trips this case. So check for lazy FP and vector before switching RI for restore_math. Move most of this case out of line. For threads that do want to restore math registers, the MSR switches are still suboptimal. Future direction may be to use a soft-RI bit to avoid MSR switches in kernel (similar to soft-EE), but for now at least the no-restore POWER9 context switch rate increases by about 5% due to sched_yield(2) return performance. I haven't constructed a test to measure the syscall cost. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-15 16:34:39 +10:00
Nicholas Piggin	acd7d8cef0	powerpc/64s: Optimize hypercall/syscall entry After `bc3551257a` ("powerpc/64: Allow for relocation-on interrupts from guest to host"), a getppid() system call goes from 307 cycles to 358 cycles (+17%) on POWER8. This is due significantly to the scratch SPR used by the hypercall check. It turns out there are a some volatile registers common to both system call and hypercall (in particular, r12, cr0, ctr), which can be used to avoid the SPR and some other overheads. This brings getppid to 320 cycles (+4%). Testing hcall entry performance by running "sc 1" in guest userspace before this patch is 854 cycles, afterwards is 826. Also a small win there. POWER9 syscall is improved by about the same amount, hcall not tested. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-15 16:34:39 +10:00
Michael Ellerman	9abcc981de	powerpc/mm/radix: Only add X for pages overlapping kernel text Currently we map the whole linear mapping with PAGE_KERNEL_X. Instead we should check if the page overlaps the kernel text and only then add PAGE_KERNEL_X. Note that we still use 1G pages if they're available, so this will typically still result in a 1G executable page at KERNELBASE. So this fix is primarily useful for catching stray branches to high linear mapping addresses. Without this patch, we can execute at 1G in xmon using: 0:mon> m c000000040000000 c000000040000000 00 l c000000040000000 00000000 01006038 c000000040000004 00000000 2000804e c000000040000008 00000000 x 0:mon> di c000000040000000 c000000040000000 38600001 li r3,1 c000000040000004 4e800020 blr 0:mon> p c000000040000000 return value is 0x1 After we get a 400 as expected: 0:mon> p c000000040000000 *** 400 exception occurred Fixes: `2bfd65e45e` ("powerpc/mm/radix: Add radix callbacks for early init routines") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com>	2017-06-15 16:34:39 +10:00
Michael Ellerman	0edc2ca9cc	Revert "powerpc: Handle simultaneous interrupts at once" This reverts commit `45cb08f479`. For some reason this is causing IRQ problems on Freescale Book3E machines, eg on my p5020ds: irq 25: nobody cared (try booting with the "irqpoll" option) CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc3-gcc-6.3.1-00037-g45cb08f4791c #624 Call Trace: [c0000000fffdbb10] [c00000000049962c] .dump_stack+0xa8/0xe8 (unreliable) [c0000000fffdbba0] [c0000000000babf4] .__report_bad_irq+0x54/0x140 [c0000000fffdbc40] [c0000000000bb11c] .note_interrupt+0x324/0x380 [c0000000fffdbd00] [c0000000000b7110] .handle_irq_event_percpu+0x68/0x88 [c0000000fffdbd90] [c0000000000b718c] .handle_irq_event+0x5c/0xa8 [c0000000fffdbe10] [c0000000000bc01c] .handle_fasteoi_irq+0xe4/0x298 [c0000000fffdbe90] [c0000000000b59c4] .generic_handle_irq+0x50/0x74 [c0000000fffdbf10] [c0000000000075d8] .__do_irq+0x74/0x1f0 [c0000000fffdbf90] [c0000000000189f8] .call_do_irq+0x14/0x24 [c0000000f7173060] [c0000000000077e4] .do_IRQ+0x90/0x120 [c0000000f7173100] [c00000000001d93c] exc_0x500_common+0xfc/0x100 --- interrupt: 501 at .prepare_to_wait_event+0xc/0x14c LR = .fsl_elbc_run_command+0xc8/0x23c [c0000000f71734d0] [c00000000065f418] .nand_reset+0xb8/0x168 [c0000000f7173560] [c00000000065fec4] .nand_scan_ident+0x2b0/0x1638 [c0000000f7173650] [c000000000666cd8] .fsl_elbc_nand_probe+0x34c/0x5f0 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [c0000000f7173750] [c0000000005a3c60] .platform_drv_probe+0x64/0xb0 [c0000000f71737d0] [c0000000005a12e0] .really_probe+0x290/0x334 [c0000000f7173870] [c0000000005a14a0] .__driver_attach+0x11c/0x120 [c0000000f7173900] [c00000000059e6a0] .bus_for_each_dev+0x98/0xfc [c0000000f71739a0] [c0000000005a0b3c] .driver_attach+0x34/0x4c [c0000000f7173a20] [c0000000005a04b0] .bus_add_driver+0x1ac/0x2e0 [c0000000f7173ac0] [c0000000005a2170] .driver_register+0x94/0x160 [c0000000f7173b40] [c0000000005a3be0] .__platform_driver_register+0x60/0x7c [c0000000f7173bc0] [c000000000d6aab4] .fsl_elbc_nand_driver_init+0x24/0x38 [c0000000f7173c30] [c000000000001934] .do_one_initcall+0x68/0x1b8 [c0000000f7173d00] [c000000000d210f8] .kernel_init_freeable+0x260/0x338 [c0000000f7173db0] [c0000000000021b0] .kernel_init+0x20/0xe70 [c0000000f7173e30] [c0000000000009bc] .ret_from_kernel_thread+0x58/0x9c handlers: [<c000000000ed85c8>] .fsl_lbc_ctrl_irq Disabling IRQ #25 Ben also had concerns with the implementation being potentially slow on some PICs, so revert it for now. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-15 16:20:46 +10:00
Alistair Popple	377aa6b0ef	powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node Commit `4c3b89effc` ("powerpc/powernv: Add sanity checks to pnv_pci_get_{gpu\|npu}_dev") introduced explicit warnings in pnv_pci_get_npu_dev() when a PCIe device has no associated device-tree node. However not all PCIe devices have an of_node and pnv_pci_get_npu_dev() gets indirectly called at least once for every PCIe device in the system. This results in spurious WARN_ON()'s so remove it. The same situation should not exist for pnv_pci_get_gpu_dev() as any NPU based PCIe device requires a device-tree node. Fixes: `4c3b89effc` ("powerpc/powernv: Add sanity checks to pnv_pci_get_{gpu\|npu}_dev") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-14 15:23:19 +10:00
Michael Ellerman	c6ee9619e2	powerpc/book3s64: Move PPC_DT_CPU_FTRs and enable it by default The PPC_DT_CPU_FTRs is a bit misplaced in menuconfig, it shows up with other general kernel options. It's really more at home in the "Platform Support" section, so move it there. Also enable it by default, for Book3s 64. It does mostly nothing unless the device tree properties are found, and we will want it enabled eventually in distro kernels, so turn it on to start getting more testing. Fixes: `5a61ef74f2` ("powerpc/64s: Support new device tree binding for discovering CPU features") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-08 20:42:57 +10:00
Aneesh Kumar K.V	92d9dfda8b	powerpc/mm/4k: Limit 4k page size config to 64TB virtual address space Supporting 512TB requires us to do a order 3 allocation for level 1 page table (pgd). This results in page allocation failures with certain workloads. For now limit 4k linux page size config to 64TB. Fixes: `f6eedbba7a` ("powerpc/mm/hash: Increase VA range to 128TB") Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-08 20:42:56 +10:00
Frederic Barrat	cec422c11c	cxl: Fix error path on bad ioctl Fix error path if we can't copy user structure on CXL_IOCTL_START_WORK ioctl. We shouldn't unlock the context status mutex as it was not locked (yet). Fixes: `0712dc7e73` ("cxl: Fix issues when unmapping contexts") Cc: stable@vger.kernel.org # v3.19+ Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-08 20:42:31 +10:00
Madhavan Srinivasan	8c218578fc	powerpc/perf: Fix Power9 test_adder fields Commit `8d911904f3` ('powerpc/perf: Add restrictions to PMC5 in power9 DD1') was added to restrict the use of PMC5 in Power9 DD1. Intention was to disable the use of PMC5 using raw event code. But instead of updating the power9_isa207_pmu structure (used on DD1), the commit incorrectly updated the power9_pmu structure. Fix it. Fixes: `8d911904f3` ("powerpc/perf: Add restrictions to PMC5 in power9 DD1") Reported-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Tested-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-06 21:21:19 +10:00
Michael Ellerman	ba4a648f12	powerpc/numa: Fix percpu allocations to be NUMA aware In commit `8c27226119` ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we switched to the generic implementation of cpu_to_node(), which uses a percpu variable to hold the NUMA node for each CPU. Unfortunately we neglected to notice that we use cpu_to_node() in the allocation of our percpu areas, leading to a chicken and egg problem. In practice what happens is when we are setting up the percpu areas, cpu_to_node() reports that all CPUs are on node 0, so we allocate all percpu areas on node 0. This is visible in the dmesg output, as all pcpu allocs being in group 0: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 To fix it we need an early_cpu_to_node() which can run prior to percpu being setup. We already have the numa_cpu_lookup_table we can use, so just plumb it in. With the patch dmesg output shows two groups, 0 and 1: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31 pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39 pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47 We can also check the data_offset in the paca of various CPUs, with the fix we see: CPU 0: data_offset = 0x0ffe8b0000 CPU 24: data_offset = 0x1ffe5b0000 And we can see from dmesg that CPU 24 has an allocation on node 1: node 0: [mem 0x0000000000000000-0x0000000fffffffff] node 1: [mem 0x0000001000000000-0x0000001fffffffff] Cc: stable@vger.kernel.org # v3.16+ Fixes: `8c27226119` ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-06 21:19:46 +10:00
Nicholas Piggin	90df4bfb4d	powerpc/64s: Machine check handle ifetch from foreign real address for POWER9 The i-side 0111b machine check, which is "Instruction Fetch to foreign address space", was missed by `7b9f71f974` ("powerpc/64s: POWER9 machine check handler"). The POWER9 processor core considers host real addresses with a nonzero value in RA(8:12) as foreign address space, accessible only by the copy and paste instructions. The copy and paste instruction pair can be used to invoke the Nest accelerators via the Virtual Accelerator Switchboard (VAS). It is an error for any regular load/store or ifetch to go to a foreign addresses. When relocation is on, this causes an MMU exception. When relocation is off, a machine check exception. It is possible to trigger this machine check by branching to a foreign address with MSR[IR]=0. Fixes: `7b9f71f974` ("powerpc/64s: POWER9 machine check handler") Reported-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-06 21:17:15 +10:00
Dan Carpenter	58d876fa71	cxl: Unlock on error in probe We should unlock if get_cxl_adapter() fails. Fixes: `594ff7d067` ("cxl: Support to flash a new image on the adapter from a guest") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-06 19:23:52 +10:00
Vaibhav Jain	b3aa20ba2b	cxl: Avoid double free_irq() for psl,slice interrupts During an eeh call to cxl_remove can result in double free_irq of psl,slice interrupts. This can happen if perst_reloads_same_image == 1 and call to cxl_configure_adapter() fails during slot_reset callback. In such a case we see a kernel oops with following back-trace: Oops: Kernel access of bad area, sig: 11 [#1] Call Trace: free_irq+0x88/0xd0 (unreliable) cxl_unmap_irq+0x20/0x40 [cxl] cxl_native_release_psl_irq+0x78/0xd8 [cxl] pci_deconfigure_afu+0xac/0x110 [cxl] cxl_remove+0x104/0x210 [cxl] pci_device_remove+0x6c/0x110 device_release_driver_internal+0x204/0x2e0 pci_stop_bus_device+0xa0/0xd0 pci_stop_and_remove_bus_device+0x28/0x40 pci_hp_remove_devices+0xb0/0x150 pci_hp_remove_devices+0x68/0x150 eeh_handle_normal_event+0x140/0x580 eeh_handle_event+0x174/0x360 eeh_event_handler+0x1e8/0x1f0 This patch fixes the issue of double free_irq by checking that variables that hold the virqs (err_hwirq, serr_hwirq, psl_virq) are not '0' before un-mapping and resetting these variables to '0' when they are un-mapped. Cc: stable@vger.kernel.org Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-06 19:09:27 +10:00
Breno Leitao	7f22ced437	powerpc/kernel: Initialize load_tm on task creation Currently tsk->thread.load_tm is not initialized in the task creation and can contain garbage on a new task. This is an undesired behaviour, since it affects the timing to enable and disable the transactional memory laziness (disabling and enabling the MSR TM bit, which affects TM reclaim and recheckpoint in the scheduling process). Fixes: `5d176f751e` ("powerpc: tm: Enable transactional memory (TM) lazily for userspace") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-06 19:09:22 +10:00
Christophe Leroy	4386c096c2	powerpc/mm: Rename map_page() to map_kernel_page() on 32-bit These two functions implement the same semantics, so unify their naming so we can share code that calls them. The longer name is more descriptive so use it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 19:59:03 +10:00
Balbir Singh	d2485644c7	powerpc/mm/hugetlb: Add support for page accounting Add __GFP_ACCOUNT to __hugepte_alloc() Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 19:03:12 +10:00
Balbir Singh	abd667be15	powerpc/mm/book(e)(3s)/32: Add page table accounting Add support in pte_alloc_one() and pgd_alloc() by passing __GFP_ACCOUNT in the flags Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 19:03:11 +10:00
Balbir Singh	de3b87611d	powerpc/mm/book(e)(3s)/64: Add page table accounting Introduce a helper pgtable_gfp_flags() which just returns the current gfp flags and adds __GFP_ACCOUNT to account for page table allocation. The generic helper is added to include/asm/pgalloc.h and has two variants - WARNING ugly bits ahead 1. If the header is included from a module, no check for mm == &init_mm is done, since init_mm is not exported 2. For kernel includes, the check is done and required see (`3e79ec7` arch: x86: charge page tables to kmemcg) The fundamental assumption is that no module should be doing pgd/pud/pmd and pte alloc's on behalf of init_mm directly. NOTE: This adds an overhead to pmd/pud/pgd allocations similar to x86. The other alternative was to implement pmd_alloc_kernel/pud_alloc_kernel and pgd_alloc_kernel with their offset variants. For 4k page size, pte_alloc_one no longer calls pte_alloc_one_kernel. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 19:03:10 +10:00
Balbir Singh	c5cee6421c	powerpc/mm/hash: Do a local flush if possible when no batch is active Currently in hpte_need_flush() if there is no batch pending we always do a global TLB flush, which is inefficient if the mm has never run on another thread. Instead do the same check that __flush_tlb_pending() does and check if a local flush is sufficient when batch->active is false. Instead of open-coding it we use mm_is_thread_local(). Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Don't use a local, just inline mm_is_thread_local()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 19:02:55 +10:00
Yang Li	64d09f5ecb	MAINTAINERS: Update my email address from freescale to nxp Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 16:58:13 +10:00
Yang Li	c67ec70101	MAINTAINERS: Update entry for Freescale SoC drivers Add myself as the maintainer for drivers/fsl/soc/ and fix the scope for device tree bindings. Signed-off-by: Li Yang <leoyang.li@nxp.com> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 16:58:10 +10:00
Nicholas Piggin	b27ce77685	selftests/powerpc: context_switch use private futexes with threads This reduces overhead of mutex locking and increases context switch rate significantly (which helps to measure and profile the context switch path). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 16:55:01 +10:00
Colin Ian King	b802ab46ba	powerpc: Fix some spelling mistakes Collation of some spelling fixes from Colin. Attemping -> Attempting intialized -> initialized missmanaged -> mismanaged Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 16:50:15 +10:00
Breno Leitao	1195892c09	powerpc/kernel: Fix FP and vector register restoration Currently tsk->thread->load_vec and load_fp are not initialized during task creation, which can lead to garbage values in these variables (non-zero values). These variables will be checked later in restore_math() to validate if the FP and vector registers are being utilized. Since these values might be non-zero, the restore_math() will continue to save the FP and vectors even if they were never utilized by the userspace application. load_fp and load_vec counters will then overflow (they wrap at 255) and the FP and Altivec will be finally disabled, but before that condition is reached (counter overflow) several context switches will have restored FP and vector registers without need, causing a performance degradation. Fixes: `70fe3d980f` ("powerpc: Restore FPU/VEC/VSX if previously used") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Gustavo Romero <gusbromero@gmail.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-05 15:55:30 +10:00
Matt Brown	f718d426d7	powerpc/lib/xor_vmx: Ensure no altivec code executes before enable_kernel_altivec() The xor_vmx.c file is used for the RAID5 xor operations. In these functions altivec is enabled to run the operation and then disabled. The code uses enable_kernel_altivec() around the core of the algorithm, however the whole file is built with -maltivec, so the compiler is within its rights to generate altivec code anywhere. This has been seen at least once in the wild: 0:mon> di $xor_altivec_2 c0000000000b97d0 3c4c01d9 addis r2,r12,473 c0000000000b97d4 3842db30 addi r2,r2,-9424 c0000000000b97d8 7c0802a6 mflr r0 c0000000000b97dc f8010010 std r0,16(r1) c0000000000b97e0 60000000 nop c0000000000b97e4 7c0802a6 mflr r0 c0000000000b97e8 faa1ffa8 std r21,-88(r1) ... c0000000000b981c f821ff41 stdu r1,-192(r1) c0000000000b9820 7f8101ce stvx v28,r1,r0 <-- POP c0000000000b9824 38000030 li r0,48 c0000000000b9828 7fa101ce stvx v29,r1,r0 ... c0000000000b984c 4bf6a06d bl c0000000000238b8 # enable_kernel_altivec This patch splits the non-altivec code into xor_vmx_glue.c which calls the altivec functions in xor_vmx.c. By compiling xor_vmx_glue.c without -maltivec we can guarantee that altivec instruction will not be executed outside of the enable/disable block. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> [mpe: Rework change log and include disassembly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 20:17:52 +10:00
Hari Bathini	48a316e350	powerpc/fadump: Set an upper limit for boot memory size By default, 5% of system RAM is reserved for preserving boot memory. Alternatively, a user can specify the amount of memory to reserve. See Documentation/powerpc/firmware-assisted-dump.txt for details. In addition to the memory reserved for preserving boot memory, some more memory is reserved, to save HPTE region, CPU state data and ELF core headers. Memory Reservation during first kernel looks like below: Low memory Top of memory 0 boot memory size \| \| \| \|<--Reserved dump area -->\| V V \| Permanent Reservation V +-----------+----------/ /----------+---+----+-----------+----+ \| \| \|CPU\|HPTE\| DUMP \|ELF \| +-----------+----------/ /----------+---+----+-----------+----+ \| ^ \| \| \ / ------------------------------------------- Boot memory content gets transferred to reserved area by firmware at the time of crash This implicitly means that the sum of the sizes of boot memory, CPU state data, HPTE region, DUMP preserving area and ELF core headers can't be greater than the total memory size. But currently, a user is allowed to specify any value as boot memory size. So, the above rule is violated when a boot memory size around 50% of the total available memory is specified. As the kernel is not handling this currently, it may lead to undefined behavior. Fix it by setting an upper limit for boot memory size to 25% of the total available memory. Also, instead of using memblock_end_of_DRAM(), which doesn't take the holes, if any, in the memory layout into account, use memblock_phys_mem_size() to calculate the percentage of total available memory. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 20:16:50 +10:00
Hari Bathini	e7467dc694	powerpc/fadump: Update comment about offset where fadump is reserved With commit `f6e6bedb77` ("powerpc/fadump: Reserve memory at an offset closer to bottom of RAM"), memory for fadump is no longer reserved at the top of RAM. But there are still a few places which say so. Change them appropriately. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 20:16:49 +10:00
Hari Bathini	81d9eca502	powerpc/fadump: Add a warning when 'fadump_reserve_mem=' is used With commit `11550dc0a0` ("powerpc/fadump: reuse crashkernel parameter for fadump memory reservation"), 'fadump_reserve_mem=' parameter is deprecated in favor of 'crashkernel=' parameter. Add a warning if 'fadump_reserve_mem=' is still used. Fixes: `11550dc0a0` ("powerpc/fadump: reuse crashkernel parameter for fadump memory reservation") Suggested-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> [mpe: Unsplit long printk strings] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 20:16:35 +10:00
Michal Suchanek	98b8cd7f75	powerpc/fadump: Return error when fadump registration fails - log an error message when registration fails and no error code listed in the switch is returned - translate the hv error code to posix error code and return it from fw_register - return the posix error code from fw_register to the process writing to sysfs - return EEXIST on re-registration - return success on deregistration when fadump is not registered - return ENODEV when no memory is reserved for fadump Signed-off-by: Michal Suchanek <msuchanek@suse.de> Tested-by: Hari Bathini <hbathini@linux.vnet.ibm.com> [mpe: Use pr_err() to shrink the error print] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:23:57 +10:00
Christophe Leroy	f782ddf297	powerpc: Remove __ilog2()s and use generic ones With the __ilog2() function as defined in arch/powerpc/include/asm/bitops.h, GCC will not optimise the code in case of constant parameter. The generic ilog2() function in include/linux/log2.h is written to handle the case of the constant parameter. This patch discards the three __ilog2() functions and defines __ilog2() as ilog2() For non constant calls, the generated code is doing the same: int test__ilog2(unsigned long x) { return __ilog2(x); } int test__ilog2_u32(u32 n) { return __ilog2_u32(n); } int test__ilog2_u64(u64 n) { return __ilog2_u64(n); } On PPC32 before the patch: 00000000 <test__ilog2>: 0: 7c 63 00 34 cntlzw r3,r3 4: 20 63 00 1f subfic r3,r3,31 8: 4e 80 00 20 blr 0000000c <test__ilog2_u32>: c: 7c 63 00 34 cntlzw r3,r3 10: 20 63 00 1f subfic r3,r3,31 14: 4e 80 00 20 blr On PPC32 after the patch: 00000000 <test__ilog2>: 0: 7c 63 00 34 cntlzw r3,r3 4: 20 63 00 1f subfic r3,r3,31 8: 4e 80 00 20 blr 0000000c <test__ilog2_u32>: c: 7c 63 00 34 cntlzw r3,r3 10: 20 63 00 1f subfic r3,r3,31 14: 4e 80 00 20 blr On PPC64 before the patch: 0000000000000000 <.test__ilog2>: 0: 7c 63 00 74 cntlzd r3,r3 4: 20 63 00 3f subfic r3,r3,63 8: 7c 63 07 b4 extsw r3,r3 c: 4e 80 00 20 blr 0000000000000010 <.test__ilog2_u32>: 10: 7c 63 00 34 cntlzw r3,r3 14: 20 63 00 1f subfic r3,r3,31 18: 7c 63 07 b4 extsw r3,r3 1c: 4e 80 00 20 blr 0000000000000020 <.test__ilog2_u64>: 20: 7c 63 00 74 cntlzd r3,r3 24: 20 63 00 3f subfic r3,r3,63 28: 7c 63 07 b4 extsw r3,r3 2c: 4e 80 00 20 blr On PPC64 after the patch: 0000000000000000 <.test__ilog2>: 0: 7c 63 00 74 cntlzd r3,r3 4: 20 63 00 3f subfic r3,r3,63 8: 7c 63 07 b4 extsw r3,r3 c: 4e 80 00 20 blr 0000000000000010 <.test__ilog2_u32>: 10: 7c 63 00 34 cntlzw r3,r3 14: 20 63 00 1f subfic r3,r3,31 18: 7c 63 07 b4 extsw r3,r3 1c: 4e 80 00 20 blr 0000000000000020 <.test__ilog2_u64>: 20: 7c 63 00 74 cntlzd r3,r3 24: 20 63 00 3f subfic r3,r3,63 28: 7c 63 07 b4 extsw r3,r3 2c: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:23:56 +10:00
Christophe Leroy	22ef33b368	powerpc: Replace ffz() by equivalent generic function With the ffz() function as defined in arch/powerpc/include/asm/bitops.h GCC will not optimise the code in case of constant parameter. This patch replaces ffz() by the generic function. The generic ffz(x) expects to never be called with ~x == 0 as written in the comment in include/asm-generic/bitops/ffz.h The only user of ffz() within arch/powerpc/ is platforms/512x/mpc5121_ads_cpld.c, which checks if x is not 0xff For non constant calls, the generated code is doing the same: unsigned long testffz(unsigned long x) { return ffz(x); } On PPC32, before the patch: 00000018 <testffz>: 18: 7c 63 18 f9 not. r3,r3 1c: 40 82 00 0c bne 28 <testffz+0x10> 20: 38 60 00 20 li r3,32 24: 4e 80 00 20 blr 28: 7d 23 00 d0 neg r9,r3 2c: 7d 23 18 38 and r3,r9,r3 30: 7c 63 00 34 cntlzw r3,r3 34: 20 63 00 1f subfic r3,r3,31 38: 4e 80 00 20 blr On PPC32, after the patch: 00000018 <testffz>: 18: 39 23 00 01 addi r9,r3,1 1c: 7d 23 18 78 andc r3,r9,r3 20: 7c 63 00 34 cntlzw r3,r3 24: 20 63 00 1f subfic r3,r3,31 28: 4e 80 00 20 blr On PPC64, before the patch: 0000000000000030 <.testffz>: 30: 7c 60 18 f9 not. r0,r3 34: 38 60 00 40 li r3,64 38: 4d 82 00 20 beqlr 3c: 7c 60 00 d0 neg r3,r0 40: 7c 63 00 38 and r3,r3,r0 44: 7c 63 00 74 cntlzd r3,r3 48: 20 63 00 3f subfic r3,r3,63 4c: 7c 63 07 b4 extsw r3,r3 50: 4e 80 00 20 blr On PPC64, after the patch: 0000000000000030 <.testffz>: 30: 38 03 00 01 addi r0,r3,1 34: 7c 03 18 78 andc r3,r0,r3 38: 7c 63 00 74 cntlzd r3,r3 3c: 20 63 00 3f subfic r3,r3,63 40: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:23:55 +10:00
Christophe Leroy	2fcff790dc	powerpc: Use builtin functions for fls()/__fls()/fls64() With the fls() functions as defined in arch/powerpc/include/asm/bitops.h GCC will not optimise the code in case of constant parameter. This patch replaces __fls() by the builtin function, and modifies fls() and fls64() to use builtins instead of inline assembly For non constant calls, the generated code is doing the same: int testfls(unsigned int x) { return fls(x); } unsigned long test__fls(unsigned long x) { return __fls(x); } int testfls64(__u64 x) { return fls64(x); } On PPC32, before the patch: 00000064 <testfls>: 64: 7c 63 00 34 cntlzw r3,r3 68: 20 63 00 20 subfic r3,r3,32 6c: 4e 80 00 20 blr 00000070 <test__fls>: 70: 7c 63 00 34 cntlzw r3,r3 74: 20 63 00 1f subfic r3,r3,31 78: 4e 80 00 20 blr 0000007c <testfls64>: 7c: 2c 03 00 00 cmpwi r3,0 80: 40 82 00 10 bne 90 <testfls64+0x14> 84: 7c 83 00 34 cntlzw r3,r4 88: 20 63 00 20 subfic r3,r3,32 8c: 4e 80 00 20 blr 90: 7c 63 00 34 cntlzw r3,r3 94: 20 63 00 40 subfic r3,r3,64 98: 4e 80 00 20 blr On PPC32, after the patch: 00000054 <testfls>: 54: 7c 63 00 34 cntlzw r3,r3 58: 20 63 00 20 subfic r3,r3,32 5c: 4e 80 00 20 blr 00000060 <test__fls>: 60: 7c 63 00 34 cntlzw r3,r3 64: 20 63 00 1f subfic r3,r3,31 68: 4e 80 00 20 blr 0000006c <testfls64>: 6c: 2c 03 00 00 cmpwi r3,0 70: 41 82 00 10 beq 80 <testfls64+0x14> 74: 7c 63 00 34 cntlzw r3,r3 78: 20 63 00 40 subfic r3,r3,64 7c: 4e 80 00 20 blr 80: 7c 83 00 34 cntlzw r3,r4 84: 20 63 00 40 subfic r3,r3,32 88: 4e 80 00 20 blr On PPC64, before the patch: 00000000000000a0 <.testfls>: a0: 7c 63 00 34 cntlzw r3,r3 a4: 20 63 00 20 subfic r3,r3,32 a8: 7c 63 07 b4 extsw r3,r3 ac: 4e 80 00 20 blr 00000000000000b0 <.test__fls>: b0: 7c 63 00 74 cntlzd r3,r3 b4: 20 63 00 3f subfic r3,r3,63 b8: 7c 63 07 b4 extsw r3,r3 bc: 4e 80 00 20 blr 00000000000000c0 <.testfls64>: c0: 7c 63 00 74 cntlzd r3,r3 c4: 20 63 00 40 subfic r3,r3,64 c8: 7c 63 07 b4 extsw r3,r3 cc: 4e 80 00 20 blr On PPC64, after the patch: 0000000000000090 <.testfls>: 90: 7c 63 00 34 cntlzw r3,r3 94: 20 63 00 20 subfic r3,r3,32 98: 7c 63 07 b4 extsw r3,r3 9c: 4e 80 00 20 blr 00000000000000a0 <.test__fls>: a0: 7c 63 00 74 cntlzd r3,r3 a4: 20 63 00 3f subfic r3,r3,63 a8: 4e 80 00 20 blr ac: 60 00 00 00 nop 00000000000000b0 <.testfls64>: b0: 7c 63 00 74 cntlzd r3,r3 b4: 20 63 00 40 subfic r3,r3,64 b8: 7c 63 07 b4 extsw r3,r3 bc: 4e 80 00 20 blr Those builtins have been in GCC since at least 3.4.6 (see https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Other-Builtins.html ) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:23:55 +10:00
Christophe Leroy	f83647d642	powerpc: Discard ffs()/__ffs() function and use builtin functions instead With the ffs() function as defined in arch/powerpc/include/asm/bitops.h GCC will not optimise the code in case of constant parameter, as shown by the small exemple below. int ffs_test(void) { return 4 << ffs(31); } c0012334 <ffs_test>: c0012334: 39 20 00 01 li r9,1 c0012338: 38 60 00 04 li r3,4 c001233c: 7d 29 00 34 cntlzw r9,r9 c0012340: 21 29 00 20 subfic r9,r9,32 c0012344: 7c 63 48 30 slw r3,r3,r9 c0012348: 4e 80 00 20 blr With this patch, the same function will compile as follows: c0012334 <ffs_test>: c0012334: 38 60 00 08 li r3,8 c0012338: 4e 80 00 20 blr The same happens with __ffs() For non constant calls, the generated code is doing the same, allthought it is slightly different on 64 bits for ffs(): unsigned long test__ffs(unsigned long x) { return __ffs(x); } int testffs(int x) { return ffs(x); } On PPC32, before the patch: 0000003c <test__ffs>: 3c: 7d 23 00 d0 neg r9,r3 40: 7d 23 18 38 and r3,r9,r3 44: 7c 63 00 34 cntlzw r3,r3 48: 20 63 00 1f subfic r3,r3,31 4c: 4e 80 00 20 blr 00000050 <testffs>: 50: 7d 23 00 d0 neg r9,r3 54: 7d 23 18 38 and r3,r9,r3 58: 7c 63 00 34 cntlzw r3,r3 5c: 20 63 00 20 subfic r3,r3,32 60: 4e 80 00 20 blr On PPC32, after the patch: 0000002c <test__ffs>: 2c: 7d 23 00 d0 neg r9,r3 30: 7d 23 18 38 and r3,r9,r3 34: 7c 63 00 34 cntlzw r3,r3 38: 20 63 00 1f subfic r3,r3,31 3c: 4e 80 00 20 blr 00000040 <testffs>: 40: 7d 23 00 d0 neg r9,r3 44: 7d 23 18 38 and r3,r9,r3 48: 7c 63 00 34 cntlzw r3,r3 4c: 20 63 00 20 subfic r3,r3,32 50: 4e 80 00 20 blr On PPC64, before the patch: 0000000000000060 <.test__ffs>: 60: 7c 03 00 d0 neg r0,r3 64: 7c 03 18 38 and r3,r0,r3 68: 7c 63 00 74 cntlzd r3,r3 6c: 20 63 00 3f subfic r3,r3,63 70: 7c 63 07 b4 extsw r3,r3 74: 4e 80 00 20 blr 0000000000000080 <.testffs>: 80: 7c 03 00 d0 neg r0,r3 84: 7c 03 18 38 and r3,r0,r3 88: 7c 63 00 74 cntlzd r3,r3 8c: 20 63 00 40 subfic r3,r3,64 90: 7c 63 07 b4 extsw r3,r3 94: 4e 80 00 20 blr On PPC64, after the patch: 0000000000000050 <.test__ffs>: 50: 7c 03 00 d0 neg r0,r3 54: 7c 03 18 38 and r3,r0,r3 58: 7c 63 00 74 cntlzd r3,r3 5c: 20 63 00 3f subfic r3,r3,63 60: 4e 80 00 20 blr 0000000000000070 <.testffs>: 70: 7c 03 00 d0 neg r0,r3 74: 7c 03 18 38 and r3,r0,r3 78: 7c 63 00 34 cntlzw r3,r3 7c: 20 63 00 20 subfic r3,r3,32 80: 7c 63 07 b4 extsw r3,r3 84: 4e 80 00 20 blr (ffs() operates on an int so cntlzw is equivalent to cntlzd) In addition, when reading the generated vmlinux, we can observe that with the builtin functions, GCC sometimes efficiently spreads the instructions within the generated functions while the inline assembly force them to remain grouped together. __builtin_ffs() is already used in arch/powerpc/include/asm/page_32.h Those builtins have been in GCC since at least 3.4.6 (see https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Other-Builtins.html ) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:23:54 +10:00
Christophe Leroy	45cb08f479	powerpc: Handle simultaneous interrupts at once It often happens to have simultaneous interrupts, for instance when having double Ethernet attachment. With the current implementation, we suffer the cost of kernel entry/exit for each interrupt. This patch introduces a loop in __do_irq() to handle all interrupts at once before returning. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:20:44 +10:00
Christophe Leroy	3c29b60388	powerpc/8xx: fix mpc8xx_get_irq() return on no irq IRQ 0 is a valid HW interrupt. So get_irq() shall return 0 when there is no irq, instead of returning irq_linear_revmap(... ,0) Fixes: `f2a0bd3753` ("[POWERPC] 8xx: powerpc port of core CPM PIC") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:20:44 +10:00
Christophe Leroy	362957c27e	powerpc/40x: Clear MSR_DR in one insn instead of two Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:20:43 +10:00
Christophe Leroy	92aa2fe039	powerpc/mm: The 8xx doesn't call do_page_fault() for breakpoints The 8xx has a dedicated exception for breakpoints, that directly calls do_break() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:20:12 +10:00
Christophe Leroy	da929f6af4	powerpc/mm: Evaluate user_mode(regs) only once in do_page_fault() Analysis of the assembly code shows that when using user_mode(regs), at least the 'andi.' is redone all the time, and also the 'lwz ,132(r31)' most of the time. With the new form, the 'is_user' is mapped to cr4, then all further use of is_user results in just things like 'beq cr4,218 <do_page_fault+0x218>' Without the patch: 50: 81 1e 00 84 lwz r8,132(r30) 54: 71 09 40 00 andi. r9,r8,16384 58: 40 82 00 0c bne 64 <do_page_fault+0x64> 84: 81 3e 00 84 lwz r9,132(r30) 8c: 71 2a 40 00 andi. r10,r9,16384 90: 41 a2 01 64 beq 1f4 <do_page_fault+0x1f4> d4: 81 3e 00 84 lwz r9,132(r30) dc: 71 28 40 00 andi. r8,r9,16384 e0: 41 82 02 08 beq 2e8 <do_page_fault+0x2e8> 108: 81 3e 00 84 lwz r9,132(r30) 110: 71 28 40 00 andi. r8,r9,16384 118: 41 82 02 28 beq 340 <do_page_fault+0x340> 1e4: 81 3e 00 84 lwz r9,132(r30) 1e8: 71 2a 40 00 andi. r10,r9,16384 1ec: 40 82 01 68 bne 354 <do_page_fault+0x354> 228: 81 3e 00 84 lwz r9,132(r30) 22c: 71 28 40 00 andi. r8,r9,16384 230: 41 82 ff c4 beq 1f4 <do_page_fault+0x1f4> 288: 71 2a 40 00 andi. r10,r9,16384 294: 41 a2 fe 60 beq f4 <do_page_fault+0xf4> 50c: 81 3e 00 84 lwz r9,132(r30) 514: 71 2a 40 00 andi. r10,r9,16384 518: 40 a2 fc e0 bne 1f8 <do_page_fault+0x1f8> 534: 81 3e 00 84 lwz r9,132(r30) 53c: 71 2a 40 00 andi. r10,r9,16384 540: 41 82 fc b8 beq 1f8 <do_page_fault+0x1f8> This patch creates a local var called 'is_user' which contains the result of user_mode(regs) With the patch: 20: 81 03 00 84 lwz r8,132(r3) 48: 55 09 97 fe rlwinm r9,r8,18,31,31 58: 2e 09 00 00 cmpwi cr4,r9,0 5c: 40 92 00 0c bne cr4,68 <do_page_fault+0x68> 88: 41 b2 01 90 beq cr4,218 <do_page_fault+0x218> d4: 40 92 01 d0 bne cr4,2a4 <do_page_fault+0x2a4> 120: 41 b2 00 f8 beq cr4,218 <do_page_fault+0x218> 138: 41 b2 ff a0 beq cr4,d8 <do_page_fault+0xd8> 1d4: 40 92 00 e0 bne cr4,2b4 <do_page_fault+0x2b4> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:19:45 +10:00
Christophe Leroy	97a011e69b	powerpc/mm: Remove a redundant test in do_page_fault() The result of (trap == 0x400) is already in is_exec. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:18:34 +10:00
Christophe Leroy	e8de85ca32	powerpc/mm: Only call store_updates_sp() on stores in do_page_fault() Function store_updates_sp() checks whether the faulting instruction is a store updating r1. Therefore we can limit its calls to store exceptions. This patch is an improvement of commit `a7a9dcd882` ("powerpc: Avoid taking a data miss on every userspace instruction miss") With the same microbenchmark app, run with 500 as argument, on an MPC885 we get: Before this patch: 152000 DTLB misses After this patch: 147000 DTLB misses Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:10:24 +10:00
Christophe Leroy	9affa9e228	powerpc/mm: Remove __this_fixmap_does_not_exist() This function has not been used since commit `9494a1e842` ("powerpc: use generic fixmap.h) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:09:53 +10:00
Balbir Singh	e63739b168	powerpc/mm/ptdump: Dump the first entry of the linear mapping as well The check in hpte_find() should be < and not <= for PAGE_OFFSET Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-06-02 19:09:52 +10:00

1 2 3 4 5 ...

678295 Commits All Branches Search

678295 Commits

All Branches