linux

Commit Graph

Author	SHA1	Message	Date
Michael Neuling	cf13435b73	powerpc/tm: Fix userspace r13 corruption When we treclaim we store the userspace checkpointed r13 to a scratch SPR and then later save the scratch SPR to the user thread struct. Unfortunately, this doesn't work as accessing the user thread struct can take an SLB fault and the SLB fault handler will write the same scratch SPRG that now contains the userspace r13. To fix this, we store r13 to the kernel stack (which can't fault) before we access the user thread struct. Found by running P8 guest + powervm + disable_1tb_segments + TM. Seen as a random userspace segfault with r13 looking like a kernel address. Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-25 22:51:08 +10:00
Michael Bringmann	8604895a34	powerpc/pseries: Fix unitialized timer reset on migration After migration of a powerpc LPAR, the kernel executes code to update the system state to reflect new platform characteristics. Such changes include modifications to device tree properties provided to the system by PHYP. Property notifications received by the post_mobility_fixup() code are passed along to the kernel in general through a call to of_update_property() which in turn passes such events back to all modules through entries like the '.notifier_call' function within the NUMA module. When the NUMA module updates its state, it resets its event timer. If this occurs after a previous call to stop_topology_update() or on a system without VPHN enabled, the code runs into an unitialized timer structure and crashes. This patch adds a safety check along this path toward the problem code. An example crash log is as follows. ibmvscsi 30000081: Re-enabling adapter! ------------[ cut here ]------------ kernel BUG at kernel/time/timer.c:958! Oops: Exception in kernel mode, sig: 5 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag inet_diag lockd unix_diag af_packet_diag netlink_diag grace fscache sunrpc xts vmx_crypto pseries_rng sg binfmt_misc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod CPU: 11 PID: 3067 Comm: drmgr Not tainted 4.17.0+ #179 ... NIP mod_timer+0x4c/0x400 LR reset_topology_timer+0x40/0x60 Call Trace: 0xc0000003f9407830 (unreliable) reset_topology_timer+0x40/0x60 dt_update_callback+0x100/0x120 notifier_call_chain+0x90/0x100 __blocking_notifier_call_chain+0x60/0x90 of_property_notify+0x90/0xd0 of_update_property+0x104/0x150 update_dt_property+0xdc/0x1f0 pseries_devicetree_update+0x2d0/0x510 post_mobility_fixup+0x7c/0xf0 migration_store+0xa4/0xc0 kobj_attr_store+0x30/0x60 sysfs_kf_write+0x64/0xa0 kernfs_fop_write+0x16c/0x240 __vfs_write+0x40/0x200 vfs_write+0xc8/0x240 ksys_write+0x5c/0x100 system_call+0x58/0x6c Fixes: `5d88aa85c0` ("powerpc/pseries: Update CPU maps when device tree is updated") Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-24 21:05:38 +10:00
Thiago Jung Bauermann	c716a25b9b	powerpc/pkeys: Fix reading of ibm, processor-storage-keys property scan_pkey_feature() uses of_property_read_u32_array() to read the ibm,processor-storage-keys property and calls be32_to_cpu() on the value it gets. The problem is that of_property_read_u32_array() already returns the value converted to the CPU byte order. The value of pkeys_total ends up more or less sane because there's a min() call in pkey_initialize() which reduces pkeys_total to 32. So in practice the kernel ignores the fact that the hypervisor reserved one key for itself (the device tree advertises 31 keys in my test VM). This is wrong, but the effect in practice is that when a process tries to allocate the 32nd key, it gets an -EINVAL error instead of -ENOSPC which would indicate that there aren't any keys available Fixes: `cf43d3b264` ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-20 22:49:46 +10:00
Christophe Leroy	85682a7e3b	powerpc: fix csum_ipv6_magic() on little endian platforms On little endian platforms, csum_ipv6_magic() keeps len and proto in CPU byte order. This generates a bad results leading to ICMPv6 packets from other hosts being dropped by powerpc64le platforms. In order to fix this, len and proto should be converted to network byte order ie bigendian byte order. However checksumming 0x12345678 and 0x56341278 provide the exact same result so it is enough to rotate the sum of len and proto by 1 byte. PPC32 only support bigendian so the fix is needed for PPC64 only Fixes: `e9c4943a10` ("powerpc: Implement csum_ipv6_magic in assembly") Reported-by: Jianlin Shi <jishi@redhat.com> Reported-by: Xin Long <lucien.xin@gmail.com> Cc: <stable@vger.kernel.org> # 4.18+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-20 21:12:28 +10:00
Alexey Kardashevskiy	7233b8cab3	powerpc/powernv/ioda2: Reduce upper limit for DMA window size (again) mpe: This was fixed originally in commit `d3d4ffaae4` ("powerpc/powernv/ioda2: Reduce upper limit for DMA window size"), but contrary to what the merge commit says was inadvertently lost by me in commit `ce57c6610c` ("Merge branch 'topic/ppc-kvm' into next") which brought in changes that moved the code to a new file. So reapply it to the new file. Original commit message follows: We use PHB in mode1 which uses bit 59 to select a correct DMA window. However there is mode2 which uses bits 59:55 and allows up to 32 DMA windows per a PE. Even though documentation does not clearly specify that, it seems that the actual hardware does not support bits 59:55 even in mode1, in other words we can create a window as big as 1<<58 but DMA simply won't work. This reduces the upper limit from 59 to 55 bits to let the userspace know about the hardware limits. Fixes: `ce57c6610c` ("Merge branch 'topic/ppc-kvm' into next") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-20 14:31:03 +10:00
Hari Bathini	0823c68b05	powerpc/fadump: re-register firmware-assisted dump if already registered Firmware-Assisted Dump (FADump) needs to be registered again after any memory hot add/remove operation to update the crash memory ranges. But currently, the kernel returns '-EEXIST' if we try to register without uregistering it first. This could expose the system to racing issues while unregistering and registering FADump from userspace during udev events. Spare the userspace of this and let it be taken care of in the kernel space for a simpler interface. Since this change, running 'echo 1 > /sys/kernel/fadump_registered' would result in re-regisering (unregistering and registering) FADump, if it was already registered. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Suraj Jitindar Singh	74422e2b19	powerpc/pseries: Remove VLA from lparcfg_write() In lparcfg_write we hard code kbuf_sz and then use this as the variable length of kbuf creating a variable length array. Since we're hard coding the length anyway just define the array using this as the length and remove the need for kbuf_sz, thus removing the variable length array. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Suraj Jitindar Singh	ab91239942	powerpc/prom: Remove VLA in prom_check_platform_support() In prom_check_platform_support() we retrieve and parse the "ibm,arch-vec-5-platform-support" property of the chosen node. Currently we use a variable length array however to avoid this use an array of constant length 8. This property is used to indicate the supported options of vector 5 bytes 23-26 of the ibm,architecture.vec node. Each of these options is a pair of bytes, thus for 4 options we have a max length of 8 bytes. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Anton Blanchard	e00d93ac9a	powerpc: Fix duplicate const clang warning in user access code This re-applies commit `b91c1e3e7a` ("powerpc: Fix duplicate const clang warning in user access code") (Jun 2015) which was undone in commits: `f2ca809059` ("powerpc/sparse: Constify the address pointer in __get_user_nosleep()") (Feb 2017) `d466f6c5ca` ("powerpc/sparse: Constify the address pointer in __get_user_nocheck()") (Feb 2017) `f84ed59a61` ("powerpc/sparse: Constify the address pointer in __get_user_check()") (Feb 2017) We see a large number of duplicate const errors in the user access code when building with llvm/clang: include/linux/pagemap.h:576:8: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier] ret = __get_user(c, uaddr); The problem is we are doing const __typeof__(*(ptr)), which will hit the warning if ptr is marked const. Removing const does not seem to have any effect on GCC code generation. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Joel Stanley	ee9d21b3b3	powerpc/boot: Ensure _zimage_start is a weak symbol When building with clang crt0's _zimage_start is not marked weak, which breaks the build when linking the kernel image: $ objdump -t arch/powerpc/boot/crt0.o \|grep _zimage_start$ 0000000000000058 g .text 0000000000000000 _zimage_start ld: arch/powerpc/boot/wrapper.a(crt0.o): in function '_zimage_start': (.text+0x58): multiple definition of '_zimage_start'; arch/powerpc/boot/pseries-head.o:(.text+0x0): first defined here Clang requires the .weak directive to appear after the symbol is declared. The binutils manual says: This directive sets the weak attribute on the comma separated list of symbol names. If the symbols do not already exist, they will be created. So it appears this is different with clang. The only reference I could see for this was an OpenBSD mailing list post[1]. Changing it to be after the declaration fixes building with Clang, and still works with GCC. $ objdump -t arch/powerpc/boot/crt0.o \|grep _zimage_start$ 0000000000000058 w .text 0000000000000000 _zimage_start Reported to clang as https://bugs.llvm.org/show_bug.cgi?id=38921 [1] https://groups.google.com/forum/#!topic/fa.openbsd.tech/PAgKKen2YCY Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Joel Stanley	cbc39809a3	powerpc/configs: Update skiroot defconfig Disable new features from recent releases, and clean out some other unused options: - Enable EXPERT, so we can disable some things - Disable non-powerpc BPF decoders - Disable TASKSTATS - Disable unused syscalls - Set more things to be modules - Turn off unused network vendors - PPC_OF_BOOT_TRAMPOLINE and FB_OF are unused on powernv - Drop unused Radeon and Matrox GPU drivers - IPV6 support landed in petitboot - Bringup related command line powersave=off dropped, switch to quiet Set CONFIG_I2C_CHARDEV=y as the module is not loaded automatically, and without this i2cget etc. will fail in the skiroot environment. This defconfig gets us build coverage of KERNEL_XZ, which was broken in the 4.19 merge window for powerpc. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Nathan Fontenot	85a88cabad	powerpc/pseries: Disable CPU hotplug across migrations When performing partition migrations all present CPUs must be online as all present CPUs must make the H_JOIN call as part of the migration process. Once all present CPUs make the H_JOIN call, one CPU is returned to make the rtas call to perform the migration to the destination system. During testing of migration and changing the SMT state we have found instances where CPUs are offlined, as part of the SMT state change, before they make the H_JOIN call. This results in a hung system where every CPU is either in H_JOIN or offline. To prevent this this patch disables CPU hotplug during the migration process. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Nathan Fontenot	fd12527a1d	powerpc/pseries: Remove unneeded uses of dlpar work queue There are three instances in which dlpar hotplug events are invoked; handling a hotplug interrupt (in a kvm guest), handling a dlpar request through sysfs, and updating LMB affinity when handling a PRRN event. Only in the case of handling a hotplug interrupt do we have to put the work on a workqueue, the other cases can handle the dlpar request directly. This patch exports the handle_dlpar_errorlog() function so that dlpar hotplug events can be handled directly and updates the two instances mentioned above to use the direct invocation. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Nathan Fontenot	cd24e457fd	powerpc/pseries: Remove prrn_work workqueue When a PRRN event is received we are already running in a worker thread. Instead of spawning off another worker thread on the prrn_work workqueue to handle the PRRN event we can just call the PRRN handler routine directly. With this update we can also pass the scope variable for the PRRN event directly to the handler instead of it being a global variable. This patch fixes the following oops mnessage we are seeing in PRRN testing: Oops: Bad kernel stack pointer, sig: 6 [#1] SMP NR_CPUS=2048 NUMA pSeries Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache binfmt_misc reiserfs vfat fat rpadlpar_io(X) rpaphp(X) tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag af_packet xfs libcrc32c dm_service_time ibmveth(X) ses enclosure scsi_transport_sas rtc_generic btrfs xor raid6_pq sd_mod ibmvscsi(X) scsi_transport_srp ipr(X) libata sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 Supported: Yes, External 54 CPU: 7 PID: 18967 Comm: kworker/u96:0 Tainted: G X 4.4.126-94.22-default #1 Workqueue: pseries hotplug workque pseries_hp_work_fn task: c000000775367790 ti: c00000001ebd4000 task.ti: c00000070d140000 NIP: 0000000000000000 LR: 000000001fb3d050 CTR: 0000000000000000 REGS: c00000001ebd7d40 TRAP: 0700 Tainted: G X (4.4.126-94.22-default) MSR: 8000000102081000 <41,VEC,ME5 CR: 28000002 XER: 20040018 4 CFAR: 000000001fb3d084 40 419 1 3 GPR00: 000000000000000040000000000010007 000000001ffff400 000000041fffe200 GPR04: 000000000000008050000000000000000 000000001fb15fa8 0000000500000500 GPR08: 000000000001f40040000000000000001 0000000000000000 000005:5200040002 GPR12: 00000000000000005c000000007a05400 c0000000000e89f8 000000001ed9f668 GPR16: 000000001fbeff944000000001fbeff94 000000001fb545e4 0000006000000060 GPR20: ffffffffffffffff4ffffffffffffffff 0000000000000000 0000000000000000 GPR24: 00000000000000005400000001fb3c000 0000000000000000 000000001fb1b040 GPR28: 000000001fb240004000000001fb440d8 0000000000000008 0000000000000000 NIP [0000000000000000] 5 (null) LR [000000001fb3d050] 031fb3d050 Call Trace: 4 Instruction dump: 4 5:47 12 2 XXXXXXXX XXXXXXXX XXXXX4XX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXX5XX XXXXXXXX 60000000 60000000 60000000 60000000 ---[ end trace aa5627b04a7d9d6b ]--- 3NMI watchdog: BUG: soft lockup - CPU#27 stuck for 23s! [kworker/27:0:13903] Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache binfmt_misc reiserfs vfat fat rpadlpar_io(X) rpaphp(X) tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag af_packet xfs libcrc32c dm_service_time ibmveth(X) ses enclosure scsi_transport_sas rtc_generic btrfs xor raid6_pq sd_mod ibmvscsi(X) scsi_transport_srp ipr(X) libata sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 Supported: Yes, External CPU: 27 PID: 13903 Comm: kworker/27:0 Tainted: G D X 4.4.126-94.22-default #1 Workqueue: events prrn_work_fn task: c000000747cfa390 ti: c00000074712c000 task.ti: c00000074712c000 NIP: c0000000008002a8 LR: c000000000090770 CTR: 000000000032e088 REGS: c00000074712f7b0 TRAP: 0901 Tainted: G D X (4.4.126-94.22-default) MSR: 8000000100009033 <SF,EE,ME,IR,DR,RI,LE> CR: 22482044 XER: 20040000 CFAR: c0000000008002c4 SOFTE: 1 GPR00: c000000000090770 c00000074712fa30 c000000000f09800 c000000000fa1928 6:02 GPR04: c000000775f5e000 fffffffffffffffe 0000000000000001 c000000000f42db8 GPR08: 0000000000000001 0000000080000007 0000000000000000 0000000000000000 GPR12: 8006210083180000 c000000007a14400 NIP [c0000000008002a8] _raw_spin_lock+0x68/0xd0 LR [c000000000090770] mobility_rtas_call+0x50/0x100 Call Trace: 59 5 [c00000074712fa60] [c000000000090770] mobility_rtas_call+0x50/0x100 [c00000074712faf0] [c000000000090b08] pseries_devicetree_update+0xf8/0x530 [c00000074712fc20] [c000000000031ba4] prrn_work_fn+0x34/0x50 [c00000074712fc40] [c0000000000e0390] process_one_work+0x1a0/0x4e0 [c00000074712fcd0] [c0000000000e0870] worker_thread+0x1a0/0x6105:57 2 [c00000074712fd80] [c0000000000e8b18] kthread+0x128/0x150 [c00000074712fe30] [c0000000000096f8] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 2c090000 40c20010 7d40192d 40c2fff0 7c2004ac 2fa90000 40de0018 5:540030 3 e8010010 ebe1fff8 7c0803a6 4e800020 <7c210b78> e92d0000 89290009 792affe3 Signed-off-by: John Allen <jallen@linux.ibm.com> Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Nathan Fontenot	063b8b1251	powerpc/pseries/memory-hotplug: Only update DT once per memory DLPAR request The updates to powerpc numa and memory hotplug code now use the in-kernel LMB array instead of the device tree. This change allows the pseries memory DLPAR code to only update the device tree once after successfully handling a DLPAR request. Prior to the in-kernel LMB array, the numa code looked up the affinity for memory being added in the device tree, the code now looks this up in the LMB array. This change means the memory hotplug code can just update the affinity for an LMB in the LMB array instead of updating the device tree. This also provides a savings in kernel memory. When updating the device tree old properties are never free'ed since there is no usecount on properties. This behavior leads to a new copy of the property being allocated every time a LMB is added or removed (i.e. a request to add 100 LMBs creates 100 new copies of the property). With this update only a single new property is created when a DLPAR request completes successfully. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:08:12 +10:00
Nicholas Piggin	6977f95e63	powerpc: avoid -mno-sched-epilog on GCC 4.9 and newer Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:01:56 +10:00
Nicholas Piggin	2a056f58fd	powerpc: consolidate -mno-sched-epilog into FTRACE flags Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:01:56 +10:00
Nicholas Piggin	f2910f0e68	powerpc: remove old GCC version checks GCC 4.6 is the minimum supported now. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:01:56 +10:00
Nicholas Piggin	89ca4e126a	powerpc/64s/hash: Add a SLB preload cache When switching processes, currently all user SLBEs are cleared, and a few (exec_base, pc, and stack) are preloaded. In trivial testing with small apps, this tends to miss the heap and low 256MB segments, and it will also miss commonly accessed segments on large memory workloads. Add a simple round-robin preload cache that just inserts the last SLB miss into the head of the cache and preloads those at context switch time. Every 256 context switches, the oldest entry is removed from the cache to shrink the cache and require fewer slbmte if they are unused. Much more could go into this, including into the SLB entry reclaim side to track some LRU information etc, which would require a study of large memory workloads. But this is a simple thing we can do now that is an obvious win for common workloads. With the full series, process switching speed on the context_switch benchmark on POWER9/hash (with kernel speculation security masures disabled) increases from 140K/s to 178K/s (27%). POWER8 does not change much (within 1%), it's unclear why it does not see a big gain like POWER9. Booting to busybox init with 256MB segments has SLB misses go down from 945 to 69, and with 1T segments 900 to 21. These could almost all be eliminated by preloading a bit more carefully with ELF binary loading. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:01:56 +10:00
Nicholas Piggin	2e1626744e	powerpc/64s/hash: provide arch_setup_exec hooks for hash slice setup This will be used by the SLB code in the next patch, but for now this sets the slb_addr_limit to the correct size for 32-bit tasks. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:01:56 +10:00
Nicholas Piggin	e83cbf7fb7	powerpc/64s: xmon do not dump hash fields when using radix mode Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:01:56 +10:00
Nicholas Piggin	655deecf67	powerpc/64s/hash: SLB allocation status bitmaps Add 32-entry bitmaps to track the allocation status of the first 32 SLB entries, and whether they are user or kernel entries. These are used to allocate free SLB entries first, before resorting to the round robin allocator. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:01:56 +10:00
Nicholas Piggin	8fed04d0f6	powerpc/64s/hash: remove user SLB data from the paca User SLB mappig data is copied into the PACA from the mm->context so it can be accessed by the SLB miss handlers. After the C conversion, SLB miss handlers now run with relocation on, and user SLB misses are able to take recursive kernel SLB misses, so the user SLB mapping data can be removed from the paca and accessed directly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 22:01:46 +10:00
Nicholas Piggin	5e46e29e6a	powerpc/64s/hash: convert SLB miss handlers to C This patch moves SLB miss handlers completely to C, using the standard exception handler macros to set up the stack and branch to C. This can be done because the segment containing the kernel stack is always bolted, so accessing it with relocation on will not cause an SLB exception. Arbitrary kernel memory may not be accessed when handling kernel space SLB misses, so care should be taken there. However user SLB misses can access any kernel memory, which can be used to move some fields out of the paca (in later patches). User SLB misses could quite easily reconcile IRQs and set up a first class kernel environment and exit via ret_from_except, however that doesn't seem to be necessary at the moment, so we only do that if a bad fault is encountered. [ Credit to Aneesh for bug fixes, error checks, and improvements to bad address handling, etc ] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Since RFC: - Added MSR[RI] handling - Fixed up a register loss bug exposed by irq tracing (Aneesh) - Reject misses outside the defined kernel regions (Aneesh) - Added several more sanity checks and error handling (Aneesh), we may look at consolidating these tests and tightenig up the code but for a first pass we decided it's better to check carefully. Since v1: - Fixed SLB cache corruption (Aneesh) - Fixed untidy SLBE allocation "leak" in get_vsid error case - Now survives some stress testing on real hardware Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:41 +10:00
Nicholas Piggin	82d8f4c22f	powerpc/64s/hash: Use POWER9 SLBIA IH=3 variant in switch_slb POWER9 introduces SLBIA IH=3, which invalidates all SLB entries and associated lookaside information that have a class value of 1, which Linux assigns to user addresses. This matches what switch_slb wants, and allows a simple fast implementation that avoids the slb_cache complexity. As a side-effect, the POWER5 < DD2.1 SLB invalidation workaround is also avoided on POWER9. Process context switching rate is improved about 2.2% for a small process that hits the slb cache which is the best case for the current code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:44 +10:00
Nicholas Piggin	5141c182d7	powerpc/64s/hash: Use POWER6 SLBIA IH=1 variant in switch_slb The SLBIA IH=1 hint will remove all non-zero SLBEs, but only invalidate ERAT entries associated with a class value of 1, for processors that support the hint (e.g., POWER6 and newer), which Linux assigns to user addresses. This prevents kernel ERAT entries from being invalidated when context switchig (if the thread faulted in more than 8 user SLBEs). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:41 +10:00
Nicholas Piggin	85376e2a17	powerpc/64s/hash: remove the vmalloc segment from the bolted SLB Remove the vmalloc segment from bolted SLBEs. This is not required to be bolted, and seems like it was added to help pre-load the SLB on context switch. However there are now other segments like the vmemmap segment and non-zero node memory that often take misses after a context switch, so it is better to solve this in a more general way. A subsequent change will track free SLB entries and uses those rather than round-robin overwrite valid entries, which makes it far less likely for kernel SLBEs to be evicted after they are installed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:41 +10:00
Nicholas Piggin	8b92887ced	powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed The POWER5 < DD2.1 issue is that slbie needs to be issued more than once. It came in with this change: ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, david@gibson.dropbear.id.au [PATCH] POWER5 erratum workaround Early POWER5 revisions (<DD2.1) have a problem requiring slbie instructions to be repeated under some circumstances. The patch below adds a workaround (patch made by Anton Blanchard). (aka. 3e4520f7605243abf66a7ccd3d2e49e48e8c0483 in the full history tree) The extra slbie in switch_slb is done even for the case where slbia is called (slb_flush_and_rebolt). I don't believe that is required because there are other slb_flush_and_rebolt callers which do not issue the workaround slbie, which would be broken if it was required. It also seems to be fine inside the isync with the first slbie, as it is in the kernel stack switch code. So move this workaround to where it is required. This is not much of an optimisation because this is the fast path, but it makes the code more understandable and neater. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Retain slbie_data initialisation to avoid compiler warning] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:41 +10:00
Nicholas Piggin	505ea82eab	powerpc/64s/hash: avoid the POWER5 < DD2.1 slb invalidate workaround on POWER8/9 I only have POWER8/9 to test, so just remove it for those. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:41 +10:00
Nicholas Piggin	09b4438db1	powerpc/64s/hash: Fix stab_rr off by one initialization This causes SLB alloation to start 1 beyond the start of the SLB. There is no real problem because after it wraps it stats behaving properly, it's just surprisig to see when looking at SLB traces. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:41 +10:00
Mahesh Salgaonkar	db7d31ac04	powernv/pseries: consolidate code for mce early handling. Now that other platforms also implements real mode mce handler, lets consolidate the code by sharing existing powernv machine check early code. Rename machine_check_powernv_early to machine_check_common_early and reuse the code. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:41 +10:00
Mahesh Salgaonkar	c6d15258cd	powerpc/pseries: Dump the SLB contents on SLB MCE errors. If we get a machine check exceptions due to SLB errors then dump the current SLB contents which will be very much helpful in debugging the root cause of SLB errors. Introduce an exclusive buffer per cpu to hold faulty SLB entries. In real mode mce handler saves the old SLB contents into this buffer accessible through paca and print it out later in virtual mode. With this patch the console will log SLB contents like below on SLB MCE errors: [ 507.297236] SLB contents of cpu 0x1 [ 507.297237] Last SLB entry inserted at slot 16 [ 507.297238] 00 c000000008000000 400ea1b217000500 [ 507.297239] 1T ESID= c00000 VSID= ea1b217 LLP:100 [ 507.297240] 01 d000000008000000 400d43642f000510 [ 507.297242] 1T ESID= d00000 VSID= d43642f LLP:110 [ 507.297243] 11 f000000008000000 400a86c85f000500 [ 507.297244] 1T ESID= f00000 VSID= a86c85f LLP:100 [ 507.297245] 12 00007f0008000000 4008119624000d90 [ 507.297246] 1T ESID= 7f VSID= 8119624 LLP:110 [ 507.297247] 13 0000000018000000 00092885f5150d90 [ 507.297247] 256M ESID= 1 VSID= 92885f5150 LLP:110 [ 507.297248] 14 0000010008000000 4009e7cb50000d90 [ 507.297249] 1T ESID= 1 VSID= 9e7cb50 LLP:110 [ 507.297250] 15 d000000008000000 400d43642f000510 [ 507.297251] 1T ESID= d00000 VSID= d43642f LLP:110 [ 507.297252] 16 d000000008000000 400d43642f000510 [ 507.297253] 1T ESID= d00000 VSID= d43642f LLP:110 [ 507.297253] ---------------------------------- [ 507.297254] SLB cache ptr value = 3 [ 507.297254] Valid SLB cache entries: [ 507.297255] 00 EA[0-35]= 7f000 [ 507.297256] 01 EA[0-35]= 1 [ 507.297257] 02 EA[0-35]= 1000 [ 507.297257] Rest of SLB cache entries: [ 507.297258] 03 EA[0-35]= 7f000 [ 507.297258] 04 EA[0-35]= 1 [ 507.297259] 05 EA[0-35]= 1000 [ 507.297260] 06 EA[0-35]= 12 [ 507.297260] 07 EA[0-35]= 7f000 Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:41 +10:00
Mahesh Salgaonkar	8f0b80561f	powerpc/pseries: Display machine check error details. Extract the MCE error details from RTAS extended log and display it to console. With this patch you should now see mce logs like below: [ 142.371818] Severe Machine check interrupt [Recovered] [ 142.371822] NIP [d00000000ca301b8]: init_module+0x1b8/0x338 [bork_kernel] [ 142.371822] Initiator: CPU [ 142.371823] Error type: SLB [Multihit] [ 142.371824] Effective address: d00000000ca70000 Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:41 +10:00
Mahesh Salgaonkar	a43c159042	powerpc/pseries: Flush SLB contents on SLB MCE errors. On pseries, as of today system crashes if we get a machine check exceptions due to SLB errors. These are soft errors and can be fixed by flushing the SLBs so the kernel can continue to function instead of system crash. We do this in real mode before turning on MMU. Otherwise we would run into nested machine checks. This patch now fetches the rtas error log in real mode and flushes the SLBs on SLB/ERAT errors. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michal Suchanek <msuchanek@suse.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:59:22 +10:00
Mahesh Salgaonkar	04fce21c9d	powerpc/pseries: Define MCE error event section. On pseries, the machine check error details are part of RTAS extended event log passed under Machine check exception section. This patch adds the definition of rtas MCE event section and related helper functions. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:58:09 +10:00
Breno Leitao	984ecdd68d	powerpc/iommu: Avoid derefence before pointer check The tbl pointer is being derefenced by IOMMU_PAGE_SIZE prior the check if it is not NULL. Just moving the dereference code to after the check, where there will be guarantee that 'tbl' will not be NULL. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:58:09 +10:00
Breno Leitao	8ac9e5bfd8	powerpc/xive: Use xive_cpu->chip_id instead of looking it up again Function xive_native_get_ipi() might use chip_id without it being initialized, if the CPU node is not found, as reported by smatch: error: uninitialized symbol 'chip_id' As suggested by Cédric, we can use xc->chip_id instead of consulting the device tree for chip id, which is safe since xive_prepare_cpu() should have initialized ->chip_id by the time xive_native_get_ipi() is called. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> [mpe: Tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:58:09 +10:00
Rashmica Gupta	3f7daf3d75	powerpc/memtrace: Remove memory in chunks When hot-removing memory release_mem_region_adjustable() splits iomem resources if they are not the exact size of the memory being hot-deleted. Adding this memory back to the kernel adds a new resource. Eg a node has memory 0x0 - 0xfffffffff. Hot-removing 1GB from 0xf40000000 results in the single resource 0x0-0xfffffffff being split into two resources: 0x0-0xf3fffffff and 0xf80000000-0xfffffffff. When we hot-add the memory back we now have three resources: 0x0-0xf3fffffff, 0xf40000000-0xf7fffffff, and 0xf80000000-0xfffffffff. This is an issue if we try to remove some memory that overlaps resources. Eg when trying to remove 2GB at address 0xf40000000, release_mem_region_adjustable() fails as it expects the chunk of memory to be within the boundaries of a single resource. We then get the warning: "Unable to release resource" and attempting to use memtrace again gives us this error: "bash: echo: write error: Resource temporarily unavailable" This patch makes memtrace remove memory in chunks that are always the same size from an address that is always equal to end_of_memory - n*size, for some n. So hotremoving and hotadding memory of different sizes will now not attempt to remove memory that spans multiple resources. Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-19 21:58:02 +10:00
Michael Neuling	51c3c62b58	powerpc: Avoid code patching freed init sections This stops us from doing code patching in init sections after they've been freed. In this chain: kvm_guest_init() -> kvm_use_magic_page() -> fault_in_pages_readable() -> __get_user() -> __get_user_nocheck() -> barrier_nospec(); We have a code patching location at barrier_nospec() and kvm_guest_init() is an init function. This whole chain gets inlined, so when we free the init section (hence kvm_guest_init()), this code goes away and hence should no longer be patched. We seen this as userspace memory corruption when using a memory checker while doing partition migration testing on powervm (this starts the code patching post migration via /sys/kernel/mobility/migration). In theory, it could also happen when using /sys/kernel/debug/powerpc/barrier_nospec. Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-18 22:42:54 +10:00
Anton Blanchard	be54c1216f	powerpc/64: Remove static branch hints from memset() Static branch hints override dynamic branch prediction on recent POWER CPUs. We should only use them when we are overwhelmingly sure of the direction. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-17 21:17:25 +10:00
Laurent Dufour	ba2dd8a26b	powerpc/pseries/mm: call H_BLOCK_REMOVE This hypervisor's call allows to remove up to 8 ptes with only call to tlbie. The virtual pages must be all within the same naturally aligned 8 pages virtual address block and have the same page and segment size encodings. Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-17 21:17:25 +10:00
Laurent Dufour	0effa488dc	powerpc/pseries/mm: factorize PTE slot computation This part of code will be called also when dealing with H_BLOCK_REMOVE. Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-17 21:17:25 +10:00
Laurent Dufour	5600fbe340	powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE This feature tells if the hcall H_BLOCK_REMOVE is available. Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-17 21:17:25 +10:00
Breno Leitao	96695563ce	powerpc/tm: Fix HTM documentation This patch simply fix part of the documentation on the HTM code. This fixes reference to old fields that were renamed in commit `000ec280e3` ("powerpc: tm: Rename transct_(*) to ck(\1)_state") It also documents better the flow after commit `eb5c3f1c86` ("powerpc: Always save/restore checkpointed regs during treclaim/trecheckpoint"), where tm_recheckpoint can recheckpoint what is in ck{fp,vr}_state blindly. Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-17 21:17:25 +10:00
Joel Stanley	b0dc0f8618	powerpc/powernv: Don't select the cpufreq governors Deciding wich govenors should be built into the kernel can be left to users to configure. Fixes: `81f359027a` ("cpufreq: powernv: Select CPUFreq related Kconfig options for powernv") Signed-off-by: Joel Stanley <joel@jms.id.au> [mpe: Update powernv/ppc64 defconfigs to enable them by default] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-17 21:16:26 +10:00
Michael Neuling	f14040bca8	KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds When we come into the softpatch handler (0x1500), we use r11 to store the HSRR0 for later use by the denorm handler. We also use the softpatch handler for the TM workarounds for POWER9. Unfortunately, in kvmppc_interrupt_hv we later store r11 out to the vcpu assuming it's still what we got from userspace. This causes r11 to be corrupted in the VCPU and hence when we restore the guest, we get a corrupted r11. We've seen this when running TM tests inside guests on P9. This fixes the problem by only touching r11 in the denorm case. Fixes: `4bb3c7a020` ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9") Cc: <stable@vger.kernel.org> # 4.17+ Test-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-09-17 16:23:28 +10:00
Alan Modra	56d20861c0	powerpc/vdso: Correct call frame information Call Frame Information is used by gdb for back-traces and inserting breakpoints on function return for the "finish" command. This failed when inside __kernel_clock_gettime. More concerning than difficulty debugging is that CFI is also used by stack frame unwinding code to implement exceptions. If you have an app that needs to handle asynchronous exceptions for some reason, and you are unlucky enough to get one inside the VDSO time functions, your app will crash. What's wrong: There is control flow in __kernel_clock_gettime that reaches label 99 without saving lr in r12. CFI info however is interpreted by the unwinder without reference to control flow: It's a simple matter of "Execute all the CFI opcodes up to the current address". That means the unwinder thinks r12 contains the return address at label 99. Disabuse it of that notion by resetting CFI for the return address at label 99. Note that the ".cfi_restore lr" could have gone anywhere from the "mtlr r12" a few instructions earlier to the instruction at label 99. I put the CFI as late as possible, because in general that's best practice (and if possible grouped with other CFI in order to reduce the number of CFI opcodes executed when unwinding). Using r12 as the return address is perfectly fine after the "mtlr r12" since r12 on that code path still contains the return address. __get_datapage also has a CFI error. That function temporarily saves lr in r0, and reflects that fact with ".cfi_register lr,r0". A later use of r0 means the CFI at that point isn't correct, as r0 no longer contains the return address. Fix that too. Signed-off-by: Alan Modra <amodra@gmail.com> Tested-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-09-14 13:47:31 +10:00
Michael Neuling	dd9a8c5a87	powerpc/tm: Fix HFSCR bit for no suspend case Currently on P9N DD2.1 we end up taking infinite TM facility unavailable exceptions on the first TM usage by userspace. In the special case of TM no suspend (P9N DD2.1), Linux is told TM is off via CPU dt-ftrs but told to (partially) use it via OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED. So HFSCR[TM] will be off from dt-ftrs but we need to turn it on for the no suspend case. This patch fixes this by enabling HFSCR TM in this case. Cc: stable@vger.kernel.org # 4.15+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-09-14 13:47:31 +10:00
Radim Krčmář	ed2ef29100	KVM: s390: Fixes for 4.19 - Fallout from the hugetlbfs support: pfmf interpretion and locking - VSIE: fix keywrapping for nested guests -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJbj40sAAoJEBF7vIC1phx8MIYQAK6TtogzCUok4nvRJZGl34Ac HvJP2OTSNcJO8MA/DkmXk6LNVgrjgLqc4Y0MCMqaz9EzM1FVM0A5cQ4Tiiwk6dlG 395Q5SbkrmVIpmxG7dSQbrj3HlMTUCz7jtAUrDS57zaWYdKhqX+AUuW45u+TPfAo DL00wS+WJxiTWB06cr0gHpHcXyctn5hK0cYUZQokMn2a1pAjLrS4TEpvoGOcu2d6 lULY6uYWCwCnma8eieiC8ssLzB8opDPedLrewBnaZFziEZZrPybYvT8uMffNfygA tj7og1/+iqnUmyAG20Fb8oM0MMcjRWhLGHVFpv1W1ph7624oDUb3Tzd7rV8bzTMC NoqHeIv+oQyhRJCsuPTe2jUcpKc/eJzA8o3ZUdu3LeDBXxNzNOIh08iRHvyFC9iM 91/YkyYcDW2cukxqYjIwPf+y/dVHRqNAmcs9+hvu8AiNeUJPGUYsmlTBABEg0V9H gubV7m/Gl5Yx95UyrlQ4UkuvkOzmtwFYsnFKE0KnqT99bbFFf2na3CZyYBJFBVOj knSl3lS9W5LLrZ3s2VaJ/4/bPc4oGjW1ADEamQCYa4K3XQoMrnqGdL0VVuALJ2dZ RVIz2DP+P6HBCoRWD0cOA0Q+MvP5hl6TrGDdpCbza3ASSF1f/eSASvHs4P4JQPqY dWQ3uIByc3wDXuErkcT5 =kgjR -----END PGP SIGNATURE----- Merge tag 'kvm-s390-master-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: Fixes for 4.19 - Fallout from the hugetlbfs support: pfmf interpretion and locking - VSIE: fix keywrapping for nested guests	2018-09-07 18:30:47 +02:00
Radim Krčmář	732b53146a	PPC KVM fixes for 4.19 Two small fixes for KVM on POWER machines; one fixes a bug where pages might not get marked dirty, causing guest memory corruption on migration, and the other fixes a bug causing reads from guest memory to use the wrong guest real address for very large HPT guests (>256G of memory), leading to failures in instruction emulation. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJbfVFcAAoJEJ2a6ncsY3GfwAcH/i4BDNm5bSXLbCZv1Zqc9iWM ZqCNSlx9fuR5z+Bl3FWvm14CqfG7JFMd1pVXVD3AEGN6nv0mtLPotmoaw+BUWXIP aD3BRIBSfOVHj90CiWJ1pqZGzE49vAKrjUGocuqHhBiqGjYmnnE7QKgD+lQ13SND LWDV3XaQgoO9+NZdqtV6hsWMmKCmXWIHykkG9H+EVkD+341e2EBQf6r83qibAGz4 U5SHkr/3JqL8oC7RJixT8CS/dV5qCgmuL8Vs5NYDTUnc6DmKhdes2s7OiugK7nHg twKe8K0aRVowmTA8yIwEN22OeH1FAUmYDClkgHozHFWyD2+u7O9kLrAYZxEN9Q4= =61nR -----END PGP SIGNATURE----- Merge tag 'kvm-ppc-fixes-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc PPC KVM fixes for 4.19 Two small fixes for KVM on POWER machines; one fixes a bug where pages might not get marked dirty, causing guest memory corruption on migration, and the other fixes a bug causing reads from guest memory to use the wrong guest real address for very large HPT guests (>256G of memory), leading to failures in instruction emulation.	2018-09-04 21:12:46 +02:00
Ard Biesheuvel	ff69279a44	powerpc: disable support for relative ksymtab references The newly added code that emits ksymtab entries as pairs of 32-bit relative references interacts poorly with the way powerpc lays out its address space: when a module exports a per-CPU variable, the primary module region covering the ksymtab entry -and thus the 32-bit relative reference- is too far away from the actual per-CPU variable's base address (to which the per-CPU offsets are applied to obtain the respective address of each CPU's copy), resulting in corruption when the module loader attempts to resolve symbol references of modules that are loaded on top and link to the exported per-CPU symbol. So let's disable this feature on powerpc. Even though it implements CONFIG_RELOCATABLE, it does not implement CONFIG_RANDOMIZE_BASE and so KASLR kernels (which are the main target of the feature) do not exist on powerpc anyway. Reported-by: Andreas Schwab <schwab@linux-m68k.org> Suggested-by: Nicholas Piggin <nicholas.piggin@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-08-29 16:12:07 -07:00
Linus Torvalds	aba16dc5cf	Merge branch 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax Pull IDA updates from Matthew Wilcox: "A better IDA API: id = ida_alloc(ida, GFP_xxx); ida_free(ida, id); rather than the cumbersome ida_simple_get(), ida_simple_remove(). The new IDA API is similar to ida_simple_get() but better named. The internal restructuring of the IDA code removes the bitmap preallocation nonsense. I hope the net -200 lines of code is convincing" * 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax: (29 commits) ida: Change ida_get_new_above to return the id ida: Remove old API test_ida: check_ida_destroy and check_ida_alloc test_ida: Convert check_ida_conv to new API test_ida: Move ida_check_max test_ida: Move ida_check_leaf idr-test: Convert ida_check_nomem to new API ida: Start new test_ida module target/iscsi: Allocate session IDs from an IDA iscsi target: fix session creation failure handling drm/vmwgfx: Convert to new IDA API dmaengine: Convert to new IDA API ppc: Convert vas ID allocation to new IDA API media: Convert entity ID allocation to new IDA API ppc: Convert mmu context allocation to new IDA API Convert net_namespace to new IDA API cb710: Convert to new IDA API rsxx: Convert to new IDA API osd: Convert to new IDA API sd: Convert to new IDA API ...	2018-08-26 11:48:42 -07:00
Linus Torvalds	1bc276775d	Kbuild updates for v4.19 (2nd) - add build_{menu,n,g,x}config targets for compile-testing Kconfig - fix and improve recursive dependency detection in Kconfig - fix parallel building of menuconfig/nconfig - fix syntax error in clang-version.sh - suppress distracting log from syncconfig - remove obsolete "rpm" target - remove VMLINUX_SYMBOL(_STR) macro entirely - fix microblaze build with CONFIG_DYNAMIC_FTRACE - move compiler test for dead code/data elimination to Kconfig - rename well-known LDFLAGS variable to KBUILD_LDFLAGS - misc fixes and cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJbgYhCAAoJED2LAQed4NsGErAP/jt7gt76+N0PZmADBZqyVR/H 4k286g3OiT7DIcdvwqE5BRvu+zNOamDujnnXw63/jwu2RjrkLX/JnhzTbC0IZleZ KeO4bU4ZH0WFa0Ny9pp0LAnzbXGMnQjDXygcUd5BFoEd5JSLKW2PISEEjRh6b5B7 swJRdgySFaMrUBRNf13FwH5EvX/D0xZQe/wFhFCOv6L4gJZFMmpGUIepgTjTUmxZ wcNN6xxXg+ulLHVcPdPQ9EYssNHN5xNys02+IdIrhhXuNHji/TFm4dGYuU+dDGeE Eu4O6Qs7pg0PFGrZ5gLxXDJEp75W+uaTNOqV+jcjq8MRxJuWxyy2biUeelKRT/KH 0iv4ZQJVOMOhl8fZgLtQaXHyQ++5uwd6kvPPf+XFdkogGAIXK0wKWLoALFEOXwb6 z1BBnFx09LrKPGt0ZlKX624OEczedv/UAFiSh3Ic2S3PFEpq4oHrEGhTnyKRobPv OEcF3RqKjmAdK7PLy4kVpTLhkutkWWhw6Giy9qXUkXYJWonJR7NTQ1mIan2LoGZC sGi+qKae/8xgO2Nerx59tZpkiHYTMfYeAo8frzWurOxm3YzEfaxNNGPl+IMW7VKz cNPzQZ5tMUy4i4PAhk/gIWibnUTPfjDbWsZSMtIbO0GFcao56EvllwD8/awuy7lO QkaAeZHFcF+qgU3muaYK =Vsb2 -----END PGP SIGNATURE----- Merge tag 'kbuild-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - add build_{menu,n,g,x}config targets for compile-testing Kconfig - fix and improve recursive dependency detection in Kconfig - fix parallel building of menuconfig/nconfig - fix syntax error in clang-version.sh - suppress distracting log from syncconfig - remove obsolete "rpm" target - remove VMLINUX_SYMBOL(_STR) macro entirely - fix microblaze build with CONFIG_DYNAMIC_FTRACE - move compiler test for dead code/data elimination to Kconfig - rename well-known LDFLAGS variable to KBUILD_LDFLAGS - misc fixes and cleanups * tag 'kbuild-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: rename LDFLAGS to KBUILD_LDFLAGS kbuild: pass LDFLAGS to recordmcount.pl kbuild: test dead code/data elimination support in Kconfig initramfs: move gen_initramfs_list.sh from scripts/ to usr/ vmlinux.lds.h: remove stale <linux/export.h> include export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR() Coccinelle: remove pci_alloc_consistent semantic to detect in zalloc-simple.cocci kbuild: make sorting initramfs contents independent of locale kbuild: remove "rpm" target, which is alias of "rpm-pkg" kbuild: Fix LOADLIBES rename in Documentation/kbuild/makefiles.txt kconfig: suppress "configuration written to .config" for syncconfig kconfig: fix "Can't open ..." in parallel build kbuild: Add a space after `!` to prevent parsing as file pattern scripts: modpost: check memory allocation results kconfig: improve the recursive dependency report kconfig: report recursive dependency involving 'imply' kconfig: error out when seeing recursive dependency kconfig: add build-only configurator targets scripts/dtc: consolidate include path options in Makefile	2018-08-25 13:40:38 -07:00
Linus Torvalds	aa5b1054ba	powerpc fixes for 4.19 #2 - An implementation for the newly added hv_ops->flush() for the OPAL hvc console driver backends, I forgot to apply this after merging the hvc driver changes before the merge window. - Enable all PCI bridges at boot on powernv, to avoid races when multiple children of a bridge try to enable it simultaneously. This is a workaround until the PCI core can be enhanced to fix the races. - A fix to query PowerVM for the correct system topology at boot before initialising sched domains, seen in some configurations to cause broken scheduling etc. - A fix for pte_access_permitted() on "nohash" platforms. - Two commits to fix SIGBUS when using remap_pfn_range() seen on Power9 due to a workaround when using the nest MMU (GPUs, accelerators). - Another fix to the VFIO code used by KVM, the previous fix had some bugs which caused guests to not start in some configurations. - A handful of other minor fixes. Thanks to: Aneesh Kumar K.V, Benjamin Herrenschmidt, Christophe Leroy, Hari Bathini, Luke Dashjr, Mahesh Salgaonkar, Nicholas Piggin, Paul Mackerras, Srikar Dronamraju. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAlt//10THG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLUGD/9y/MPrs+V2lNFhxP+l1jO4Ro0v8DPI vARjqq06WfppgQgSS1dvWfzLaMFbGe8wPRfL0T1xMOXCZg8Ts/HrgxVBVFYcv6/Q xBaU5bKztg6HKQbwwO+8B/gdTA3hO7yFVux6JGwsGO5Ebl8Q3UDOdbvYX6XTj3H8 rdiicle9LaI7qodC8bxlBo1Be0YKEW0O/ag179sxXzozzvoPIyFpeX6FL1sAft++ XlQS1MHu4hErSM2rbmyoFCm+SmyRt3CD0NTVmNd2cgw5XexPOBFlnsdgpaK1jJFc CYu1chP83E91ol1/8NAPkcmPWvP6MGyoqOl75RghooY2D1IJ2GtLKoz2Dvc53ay2 ZlIpMyc2CYIa55Mj18tOV/NbGbh0Lf0Ta++BxqxbcCDt5fq0VxHkoDPqBgWh/tdp Po7oQc7U2VTKKC3wiLr//nSHpgtSTAWtucDt7oT7GdP8+EMxUZ8teBFIfTkwfuD2 plroEmcYuRD3beI4FAG/iCp5POOCsnHLkKVDl7tyQPl3Yvu8hvyLY9gBS9RN4Unt /z94YFJtz7UD+VP7jDaPiQS26y0WEJfW8ml0tNxMZVBdksZLPIPN0/EU/04V0jWf GxynzKMhElxC67tK/Eb43EGiLEZAlbEnJxoOtiCnL1MK+OIJPjg6e7o6qlsmaa+Y zkXVpxLtkq0OoQ== =1AUa -----END PGP SIGNATURE----- Merge tag 'powerpc-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - An implementation for the newly added hv_ops->flush() for the OPAL hvc console driver backends, I forgot to apply this after merging the hvc driver changes before the merge window. - Enable all PCI bridges at boot on powernv, to avoid races when multiple children of a bridge try to enable it simultaneously. This is a workaround until the PCI core can be enhanced to fix the races. - A fix to query PowerVM for the correct system topology at boot before initialising sched domains, seen in some configurations to cause broken scheduling etc. - A fix for pte_access_permitted() on "nohash" platforms. - Two commits to fix SIGBUS when using remap_pfn_range() seen on Power9 due to a workaround when using the nest MMU (GPUs, accelerators). - Another fix to the VFIO code used by KVM, the previous fix had some bugs which caused guests to not start in some configurations. - A handful of other minor fixes. Thanks to: Aneesh Kumar K.V, Benjamin Herrenschmidt, Christophe Leroy, Hari Bathini, Luke Dashjr, Mahesh Salgaonkar, Nicholas Piggin, Paul Mackerras, Srikar Dronamraju. * tag 'powerpc-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mce: Fix SLB rebolting during MCE recovery path. KVM: PPC: Book3S: Fix guest DMA when guest partially backed by THP pages powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid. powerpc/nohash: fix pte_access_permitted() powerpc/topology: Get topology for shared processors at boot powerpc64/ftrace: Include ftrace.h needed for enable/disable calls powerpc/powernv/pci: Work around races in PCI bridge enabling powerpc/fadump: cleanup crash memory ranges support powerpc/powernv: provide a console flush operation for opal hvc driver powerpc/traps: Avoid rate limit messages from show unhandled signals powerpc/64s: Fix PACA_IRQ_HARD_DIS accounting in idle_power4()	2018-08-24 09:34:23 -07:00
Finn Thain	3cc97bea60	treewide: correct "differenciate" and "instanciate" typos Also add these typos to spelling.txt so checkpatch.pl will look for them. Link: http://lkml.kernel.org/r/88af06b9de34d870cb0afc46cfd24e0458be2575.1529471371.git.fthain@telegraphics.com.au Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-08-23 18:48:43 -07:00
Masahiro Yamada	d503ac531a	kbuild: rename LDFLAGS to KBUILD_LDFLAGS Commit `a0f97e06a4` ("kbuild: enable 'make CFLAGS=...' to add additional options to CC") renamed CFLAGS to KBUILD_CFLAGS. Commit `222d394d30` ("kbuild: enable 'make AFLAGS=...' to add additional options to AS") renamed AFLAGS to KBUILD_AFLAGS. Commit `06c5040cdb` ("kbuild: enable 'make CPPFLAGS=...' to add additional options to CPP") renamed CPPFLAGS to KBUILD_CPPFLAGS. For some reason, LDFLAGS was not renamed. Using a well-known variable like LDFLAGS may result in accidental override of the variable. Kbuild generally uses KBUILD_ prefixed variables for the internally appended options, so here is one more conversion to sanitize the naming convention. I did not touch Makefiles under tools/ since the tools build system is a different world. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Palmer Dabbelt <palmer@sifive.com>	2018-08-24 08:22:08 +09:00
Mahesh Salgaonkar	0f52b3a00c	powerpc/mce: Fix SLB rebolting during MCE recovery path. The commit `e7e8184747` ("powerpc/64s: move machine check SLB flushing to mm/slb.c") introduced a bug in reloading bolted SLB entries. Unused bolted entries are stored with .esid=0 in the slb_shadow area, and that value is now used directly as the RB input to slbmte, which means the RB[52:63] index field is set to 0, which causes SLB entry 0 to be cleared. Fix this by storing the index bits in the unused bolted entries, which directs the slbmte to the right place. The SLB shadow area is also used by the hypervisor, but PAPR is okay with that, from LoPAPR v1.1, 14.11.1.3 SLB Shadow Buffer: Note: SLB is filled sequentially starting at index 0 from the shadow buffer ignoring the contents of RB field bits 52-63 Fixes: `e7e8184747` ("powerpc/64s: move machine check SLB flushing to mm/slb.c") Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-23 23:40:10 +10:00
Paul Mackerras	8cfbdbdc24	KVM: PPC: Book3S: Fix guest DMA when guest partially backed by THP pages Commit `76fa4975f3` ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page", 2018-07-17) added some checks to ensure that guest DMA mappings don't attempt to map more than the guest is entitled to access. However, errors in the logic mean that legitimate guest requests to map pages for DMA are being denied in some situations. Specifically, if the first page of the range passed to mm_iommu_get() is mapped with a normal page, and subsequent pages are mapped with transparent huge pages, we end up with mem->pageshift == 0. That means that the page size checks in mm_iommu_ua_to_hpa() and mm_iommu_up_to_hpa_rm() will always fail for every page in that region, and thus the guest can never map any memory in that region for DMA, typically leading to a flood of error messages like this: qemu-system-ppc64: VFIO_MAP_DMA: -22 qemu-system-ppc64: vfio_dma_map(0x10005f47780, 0x800000000000000, 0x10000, 0x7fff63ff0000) = -22 (Invalid argument) The logic errors in mm_iommu_get() are: (a) use of 'ua' not 'ua + (i << PAGE_SHIFT)' in the find_linux_pte() call (meaning that find_linux_pte() returns the pte for the first address in the range, not the address we are currently up to); (b) use of 'pageshift' as the variable to receive the hugepage shift returned by find_linux_pte() - for a normal page this gets set to 0, leading to us setting mem->pageshift to 0 when we conclude that the pte returned by find_linux_pte() didn't match the page we were looking at; (c) comparing 'compshift', which is a page order, i.e. log base 2 of the number of pages, with 'pageshift', which is a log base 2 of the number of bytes. To fix these problems, this patch introduces 'cur_ua' to hold the current user address and uses that in the find_linux_pte() call; introduces 'pteshift' to hold the hugepage shift found by find_linux_pte(); and compares 'pteshift' with 'compshift + PAGE_SHIFT' rather than 'compshift'. The patch also moves the local_irq_restore to the point after the PTE pointer returned by find_linux_pte() has been dereferenced because otherwise the PTE could change underneath us, and adds a check to avoid doing the find_linux_pte() call once mem->pageshift has been reduced to PAGE_SHIFT, as an optimization. Fixes: `76fa4975f3` ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-23 23:40:10 +10:00
Aneesh Kumar K.V	f08d08f3db	powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition The Nest MMU workaround is only needed for RW upgrades. Avoid doing that for other PTE updates. We also avoid clearing the PTE while marking it invalid. This is because other page table walkers will find this PTE none and can result in unexpected behaviour due to that. Instead we clear _PAGE_PRESENT and set the software PTE bit _PAGE_INVALID. pte_present() is already updated to check for both bits. This makes sure page table walkers will find the PTE present and things like pte_pfn(pte) returns the right value. Based on an original patch from Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-23 21:56:48 +10:00
Aneesh Kumar K.V	bd0dbb73e0	powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid. When splitting a huge pmd pte, we need to mark the pmd entry invalid. We can do that by clearing _PAGE_PRESENT bit. But then that will be taken as a swap pte. In order to differentiate between the two use a software pte bit when invalidating. For regular pte, due to `bd5050e38a` ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") we need to mark the pte entry invalid when relaxing access permission. Instead of marking pte_none which can result in different page table walk routines possibly skipping this pte entry, invalidate it but still keep it marked present. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-23 12:16:01 +10:00
Christophe Leroy	810e9f86f3	powerpc/nohash: fix pte_access_permitted() Commit `5769beaf18` ("powerpc/mm: Add proper pte access check helper for other platforms") replaced generic pte_access_permitted() by an arch specific one. The generic one is defined as (pte_present(pte) && (!(write) \|\| pte_write(pte))) The arch specific one is open coded checking that _PAGE_USER and _PAGE_WRITE (_PAGE_RW) flags are set, but lacking to check that _PAGE_RO and _PAGE_PRIVILEGED are unset, leading to a useless test on targets like the 8xx which defines _PAGE_RW and _PAGE_USER as 0. Commit `5fa5b16be5` ("powerpc/mm/hugetlb: Use pte_access_permitted for hugetlb access check") replaced some tests performed with pte helpers by a call to pte_access_permitted(), leading to the same issue. This patch rewrites powerpc/nohash pte_access_permitted() using pte helpers. Fixes: `5769beaf18` ("powerpc/mm: Add proper pte access check helper for other platforms") Fixes: `5fa5b16be5` ("powerpc/mm/hugetlb: Use pte_access_permitted for hugetlb access check") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-23 12:15:58 +10:00
Ard Biesheuvel	271ca78877	arch: enable relative relocations for arm64, power and x86 Patch series "add support for relative references in special sections", v10. This adds support for emitting special sections such as initcall arrays, PCI fixups and tracepoints as relative references rather than absolute references. This reduces the size by 50% on 64-bit architectures, but more importantly, it removes the need for carrying relocation metadata for these sections in relocatable kernels (e.g., for KASLR) that needs to be fixed up at boot time. On arm64, this reduces the vmlinux footprint of such a reference by 8x (8 byte absolute reference + 24 byte RELA entry vs 4 byte relative reference) Patch #3 was sent out before as a single patch. This series supersedes the previous submission. This version makes relative ksymtab entries dependent on the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS rather than trying to infer from kbuild test robot replies for which architectures it should be blacklisted. Patch #1 introduces the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS, and sets it for the main architectures that are expected to benefit the most from this feature, i.e., 64-bit architectures or ones that use runtime relocations. Patch #2 add support for #define'ing __DISABLE_EXPORTS to get rid of ksymtab/kcrctab sections in decompressor and EFI stub objects when rebuilding existing C files to run in a different context. Patches #4 - #6 implement relative references for initcalls, PCI fixups and tracepoints, respectively, all of which produce sections with order ~1000 entries on an arm64 defconfig kernel with tracing enabled. This means we save about 28 KB of vmlinux space for each of these patches. [From the v7 series blurb, which included the jump_label patches as well]: For the arm64 kernel, all patches combined reduce the memory footprint of vmlinux by about 1.3 MB (using a config copied from Ubuntu that has KASLR enabled), of which ~1 MB is the size reduction of the RELA section in .init, and the remaining 300 KB is reduction of .text/.data. This patch (of 6): Before updating certain subsystems to use place relative 32-bit relocations in special sections, to save space and reduce the number of absolute relocations that need to be processed at runtime by relocatable kernels, introduce the Kconfig symbol and define it for some architectures that should be able to support and benefit from it. Link: http://lkml.kernel.org/r/20180704083651.24360-2-ard.biesheuvel@linaro.org Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Petr Mladek <pmladek@suse.com> Cc: James Morris <jmorris@namei.org> Cc: Nicolas Pitre <nico@linaro.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Cc: James Morris <james.morris@microsoft.com> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-08-22 10:52:47 -07:00
Matthew Wilcox	cd38049e48	ppc: Convert vas ID allocation to new IDA API Removes a custom spinlock and simplifies the code. Also fix an error where we could allocate one ID too many. Signed-off-by: Matthew Wilcox <willy@infradead.org>	2018-08-21 23:54:19 -04:00
Matthew Wilcox	b3fa64170e	ppc: Convert mmu context allocation to new IDA API ida_alloc_range is the perfect fit for this use case. Eliminates a custom spinlock, a call to ida_pre_get and a local check for the allocated ID exceeding a maximum. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>	2018-08-21 23:54:18 -04:00
Srikar Dronamraju	2ea6263068	powerpc/topology: Get topology for shared processors at boot On a shared LPAR, Phyp will not update the CPU associativity at boot time. Just after the boot system does recognize itself as a shared LPAR and trigger a request for correct CPU associativity. But by then the scheduler would have already created/destroyed its sched domains. This causes - Broken load balance across Nodes causing islands of cores. - Performance degradation esp if the system is lightly loaded - dmesg to wrongly report all CPUs to be in Node 0. - Messages in dmesg saying borken topology. - With commit `051f3ca02e` ("sched/topology: Introduce NUMA identity node sched domain"), can cause rcu stalls at boot up. The sched_domains_numa_masks table which is used to generate cpumasks is only created at boot time just before creating sched domains and never updated. Hence, its better to get the topology correct before the sched domains are created. For example on 64 core Power 8 shared LPAR, dmesg reports Brought up 512 CPUs Node 0 CPUs: 0-511 Node 1 CPUs: Node 2 CPUs: Node 3 CPUs: Node 4 CPUs: Node 5 CPUs: Node 6 CPUs: Node 7 CPUs: Node 8 CPUs: Node 9 CPUs: Node 10 CPUs: Node 11 CPUs: ... BUG: arch topology borken the DIE domain not a subset of the NUMA domain BUG: arch topology borken the DIE domain not a subset of the NUMA domain numactl/lscpu output will still be correct with cores spreading across all nodes: Socket(s): 64 NUMA node(s): 12 Model: 2.0 (pvr 004d 0200) Model name: POWER8 (architected), altivec supported Hypervisor vendor: pHyp Virtualization type: para L1d cache: 64K L1i cache: 32K NUMA node0 CPU(s): 0-7,32-39,64-71,96-103,176-183,272-279,368-375,464-471 NUMA node1 CPU(s): 8-15,40-47,72-79,104-111,184-191,280-287,376-383,472-479 NUMA node2 CPU(s): 16-23,48-55,80-87,112-119,192-199,288-295,384-391,480-487 NUMA node3 CPU(s): 24-31,56-63,88-95,120-127,200-207,296-303,392-399,488-495 NUMA node4 CPU(s): 208-215,304-311,400-407,496-503 NUMA node5 CPU(s): 168-175,264-271,360-367,456-463 NUMA node6 CPU(s): 128-135,224-231,320-327,416-423 NUMA node7 CPU(s): 136-143,232-239,328-335,424-431 NUMA node8 CPU(s): 216-223,312-319,408-415,504-511 NUMA node9 CPU(s): 144-151,240-247,336-343,432-439 NUMA node10 CPU(s): 152-159,248-255,344-351,440-447 NUMA node11 CPU(s): 160-167,256-263,352-359,448-455 Currently on this LPAR, the scheduler detects 2 levels of Numa and created numa sched domains for all CPUs, but it finds a single DIE domain consisting of all CPUs. Hence it deletes all numa sched domains. To address this, detect the shared processor and update topology soon after CPUs are setup so that correct topology is updated just before scheduler creates sched domain. With the fix, dmesg reports: numa: Node 0 CPUs: 0-7 32-39 64-71 96-103 176-183 272-279 368-375 464-471 numa: Node 1 CPUs: 8-15 40-47 72-79 104-111 184-191 280-287 376-383 472-479 numa: Node 2 CPUs: 16-23 48-55 80-87 112-119 192-199 288-295 384-391 480-487 numa: Node 3 CPUs: 24-31 56-63 88-95 120-127 200-207 296-303 392-399 488-495 numa: Node 4 CPUs: 208-215 304-311 400-407 496-503 numa: Node 5 CPUs: 168-175 264-271 360-367 456-463 numa: Node 6 CPUs: 128-135 224-231 320-327 416-423 numa: Node 7 CPUs: 136-143 232-239 328-335 424-431 numa: Node 8 CPUs: 216-223 312-319 408-415 504-511 numa: Node 9 CPUs: 144-151 240-247 336-343 432-439 numa: Node 10 CPUs: 152-159 248-255 344-351 440-447 numa: Node 11 CPUs: 160-167 256-263 352-359 448-455 and lscpu also reports: Socket(s): 64 NUMA node(s): 12 Model: 2.0 (pvr 004d 0200) Model name: POWER8 (architected), altivec supported Hypervisor vendor: pHyp Virtualization type: para L1d cache: 64K L1i cache: 32K NUMA node0 CPU(s): 0-7,32-39,64-71,96-103,176-183,272-279,368-375,464-471 NUMA node1 CPU(s): 8-15,40-47,72-79,104-111,184-191,280-287,376-383,472-479 NUMA node2 CPU(s): 16-23,48-55,80-87,112-119,192-199,288-295,384-391,480-487 NUMA node3 CPU(s): 24-31,56-63,88-95,120-127,200-207,296-303,392-399,488-495 NUMA node4 CPU(s): 208-215,304-311,400-407,496-503 NUMA node5 CPU(s): 168-175,264-271,360-367,456-463 NUMA node6 CPU(s): 128-135,224-231,320-327,416-423 NUMA node7 CPU(s): 136-143,232-239,328-335,424-431 NUMA node8 CPU(s): 216-223,312-319,408-415,504-511 NUMA node9 CPU(s): 144-151,240-247,336-343,432-439 NUMA node10 CPU(s): 152-159,248-255,344-351,440-447 NUMA node11 CPU(s): 160-167,256-263,352-359,448-455 Reported-by: Manjunatha H R <manjuhr1@in.ibm.com> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> [mpe: Trim / format change log] Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-21 16:01:59 +10:00
Luke Dashjr	d6ee76d3d3	powerpc64/ftrace: Include ftrace.h needed for enable/disable calls this_cpu_disable_ftrace and this_cpu_enable_ftrace are inlines in ftrace.h Without it included, the build fails. Fixes: `a4bc64d305` ("powerpc64/ftrace: Disable ftrace during kvm entry/exit") Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Luke Dashjr <luke-jr+git@utopios.org> Acked-by: Naveen N. Rao <naveen.n.rao at linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-21 15:58:34 +10:00
Benjamin Herrenschmidt	db2173198b	powerpc/powernv/pci: Work around races in PCI bridge enabling The generic code is racy when multiple children of a PCI bridge try to enable it simultaneously. This leads to drivers trying to access a device through a not-yet-enabled bridge, and this EEH errors under various circumstances when using parallel driver probing. There is work going on to fix that properly in the PCI core but it will take some time. x86 gets away with it because (outside of hotplug), the BIOS enables all the bridges at boot time. This patch does the same thing on powernv by enabling all bridges that have child devices at boot time, thus avoiding subsequent races. It's suitable for backporting to stable and distros, while the proper PCI fix will probably be significantly more invasive. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-20 20:29:17 +10:00
Hari Bathini	a58183138c	powerpc/fadump: cleanup crash memory ranges support Commit `1bd6a1c4b8` ("powerpc/fadump: handle crash memory ranges array index overflow") changed crash memory ranges to a dynamic array that is reallocated on-demand with krealloc(). The relevant header for this call was not included. The kernel compiles though. But be cautious and add the header anyway. Also, memory allocation logic in fadump_add_crash_memory() takes care of memory allocation for crash memory ranges in all scenarios. Drop unnecessary memory allocation in fadump_setup_crash_memory_ranges(). Fixes: `1bd6a1c4b8` ("powerpc/fadump: handle crash memory ranges array index overflow") Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-20 20:19:54 +10:00
Nicholas Piggin	95b861a76c	powerpc/powernv: provide a console flush operation for opal hvc driver Provide the flush hv_op for the opal hvc driver. This will flush the firmware console buffers without spinning with interrupts disabled. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-20 20:19:54 +10:00
Michael Ellerman	997dd26cb3	powerpc/traps: Avoid rate limit messages from show unhandled signals In the recent commit to add an explicit ratelimit state when showing unhandled signals, commit `35a52a10c3` ("powerpc/traps: Use an explicit ratelimit state for show_signal_msg()"), I put the check of show_unhandled_signals and the ratelimit state before the call to unhandled_signal() so as to avoid unnecessarily calling the latter when show_unhandled_signals is false. However that causes us to check the ratelimit state on every call, so if we take a lot of handled signals that has the effect of making the ratelimit code print warnings that callbacks have been suppressed when they haven't. So rearrange the code so that we check show_unhandled_signals first, then call unhandled_signal() and finally check the ratelimit state. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>	2018-08-20 20:19:46 +10:00
Paul Mackerras	46dec40fb7	KVM: PPC: Book3S HV: Don't truncate HPTE index in xlate function This fixes a bug which causes guest virtual addresses to get translated to guest real addresses incorrectly when the guest is using the HPT MMU and has more than 256GB of RAM, or more specifically has a HPT larger than 2GB. This has showed up in testing as a failure of the host to emulate doorbell instructions correctly on POWER9 for HPT guests with more than 256GB of RAM. The bug is that the HPTE index in kvmppc_mmu_book3s_64_hv_xlate() is stored as an int, and in forming the HPTE address, the index gets shifted left 4 bits as an int before being signed-extended to 64 bits. The simple fix is to make the variable a long int, matching the return type of kvmppc_hv_find_lock_hpte(), which is what calculates the index. Fixes: `697d3899dc` ("KVM: PPC: Implement MMIO emulation support for Book3S HV guests") Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-08-20 16:05:45 +10:00
Linus Torvalds	e61cf2e3a5	Minor code cleanups for PPC. For x86 this brings in PCID emulation and CR3 caching for shadow page tables, nested VMX live migration, nested VMCS shadowing, an optimized IPI hypercall, and some optimizations. ARM will come next week. There is a semantic conflict because tip also added an .init_platform callback to kvm.c. Please keep the initializer from this branch, and add a call to kvmclock_init (added by tip) inside kvm_init_platform (added here). Also, there is a backmerge from 4.18-rc6. This is because of a refactoring that conflicted with a relatively late bugfix and resulted in a particularly hellish conflict. Because the conflict was only due to unfortunate timing of the bugfix, I backmerged and rebased the refactoring rather than force the resolution on you. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJbdwNFAAoJEL/70l94x66DiPEH/1cAGZWGd85Y3yRu1dmTmqiz kZy0V+WTQ5kyJF4ZsZKKOp+xK7Qxh5e9kLdTo70uPZCHwLu9IaGKN9+dL9Jar3DR yLPX5bMsL8UUed9g9mlhdaNOquWi7d7BseCOnIyRTolb+cqnM5h3sle0gqXloVrS UQb4QogDz8+86czqR8tNfazjQRKW/D2HEGD5NDNVY1qtpY+leCDAn9/u6hUT5c6z EtufgyDh35UN+UQH0e2605gt3nN3nw3FiQJFwFF1bKeQ7k5ByWkuGQI68XtFVhs+ 2WfqL3ftERkKzUOy/WoSJX/C9owvhMcpAuHDGOIlFwguNGroZivOMVnACG1AI3I= =9Mgw -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull first set of KVM updates from Paolo Bonzini: "PPC: - minor code cleanups x86: - PCID emulation and CR3 caching for shadow page tables - nested VMX live migration - nested VMCS shadowing - optimized IPI hypercall - some optimizations ARM will come next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (85 commits) kvm: x86: Set highest physical address bits in non-present/reserved SPTEs KVM/x86: Use CC_SET()/CC_OUT in arch/x86/kvm/vmx.c KVM: X86: Implement PV IPIs in linux guest KVM: X86: Add kvm hypervisor init time platform setup callback KVM: X86: Implement "send IPI" hypercall KVM/x86: Move X86_CR4_OSXSAVE check into kvm_valid_sregs() KVM: x86: Skip pae_root shadow allocation if tdp enabled KVM/MMU: Combine flushing remote tlb in mmu_set_spte() KVM: vmx: skip VMWRITE of HOST_{FS,GS}_BASE when possible KVM: vmx: skip VMWRITE of HOST_{FS,GS}_SEL when possible KVM: vmx: always initialize HOST_{FS,GS}_BASE to zero during setup KVM: vmx: move struct host_state usage to struct loaded_vmcs KVM: vmx: compute need to reload FS/GS/LDT on demand KVM: nVMX: remove a misleading comment regarding vmcs02 fields KVM: vmx: rename __vmx_load_host_state() and vmx_save_host_state() KVM: vmx: add dedicated utility to access guest's kernel_gs_base KVM: vmx: track host_state.loaded using a loaded_vmcs pointer KVM: vmx: refactor segmentation code in vmx_save_host_state() kvm: nVMX: Fix fault priority for VMX operations kvm: nVMX: Fix fault vector for VMX operation at CPL > 0 ...	2018-08-19 10:38:36 -07:00
Linus Torvalds	6ada4e2826	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - a few misc things - a few Y2038 fixes - ntfs fixes - arch/sh tweaks - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (111 commits) mm/hmm.c: remove unused variables align_start and align_end fs/userfaultfd.c: remove redundant pointer uwq mm, vmacache: hash addresses based on pmd mm/list_lru: introduce list_lru_shrink_walk_irq() mm/list_lru.c: pass struct list_lru_node* as an argument to __list_lru_walk_one() mm/list_lru.c: move locking from __list_lru_walk_one() to its caller mm/list_lru.c: use list_lru_walk_one() in list_lru_walk_node() mm, swap: make CONFIG_THP_SWAP depend on CONFIG_SWAP mm/sparse: delete old sparse_init and enable new one mm/sparse: add new sparse_init_nid() and sparse_init() mm/sparse: move buffer init/fini to the common place mm/sparse: use the new sparse buffer functions in non-vmemmap mm/sparse: abstract sparse buffer allocations mm/hugetlb.c: don't zero 1GiB bootmem pages mm, page_alloc: double zone's batchsize mm/oom_kill.c: document oom_lock mm/hugetlb: remove gigantic page support for HIGHMEM mm, oom: remove sleep from under oom_lock kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous() mm/cma: remove unsupported gfp_mask parameter from cma_alloc() ...	2018-08-17 16:49:31 -07:00
Marek Szyprowski	6518202970	mm/cma: remove unsupported gfp_mask parameter from cma_alloc() cma_alloc() doesn't really support gfp flags other than __GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter. This will help to avoid giving false feeling that this function supports standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer, what has already been an issue: see commit `dd65a941f6` ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag"). Link: http://lkml.kernel.org/r/20180709122019eucas1p2340da484acfcc932537e6014f4fd2c29~-sqTPJKij2939229392eucas1p2j@eucas1p2.samsung.com Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Michał Nazarewicz <mina86@mina86.com> Acked-by: Laura Abbott <labbott@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-08-17 16:20:32 -07:00
Souptick Joarder	50a7ca3c6f	mm: convert return type of handle_mm_fault() caller to vm_fault_t Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Ref-> commit `1c8f422059` ("mm: change return type to vm_fault_t") In this patch all the caller of handle_mm_fault() are changed to return vm_fault_t type. Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Tony Luck <tony.luck@intel.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: James Hogan <jhogan@kernel.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: David S. Miller <davem@davemloft.net> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-08-17 16:20:28 -07:00
Linus Torvalds	5e2d059b52	powerpc updates for 4.19 Notable changes: - A fix for a bug in our page table fragment allocator, where a page table page could be freed and reallocated for something else while still in use, leading to memory corruption etc. The fix reuses pt_mm in struct page (x86 only) for a powerpc only refcount. - Fixes to our pkey support. Several are user-visible changes, but bring us in to line with x86 behaviour and/or fix outright bugs. Thanks to Florian Weimer for reporting many of these. - A series to improve the hvc driver & related OPAL console code, which have been seen to cause hardlockups at times. The hvc driver changes in particular have been in linux-next for ~month. - Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y. - Remove Power8 DD1 and Power9 DD1 support, neither chip should be in use anywhere other than as a paper weight. - An optimised memcmp implementation using Power7-or-later VMX instructions - Support for barrier_nospec on some NXP CPUs. - Support for flushing the count cache on context switch on some IBM CPUs (controlled by firmware), as a Spectre v2 mitigation. - A series to enhance the information we print on unhandled signals to bring it into line with other arches, including showing the offending VMA and dumping the instructions around the fault. Thanks to: Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Alexey Spirkov, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Bartosz Golaszewski, Benjamin Herrenschmidt, Bharat Bhushan, Bjoern Noetel, Boqun Feng, Breno Leitao, Bryant G. Ly, Camelia Groza, Christophe Leroy, Christoph Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt, Darren Stevens, Dave Young, David Gibson, Diana Craciun, Finn Thain, Florian Weimer, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand, Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel Stanley, Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues, Michael Hanselmann, Michael Neuling, Michael Schmitz, Mukesh Ojha, Murilo Opsfelder Araujo, Nicholas Piggin, Parth Y Shah, Paul Mackerras, Paul Menzel, Ram Pai, Randy Dunlap, Rashmica Gupta, Reza Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff, Scott Wood, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat Rao B, zhong jiang. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAlt2O6cTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgC7hD/4+cj796Df7GsVsIMxzQm7SS9dklIdO JuKj2Nr5HRzTH59jWlXukLG9mfTNCFgFJB4gEpK1ArDOTcHTCI9RRsLZTZ/kum66 7Pd+7T40dLYXB5uecuUs0vMXa2fI3syKh1VLzACSXv3Dh9BBIKQBwW/aD2eww4YI 1fS5LnXZ2PSxfr6KNAC6ogZnuaiD0sHXOYrtGHq+S/TFC7+Z6ySa6+AnPS+hPVoo /rHDE1Khr66aj7uk+PP2IgUrCFj6Sbj6hTVlS/iAuwbMjUl9ty6712PmvX9x6wMZ 13hJQI+g6Ci+lqLKqmqVUpXGSr6y4NJGPS/Hko4IivBTJApI+qV/tF2H9nxU+6X0 0RqzsMHPHy13n2torA1gC7ttzOuXPI4hTvm6JWMSsfmfjTxLANJng3Dq3ejh6Bqw 76EMowpDLexwpy7/glPpqNdsP4ySf2Qm8yq3mR7qpL4m3zJVRGs11x+s5DW8NKBL Fl5SqZvd01abH+sHwv6NLaLkEtayUyohxvyqu2RU3zu5M5vi7DhqstybTPjKPGu0 icSPh7b2y10WpOUpC6lxpdi8Me8qH47mVc/trZ+SpgBrsuEmtJhGKszEnzRCOqos o2IhYHQv3lQv86kpaAFQlg/RO+Lv+Lo5qbJ209V+hfU5nYzXpEulZs4dx1fbA+ze fK8GEh+u0L4uJg== =PzRz -----END PGP SIGNATURE----- Merge tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - A fix for a bug in our page table fragment allocator, where a page table page could be freed and reallocated for something else while still in use, leading to memory corruption etc. The fix reuses pt_mm in struct page (x86 only) for a powerpc only refcount. - Fixes to our pkey support. Several are user-visible changes, but bring us in to line with x86 behaviour and/or fix outright bugs. Thanks to Florian Weimer for reporting many of these. - A series to improve the hvc driver & related OPAL console code, which have been seen to cause hardlockups at times. The hvc driver changes in particular have been in linux-next for ~month. - Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y. - Remove Power8 DD1 and Power9 DD1 support, neither chip should be in use anywhere other than as a paper weight. - An optimised memcmp implementation using Power7-or-later VMX instructions - Support for barrier_nospec on some NXP CPUs. - Support for flushing the count cache on context switch on some IBM CPUs (controlled by firmware), as a Spectre v2 mitigation. - A series to enhance the information we print on unhandled signals to bring it into line with other arches, including showing the offending VMA and dumping the instructions around the fault. Thanks to: Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Alexey Spirkov, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Bartosz Golaszewski, Benjamin Herrenschmidt, Bharat Bhushan, Bjoern Noetel, Boqun Feng, Breno Leitao, Bryant G. Ly, Camelia Groza, Christophe Leroy, Christoph Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt, Darren Stevens, Dave Young, David Gibson, Diana Craciun, Finn Thain, Florian Weimer, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand, Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel Stanley, Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues, Michael Hanselmann, Michael Neuling, Michael Schmitz, Mukesh Ojha, Murilo Opsfelder Araujo, Nicholas Piggin, Parth Y Shah, Paul Mackerras, Paul Menzel, Ram Pai, Randy Dunlap, Rashmica Gupta, Reza Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff, Scott Wood, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat Rao, zhong jiang" * tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (234 commits) powerpc/mm/book3s/radix: Add mapping statistics powerpc/uaccess: Enable get_user(u64, *p) on 32-bit powerpc/mm/hash: Remove unnecessary do { } while(0) loop powerpc/64s: move machine check SLB flushing to mm/slb.c powerpc/powernv/idle: Fix build error powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range powerpc/mm: remove warning about ‘type’ being set powerpc/32: Include setup.h header file to fix warnings powerpc: Move `path` variable inside DEBUG_PROM powerpc/powermac: Make some functions static powerpc/powermac: Remove variable x that's never read cxl: remove a dead branch powerpc/powermac: Add missing include of header pmac.h powerpc/kexec: Use common error handling code in setup_new_fdt() powerpc/xmon: Add address lookup for percpu symbols powerpc/mm: remove huge_pte_offset_and_shift() prototype powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled powerpc/pseries: Fix endianness while restoring of r3 in MCE handler. powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements powerpc/fadump: handle crash memory ranges array index overflow ...	2018-08-17 11:32:50 -07:00
Linus Torvalds	4e31843f68	pci-v4.19-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlt1f9AUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vxbdhAArnhRvkwOk4m4/LCuKF6HpmlxbBNC TjnBCenNf+lFXzWskfDFGFl/Wif4UzGbRTSCNQrwMzj3Ww3f/6R2QIq9rEJvyNC4 VdxQnaBEZSUgN87q5UGqgdjMTo3zFvlFH6fpb5XDiQ5IX/QZeXeYqoB64w+HvKPU M+IsoOvnA5gb7pMcpchrGUnSfS1e6AqQbbTt6tZflore6YCEA4cH5OnpGx8qiZIp ut+CMBvQjQB01fHeBc/wGrVte4NwXdONrXqpUb4sHF7HqRNfEh0QVyPhvebBi+k1 kquqoBQfPFTqgcab31VOcQhg70dEx+1qGm5/YBAwmhCpHR/g2gioFXoROsr+iUOe BtF6LZr+Y8cySuhJnkCrJBqWvvBaKbJLg0KMbI+7p4o9MZpod2u7LS5LFrlRDyKW 3nz3o+b1+v3tCCKVKIhKo0ljolgkweQtR1f6KIHvq93wBODHVQnAOt9NlPfHVyks ryGBnOhMjoU5hvfexgIWFk9Ph9MEVQSffkI+TeFPO/tyGBfGfQyGtESiXuEaMQaH FGdZHX2RLkY3pWHOtWeMzRHzOnr2XjpDFcAqL3HBGPdJ30K3Umv3WOgoFe2SaocG 0gaddPjKSwwM4Sa/VP+O5cjGuzi7QnczSDdpYjxIGZzBav32hqx4/rsnLw7bHH8y XkEme7cYJc8MGsA= =2Dmn -----END PGP SIGNATURE----- Merge tag 'pci-v4.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: - Decode AER errors with names similar to "lspci" (Tyler Baicar) - Expose AER statistics in sysfs (Rajat Jain) - Clear AER status bits selectively based on the type of recovery (Oza Pawandeep) - Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST (Alexandru Gagniuc) - Don't clear AER status bits if we're using the "Firmware-First" strategy where firmware owns the registers (Alexandru Gagniuc) - Use sysfs_match_string() to simplify ASPM sysfs parsing (Andy Shevchenko) - Remove unnecessary includes of <linux/pci-aspm.h> (Bjorn Helgaas) - Defer DPC event handling to work queue (Keith Busch) - Use threaded IRQ for DPC bottom half (Keith Busch) - Print AER status while handling DPC events (Keith Busch) - Work around IDT switch ACS Source Validation erratum (James Puthukattukaran) - Emit diagnostics for all cases of PCIe Link downtraining (Links operating slower than they're capable of) (Alexandru Gagniuc) - Skip VFs when configuring Max Payload Size (Myron Stowe) - Reduce Root Port Max Payload Size if necessary when hot-adding a device below it (Myron Stowe) - Simplify SHPC existence/permission checks (Bjorn Helgaas) - Remove hotplug sample skeleton driver (Lukas Wunner) - Convert pciehp to threaded IRQ handling (Lukas Wunner) - Improve pciehp tolerance of missed events and initially unstable links (Lukas Wunner) - Clear spurious pciehp events on resume (Lukas Wunner) - Add pciehp runtime PM support, including for Thunderbolt controllers (Lukas Wunner) - Support interrupts from pciehp bridges in D3hot (Lukas Wunner) - Mark fall-through switch cases before enabling -Wimplicit-fallthrough (Gustavo A. R. Silva) - Move DMA-debug PCI init from arch code to PCI core (Christoph Hellwig) - Fix pci_request_irq() usage of IRQF_ONESHOT when no handler is supplied (Heiner Kallweit) - Unify PCI and DMA direction #defines (Shunyong Yang) - Add PCI_DEVICE_DATA() macro (Andy Shevchenko) - Check for VPD completion before checking for timeout (Bert Kenward) - Limit Netronome NFP5000 config space size to work around erratum (Jakub Kicinski) - Set IRQCHIP_ONESHOT_SAFE for PCI MSI irqchips (Heiner Kallweit) - Document ACPI description of PCI host bridges (Bjorn Helgaas) - Add "pci=disable_acs_redir=" parameter to disable ACS redirection for peer-to-peer DMA support (we don't have the peer-to-peer support yet; this is just one piece) (Logan Gunthorpe) - Clean up devm_of_pci_get_host_bridge_resources() resource allocation (Jan Kiszka) - Fixup resizable BARs after suspend/resume (Christian König) - Make "pci=earlydump" generic (Sinan Kaya) - Fix ROM BAR access routines to stay in bounds and check for signature correctly (Rex Zhu) - Add DMA alias quirk for Microsemi Switchtec NTB (Doug Meyer) - Expand documentation for pci_add_dma_alias() (Logan Gunthorpe) - To avoid bus errors, enable PASID only if entire path supports End-End TLP prefixes (Sinan Kaya) - Unify slot and bus reset functions and remove hotplug knowledge from callers (Sinan Kaya) - Add Function-Level Reset quirks for Intel and Samsung NVMe devices to fix guest reboot issues (Alex Williamson) - Add function 1 DMA alias quirk for Marvell 88SS9183 PCIe SSD Controller (Bjorn Helgaas) - Remove Xilinx AXI-PCIe host bridge arch dependency (Palmer Dabbelt) - Remove Aardvark outbound window configuration (Evan Wang) - Fix Aardvark bridge window sizing issue (Zachary Zhang) - Convert Aardvark to use pci_host_probe() to reduce code duplication (Thomas Petazzoni) - Correct the Cadence cdns_pcie_writel() signature (Alan Douglas) - Add Cadence support for optional generic PHYs (Alan Douglas) - Add Cadence power management ops (Alan Douglas) - Remove redundant variable from Cadence driver (Colin Ian King) - Add Kirin MSI support (Xiaowei Song) - Drop unnecessary root_bus_nr setting from exynos, imx6, keystone, armada8k, artpec6, designware-plat, histb, qcom, spear13xx (Shawn Guo) - Move link notification settings from DesignWare core to individual drivers (Gustavo Pimentel) - Add endpoint library MSI-X interfaces (Gustavo Pimentel) - Correct signature of endpoint library IRQ interfaces (Gustavo Pimentel) - Add DesignWare endpoint library MSI-X callbacks (Gustavo Pimentel) - Add endpoint library MSI-X test support (Gustavo Pimentel) - Remove unnecessary GFP_ATOMIC from Hyper-V "new child" allocation (Jia-Ju Bai) - Add more devices to Broadcom PAXC quirk (Ray Jui) - Work around corrupted Broadcom PAXC config space to enable SMMU and GICv3 ITS (Ray Jui) - Disable MSI parsing to work around broken Broadcom PAXC logic in some devices (Ray Jui) - Hide unconfigured functions to work around a Broadcom PAXC defect (Ray Jui) - Lower iproc log level to reduce console output during boot (Ray Jui) - Fix mobiveil iomem/phys_addr_t type usage (Lorenzo Pieralisi) - Fix mobiveil missing include file (Lorenzo Pieralisi) - Add mobiveil Kconfig/Makefile support (Lorenzo Pieralisi) - Fix mvebu I/O space remapping issues (Thomas Petazzoni) - Use generic pci_host_bridge in mvebu instead of ARM-specific API (Thomas Petazzoni) - Whitelist VMD devices with fast interrupt handlers to avoid sharing vectors with slow handlers (Keith Busch) * tag 'pci-v4.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (153 commits) PCI/AER: Don't clear AER bits if error handling is Firmware-First PCI: Limit config space size for Netronome NFP5000 PCI/MSI: Set IRQCHIP_ONESHOT_SAFE for PCI-MSI irqchips PCI/VPD: Check for VPD access completion before checking for timeout PCI: Add PCI_DEVICE_DATA() macro to fully describe device ID entry PCI: Match Root Port's MPS to endpoint's MPSS as necessary PCI: Skip MPS logic for Virtual Functions (VFs) PCI: Add function 1 DMA alias quirk for Marvell 88SS9183 PCI: Check for PCIe Link downtraining PCI: Add ACS Redirect disable quirk for Intel Sunrise Point PCI: Add device-specific ACS Redirect disable infrastructure PCI: Convert device-specific ACS quirks from NULL termination to ARRAY_SIZE PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support PCI: Allow specifying devices using a base bus and path of devfns PCI: Make specifying PCI devices in kernel parameters reusable PCI: Hide ACS quirk declarations inside PCI core PCI: Delay after FLR of Intel DC P3700 NVMe PCI: Disable Samsung SM961/PM961 NVMe before FLR PCI: Export pcie_has_flr() PCI: mvebu: Drop bogus comment above mvebu_pcie_map_registers() ...	2018-08-16 09:21:54 -07:00
Linus Torvalds	dafa5f6577	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Fix dcache flushing crash in skcipher. - Add hash finup self-tests. - Reschedule during speed tests. Algorithms: - Remove insecure vmac and replace it with vmac64. - Add public key verification for DH/ECDH. Drivers: - Decrease priority of sha-mb on x86. - Improve NEON latency/throughput on ARM64. - Add md5/sha384/sha512/des/3des to inside-secure. - Support eip197d in inside-secure. - Only register algorithms supported by the host in virtio. - Add cts and remove incompatible cts1 from ccree. - Add hisilicon SEC security accelerator driver. - Replace msm hwrng driver with qcom pseudo rng driver. Misc: - Centralize CRC polynomials" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (121 commits) crypto: arm64/ghash-ce - implement 4-way aggregation crypto: arm64/ghash-ce - replace NEON yield check with block limit crypto: hisilicon - sec_send_request() can be static lib/mpi: remove redundant variable esign crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable crypto: arm64/aes-ce-gcm - implement 2-way aggregation crypto: arm64/aes-ce-gcm - operate on two input blocks at a time crypto: dh - make crypto_dh_encode_key() make robust crypto: dh - fix calculating encoded key size crypto: ccp - Check for NULL PSP pointer at module unload crypto: arm/chacha20 - always use vrev for 16-bit rotates crypto: ccree - allow bigger than sector XTS op crypto: ccree - zero all of request ctx before use crypto: ccree - remove cipher ivgen left overs crypto: ccree - drop useless type flag during reg crypto: ablkcipher - fix crash flushing dcache in error path crypto: blkcipher - fix crash flushing dcache in error path crypto: skcipher - fix crash flushing dcache in error path crypto: skcipher - remove unnecessary setting of walk->nbytes crypto: scatterwalk - remove scatterwalk_samebuf() ...	2018-08-15 16:01:47 -07:00
Linus Torvalds	9a76aba02a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: - Gustavo A. R. Silva keeps working on the implicit switch fallthru changes. - Support 802.11ax High-Efficiency wireless in cfg80211 et al, From Luca Coelho. - Re-enable ASPM in r8169, from Kai-Heng Feng. - Add virtual XFRM interfaces, which avoids all of the limitations of existing IPSEC tunnels. From Steffen Klassert. - Convert GRO over to use a hash table, so that when we have many flows active we don't traverse a long list during accumluation. - Many new self tests for routing, TC, tunnels, etc. Too many contributors to mention them all, but I'm really happy to keep seeing this stuff. - Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu. - Lots of cleanups and fixes in L2TP code from Guillaume Nault. - Add IPSEC offload support to netdevsim, from Shannon Nelson. - Add support for slotting with non-uniform distribution to netem packet scheduler, from Yousuk Seung. - Add UDP GSO support to mlx5e, from Boris Pismenny. - Support offloading of Team LAG in NFP, from John Hurley. - Allow to configure TX queue selection based upon RX queue, from Amritha Nambiar. - Support ethtool ring size configuration in aquantia, from Anton Mikaev. - Support DSCP and flowlabel per-transport in SCTP, from Xin Long. - Support list based batching and stack traversal of SKBs, this is very exciting work. From Edward Cree. - Busyloop optimizations in vhost_net, from Toshiaki Makita. - Introduce the ETF qdisc, which allows time based transmissions. IGB can offload this in hardware. From Vinicius Costa Gomes. - Add parameter support to devlink, from Moshe Shemesh. - Several multiplication and division optimizations for BPF JIT in nfp driver, from Jiong Wang. - Lots of prepatory work to make more of the packet scheduler layer lockless, when possible, from Vlad Buslov. - Add ACK filter and NAT awareness to sch_cake packet scheduler, from Toke Høiland-Jørgensen. - Support regions and region snapshots in devlink, from Alex Vesker. - Allow to attach XDP programs to both HW and SW at the same time on a given device, with initial support in nfp. From Jakub Kicinski. - Add TLS RX offload and support in mlx5, from Ilya Lesokhin. - Use PHYLIB in r8169 driver, from Heiner Kallweit. - All sorts of changes to support Spectrum 2 in mlxsw driver, from Ido Schimmel. - PTP support in mv88e6xxx DSA driver, from Andrew Lunn. - Make TCP_USER_TIMEOUT socket option more accurate, from Jon Maxwell. - Support for templates in packet scheduler classifier, from Jiri Pirko. - IPV6 support in RDS, from Ka-Cheong Poon. - Native tproxy support in nf_tables, from Máté Eckl. - Maintain IP fragment queue in an rbtree, but optimize properly for in-order frags. From Peter Oskolkov. - Improvde handling of ACKs on hole repairs, from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits) bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT" hv/netvsc: Fix NULL dereference at single queue mode fallback net: filter: mark expected switch fall-through xen-netfront: fix warn message as irq device name has '/' cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0 net: dsa: mv88e6xxx: missing unlock on error path rds: fix building with IPV6=m inet/connection_sock: prefer _THIS_IP_ to current_text_addr net: dsa: mv88e6xxx: bitwise vs logical bug net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd() ieee802154: hwsim: using right kind of iteration net: hns3: Add vlan filter setting by ethtool command -K net: hns3: Set tx ring' tc info when netdev is up net: hns3: Remove tx ring BD len register in hns3_enet net: hns3: Fix desc num set to default when setting channel net: hns3: Fix for phy link issue when using marvell phy driver net: hns3: Fix for information of phydev lost problem when down/up net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero net: hns3: Add support for serdes loopback selftest bnxt_en: take coredump_record structure off stack ...	2018-08-15 15:04:25 -07:00
Linus Torvalds	fa1b5d09d0	Consolidation of Kconfig files by Christoph Hellwig. Move the source statements of arch-independent Kconfig files instead of duplicating the includes in every arch/$(SRCARCH)/Kconfig. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJbdFsfAAoJED2LAQed4NsGxHsP/1tmA57OOOj8oGxO2OXhXVbr Q0MZqCoV4bqMvK/hgCQdl9f+tp0m+j12x4xDLdVf4OqnTXMbqvPDu3uQVKvaj/k1 gHhsFA1tFgSbuJ8InltUsrPEQqbceeJsj50xHVAKijqI6LYeRPPSU7aE9obn+OzH n2nd5sLKvMI/dqdJvW6i5KPydqTH3r3iA7D+ne/XQj0s0EMXvXUPmDT1+ijTnM4a yfm6W5p7L/c3Ugf1Pz5PfnPl4BxBwZMfW5ie/UO8j5C6Rl0iPaOGuuHurocaaJb3 MefR/7NEAR3G8MhJyL2+70jbbwhjpqR2b5ooz1vpuulPHxjeU45BY60XIBWq1afR ewsc12MMCYB695ieYWoHdaWgxD/jhffyRuajfpkXKIZEMgDxS03sMhdULXENVMx1 M0ZQ01g/NLWt9ti9DY3eTKB3ymOhnBa1sa77nGGUHkITq4DQKwPX1J9FP/HT6RNt uOvzeH5kGzc7tqOlZAO0kHbwhQG1uqGcd78IYd4lgf/XfkSgDERTWjnJmnQbwr9m 3PFuST2u8eyO+8Lh1MK76TXOEkXsHMdFugPmb6SlgtMEPKGVLDPlsj52o/LFtgzl eygfMiBFr2+ttkZ6IpNcpmQ4IztmDpz6XoMk3PqDAfUTUSYpCnq1gAEuff/eisCM Odva1ZZaeQ7WpxhsP8rr =gsQJ -----END PGP SIGNATURE----- Merge tag 'kconfig-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kconfig consolidation from Masahiro Yamada: "Consolidation of Kconfig files by Christoph Hellwig. Move the source statements of arch-independent Kconfig files instead of duplicating the includes in every arch/$(SRCARCH)/Kconfig" * tag 'kconfig-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: add a Memory Management options" menu kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt kconfig: use a menu in arch/Kconfig to reduce clutter kconfig: include kernel/Kconfig.preempt from init/Kconfig Kconfig: consolidate the "Kernel hacking" menu kconfig: include common Kconfig files from top-level Kconfig kconfig: remove duplicate SWAP symbol defintions um: create a proper drivers Kconfig um: cleanup Kconfig files um: stop abusing KBUILD_KCONFIG	2018-08-15 13:05:12 -07:00
Bjorn Helgaas	a40f72db8a	Merge branch 'pci/misc' - Mark fall-through switch cases before enabling -Wimplicit-fallthrough (Gustavo A. R. Silva) - Move DMA-debug PCI init from arch code to PCI core (Christoph Hellwig) - Fix pci_request_irq() usage of IRQF_ONESHOT when no handler is supplied (Heiner Kallweit) - Unify PCI and DMA direction #defines (Shunyong Yang) - Add PCI_DEVICE_DATA() macro (Andy Shevchenko) - Check for VPD completion before checking for timeout (Bert Kenward) - Limit Netronome NFP5000 config space size to work around erratum (Jakub Kicinski) * pci/misc: PCI: Limit config space size for Netronome NFP5000 PCI/VPD: Check for VPD access completion before checking for timeout PCI: Add PCI_DEVICE_DATA() macro to fully describe device ID entry PCI: Unify PCI and normal DMA direction definitions PCI: Use IRQF_ONESHOT if pci_request_irq() called with no handler PCI: Call dma_debug_add_bus() for pci_bus_type from PCI core PCI: Mark fall-through switch cases before enabling -Wimplicit-fallthrough # Conflicts: # drivers/pci/hotplug/pciehp_ctrl.c	2018-08-15 14:58:54 -05:00
Linus Torvalds	e026bcc561	Kbuild updates for v4.19 - verify depmod is installed before modules_install - support build salt in case build ids must be unique between builds - allow users to specify additional host compiler flags via HOSTFLAGS, and rename internal variables to KBUILD_HOSTFLAGS - update buildtar script to drop vax support, add arm64 support - update builddeb script for better debarch support - document the pit-fall of if_changed usage - fix parallel build of UML with O= option - make 'samples' target depend on headers_install to fix build errors - remove deprecated host-progs variable - add a new coccinelle script for refcount_t vs atomic_t check - improve double-test coccinelle script - misc cleanups and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJbdFZ0AAoJED2LAQed4NsGcHYP/23txxk3GRP7O4UkfPw9Rtky MHiXTgcoy2vbG+l12BgzWX+qFii8XTUe3dQtK4HnGQFUIBtEBV/hpZPJtxfgGSev Zou5cv1kr5rNzTkCn//TG3O6/WIkTBCe2hahDCtmGDI3kd/cPK4dHbU/q6KpaqIJ qzZYBXIvCeu2GM8idQoCRrwdMpgu1pBz1gz2sDje1yHH2toI7T6cXHRLQDgx+HPq LIP7W9GUsoDdXjecvPD51LiW89E6BUxETBh5Ft9r9uzwB5ylQQMcw6Qyu2DiYDUX PPsHCMiolYV+Ttcy+vj/67KOvKmEaFotssck+RD/xDCF17zKhRkup+YM8kPLHTVZ TcAUZadbnT6U/s2W6GFwvVbN/P7cc3aif+aNCC/Pl23yagp3pydlSCocYxQgiVR7 /rx48haYDEgu/MJ1X0dOpSO0ErY7zu2OoAlNerW+D9QizwbP+WtZO/CJH8SxQRuN dQ1xmyNrie+ODgi9tbc4eBrsb+1rioX927TP5MbJcfXt5CTsxDmIqop5XwyYIoQN ZWWlzC8Ii3P2trAVpBgM2IEbngSxwr6T9Wbf1ScJnPKr/o1rq+pBk49cYstTz3kQ OwJ8gPwUrkW4R+hlD7L6mL/WcrKzZBQS0Ij1QW2kVSEhRrsKo99psE1/rGehnHu9 KGB0LYYCqGSOHR4zOjg0 =VjfG -----END PGP SIGNATURE----- Merge tag 'kbuild-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - verify depmod is installed before modules_install - support build salt in case build ids must be unique between builds - allow users to specify additional host compiler flags via HOSTFLAGS, and rename internal variables to KBUILD_HOSTFLAGS - update buildtar script to drop vax support, add arm64 support - update builddeb script for better debarch support - document the pit-fall of if_changed usage - fix parallel build of UML with O= option - make 'samples' target depend on headers_install to fix build errors - remove deprecated host-progs variable - add a new coccinelle script for refcount_t vs atomic_t check - improve double-test coccinelle script - misc cleanups and fixes * tag 'kbuild-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits) coccicheck: return proper error code on fail Coccinelle: doubletest: reduce side effect false positives kbuild: remove deprecated host-progs variable kbuild: make samples really depend on headers_install um: clean up archheaders recipe kbuild: add %asm-generic to no-dot-config-targets um: fix parallel building with O= option scripts: Add Python 3 support to tracing/draw_functrace.py builddeb: Add automatic support for sh{3,4}{,eb} architectures builddeb: Add automatic support for riscv* architectures builddeb: Add automatic support for m68k architecture builddeb: Add automatic support for or1k architecture builddeb: Add automatic support for sparc64 architecture builddeb: Add automatic support for mips{,64}r6{,el} architectures builddeb: Add automatic support for mips64el architecture builddeb: Add automatic support for ppc64 and powerpcspe architectures builddeb: Introduce functions to simplify kconfig tests in set_debarch builddeb: Drop check for 32-bit s390 builddeb: Change architecture detection fallback to use dpkg-architecture builddeb: Skip architecture detection when KBUILD_DEBARCH is set ...	2018-08-15 12:09:03 -07:00
Paul Mackerras	c066fafc59	KVM: PPC: Book3S HV: Use correct pagesize in kvm_unmap_radix() Since commit `e641a31783` ("KVM: PPC: Book3S HV: Unify dirty page map between HPT and radix", 2017-10-26), kvm_unmap_radix() computes the number of PAGE_SIZEd pages being unmapped and passes it to kvmppc_update_dirty_map(), which expects to be passed the page size instead. Consequently it will only mark one system page dirty even when a large page (for example a THP page) is being unmapped. The consequence of this is that part of the THP page might not get copied during live migration, resulting in memory corruption for the guest. This fixes it by computing and passing the page size in kvm_unmap_radix(). Cc: stable@vger.kernel.org # v4.15+ Fixes: `e641a31783` (KVM: PPC: Book3S HV: Unify dirty page map between HPT and radix) Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-08-15 14:39:27 +10:00
Nicholas Piggin	993ff6d9df	powerpc/64s: Fix PACA_IRQ_HARD_DIS accounting in idle_power4() When idle_power4() hard disables interrupts then finds a soft pending interrupt, it returns with interrupts hard disabled but without PACA_IRQ_HARD_DIS set. Commit `9b81c0211c` ("powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely") added a warning for that condition (since disabled). Fix this by adding the PACA_IRQ_HARD_DIS for that case. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-14 15:36:02 +10:00
Linus Torvalds	8603596a32	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf update from Thomas Gleixner: "The perf crowd presents: Kernel updates: - Removal of jprobes - Cleanup and consolidatation the handling of kprobes - Cleanup and consolidation of hardware breakpoints - The usual pile of fixes and updates to PMUs and event descriptors Tooling updates: - Updates and improvements all over the place. Nothing outstanding, just the (good) boring incremental grump work" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) perf trace: Do not require --no-syscalls to suppress strace like output perf bpf: Include uapi/linux/bpf.h from the 'perf trace' script's bpf.h perf tools: Allow overriding MAX_NR_CPUS at compile time perf bpf: Show better message when failing to load an object perf list: Unify metric group description format with PMU event description perf vendor events arm64: Update ThunderX2 implementation defined pmu core events perf cs-etm: Generate branch sample for CS_ETM_TRACE_ON packet perf cs-etm: Generate branch sample when receiving a CS_ETM_TRACE_ON packet perf cs-etm: Support dummy address value for CS_ETM_TRACE_ON packet perf cs-etm: Fix start tracing packet handling perf build: Fix installation directory for eBPF perf c2c report: Fix crash for empty browser perf tests: Fix indexing when invoking subtests perf trace: Beautify the AF_INET & AF_INET6 'socket' syscall 'protocol' args perf trace beauty: Add beautifiers for 'socket''s 'protocol' arg perf trace beauty: Do not print NULL strarray entries perf beauty: Add a generator for IPPROTO_ socket's protocol constants tools include uapi: Grab a copy of linux/in.h perf tests: Fix complex event name parsing perf evlist: Fix error out while applying initial delay and LBR ...	2018-08-13 12:55:49 -07:00
Linus Torvalds	de5d1b39ea	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking/atomics update from Thomas Gleixner: "The locking, atomics and memory model brains delivered: - A larger update to the atomics code which reworks the ordering barriers, consolidates the atomic primitives, provides the new atomic64_fetch_add_unless() primitive and cleans up the include hell. - Simplify cmpxchg() instrumentation and add instrumentation for xchg() and cmpxchg_double(). - Updates to the memory model and documentation" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits) locking/atomics: Rework ordering barriers locking/atomics: Instrument cmpxchg_double() locking/atomics: Instrument xchg() locking/atomics: Simplify cmpxchg() instrumentation locking/atomics/x86: Reduce arch_cmpxchg64() instrumentation tools/memory-model: Rename litmus tests to comply to norm7 tools/memory-model/Documentation: Fix typo, smb->smp sched/Documentation: Update wake_up() & co. memory-barrier guarantees locking/spinlock, sched/core: Clarify requirements for smp_mb__after_spinlock() sched/core: Use smp_mb() in wake_woken_function() tools/memory-model: Add informal LKMM documentation to MAINTAINERS locking/atomics/Documentation: Describe atomic_set() as a write operation tools/memory-model: Make scripts executable tools/memory-model: Remove ACCESS_ONCE() from model tools/memory-model: Remove ACCESS_ONCE() from recipes locking/memory-barriers.txt/kokr: Update Korean translation to fix broken DMA vs. MMIO ordering example MAINTAINERS: Add Daniel Lustig as an LKMM reviewer tools/memory-model: Fix ISA2+pooncelock+pooncelock+pombonce name tools/memory-model: Add litmus test for full multicopy atomicity locking/refcount: Always allow checked forms ...	2018-08-13 12:23:39 -07:00
Linus Torvalds	f7951c33f0	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - Cleanup and improvement of NUMA balancing - Refactoring and improvements to the PELT (Per Entity Load Tracking) code - Watchdog simplification and related cleanups - The usual pile of small incremental fixes and improvements * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) watchdog: Reduce message verbosity stop_machine: Reflow cpu_stop_queue_two_works() sched/numa: Move task_numa_placement() closer to numa_migrate_preferred() sched/numa: Use group_weights to identify if migration degrades locality sched/numa: Update the scan period without holding the numa_group lock sched/numa: Remove numa_has_capacity() sched/numa: Modify migrate_swap() to accept additional parameters sched/numa: Remove unused task_capacity from 'struct numa_stats' sched/numa: Skip nodes that are at 'hoplimit' sched/debug: Reverse the order of printing faults sched/numa: Use task faults only if numa_group is not yet set up sched/numa: Set preferred_node based on best_cpu sched/numa: Simplify load_too_imbalanced() sched/numa: Evaluate move once per node sched/numa: Remove redundant field sched/debug: Show the sum wait time of a task group sched/fair: Remove #ifdefs from scale_rt_capacity() sched/core: Remove get_cpu() from sched_fork() sched/cpufreq: Clarify sugov_get_util() sched/sysctl: Remove unused sched_time_avg_ms sysctl ...	2018-08-13 11:25:07 -07:00
Aneesh Kumar K.V	a2dc009afa	powerpc/mm/book3s/radix: Add mapping statistics Add statistics that show how memory is mapped within the kernel linear mapping. This is similar to commit `37cd944c8d` ("s390/pgtable: add mapping statistics") We don't do this with Hash translation mode. Hash uses one size (mmu_linear_psize) to map the kernel linear mapping and we print the linear psize during boot as below. "Page orders: linear mapping = 24, virtual = 16, io = 16, vmemmap = 24" A sample output looks like: DirectMap4k: 0 kB DirectMap64k: 18432 kB DirectMap2M: 1030144 kB DirectMap1G: 11534336 kB Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-13 16:35:05 +10:00
Michael Ellerman	241b5f7ffc	Merge branch 'next' of https://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Merge some updates from Scott: "This contains an 8xx compilation fix, and a dpaa device tree fix."	2018-08-13 16:06:23 +10:00
Michael Ellerman	b3124ec2f9	Merge branch 'fixes' into next Merge our fixes branch from the 4.18 cycle to resolve some minor conflicts.	2018-08-13 15:59:06 +10:00
Michael Ellerman	f7a6947cd4	powerpc/uaccess: Enable get_user(u64, *p) on 32-bit Currently if you build a 32-bit powerpc kernel and use get_user() to load a u64 value it will fail to build with eg: kernel/rseq.o: In function `rseq_get_rseq_cs': kernel/rseq.c:123: undefined reference to `__get_user_bad' This is hitting the check in __get_user_size() that makes sure the size we're copying doesn't exceed the size of the destination: #define __get_user_size(x, ptr, size, retval) do { retval = 0; __chk_user_ptr(ptr); if (size > sizeof(x)) (x) = __get_user_bad(); Which doesn't immediately make sense because the size of the destination is u64, but it's not really, because __get_user_check() etc. internally create an unsigned long and copy into that: #define __get_user_check(x, ptr, size) ({ long __gu_err = -EFAULT; unsigned long __gu_val = 0; The problem being that on 32-bit unsigned long is not big enough to hold a u64. We can fix this with a trick from hpa in the x86 code, we statically check the type of x and set the type of __gu_val to either unsigned long or unsigned long long. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:26:10 +10:00
Aneesh Kumar K.V	f405b510c9	powerpc/mm/hash: Remove unnecessary do { } while(0) loop Avoid coverity false warnings like: *** CID 187347: Control flow issues (UNREACHABLE) /arch/powerpc/mm/hash_native_64.c: 819 in native_flush_hash_range() 813 slot += hidx & _PTEIDX_GROUP_IX; 814 hptep = htab_address + slot; 815 want_v = hpte_encode_avpn(vpn, psize, ssize); 816 hpte_v = hpte_get_old_v(hptep); 817 818 if (!HPTE_V_COMPARE(hpte_v, want_v) \|\| !(hpte_v & HPTE_V_VALID)) >>> CID 187347: Control flow issues (UNREACHABLE) Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:25:57 +10:00
Nicholas Piggin	e7e8184747	powerpc/64s: move machine check SLB flushing to mm/slb.c The machine check code that flushes and restores bolted segments in real mode belongs in mm/slb.c. This will also be used by pseries machine check and idle code in future changes. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:39 +10:00
Aneesh Kumar K.V	ae24ce5e12	powerpc/powernv/idle: Fix build error Fix the below build error using strlcpy instead of strncpy In function 'pnv_parse_cpuidle_dt', inlined from 'pnv_init_idle_states' at arch/powerpc/platforms/powernv/idle.c:840:7, inlined from '__machine_initcall_powernv_pnv_init_idle_states' at arch/powerpc/platforms/powernv/idle.c:870:1: arch/powerpc/platforms/powernv/idle.c:820:3: error: 'strncpy' specified bound 16 equals destination size [-Werror=stringop-truncation] strncpy(pnv_idle_states[i].name, temp_string[i], ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ PNV_IDLE_NAME_LEN); Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:39 +10:00
Aneesh Kumar K.V	0b6aa1a20a	powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range This patch makes sure we update the mmu_gather page size even if we are requesting for a fullmm flush. This avoids triggering VM_WARN_ON in code paths like __tlb_remove_page_size that explicitly check for removing range page size to be same as mmu gather page size. Fixes: `5a6099346c` ("powerpc/64s/radix: tlb do not flush on page size when fullmm") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:39 +10:00
Mathieu Malaterre	fce278af81	powerpc/mm: remove warning about ‘type’ being set ‘type’ is only used when CONFIG_DEBUG_HIGHMEM is set. So add a possibly unused tag to variable. Remove warning treated as error with W=1: arch/powerpc/mm/highmem.c:59:6: error: variable ‘type’ set but not used [-Werror=unused-but-set-variable] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:38 +10:00
Mathieu Malaterre	f2c6d0d109	powerpc/32: Include setup.h header file to fix warnings Make sure to include setup.h to provide the following prototypes: - irqstack_early_init - setup_power_save - initialize_cache_info Fix the following warnings (treated as error in W=1): arch/powerpc/kernel/setup_32.c:198:13: error: no previous prototype for ‘irqstack_early_init’ arch/powerpc/kernel/setup_32.c:238:13: error: no previous prototype for ‘setup_power_save’ arch/powerpc/kernel/setup_32.c:253:13: error: no previous prototype for ‘initialize_cache_info’ Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:38 +10:00
Mathieu Malaterre	eab00a208e	powerpc: Move `path` variable inside DEBUG_PROM Add gcc attribute unused for two variables. Fix warnings treated as errors with W=1: arch/powerpc/kernel/prom_init.c:1388:8: error: variable ‘path’ set but not used [-Werror=unused-but-set-variable] Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:38 +10:00
Mathieu Malaterre	618a89d738	powerpc/powermac: Make some functions static These functions can all be static, make it so. Fix warnings treated as errors with W=1: arch/powerpc/platforms/powermac/pci.c:1022:6: error: no previous prototype for ‘pmac_pci_fixup_ohci’ arch/powerpc/platforms/powermac/pci.c:1057:6: error: no previous prototype for ‘pmac_pci_fixup_cardbus’ arch/powerpc/platforms/powermac/pci.c:1094:6: error: no previous prototype for ‘pmac_pci_fixup_pciata’ Remove has_address declaration and assignment since it's not used. Also add gcc attribute unused to fix a warning treated as error with W=1: arch/powerpc/platforms/powermac/pci.c:784:19: error: variable ‘has_address’ set but not used arch/powerpc/platforms/powermac/pci.c:907:22: error: variable ‘ht’ set but not used Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:37 +10:00
Mathieu Malaterre	8921305c1e	powerpc/powermac: Remove variable x that's never read Since the value of x is never intended to be read, remove it. Fix warning treated as error with W=1: arch/powerpc/platforms/powermac/udbg_scc.c:76:9: error: variable ‘x’ set but not used Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Mathieu Malaterre <malat@debian.org> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:37 +10:00
Mathieu Malaterre	2fff0f07b8	powerpc/powermac: Add missing include of header pmac.h The header `pmac.h` was not included, leading to the following warnings, treated as error with W=1: arch/powerpc/platforms/powermac/time.c:69:13: error: no previous prototype for ‘pmac_time_init’ [-Werror=missing-prototypes] arch/powerpc/platforms/powermac/time.c:207:15: error: no previous prototype for ‘pmac_get_boot_time’ [-Werror=missing-prototypes] arch/powerpc/platforms/powermac/time.c:222:6: error: no previous prototype for ‘pmac_get_rtc_time’ [-Werror=missing-prototypes] arch/powerpc/platforms/powermac/time.c:240:5: error: no previous prototype for ‘pmac_set_rtc_time’ [-Werror=missing-prototypes] arch/powerpc/platforms/powermac/time.c:259:12: error: no previous prototype for ‘via_calibrate_decr’ [-Werror=missing-prototypes] arch/powerpc/platforms/powermac/time.c:311:13: error: no previous prototype for ‘pmac_calibrate_decr’ [-Werror=missing-prototypes] The function `via_calibrate_decr` was made static to silence a warning. Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:36 +10:00
Markus Elfring	baedcdf505	powerpc/kexec: Use common error handling code in setup_new_fdt() Add a jump target so that a bit of exception handling can be better reused at the end of this function. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:36 +10:00
Boqun Feng	302c7b0c4f	powerpc/xmon: Add address lookup for percpu symbols Currently, in xmon, there is no obvious way to get an address for a percpu symbol for a particular cpu. Having such an ability would be good for debugging the system when percpu variables got involved. Therefore, this patch introduces a new xmon command "lp" to lookup the address for percpu symbols. Usage of "lp" is similar to "ls", except that we could add a cpu number to choose the variable of which cpu we want to lookup. If no cpu number is given, lookup for current cpu. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:35 +10:00
Christophe Leroy	646dbe40fa	powerpc/mm: remove huge_pte_offset_and_shift() prototype huge_pte_offset_and_shift() has never existed Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:35 +10:00
Christophe Leroy	fa54a981ea	powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled The symbol memcpy_nocache_branch defined in order to allow patching of memset function once cache is enabled leads to confusing reports by perf tool. Using the new patch_site functionality solves this issue. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:35 +10:00
Mahesh Salgaonkar	cd813e1cd7	powerpc/pseries: Fix endianness while restoring of r3 in MCE handler. During Machine Check interrupt on pseries platform, register r3 points RTAS extended event log passed by hypervisor. Since hypervisor uses r3 to pass pointer to rtas log, it stores the original r3 value at the start of the memory (first 8 bytes) pointed by r3. Since hypervisor stores this info and rtas log is in BE format, linux should make sure to restore r3 value in correct endian format. Without this patch when MCE handler, after recovery, returns to code that that caused the MCE may end up with Data SLB access interrupt for invalid address followed by kernel panic or hang. Severe Machine check interrupt [Recovered] NIP [d00000000ca301b8]: init_module+0x1b8/0x338 [bork_kernel] Initiator: CPU Error type: SLB [Multihit] Effective address: d00000000ca70000 cpu 0xa: Vector: 380 (Data SLB Access) at [c0000000fc7775b0] pc: c0000000009694c0: vsnprintf+0x80/0x480 lr: c0000000009698e0: vscnprintf+0x20/0x60 sp: c0000000fc777830 msr: 8000000002009033 dar: a803a30c000000d0 current = 0xc00000000bc9ef00 paca = 0xc00000001eca5c00 softe: 3 irq_happened: 0x01 pid = 8860, comm = insmod vscnprintf+0x20/0x60 vprintk_emit+0xb4/0x4b0 vprintk_func+0x5c/0xd0 printk+0x38/0x4c init_module+0x1c0/0x338 [bork_kernel] do_one_initcall+0x54/0x230 do_init_module+0x8c/0x248 load_module+0x12b8/0x15b0 sys_finit_module+0xa8/0x110 system_call+0x58/0x6c --- Exception: c00 (System Call) at 00007fff8bda0644 SP (7fffdfbfe980) is in userspace This patch fixes this issue. Fixes: `a08a53ea4c` ("powerpc/le: Enable RTAS events support") Cc: stable@vger.kernel.org # v3.15+ Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:34 +10:00
Hari Bathini	ced1bf52f4	powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements With dynamic memory allocation support for crash memory ranges array, there is no hard limit on the no. of crash memory ranges kernel could export, but program headers count could overflow in the /proc/vmcore ELF file while exporting each memory range as PT_LOAD segment. Reduce the likelihood of a such scenario, by folding adjacent crash memory ranges which minimizes the total number of PT_LOAD segments. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:34 +10:00
Hari Bathini	1bd6a1c4b8	powerpc/fadump: handle crash memory ranges array index overflow Crash memory ranges is an array of memory ranges of the crashing kernel to be exported as a dump via /proc/vmcore file. The size of the array is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since commit `142b45a72e` ("memblock: Add array resizing support"). On large memory systems with a few DLPAR operations, the memblock memory regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such systems, registering fadump results in crash or other system failures like below: task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000 NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180 REGS: c00000000b73b570 TRAP: 0300 Tainted: G L X (4.4.140+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 22004484 XER: 20000000 CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0 ... NIP [c000000000047df4] smp_send_reschedule+0x24/0x80 LR [c0000000000f9e58] resched_curr+0x138/0x160 Call Trace: resched_curr+0x138/0x160 (unreliable) check_preempt_curr+0xc8/0xf0 ttwu_do_wakeup+0x38/0x150 try_to_wake_up+0x224/0x4d0 __wake_up_common+0x94/0x100 ep_poll_callback+0xac/0x1c0 __wake_up_common+0x94/0x100 __wake_up_sync_key+0x70/0xa0 sock_def_readable+0x58/0xa0 unix_stream_sendmsg+0x2dc/0x4c0 sock_sendmsg+0x68/0xa0 ___sys_sendmsg+0x2cc/0x2e0 __sys_sendmsg+0x5c/0xc0 SyS_socketcall+0x36c/0x3f0 system_call+0x3c/0x100 as array index overflow is not checked for while setting up crash memory ranges causing memory corruption. To resolve this issue, dynamically allocate memory for crash memory ranges and resize it incrementally, in units of pagesize, on hitting array size limit. Fixes: `2df173d9e8` ("fadump: Initialize elfcore header and add PT_LOAD program headers.") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: Just use PAGE_SIZE directly, fixup variable placement] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:34 +10:00
Christophe Leroy	6bd6d86722	powerpc/cpm1: fix compilation error with CONFIG_PPC_EARLY_DEBUG_CPM commit `e8cb7a55eb` ("powerpc: remove superflous inclusions of asm/fixmap.h") removed inclusion of asm/fixmap.h from files not including objects from that file. However, asm/mmu-8xx.h includes call to __fix_to_virt(). The proper way would be to include asm/fixmap.h in asm/mmu-8xx.h but it creates an inclusion loop. So we have to leave asm/fixmap.h in sysdep/cpm_common.c for CONFIG_PPC_EARLY_DEBUG_CPM CC arch/powerpc/sysdev/cpm_common.o In file included from ./arch/powerpc/include/asm/mmu.h:340:0, from ./arch/powerpc/include/asm/reg_8xx.h:8, from ./arch/powerpc/include/asm/reg.h:29, from ./arch/powerpc/include/asm/processor.h:13, from ./arch/powerpc/include/asm/thread_info.h:28, from ./include/linux/thread_info.h:38, from ./arch/powerpc/include/asm/ptrace.h:159, from ./arch/powerpc/include/asm/hw_irq.h:12, from ./arch/powerpc/include/asm/irqflags.h:12, from ./include/linux/irqflags.h:16, from ./include/asm-generic/cmpxchg-local.h:6, from ./arch/powerpc/include/asm/cmpxchg.h:537, from ./arch/powerpc/include/asm/atomic.h:11, from ./include/linux/atomic.h:5, from ./include/linux/mutex.h:18, from ./include/linux/kernfs.h:13, from ./include/linux/sysfs.h:16, from ./include/linux/kobject.h:20, from ./include/linux/device.h:16, from ./include/linux/node.h:18, from ./include/linux/cpu.h:17, from ./include/linux/of_device.h:5, from arch/powerpc/sysdev/cpm_common.c:21: arch/powerpc/sysdev/cpm_common.c: In function ‘udbg_init_cpm’: ./arch/powerpc/include/asm/mmu-8xx.h:218:25: error: implicit declaration of function ‘__fix_to_virt’ [-Werror=implicit-function-declaration] #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE)) ^ arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’ VIRT_IMMR_BASE); ^ ./arch/powerpc/include/asm/mmu-8xx.h:218:39: error: ‘FIX_IMMR_BASE’ undeclared (first use in this function) #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE)) ^ arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’ VIRT_IMMR_BASE); ^ ./arch/powerpc/include/asm/mmu-8xx.h:218:39: note: each undeclared identifier is reported only once for each function it appears in #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE)) ^ arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’ VIRT_IMMR_BASE); ^ cc1: all warnings being treated as errors make[1]: *** [arch/powerpc/sysdev/cpm_common.o] Error 1 Fixes: `e8cb7a55eb` ("powerpc: remove superflous inclusions of asm/fixmap.h") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:33 +10:00
Dan Carpenter	c42d3be0c0	powerpc: Fix size calculation using resource_size() The problem is the the calculation should be "end - start + 1" but the plus one is missing in this calculation. Fixes: `8626816e90` ("powerpc: add support for MPIC message register API") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:33 +10:00
Rashmica Gupta	d3da701d33	powerpc/powernv: Allow memory that has been hot-removed to be hot-added This patch allows the memory removed by memtrace to be readded to the kernel. So now you don't have to reboot your system to add the memory back to the kernel or to have a different amount of memory removed. Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> Tested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-10 22:12:31 +10:00
Michael Büsch	209b43759d	ssb: Remove SSB_WARN_ON, SSB_BUG_ON and SSB_DEBUG Use the standard WARN_ON instead. If a small kernel is desired, WARN_ON can be disabled globally. Also remove SSB_DEBUG. Besides WARN_ON it only adds a tiny debug check. Include this check unconditionally. Signed-off-by: Michael Buesch <m@bues.ch> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>	2018-08-09 18:47:47 +03:00
Camelia Groza	bd96461249	powerpc/dts/fsl: t2080rdb: use the Cortina PHY driver compatible The Cortina PHY is not compatible with IEEE 802.3 clause 45. Signed-off-by: Camelia Groza <camelia.groza@nxp.com> [scottwood: made commit message about compatibility, not driver choice] Signed-off-by: Scott Wood <oss@buserror.net>	2018-08-08 17:18:02 -05:00
Camelia Groza	39e560a918	powerpc/dts/fsl: t4240rdb: use the Cortina PHY driver compatible The Cortina PHY is not compatible with IEEE 802.3 clause 45. Signed-off-by: Camelia Groza <camelia.groza@nxp.com> [scottwood: made commit message about compatibility, not driver choice] Signed-off-by: Scott Wood <oss@buserror.net>	2018-08-08 17:14:12 -05:00
Camelia Groza	24f36ce616	powerpc/configs/dpaa: enable the Cortina PHY driver Cortina PHYs are present on T4240RDB and T2080RDB. Enable the driver by default. Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>	2018-08-08 17:13:45 -05:00
Christophe Leroy	2a39926c6a	powerpc/cpm1: fix compilation error with CONFIG_PPC_EARLY_DEBUG_CPM commit `e8cb7a55eb` ("powerpc: remove superflous inclusions of asm/fixmap.h") removed inclusion of asm/fixmap.h from files not including objects from that file. However, asm/mmu-8xx.h includes call to __fix_to_virt(). The proper way would be to include asm/fixmap.h in asm/mmu-8xx.h but it creates an inclusion loop. So we have to leave asm/fixmap.h in sysdep/cpm_common.c for CONFIG_PPC_EARLY_DEBUG_CPM CC arch/powerpc/sysdev/cpm_common.o In file included from ./arch/powerpc/include/asm/mmu.h:340:0, from ./arch/powerpc/include/asm/reg_8xx.h:8, from ./arch/powerpc/include/asm/reg.h:29, from ./arch/powerpc/include/asm/processor.h:13, from ./arch/powerpc/include/asm/thread_info.h:28, from ./include/linux/thread_info.h:38, from ./arch/powerpc/include/asm/ptrace.h:159, from ./arch/powerpc/include/asm/hw_irq.h:12, from ./arch/powerpc/include/asm/irqflags.h:12, from ./include/linux/irqflags.h:16, from ./include/asm-generic/cmpxchg-local.h:6, from ./arch/powerpc/include/asm/cmpxchg.h:537, from ./arch/powerpc/include/asm/atomic.h:11, from ./include/linux/atomic.h:5, from ./include/linux/mutex.h:18, from ./include/linux/kernfs.h:13, from ./include/linux/sysfs.h:16, from ./include/linux/kobject.h:20, from ./include/linux/device.h:16, from ./include/linux/node.h:18, from ./include/linux/cpu.h:17, from ./include/linux/of_device.h:5, from arch/powerpc/sysdev/cpm_common.c:21: arch/powerpc/sysdev/cpm_common.c: In function ‘udbg_init_cpm’: ./arch/powerpc/include/asm/mmu-8xx.h:218:25: error: implicit declaration of function ‘__fix_to_virt’ [-Werror=implicit-function-declaration] #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE)) ^ arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’ VIRT_IMMR_BASE); ^ ./arch/powerpc/include/asm/mmu-8xx.h:218:39: error: ‘FIX_IMMR_BASE’ undeclared (first use in this function) #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE)) ^ arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’ VIRT_IMMR_BASE); ^ ./arch/powerpc/include/asm/mmu-8xx.h:218:39: note: each undeclared identifier is reported only once for each function it appears in #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE)) ^ arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’ VIRT_IMMR_BASE); ^ cc1: all warnings being treated as errors make[1]: *** [arch/powerpc/sysdev/cpm_common.o] Error 1 Fixes: `e8cb7a55eb` ("powerpc: remove superflous inclusions of asm/fixmap.h") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2018-08-08 16:27:13 -05:00
Benjamin Herrenschmidt	77b5f703dc	powerpc/powernv/opal: Use standard interrupts property when available For (bad) historical reasons, OPAL used to create a non-standard pair of properties "opal-interrupts" and "opal-interrupts-names" for representing the list of interrupts it wants Linux to request on its behalf. Among other issues, the opal-interrupts doesn't have a way to carry the type of interrupts, and they were assumed to be all level sensitive. This is wrong on some recent systems where some of them are edge sensitive causing warnings in the XIVE code and possible misbehaviours if they need to be retriggered (typically the NPU2 TCE error interrupts). This makes Linux switch to using the standard "interrupts" and "interrupt-names" properties instead when they are available, using standard of_irq helpers, which can carry all the desired type information. Newer versions of OPAL will generate those properties in addition to the legacy ones. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Fixup prefix logic to check strlen(r->name). Reinstate setting of start = 0 in opal_event_shutdown() to avoid double free warnings] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:38 +10:00
Christophe Leroy	d6690b1a9b	powerpc: Allow CPU selection of e300core variants GCC supports -mcpu=e300c2 and -mcpu=e300c3 This patch gives the opportunity to tune kernel to one of those two types. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:37 +10:00
Christophe Leroy	0e00a8c9fd	powerpc: Allow CPU selection also on PPC32 This patch extends to PPC32 the capability to select the exact CPU type. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:37 +10:00
Christophe Leroy	cc62d20ce4	powerpc: Make CPU selection logic generic in Makefile At the time being, when adding a new CPU for selection, both Kconfig.cputype and Makefile have to be modified. This patch moves into Kconfig.cputype the name of the CPU to me passed to the -mcpu= argument. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Rename the option to TARGET_CPU to echo the gcc documentation] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:36 +10:00
Rodrigo R. Galvao	badf436f6f	powerpc/Makefiles: Convert ifeq to ifdef where possible In Makefiles if we're testing a CONFIG_FOO symbol for equality with 'y' we can instead just use ifdef. The latter reads easily, so convert to it where possible. Signed-off-by: Rodrigo R. Galvao <rosattig@linux.vnet.ibm.com> Reviewed-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:36 +10:00
Paul Mackerras	f8db2007ff	powerpc/64: Copy as much as possible in __copy_tofrom_user In __copy_tofrom_user, if we encounter an exception on a store, we stop copying and return the number of bytes not copied. However, if the store is wider than one byte and is to an unaligned address, it is possible that the store operand overlaps a page boundary and the exception occurred on the latter part of the store operand, meaning that it would be possible to copy a few more bytes. Since copy_to_user is generally expected to copy as much as possible, it would be better to copy those extra few bytes. This adds code to do that. Since this edge case is not performance-critical, the code has been written to be compact rather than as fast as possible. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:36 +10:00
Paul Mackerras	98c45f51f7	selftests/powerpc/64: Test all paths through copy routines The hand-coded assembler 64-bit copy routines include feature sections that select one code path or another depending on which CPU we are executing on. The self-tests for these copy routines end up testing just one path. This adds a mechanism for selecting any desired code path at compile time, and makes 2 or 3 versions of each test, each using a different code path, so as to cover all the possible paths. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Add -mcpu=power4 to CFLAGS for older compilers] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:35 +10:00
Paul Mackerras	a7c81ce398	powerpc/64: Make exception table clearer in __copy_tofrom_user_base This aims to make the generation of exception table entries for the loads and stores in __copy_tofrom_user_base clearer and easier to verify. Instead of having a series of local labels on the loads and stores, with a series of corresponding labels later for the exception handlers, we now use macros to generate exception table entries at the point of each load and store that could potentially trap. We do this with the macros lex (load exception) and stex (store exception). These macros are used right before the load or store to which they apply. Some complexity is introduced by the fact that we have some more work to do after hitting an exception, because we need to calculate and return the number of bytes not copied. The code uses r3 as the current pointer into the destination buffer, that is, the address of the first byte of the destination that has not been modified. However, at various points in the copy loops, r3 can be 4, 8, 16 or 24 bytes behind that point. To express this offset in an understandable way, we define a symbol r3_offset which is updated at various points so that it equal to the difference between the address of the first unmodified byte of the destination and the value in r3. (In fact it only needs to be accurate at the point of each lex or stex macro invocation.) The rules for updating r3_offset are as follows: * It starts out at 0 * An addi r3,r3,N instruction decreases r3_offset by N * A store instruction (stb, sth, stw, std) to N(r3) increases r3_offset by the width of the store (1, 2, 4, 8) * A store with update instruction (stbu, sthu, stwu, stdu) to N(r3) sets r3_offset to the width of the store. There is some trickiness to the way that the lex and stex macros and the associated exception handlers work. I would have liked to use the current value of r3_offset in the name of the symbol used as the exception handler, as in ".Lld_exc_$(r3_offset)" and then have symbols .Lld_exc_0, .Lld_exc_8, .Lld_exc_16 etc. corresponding to the offsets that needed to be added to r3. However, I couldn't see a way to do that with gas. Instead, the exception handler address is .Lld_exc - r3_offset or .Lst_exc - r3_offset, that is, the distance ahead of .Lld_exc/.Lst_exc that we start executing is equal to the amount that we need to add to r3. This works because r3_offset is always a small multiple of 4, and our instructions are 4 bytes long. This means that before .Lld_exc and .Lst_exc, we have a sequence of instructions that increments r3 by 4, 8, 16 or 24 depending on where we start. The sequence increments r3 by 4 per instruction (on average). We also replace the exception table for the 4k copy loop by a macro per load or store. These loads and stores all use exactly the same exception handler, which simply resets the argument registers r3, r4 and r5 to there original values and re-does the whole copy using the slower loop. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:34 +10:00
zhong jiang	81d7b08b3c	powerpc/powermac: of_node_put() is not needed after iterator for_each_node_by_name() iterators only exit normally when the loop cursor is NULL, So there is no need to call of_node_put(). Signed-off-by: zhong jiang <zhongjiang@huawei.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:34 +10:00
Haren Myneni	656ecc16e8	crypto/nx: Initialize 842 high and normal RxFIFO control registers NX increments readOffset by FIFO size in receive FIFO control register when CRB is read. But the index in RxFIFO has to match with the corresponding entry in FIFO maintained by VAS in kernel. Otherwise NX may be processing incorrect CRBs and can cause CRB timeout. VAS FIFO offset is 0 when the receive window is opened during initialization. When the module is reloaded or in kexec boot, readOffset in FIFO control register may not match with VAS entry. This patch adds nx_coproc_init OPAL call to reset readOffset and queued entries in FIFO control register for both high and normal FIFOs. Signed-off-by: Haren Myneni <haren@us.ibm.com> [mpe: Fixup uninitialized variable warning] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:34 +10:00
Haren Myneni	6e708000ec	powerpc/powernv: Export opal_check_token symbol Export opal_check_token symbol for modules to check the availability of OPAL calls before using them. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:33 +10:00
Randy Dunlap	f5daf77a55	powerpc/platforms/85xx: fix t1042rdb_diu.c build errors & warning Fix build errors and warnings in t1042rdb_diu.c by adding header files and MODULE_LICENSE(). ../arch/powerpc/platforms/85xx/t1042rdb_diu.c:152:1: warning: data definition has no type or storage class early_initcall(t1042rdb_diu_init); ../arch/powerpc/platforms/85xx/t1042rdb_diu.c:152:1: error: type defaults to 'int' in declaration of 'early_initcall' [-Werror=implicit-int] ../arch/powerpc/platforms/85xx/t1042rdb_diu.c:152:1: warning: parameter names (without types) in function declaration and WARNING: modpost: missing MODULE_LICENSE() in arch/powerpc/platforms/85xx/t1042rdb_diu.o Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:33 +10:00
Anju T Sudhakar	7ccc4fe5ff	powerpc/perf: Remove sched_task function defined for thread-imc Call trace observed while running perf-fuzzer: CPU: 43 PID: 9088 Comm: perf_fuzzer Not tainted 4.13.0-32-generic #35~lp1746225 task: c000003f776ac900 task.stack: c000003f77728000 NIP: c000000000299b70 LR: c0000000002a4534 CTR: c00000000029bb80 REGS: c000003f7772b760 TRAP: 0700 Not tainted (4.13.0-32-generic) MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24008822 XER: 00000000 CFAR: c000000000299a70 SOFTE: 0 GPR00: c0000000002a4534 c000003f7772b9e0 c000000001606200 c000003fef858908 GPR04: c000003f776ac900 0000000000000001 ffffffffffffffff 0000003fee730000 GPR08: 0000000000000000 0000000000000000 c0000000011220d8 0000000000000002 GPR12: c00000000029bb80 c000000007a3d900 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 c000003f776ad090 c000000000c71354 GPR24: c000003fef716780 0000003fee730000 c000003fe69d4200 c000003f776ad330 GPR28: c0000000011220d8 0000000000000001 c0000000014c6108 c000003fef858900 NIP [c000000000299b70] perf_pmu_sched_task+0x170/0x180 LR [c0000000002a4534] __perf_event_task_sched_in+0xc4/0x230 Call Trace: perf_iterate_sb+0x158/0x2a0 (unreliable) __perf_event_task_sched_in+0xc4/0x230 finish_task_switch+0x21c/0x310 __schedule+0x304/0xb80 schedule+0x40/0xc0 do_wait+0x254/0x2e0 kernel_wait4+0xa0/0x1a0 SyS_wait4+0x64/0xc0 system_call+0x58/0x6c Instruction dump: 3beafea0 7faa4800 409eff18 e8010060 eb610028 ebc10040 7c0803a6 38210050 eb81ffe0 eba1ffe8 ebe1fff8 4e800020 <0fe00000> 4bffffbc 60000000 60420000 ---[ end trace 8c46856d314c1811 ]--- The context switch call-backs for thread-imc are defined in sched_task function. So when thread-imc events are grouped with software pmu events, perf_pmu_sched_task hits the WARN_ON_ONCE condition, since software PMUs are assumed not to have a sched_task defined. Patch to move the thread_imc enable/disable opal call back from sched_task to event_[add/del] function Fixes: `f74c89bd80` ("powerpc/perf: Add thread IMC PMU support") Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Tested-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:32 +10:00
Nicholas Piggin	4231aba000	powerpc/64s: Fix page table fragment refcount race vs speculative references The page table fragment allocator uses the main page refcount racily with respect to speculative references. A customer observed a BUG due to page table page refcount underflow in the fragment allocator. This can be caused by the fragment allocator set_page_count stomping on a speculative reference, and then the speculative failure handler decrements the new reference, and the underflow eventually pops when the page tables are freed. Fix this by using a dedicated field in the struct page for the page table fragment allocator. Fixes: `5c1f6ee9a3` ("powerpc: Reduce PTE table memory wastage") Cc: stable@vger.kernel.org # v3.10+ Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:32 +10:00
Darren Stevens	e13606d732	powerpc/pasemi: Use pr_err/pr_warn... for kernel messages Pasemi code still uses printk(KERN_ERR/KERN_WARN ... change these to pr_err(, pr_warn(... to match other powerpc arch code. No functional changes. Signed-off-by: Darren Stevens <darren@stevens-zone.net> [mpe: Unsplit some strings while we're at it] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:31 +10:00
Murilo Opsfelder Araujo	a99b9c5ed4	powerpc/traps: Show instructions on exceptions Call show_user_instructions() in arch/powerpc/kernel/traps.c to dump instructions at faulty location, useful to debugging. Before this patch, an unhandled signal message looked like: pandafault[10524]: segfault (11) at 100007d0 nip 1000061c lr 7fffbd295100 code 2 in pandafault[10000000+10000] After this patch, it looks like: pandafault[10524]: segfault (11) at 100007d0 nip 1000061c lr 7fffbd295100 code 2 in pandafault[10000000+10000] pandafault[10524]: code: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe pandafault[10524]: code: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:30 +10:00
Murilo Opsfelder Araujo	88b0fe1757	powerpc: Add show_user_instructions() show_user_instructions() is a slightly modified version of show_instructions() that allows userspace instruction dump. This will be useful within show_signal_msg() to dump userspace instructions of the faulty location. Here is a sample of what show_user_instructions() outputs: pandafault[10850]: code: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe pandafault[10850]: code: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040 The current->comm and current->pid printed can serve as a glue that links the instructions dump to its originator, allowing messages to be interleaved in the logs. Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:30 +10:00
Murilo Opsfelder Araujo	0f642d616b	powerpc/traps: Print VMA for unhandled signals This adds VMA address in the message printed for unhandled signals, similarly to what other architectures, like x86, print. Before this patch, a page fault looked like: pandafault[61470]: unhandled signal 11 at 100007d0 nip 1000061c lr 7fff8d185100 code 2 After this patch, a page fault looks like: pandafault[6303]: segfault 11 at 100007d0 nip 1000061c lr 7fff93c55100 code 2 in pandafault[10000000+10000] Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:30 +10:00
Murilo Opsfelder Araujo	49d8f2011d	powerpc/traps: Use %lx format in show_signal_msg() Use %lx format to print registers. This avoids having two different formats and avoids checking for MSR_64BIT, improving readability of the function. Even though we could have used %px, which is functionally equivalent to %lx as per Documentation/core-api/printk-formats.rst, it is not semantically correct because the data printed are not pointers. And using %px requires casting data to (void *). Besides that, %lx matches the format used in show_regs(). Before this patch: pandafault[4808]: unhandled signal 11 at 0000000010000718 nip 0000000010000574 lr 00007fff935e7a6c code 2 After this patch: pandafault[4732]: unhandled signal 11 at 10000718 nip 10000574 lr 7fff86697a6c code 2 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:29 +10:00
Murilo Opsfelder Araujo	35a52a10c3	powerpc/traps: Use an explicit ratelimit state for show_signal_msg() Replace printk_ratelimited() by printk() and a default rate limit burst to limit displaying unhandled signals messages. This will allow us to call print_vma_addr() in a future patch, which does not work with printk_ratelimited(). Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:29 +10:00
Murilo Opsfelder Araujo	658b0f92bc	powerpc/traps: Print unhandled signals in a separate function Isolate the logic of printing unhandled signals out of _exception_pkey(). No functional change, only code rearrangement. Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:29 +10:00
Michael Ellerman	78ee994637	powerpc/64s: Make rfi_flush_fallback a little more robust Because rfi_flush_fallback runs immediately before the return to userspace it currently runs with the user r1 (stack pointer). This means if we oops in there we will report a bad kernel stack pointer in the exception entry path, eg: Bad kernel stack pointer 7ffff7150e40 at c0000000000023b4 Oops: Bad kernel stack pointer, sig: 6 [#1] LE SMP NR_CPUS=32 NUMA PowerNV Modules linked in: CPU: 0 PID: 1246 Comm: klogd Not tainted 4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3 #7 NIP: c0000000000023b4 LR: 0000000010053e00 CTR: 0000000000000040 REGS: c0000000fffe7d40 TRAP: 4100 Not tainted (4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3) MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000 CFAR: c00000000000bac8 IRQMASK: c0000000f1e66a80 GPR00: 0000000002000000 00007ffff7150e40 00007fff93a99900 0000000000000020 ... NIP [c0000000000023b4] rfi_flush_fallback+0x34/0x80 LR [0000000010053e00] 0x10053e00 Although the NIP tells us where we were, and the TRAP number tells us what happened, it would still be nicer if we could report the actual exception rather than barfing about the stack pointer. We an do that fairly simply by loading the kernel stack pointer on entry and restoring the user value before returning. That way we see a regular oops such as: Unrecoverable exception 4100 at c00000000000239c Oops: Unrecoverable exception, sig: 6 [#1] LE SMP NR_CPUS=32 NUMA PowerNV Modules linked in: CPU: 0 PID: 1251 Comm: klogd Not tainted 4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty #40 NIP: c00000000000239c LR: 0000000010053e00 CTR: 0000000000000040 REGS: c0000000f1e17bb0 TRAP: 4100 Not tainted (4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty) MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000 CFAR: c00000000000bac8 IRQMASK: 0 ... NIP [c00000000000239c] rfi_flush_fallback+0x3c/0x80 LR [0000000010053e00] 0x10053e00 Call Trace: [c0000000f1e17e30] [c00000000000b9e4] system_call+0x5c/0x70 (unreliable) Note this shouldn't make the kernel stack pointer vulnerable to a meltdown attack, because it should be flushed from the cache before we return to userspace. The user r1 value will be in the cache, because we load it in the return path, but that is harmless. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>	2018-08-08 00:32:27 +10:00
Michael Ellerman	99d54754d3	powerpc/powernv: Query firmware for count cache flush settings Look for fw-features properties to determine the appropriate settings for the count cache flush, and then call the generic powerpc code to set it up based on the security feature flags. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:27 +10:00
Michael Ellerman	ba72dc1719	powerpc/pseries: Query hypervisor for count cache flush settings Use the existing hypercall to determine the appropriate settings for the count cache flush, and then call the generic powerpc code to set it up based on the security feature flags. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:26 +10:00
Michael Ellerman	ee13cb249f	powerpc/64s: Add support for software count cache flush Some CPU revisions support a mode where the count cache needs to be flushed by software on context switch. Additionally some revisions may have a hardware accelerated flush, in which case the software flush sequence can be shortened. If we detect the appropriate flag from firmware we patch a branch into _switch() which takes us to a count cache flush sequence. That sequence in turn may be patched to return early if we detect that the CPU supports accelerating the flush sequence in hardware. Add debugfs support for reporting the state of the flush, as well as runtime disabling it. And modify the spectre_v2 sysfs file to report the state of the software flush. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:26 +10:00
Michael Ellerman	dc8c6cce9a	powerpc/64s: Add new security feature flags for count cache flush Add security feature flags to indicate the need for software to flush the count cache on context switch, and for the presence of a hardware assisted count cache flush. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:26 +10:00
Michael Ellerman	06d0bbc6d0	powerpc/asm: Add a patch_site macro & helpers for patching instructions Add a macro and some helper C functions for patching single asm instructions. The gas macro means we can do something like: 1: nop patch_site 1b, patch__foo Which is less visually distracting than defining a GLOBAL symbol at 1, and also doesn't pollute the symbol table which can confuse eg. perf. These are obviously similar to our existing feature sections, but are not automatically patched based on CPU/MMU features, rather they are designed to be manually patched by C code at some arbitrary point. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:25 +10:00
Diana Craciun	c28218d4ab	powerpc/fsl: Sanitize the syscall table for NXP PowerPC 32 bit platforms Used barrier_nospec to sanitize the syscall table. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:24 +10:00
Diana Craciun	ebcd1bfc33	powerpc/fsl: Add barrier_nospec implementation for NXP PowerPC Book3E Implement the barrier_nospec as a isync;sync instruction sequence. The implementation uses the infrastructure built for BOOK3S 64. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:24 +10:00
Diana Craciun	406d2b6ae3	powerpc/64: Make meltdown reporting Book3S 64 specific In a subsequent patch we will enable building security.c for Book3E. However the NXP platforms are not vulnerable to Meltdown, so make the Meltdown vulnerability reporting PPC_BOOK3S_64 specific. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:24 +10:00
Michael Ellerman	af375eefbf	powerpc/64: Call setup_barrier_nospec() from setup_arch() Currently we require platform code to call setup_barrier_nospec(). But if we add an empty definition for the !CONFIG_PPC_BARRIER_NOSPEC case then we can call it in setup_arch(). Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:23 +10:00
Michael Ellerman	179ab1cbf8	powerpc/64: Add CONFIG_PPC_BARRIER_NOSPEC Add a config symbol to encode which platforms support the barrier_nospec speculation barrier. Currently this is just Book3S 64 but we will add Book3E in a future patch. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:23 +10:00
Diana Craciun	6453b532f2	powerpc/64: Make stf barrier PPC_BOOK3S_64 specific. NXP Book3E platforms are not vulnerable to speculative store bypass, so make the mitigations PPC_BOOK3S_64 specific. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-08 00:32:18 +10:00
Diana Craciun	cf175dc315	powerpc/64: Disable the speculation barrier from the command line The speculation barrier can be disabled from the command line with the parameter: "nospectre_v1". Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:39 +10:00
Michael Ellerman	0b924de4f6	powerpc/64s: Don't use __MASKABLE_EXCEPTION unnecessarily We only need to use __MASKABLE_EXCEPTION in one of the four cases for hardware interrupt, so use the helper macros in the other cases. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:39 +10:00
Michael Ellerman	b536da7c2d	powerpc/64s: Drop unused loc parameter to MASKABLE_EXCEPTION macros We pass the "loc" (location) parameter to MASKABLE_EXCEPTION and friends, but it's not used, so drop it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:38 +10:00
Michael Ellerman	0a55c24185	powerpc/64s: Remove PSERIES naming from the MASKABLE macros Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:38 +10:00
Michael Ellerman	6adc6e9c07	powerpc/64s: Drop _MASKABLE_RELON_EXCEPTION_PSERIES() _MASKABLE_RELON_EXCEPTION_PSERIES() does nothing useful, update all callers to use __MASKABLE_RELON_EXCEPTION_PSERIES() directly. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:37 +10:00
Michael Ellerman	9bf2877ac1	powerpc/64s: Drop _MASKABLE_EXCEPTION_PSERIES() _MASKABLE_EXCEPTION_PSERIES() does nothing useful, update all callers to use __MASKABLE_EXCEPTION_PSERIES() directly. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:37 +10:00
Michael Ellerman	bdf08e1da0	powerpc/64s: Rename EXCEPTION_PROLOG_PSERIES to EXCEPTION_PROLOG Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:36 +10:00
Michael Ellerman	270373f14f	powerpc/64s: Rename EXCEPTION_RELON_PROLOG_PSERIES To just EXCEPTION_RELON_PROLOG(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:36 +10:00
Michael Ellerman	6ebb939740	powerpc/64s: Rename EXCEPTION_RELON_PROLOG_PSERIES_1 The EXCEPTION_RELON_PROLOG_PSERIES_1() macro does the same job as EXCEPTION_PROLOG_2 (which we just recently created), except for "RELON" (relocation on) exceptions. So rename it as such. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:35 +10:00
Michael Ellerman	94f3cc8e36	powerpc/64s: Remove PSERIES from the NORI macros Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:35 +10:00
Michael Ellerman	cb58a4a4b3	powerpc/64s: Rename EXCEPTION_PROLOG_PSERIES_1 to EXCEPTION_PROLOG_2 As with the other patches in this series, we are removing the "PSERIES" from the name as it's no longer meaningful. In this case it's not simply a case of removing the "PSERIES" as that would result in a clash with the existing EXCEPTION_PROLOG_1. Instead we name this one EXCEPTION_PROLOG_2, as it's usually used in sequence after 0 and 1. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:35 +10:00
Michael Ellerman	b706f42362	powerpc/64s: Rename STD_RELON_EXCEPTION_PSERIES_OOL to STD_RELON_EXCEPTION_OOL Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:34 +10:00
Michael Ellerman	e42389c5f1	powerpc/64s: Rename STD_RELON_EXCEPTION_PSERIES to STD_RELON_EXCEPTION Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:34 +10:00
Michael Ellerman	75e8bef3d6	powerpc/64s: Rename STD_EXCEPTION_PSERIES_OOL to STD_EXCEPTION_OOL Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:33 +10:00
Michael Ellerman	e899fce509	powerpc/64s: Rename STD_EXCEPTION_PSERIES to STD_EXCEPTION The "PSERIES" in STD_EXCEPTION_PSERIES is to differentiate the macros from the legacy iSeries versions, which are called STD_EXCEPTION_ISERIES. It is not anything to do with pseries vs powernv or powermac etc. We removed the legacy iSeries code in 2012, in commit 8ee3e0d69623x ("powerpc: Remove the main legacy iSerie platform code"). So remove "PSERIES" from the macros. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:33 +10:00
Michael Ellerman	92b6d65c07	powerpc/64s: Move SET_SCRATCH0() into EXCEPTION_RELON_PROLOG_PSERIES() EXCEPTION_RELON_PROLOG_PSERIES() only has two users, STD_RELON_EXCEPTION_PSERIES() and STD_RELON_EXCEPTION_HV() both of which "call" SET_SCRATCH0(), so just move SET_SCRATCH0() into EXCEPTION_RELON_PROLOG_PSERIES(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:32 +10:00
Michael Ellerman	4a7a0a8444	powerpc/64s: Move SET_SCRATCH0() into EXCEPTION_PROLOG_PSERIES() EXCEPTION_PROLOG_PSERIES() only has two users, STD_EXCEPTION_PSERIES() and STD_EXCEPTION_HV() both of which "call" SET_SCRATCH0(), so just move SET_SCRATCH0() into EXCEPTION_PROLOG_PSERIES(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:31 +10:00
Darren Stevens	250a93501d	powerpc/pasemi: Search for PCI root bus by compatible property Pasemi arch code finds the root of the PCI-e bus by searching the device-tree for a node called 'pxp'. But the root bus has a compatible property of 'pasemi,rootbus' so search for that instead. Signed-off-by: Darren Stevens <darren@stevens-zone.net> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:31 +10:00
Christophe Leroy	9412b23450	powerpc/lib: Implement strlen() in assembly for PPC32 The generic implementation of strlen() reads strings byte per byte. This patch implements strlen() in assembly based on a read of entire words, in the same spirit as what some other arches and glibc do. On a 8xx the time spent in strlen is reduced by 3/4 for long strings. strlen() selftest on an 8xx provides the following values: Before the patch (ie with the generic strlen() in lib/string.c): len 256 : time = 1.195055 len 016 : time = 0.083745 len 008 : time = 0.046828 len 004 : time = 0.028390 After the patch: len 256 : time = 0.272185 ==> 78% improvment len 016 : time = 0.040632 ==> 51% improvment len 008 : time = 0.033060 ==> 29% improvment len 004 : time = 0.029149 ==> 2% degradation On a 832x: Before the patch: len 256 : time = 0.236125 len 016 : time = 0.018136 len 008 : time = 0.011000 len 004 : time = 0.007229 After the patch: len 256 : time = 0.094950 ==> 60% improvment len 016 : time = 0.013357 ==> 26% improvment len 008 : time = 0.010586 ==> 4% improvment len 004 : time = 0.008784 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:30 +10:00
Mahesh Salgaonkar	94675cceac	powerpc/pseries: Defer the logging of rtas error to irq work queue. rtas_log_buf is a buffer to hold RTAS event data that are communicated to kernel by hypervisor. This buffer is then used to pass RTAS event data to user through proc fs. This buffer is allocated from vmalloc (non-linear mapping) area. On Machine check interrupt, register r3 points to RTAS extended event log passed by hypervisor that contains the MCE event. The pseries machine check handler then logs this error into rtas_log_buf. The rtas_log_buf is a vmalloc-ed (non-linear) buffer we end up taking up a page fault (vector 0x300) while accessing it. Since machine check interrupt handler runs in NMI context we can not afford to take any page fault. Page faults are not honored in NMI context and causes kernel panic. Apart from that, as Nick pointed out, pSeries_log_error() also takes a spin_lock while logging error which is not safe in NMI context. It may endup in deadlock if we get another MCE before releasing the lock. Fix this by deferring the logging of rtas error to irq work queue. Current implementation uses two different buffers to hold rtas error log depending on whether extended log is provided or not. This makes bit difficult to identify which buffer has valid data that needs to logged later in irq work. Simplify this using single buffer, one per paca, and copy rtas log to it irrespective of whether extended log is provided or not. Allocate this buffer below RMA region so that it can be accessed in real mode mce handler. Fixes: `b96672dd84` ("powerpc: Machine check interrupt is a non-maskable interrupt") Cc: stable@vger.kernel.org # v4.14+ Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:29 +10:00
Mahesh Salgaonkar	74e96bf44f	powerpc/pseries: Avoid using the size greater than RTAS_ERROR_LOG_MAX. The global mce data buffer that used to copy rtas error log is of 2048 (RTAS_ERROR_LOG_MAX) bytes in size. Before the copy we read extended_log_length from rtas error log header, then use max of extended_log_length and RTAS_ERROR_LOG_MAX as a size of data to be copied. Ideally the platform (phyp) will never send extended error log with size > 2048. But if that happens, then we have a risk of buffer overrun and corruption. Fix this by using min_t instead. Fixes: `d368514c30` ("powerpc: Fix corruption when grabbing FWNMI data") Reported-by: Michal Suchanek <msuchanek@suse.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:28 +10:00
Benjamin Herrenschmidt	e27e0a9465	powerpc/xive: Remove xive_kexec_teardown_cpu() It's identical to xive_teardown_cpu() so just use the latter Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:28 +10:00
Benjamin Herrenschmidt	dbc5740247	powerpc/xive: Remove now useless pr_debug statements Those overly verbose statement in the setup of the pool VP aren't particularly useful (esp. considering we don't actually use the pool, we configure it bcs HW requires it only). So remove them which improves the code readability. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:27 +10:00
Nicholas Piggin	34c604d275	powerpc/64s: free page table caches at exit_mmap time The kernel page table caches are tied to init_mm, so there is no more need for them after userspace is finished. destroy_context() gets called when we drop the last reference for an mm, which can be much later than the task exit due to other lazy mm references to it. We can free the page table cache pages on task exit because they only cache the userspace page tables and kernel threads should not access user space addresses. The mapping for kernel threads itself is maintained in init_mm and page table cache for that is attached to init_mm. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Merge change log additions from Aneesh] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:27 +10:00
Nicholas Piggin	5a6099346c	powerpc/64s/radix: tlb do not flush on page size when fullmm When the mm is being torn down there will be a full PID flush so there is no need to flush the TLB on page size changes. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:26 +10:00
Michael Ellerman	7cd129b4b5	powerpc: Add a checkpatch wrapper with our preferred settings This makes it easy to run checkpatch with settings that I like. Usage is eg: $ ./arch/powerpc/tools/checkpatch.sh -g origin/master.. To check all commits since origin/master. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Russell Currey <ruscur@russell.cc>	2018-08-07 21:49:25 +10:00
Michael Ellerman	4da1f79227	powerpc/64: Disable irq restore warning for now We recently added a warning in arch_local_irq_restore() to check that the soft masking state matches reality. Unfortunately it trips in a few places, which are not entirely trivial to fix. The key problem is if we're doing function_graph tracing of restore_math(), the warning pops and then seems to recurse. It's not entirely clear because the system continuously oopses on all CPUs, with the output interleaved and unreadable. It's also been observed on a G5 coming out of idle. Until we can fix those cases disable the warning for now. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-07 21:49:24 +10:00
Paolo Bonzini	d2ce98ca0a	Linux 4.18-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAltU8z0eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG5X8H/2fJr7m3k242+t76 sitwvx1eoPqTgryW59dRKm9IuXAGA+AjauvHzaz1QxomeQa50JghGWefD0eiJfkA 1AphQ/24EOiAbbVk084dAI/C2p122dE4D5Fy7CrfLnuouyrbFaZI5STbnrRct7sR 9deeYW0GDHO1Uenp4WDCj0baaqJqaevZ+7GG09DnWpya2nQtSkGBjqn6GpYmrfOU mqFuxAX8mEOW6cwK16y/vYtnVjuuMAiZ63/OJ8AQ6d6ArGLwAsdn7f8Fn4I4tEr2 L0d3CRLUyegms4++Dmlu05k64buQu46WlPhjCZc5/Ts4kjrNxBuHejj2/jeSnUSt vJJlibI= =42a5 -----END PGP SIGNATURE----- Merge tag 'v4.18-rc6' into HEAD Pull bug fixes into the KVM development tree to avoid nasty conflicts.	2018-08-06 17:31:36 +02:00
Yangbo Lu	a16b5da54d	powerpc/mpc85xx: add clocks property for fman ptp timer node This patch is to add clocks property for fman ptp timer node. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-05 17:11:49 -07:00
David S. Miller	c1c8626fce	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net Lots of overlapping changes, mostly trivial in nature. The mlxsw conflict was resolving using the example resolution at: https://github.com/jpirko/linux_mlxsw/blob/combined_queue/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_actions.c Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-05 13:04:31 -07:00
Linus Torvalds	2ed7533cb7	powerpc fixes for 4.18 #5 One fix for a regression in a recent TLB flush optimisation, which caused us to incorrectly not send TLB invalidations to coprocessors. Thanks to: Frederic Barrat, Nicholas Piggin, Vaibhav Jain. -----BEGIN PGP SIGNATURE----- iQIwBAABCAAaBQJbZDxOExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYAm 7w//QcYYpBcBu9XK0E53XglSoC4iTzMrPnIIivPsSngLXWtEC/5315nbCOdhT4mh CTYxmwCm0pZG4kxJntnT4MXwSLhM46eFs586uOS1BgEJw+m85m2tiEsrNQSogJ14 GzZiIk1UbzKUZjpSTIXigx0Um6x+q83kIBhd/ewrejB1wz6d5T7Wktjv2iWewGdf kOJPXuvaWXeGmJWJ2Gek3T5j7PCII0humEQjsy40XVFR3NYC7jT1r4UONE3NeBTU xavo/EcB3JHx2MPPgAfJT/KAiHLQZzuR382azf7W5sQJ7/Yy2N1jpskSxFoMEMtc axYWniHfGwJsnwcO4HPisIA67tXR6EzugAyuta4IQufRVp1s+j2WIzyqjl4uOB7Q v5ZXBH/Y4wShqVHhUzzQasiJgGYZoUoeOvJIoFNEo5f6UFuT/l9mkuM4vPtNKi6i NpT2Hj/hTATNRGyrBnvjcktS8uHbVxSHE6Cm4ZHSfQ8POMulUKvhHTZ7gYsxpxUE ktKCNgNL57QHRQojY0AftpEMLrDYiJEE9+6OLjEk3Mf7ayiZAAjIpy7LRD9+p8Q8 eomEXmwQr1YOXGgvS0mFFPd+P2PfJoQztd1fWZtZIY1xksdwGep65e24hBOEjCAV 7vVx+3pm+dekEFVf976ZYgACHCjOn8Efd5t+AK7b8S0u0uw= =Z81h -----END PGP SIGNATURE----- Merge tag 'powerpc-4.18-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "One fix for a regression in a recent TLB flush optimisation, which caused us to incorrectly not send TLB invalidations to coprocessors. Thanks to Frederic Barrat, Nicholas Piggin, Vaibhav Jain" * tag 'powerpc-4.18-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/radix: Fix missing global invalidations when removing copro	2018-08-03 10:38:21 -07:00
Reza Arbab	9eab9901b0	powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage We've encountered a performance issue when multiple processors stress {get,put}_mmio_atsd_reg(). These functions contend for mmio_atsd_usage, an unsigned long used as a bitmask. The accesses to mmio_atsd_usage are done using test_and_set_bit_lock() and clear_bit_unlock(). As implemented, both of these will require a (successful) stwcx to that same cache line. What we end up with is thread A, attempting to unlock, being slowed by other threads repeatedly attempting to lock. A's stwcx instructions fail and retry because the memory reservation is lost every time a different thread beats it to the punch. There may be a long-term way to fix this at a larger scale, but for now resolve the immediate problem by gating our call to test_and_set_bit_lock() with one to test_bit(), which is obviously implemented without using a store. Fixes: `1ab66d1fba` ("powerpc/powernv: Introduce address translation services for Nvlink2") Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-03 20:09:01 +10:00
Christoph Hellwig	06832fc004	powerpc: Do not redefine NEED_DMA_MAP_STATE kernel/dma/Kconfig already defines NEED_DMA_MAP_STATE, just select it from CONFIG_PPC using the same condition as an if guard. Signed-off-by: Christoph Hellwig <hch@lst.de> [mpe: Move it under PPC] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-03 20:09:00 +10:00
Guenter Roeck	6e0495c2e8	powerpc/4xx: Fix error return path in ppc4xx_msi_probe() An arbitrary error in ppc4xx_msi_probe() quite likely results in a crash similar to the following, seen after dma_alloc_coherent() returned an error. Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc001bff0 Oops: Kernel access of bad area, sig: 11 [#1] BE Canyonlands Modules linked in: CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.18.0-rc6-00010-gff33d1030a6c #1 NIP: c001bff0 LR: c001c418 CTR: c01faa7c REGS: cf82db40 TRAP: 0300 Tainted: G W (4.18.0-rc6-00010-gff33d1030a6c) MSR: 00029000 <CE,EE,ME> CR: 28002024 XER: 00000000 DEAR: 00000000 ESR: 00000000 GPR00: c001c418 cf82dbf0 cf828000 cf8de400 00000000 00000000 000000c4 000000c4 GPR08: c0481ea4 00000000 00000000 000000c4 22002024 00000000 c00025e8 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c0492380 0000004a GPR24: 00029000 0000000c 00000000 cf8de410 c0494d60 c0494d60 cf8bebc0 00000001 NIP [c001bff0] ppc4xx_of_msi_remove+0x48/0xa0 LR [c001c418] ppc4xx_msi_probe+0x294/0x3b8 Call Trace: [cf82dbf0] [00029000] 0x29000 (unreliable) [cf82dc10] [c001c418] ppc4xx_msi_probe+0x294/0x3b8 [cf82dc70] [c0209fbc] platform_drv_probe+0x40/0x9c [cf82dc90] [c0208240] driver_probe_device+0x2a8/0x350 [cf82dcc0] [c0206204] bus_for_each_drv+0x60/0xac [cf82dcf0] [c0207e88] __device_attach+0xe8/0x160 [cf82dd20] [c02071e0] bus_probe_device+0xa0/0xbc [cf82dd40] [c02050c8] device_add+0x404/0x5c4 [cf82dd90] [c0288978] of_platform_device_create_pdata+0x88/0xd8 [cf82ddb0] [c0288b70] of_platform_bus_create+0x134/0x220 [cf82de10] [c0288bcc] of_platform_bus_create+0x190/0x220 [cf82de70] [c0288cf4] of_platform_bus_probe+0x98/0xec [cf82de90] [c0449650] __machine_initcall_canyonlands_ppc460ex_device_probe+0x38/0x54 [cf82dea0] [c0002404] do_one_initcall+0x40/0x188 [cf82df00] [c043daec] kernel_init_freeable+0x130/0x1d0 [cf82df30] [c0002600] kernel_init+0x18/0x104 [cf82df40] [c000c23c] ret_from_kernel_thread+0x14/0x1c Instruction dump: 90010024 813d0024 2f890000 83c30058 41bd0014 48000038 813d0024 7f89f800 409d002c 813e000c 57ea103a 3bff0001 <7c69502e> 2f830000 419effe0 4803b26d ---[ end trace 8cf551077ecfc42a ]--- Fix it up. Specifically, - Return valid error codes from ppc4xx_setup_pcieh_hw(), have it clean up after itself, and only access hardware after all possible error conditions have been handled. - Use devm_kzalloc() instead of kzalloc() in ppc4xx_msi_probe() Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-03 20:08:20 +10:00
Herbert Xu	c5f5aeef9b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux Merge mainline to pick up `c7513c2a27` ("crypto/arm64: aes-ce-gcm - add missing kernel_neon_begin/end pair").	2018-08-03 17:55:12 +08:00
Nicholas Piggin	3127692deb	powernv/cpuidle: Fix idle states all being marked invalid Commit `9c7b185ab2` ("powernv/cpuidle: Parse dt idle properties into global structure") parses dt idle states into structs, but never marks them valid. This results in all idle states being lost. Fixes: `9c7b185ab2` ("powernv/cpuidle: Parse dt idle properties into global structure") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-03 18:30:33 +10:00
Linus Torvalds	ef46808b79	pci-v4.18-fixes-5 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAltjEoMUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vw3gQ/+IK9Poej5yIKG06gtbtLjflq9wRXW ZP72U9P8gpfI/L/0T+uAHVpp5ewjD8tGCgsEhS/OzU7eu/fyr5V5TnqjfFE++fi4 JnQrl4J14y8miP/iYQGf+6CwGYsHM4TnVa/yo525FocZBqYWKcLAPtYuhcXNPWNC iOKa3wii1KErJnjwCU0mR1a45dP5vJC/TCEcDD1ZQYJPxDrfB9lwwAo+rXFyPjMe TBJso1bLOzd3XTuyQqiP/HBPdtGi024eb3JdK16K273EPKoFp9Iw8irpP4hZKPqK 60wbTc4kSlo7Sj9OOdTXH1XW2xe/xDJimyyJjsVz8Rg6y8DN6Vg0aXEazm/xFmRC wxP4/U5ciivhvmCrbqB6DPqcBtu83Yxh7sZZPKw4LGU8Dk75QDhlVAmE2+vH51u8 Owt11GAjj++6EHCjeTms+bvAL/HFWeGru/A2NZE93QEy/tq3UtCLXGkni5Av1vWj u2bQPVqia1s7xPdRR3lwCOeCa7yU/LwqpNm8w0uF/0oCRKs42Ao4pU/sLYQGihoo rwqAEBKYA6TI6L7C4TmCue4cegRvK0Mhn8wZdGrJhJTKu+1gtIs0ph1bwtMin28d U8zX6Zhi9/xQSJN7iARbEcdgCZtePKkLIMSiILKMmAh+bjAs3HzLMs2czQ2g7YKU +71+CqJfgQXmxjI= =pwJS -----END PGP SIGNATURE----- Merge tag 'pci-v4.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix integer overflow in new mobiveil driver (Dan Carpenter) - Fix race during NVMe removal/rescan (Hari Vyas) * tag 'pci-v4.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Fix is_added/is_busmaster race condition PCI: mobiveil: Avoid integer overflow in IB_WIN_SIZE	2018-08-02 10:59:19 -07:00
Ingo Molnar	16e0e6a83b	Merge branch 'perf/urgent' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-08-02 09:59:20 +02:00
Christoph Hellwig	87a4c37599	kconfig: include kernel/Kconfig.preempt from init/Kconfig Almost all architectures include it. Add a ARCH_NO_PREEMPT symbol to disable preempt support for alpha, hexagon, non-coldfire m68k and user mode Linux. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2018-08-02 08:06:54 +09:00
Christoph Hellwig	06ec64b84c	Kconfig: consolidate the "Kernel hacking" menu Move the source of lib/Kconfig.debug and arch/$(ARCH)/Kconfig.debug to the top-level Kconfig. For two architectures that means moving their arch-specific symbols in that menu into a new arch Kconfig.debug file, and for a few more creating a dummy file so that we can include it unconditionally. Also move the actual 'Kernel hacking' menu to lib/Kconfig.debug, where it belongs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2018-08-02 08:06:48 +09:00
Christoph Hellwig	1572497cb0	kconfig: include common Kconfig files from top-level Kconfig Instead of duplicating the source statements in every architecture just do it once in the toplevel Kconfig file. Note that with this the inclusion of arch/$(SRCARCH/Kconfig moves out of the top-level Kconfig into arch/Kconfig so that don't violate ordering constraits while keeping a sensible menu structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2018-08-02 08:03:23 +09:00
Frederic Barrat	cca19f0b68	powerpc/64s/radix: Fix missing global invalidations when removing copro With the optimizations for TLB invalidation from commit `0cef77c779` ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask"), the scope of a TLBI (global vs. local) can now be influenced by the value of the 'copros' counter of the memory context. When calling mm_context_remove_copro(), the 'copros' counter is decremented first before flushing. It may have the unintended side effect of sending local TLBIs when we explicitly need global invalidations in this case. Thus breaking any nMMU user in a bad and unpredictable way. Fix it by flushing first, before updating the 'copros' counter, so that invalidations will be global. Fixes: `0cef77c779` ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask") Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-08-01 23:23:41 +10:00
Hari Vyas	44bda4b7d2	PCI: Fix is_added/is_busmaster race condition When a PCI device is detected, pdev->is_added is set to 1 and proc and sysfs entries are created. When the device is removed, pdev->is_added is checked for one and then device is detached with clearing of proc and sys entries and at end, pdev->is_added is set to 0. is_added and is_busmaster are bit fields in pci_dev structure sharing same memory location. A strange issue was observed with multiple removal and rescan of a PCIe NVMe device using sysfs commands where is_added flag was observed as zero instead of one while removing device and proc,sys entries are not cleared. This causes issue in later device addition with warning message "proc_dir_entry" already registered. Debugging revealed a race condition between the PCI core setting the is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue setting the is_busmaster bit in pci_set_master(). As these fields are not handled atomically, that clears the is_added bit. Move the is_added bit to a separate private flag variable and use atomic functions to set and retrieve the device addition state. This avoids the race because is_added no longer shares a memory location with is_busmaster. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283 Signed-off-by: Hari Vyas <hari.vyas@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au>	2018-07-31 11:27:54 -05:00
Sam Bobroff	b87b9cf493	powerpc/pseries: fix EEH recovery of some IOV devices EEH recovery currently fails on pSeries for some IOV capable PCI devices, if CONFIG_PCI_IOV is on and the hypervisor doesn't provide certain device tree properties for the device. (Found on an IOV capable device using the ipr driver.) Recovery fails in pci_enable_resources() at the check on r->parent, because r->flags is set and r->parent is not. This state is due to sriov_init() setting the start, end and flags members of the IOV BARs but the parent not being set later in pseries_pci_fixup_iov_resources(), because the "ibm,open-sriov-vf-bar-info" property is missing. Correct this by zeroing the resource flags for IOV BARs when they can't be configured (this is the same method used by sriov_init() and __pci_read_base()). VFs cleared this way can't be enabled later, because that requires another device tree property, "ibm,number-of-configurable-vfs" as well as support for the RTAS function "ibm_map_pes". These are all part of hypervisor support for IOV and it seems unlikely that a hypervisor would ever partially, but not fully, support it. (None are currently provided by QEMU/KVM.) Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Bryant G. Ly <bryantly@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-07-31 19:56:46 +10:00
Shilpasri G Bhat	04baaf28f4	powerpc/powernv: Add support to enable sensor groups Adds support to enable/disable a sensor group at runtime. This can be used to select the sensor groups that needs to be copied to main memory by OCC. Sensor groups like power, temperature, current, voltage, frequency, utilization can be enabled/disabled at runtime. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-07-31 19:56:45 +10:00
Akshay Adiga	1961acad2f	powernv/cpuidle: Use parsed device tree values for cpuidle_init Export pnv_idle_states and nr_pnv_idle_states so that its accessible to cpuidle driver. Use properties from pnv_idle_states structure for powernv cpuidle_init. Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-07-31 19:56:44 +10:00
Akshay Adiga	9c7b185ab2	powernv/cpuidle: Parse dt idle properties into global structure Device-tree parsing happens twice, once while deciding idle state to be used for hotplug and once during cpuidle init. Hence, parsing the device tree and caching it will reduce code duplication. Parsing code has been moved to pnv_parse_cpuidle_dt() from pnv_probe_idle_states(). In addition to the properties in the device tree the number of available states is also required. Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-07-31 19:56:44 +10:00
Christoph Hellwig	a8651194f9	PCI: Call dma_debug_add_bus() for pci_bus_type from PCI core There is nothing arch-specific about PCI or dma-debug, so call dma_debug_add_bus() from the PCI core just after registering the bus type. Most of dma-debug is already generic; this just adds reporting of pending dma-allocations on driver unload for arches other than powerpc, sh, and x86. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)	2018-07-30 15:58:01 -05:00
Alexey Spirkov	f7e2a15223	powerpc/44x: Mark mmu_init_secondary() as __init mmu_init_secondary() calls ppc44x_pin_tlb() which is marked __init, leading to a warning: The function mmu_init_secondary() references the function __init ppc44x_pin_tlb(). There's no CPU hotplug support on 44x so mmu_init_secondary() will only be called at boot. Therefore we should mark it as __init. Signed-off-by: Alexey Spirkov <alexeis@astrosoft.ru> [mpe: Flesh out change log details] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-07-30 22:48:22 +10:00
Michael Ellerman	a984506c54	powerpc/mm: Don't report PUDs as memory leaks when using kmemleak Paul Menzel reported that kmemleak was producing reports such as: unreferenced object 0xc0000000f8b80000 (size 16384): comm "init", pid 1, jiffies 4294937416 (age 312.240s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000d997deb7>] __pud_alloc+0x80/0x190 [<0000000087f2e8a3>] move_page_tables+0xbac/0xdc0 [<00000000091e51c2>] shift_arg_pages+0xc0/0x210 [<00000000ab88670c>] setup_arg_pages+0x22c/0x2a0 [<0000000060871529>] load_elf_binary+0x41c/0x1648 [<00000000ecd9d2d4>] search_binary_handler.part.11+0xbc/0x280 [<0000000034e0cdd7>] __do_execve_file.isra.13+0x73c/0x940 [<000000005f953a6e>] sys_execve+0x58/0x70 [<000000009700a858>] system_call+0x5c/0x70 Indicating that a PUD was being leaked. However what's really happening is that kmemleak is not able to recognise the references from the PGD to the PUD, because they are not fully qualified pointers. We can confirm that in xmon, eg: Find the task struct for pid 1 "init": 0:mon> P task_struct ->thread.ksp PID PPID S P CMD c0000001fe7c0000 c0000001fe803960 1 0 S 13 systemd Dump virtual address 0 to find the PGD: 0:mon> dv 0 c0000001fe7c0000 pgd @ 0xc0000000f8b01000 Dump the memory of the PGD: 0:mon> d c0000000f8b01000 c0000000f8b01000 00000000f8b90000 0000000000000000 \|................\| c0000000f8b01010 0000000000000000 0000000000000000 \|................\| c0000000f8b01020 0000000000000000 0000000000000000 \|................\| c0000000f8b01030 0000000000000000 00000000f8b80000 \|................\| ^^^^^^^^^^^^^^^^ There we can see the reference to our supposedly leaked PUD. But because it's missing the leading 0xc, kmemleak won't recognise it. We can confirm it's still in use by translating an address that is mapped via it: 0:mon> dv 7fff94000000 c0000001fe7c0000 pgd @ 0xc0000000f8b01000 pgdp @ 0xc0000000f8b01038 = 0x00000000f8b80000 <-- pudp @ 0xc0000000f8b81ff8 = 0x00000000037c4000 pmdp @ 0xc0000000037c5ca0 = 0x00000000fbd89000 ptep @ 0xc0000000fbd89000 = 0xc0800001d5ce0386 Maps physical address = 0x00000001d5ce0000 Flags = Accessed Dirty Read Write The fix is fairly simple. We need to tell kmemleak to ignore PUD allocations and never report them as leaks. We can also tell it not to scan the PGD, because it will never find pointers in there. However it will still notice if we allocate a PGD and then leak it. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-07-30 22:48:21 +10:00
Christophe Leroy	405cb4024e	powerpc: split asm/tlbflush.h Split asm/tlbflush.h into: asm/nohash/tlbflush.h asm/book3s/32/tlbflush.h asm/book3s/64/tlbflush.h (already existing) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-07-30 22:48:21 +10:00

... 2 3 4 5 6 ...

18975 Commits