mirror of https://gitee.com/openkylin/linux.git
20725 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Ingo Molnar | 1ca7feb590 |
Linux 5.4-rc7
-----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3IqJQeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOiUH+gOEDwid5OODaFAd CggXugdFIlBZefKqGVNW5sjgX8pxFWHXuEMC8iNb6QXtQZdFrI6LFf9hhUDmzQtm 6y1LPxxEiTZjObMEsBNylb7tyzgujFHcAlp0Zro3w/HLCqmYTSP3FF46i2u6KZfL XhkpM4X7R7qxlfpdhlfESv/ElRGocZe6SwXfC7pcPo5flFcmkdu9ijqhNd/6CZ/h Nf9rTsD/wEDVUelFbgVN+LJzlaB0tsyc4Zbof07n8OsFZjhdEOop8gfM/kTBLcyY 6bh66SfDScdsNnC/l8csbPjSZRx+i+nQs67DyhGNnsSAFgHBZdC4Tb/2mDCwhCLR dUvuYZc= =1N6F -----END PGP SIGNATURE----- Merge tag 'v5.4-rc7' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | 0058b0a506 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller: 1) BPF sample build fixes from Björn Töpel 2) Fix powerpc bpf tail call implementation, from Eric Dumazet. 3) DCCP leaks jiffies on the wire, fix also from Eric Dumazet. 4) Fix crash in ebtables when using dnat target, from Florian Westphal. 5) Fix port disable handling whne removing bcm_sf2 driver, from Florian Fainelli. 6) Fix kTLS sk_msg trim on fallback to copy mode, from Jakub Kicinski. 7) Various KCSAN fixes all over the networking, from Eric Dumazet. 8) Memory leaks in mlx5 driver, from Alex Vesker. 9) SMC interface refcounting fix, from Ursula Braun. 10) TSO descriptor handling fixes in stmmac driver, from Jose Abreu. 11) Add a TX lock to synchonize the kTLS TX path properly with crypto operations. From Jakub Kicinski. 12) Sock refcount during shutdown fix in vsock/virtio code, from Stefano Garzarella. 13) Infinite loop in Intel ice driver, from Colin Ian King. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (108 commits) ixgbe: need_wakeup flag might not be set for Tx i40e: need_wakeup flag might not be set for Tx igb/igc: use ktime accessors for skb->tstamp i40e: Fix for ethtool -m issue on X722 NIC iavf: initialize ITRN registers with correct values ice: fix potential infinite loop because loop counter being too small qede: fix NULL pointer deref in __qede_remove() net: fix data-race in neigh_event_send() vsock/virtio: fix sock refcnt holding during the shutdown net: ethernet: octeon_mgmt: Account for second possible VLAN header mac80211: fix station inactive_time shortly after boot net/fq_impl: Switch to kvmalloc() for memory allocation mac80211: fix ieee80211_txq_setup_flows() failure path ipv4: Fix table id reference in fib_sync_down_addr ipv6: fixes rt6_probe() and fib6_nh->last_probe init net: hns: Fix the stray netpoll locks causing deadlock in NAPI path net: usb: qmi_wwan: add support for DW5821e with eSIM support CDC-NCM: handle incomplete transfer of MTU nfc: netlink: fix double device reference drop NFC: st21nfca: fix double free ... |
|
David S. Miller | 41de23e223 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says: ==================== pull-request: bpf 2019-11-02 The following pull-request contains BPF updates for your *net* tree. We've added 6 non-merge commits during the last 6 day(s) which contain a total of 8 files changed, 35 insertions(+), 9 deletions(-). The main changes are: 1) Fix ppc BPF JIT's tail call implementation by performing a second pass to gather a stable JIT context before opcode emission, from Eric Dumazet. 2) Fix build of BPF samples sys_perf_event_open() usage to compiled out unavailable test_attr__{enabled,open} checks. Also fix potential overflows in bpf_map_{area_alloc,charge_init} on 32 bit archs, from Björn Töpel. 3) Fix narrow loads of bpf_sysctl context fields with offset > 0 on big endian archs like s390x and also improve the test coverage, from Ilya Leoshkevich. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Linus Torvalds | 8194c28efd |
powerpc fixes for 5.4 #4
Our recent cleanup of EEH led to an oops on bare metal machines when the cxl (CAPI) driver creates virtual devices for an attached FPGA accelerator. The "secure virtual machine" support we added in v5.4 had a bug if the kernel was relocated (moved during boot), in those cases the signature of the kernel text wouldn't verify and the Ultravisor would refuse to run the VM. A recent change to disable interrupts before calling arch_cpu_idle_dead() caused a WARN_ON() in our bare metal CPU offline code to always trigger. The KUAP (SMAP) support we added for 32-bit Book3S had a bug if the address range crossed a segment (256MB) boundary which could lead to spurious faults. Thanks to: Christophe Leroy, Frederic Barrat, Michael Anderson, Nicholas Piggin, Sam Bobroff, Thiago Jung Bauermann. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl29N34THG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgBVfEAC/zus1Klhx1iHInj9gF8XnehrD21OM zo2XRKFs1v7xhk3PiYqUJDXKNvq7Bi8IDQZ1luJZbZ1vbJs/zHXg4CbqHT+tCrPP AQwCl3tzgEYxr5bXt2GdmfMhGP2Vkt2l71oIwqbGBI205466ys6OpM2yVws5pR+N a4O5L1xhZQphXXu4XsfM8tVoHhgPkVkDcjaqwTKjTyLV5nDwQPlhaUFYxf1Q16YI PumHWUuT84CXGq8f1vM6mBGLvQW3JXuCzIw6JAN1kukwEKa7ugmjesBiS7OUc6IU yyMdiHFX94S/YySdm8LwIHLWttbMfv2bPzDZIZerAQU5UM3TddBjd6TDyJqVA79S nsfgkNzQCYXhUOOM94WtE+cUZ9G2VoxS84qFTLrj0Y/nhMV7uGUG+YlTrkNL4DN5 GqWdBv1lYHYbjh9GZh5YVmewK0cF6o05daspEKFyHcxucMWXhqlPHSjL0gVIpQMO czfhlF3D2jyWCO0BDNHwDa6x7BUBuKGIvm0IYhs9nPkr9WtGHEJrgwcRxc8N3Pcp rUQEj+V8CDebV6r2EsW6SIS68UpAjihp1EleiQ+1GoU2P1rHtmioPluqmA302cDy 76bAFC38pG1q3ELhEeZS7LDDR8+N2SwvaklXVCRMyEhWoRaxY2noa5ZUnO6d5vUS 0gzoth1ve2/gNA== =W7+c -----END PGP SIGNATURE----- Merge tag 'powerpc-5.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Our recent cleanup of EEH led to an oops on bare metal machines when the cxl (CAPI) driver creates virtual devices for an attached FPGA accelerator. The "secure virtual machine" support we added in v5.4 had a bug if the kernel was relocated (moved during boot), in those cases the signature of the kernel text wouldn't verify and the Ultravisor would refuse to run the VM. A recent change to disable interrupts before calling arch_cpu_idle_dead() caused a WARN_ON() in our bare metal CPU offline code to always trigger. The KUAP (SMAP) support we added for 32-bit Book3S had a bug if the address range crossed a segment (256MB) boundary which could lead to spurious faults. Thanks to: Christophe Leroy, Frederic Barrat, Michael Anderson, Nicholas Piggin, Sam Bobroff, Thiago Jung Bauermann" * tag 'powerpc-5.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/powernv: Fix CPU idle to be called with IRQs disabled powerpc/prom_init: Undo relocation before entering secure mode powerpc/powernv/eeh: Fix oops when probing cxl devices powerpc/32s: fix allow/prevent_user_access() when crossing segment boundaries. |
|
Eric Dumazet | 7de0869093 |
powerpc/bpf: Fix tail call implementation
We have seen many crashes on powerpc hosts while loading bpf programs.
The problem here is that bpf_int_jit_compile() does a first pass
to compute the program length.
Then it allocates memory to store the generated program and
calls bpf_jit_build_body() a second time (and a third time
later)
What I have observed is that the second bpf_jit_build_body()
could end up using few more words than expected.
If bpf_jit_binary_alloc() put the space for the program
at the end of the allocated page, we then write on
a non mapped memory.
It appears that bpf_jit_emit_tail_call() calls
bpf_jit_emit_common_epilogue() while ctx->seen might not
be stable.
Only after the second pass we can be sure ctx->seen wont be changed.
Trying to avoid a second pass seems quite complex and probably
not worth it.
Fixes:
|
|
Nicholas Piggin | 7d6475051f |
powerpc/powernv: Fix CPU idle to be called with IRQs disabled
Commit |
|
Thiago Jung Bauermann | 05d9a95283 |
powerpc/prom_init: Undo relocation before entering secure mode
The ultravisor will do an integrity check of the kernel image but we
relocated it so the check will fail. Restore the original image by
relocating it back to the kernel virtual base address.
This works because during build vmlinux is linked with an expected
virtual runtime address of KERNELBASE.
Fixes:
|
|
Ingo Molnar | 65133033ee |
Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Frederic Barrat | a8a30219ba |
powerpc/powernv/eeh: Fix oops when probing cxl devices
Recent cleanup in the way EEH support is added to a device causes a
kernel oops when the cxl driver probes a device and creates virtual
devices discovered on the FPGA:
BUG: Kernel NULL pointer dereference at 0x000000a0
Faulting instruction address: 0xc000000000048070
Oops: Kernel access of bad area, sig: 7 [#1]
...
NIP eeh_add_device_late.part.9+0x50/0x1e0
LR eeh_add_device_late.part.9+0x3c/0x1e0
Call Trace:
_dev_info+0x5c/0x6c (unreliable)
pnv_pcibios_bus_add_device+0x60/0xb0
pcibios_bus_add_device+0x40/0x60
pci_bus_add_device+0x30/0x100
pci_bus_add_devices+0x64/0xd0
cxl_pci_vphb_add+0xe0/0x130 [cxl]
cxl_probe+0x504/0x5b0 [cxl]
local_pci_probe+0x6c/0x110
work_for_cpu_fn+0x38/0x60
The root cause is that those cxl virtual devices don't have a
representation in the device tree and therefore no associated pci_dn
structure. In eeh_add_device_late(), pdn is NULL, so edev is NULL and
we oops.
We never had explicit support for EEH for those virtual devices.
Instead, EEH events are reported to the (real) pci device and handled
by the cxl driver. Which can then forward to the virtual devices and
handle dependencies. The fact that we try adding EEH support for the
virtual devices is new and a side-effect of the recent cleanup.
This patch fixes it by skipping adding EEH support on powernv for
devices which don't have a pci_dn structure.
The cxl driver doesn't create virtual devices on pseries so this patch
doesn't fix it there intentionally.
Fixes:
|
|
Joel Fernandes (Google) | da97e18458 |
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system call depends on the perf_event_paranoid sysctl. This has a number of limitations: 1. The sysctl is only a single value. Many types of accesses are controlled based on the single value thus making the control very limited and coarse grained. 2. The sysctl is global, so if the sysctl is changed, then that means all processes get access to perf_event_open(2) opening the door to security issues. This patch adds LSM and SELinux access checking which will be used in Android to access perf_event_open(2) for the purposes of attaching BPF programs to tracepoints, perf profiling and other operations from userspace. These operations are intended for production systems. 5 new LSM hooks are added: 1. perf_event_open: This controls access during the perf_event_open(2) syscall itself. The hook is called from all the places that the perf_event_paranoid sysctl is checked to keep it consistent with the systctl. The hook gets passed a 'type' argument which controls CPU, kernel and tracepoint accesses (in this context, CPU, kernel and tracepoint have the same semantics as the perf_event_paranoid sysctl). Additionally, I added an 'open' type which is similar to perf_event_paranoid sysctl == 3 patch carried in Android and several other distros but was rejected in mainline [1] in 2016. 2. perf_event_alloc: This allocates a new security object for the event which stores the current SID within the event. It will be useful when the perf event's FD is passed through IPC to another process which may try to read the FD. Appropriate security checks will limit access. 3. perf_event_free: Called when the event is closed. 4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event. 5. perf_event_write: Called from the ioctl(2) syscalls for the event. [1] https://lwn.net/Articles/696240/ Since Peter had suggest LSM hooks in 2016 [1], I am adding his Suggested-by tag below. To use this patch, we set the perf_event_paranoid sysctl to -1 and then apply selinux checking as appropriate (default deny everything, and then add policy rules to give access to domains that need it). In the future we can remove the perf_event_paranoid sysctl altogether. Suggested-by: Peter Zijlstra <peterz@infradead.org> Co-developed-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: James Morris <jmorris@namei.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: rostedt@goodmis.org Cc: Yonghong Song <yhs@fb.com> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: jeffv@google.com Cc: Jiri Olsa <jolsa@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: primiano@google.com Cc: Song Liu <songliubraving@fb.com> Cc: rsavitski@google.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Matthew Garrett <matthewgarrett@google.com> Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org |
|
Christophe Leroy | d10f60ae27 |
powerpc/32s: fix allow/prevent_user_access() when crossing segment boundaries.
Make sure starting addr is aligned to segment boundary so that when
incrementing the segment, the starting address of the new segment is
below the end address. Otherwise the last segment might get missed.
Fixes:
|
|
Greg Kurz | 12ade69c1e |
KVM: PPC: Book3S HV: XIVE: Ensure VP isn't already in use
Connecting a vCPU to a XIVE KVM device means establishing a 1:1 association between a vCPU id and the offset (VP id) of a VP structure within a fixed size block of VPs. We currently try to enforce the 1:1 relationship by checking that a vCPU with the same id isn't already connected. This is good but unfortunately not enough because we don't map VP ids to raw vCPU ids but to packed vCPU ids, and the packing function kvmppc_pack_vcpu_id() isn't bijective by design. We got away with it because QEMU passes vCPU ids that fit well in the packing pattern. But nothing prevents userspace to come up with a forged vCPU id resulting in a packed id collision which causes the KVM device to associate two vCPUs to the same VP. This greatly confuses the irq layer and ultimately crashes the kernel, as shown below. Example: a guest with 1 guest thread per core, a core stride of 8 and 300 vCPUs has vCPU ids 0,8,16...2392. If QEMU is patched to inject at some point an invalid vCPU id 348, which is the packed version of itself and 2392, we get: genirq: Flags mismatch irq 199. 00010000 (kvm-2-2392) vs. 00010000 (kvm-2-348) CPU: 24 PID: 88176 Comm: qemu-system-ppc Not tainted 5.3.0-xive-nr-servers-5.3-gku+ #38 Call Trace: [c000003f7f9937e0] [c000000000c0110c] dump_stack+0xb0/0xf4 (unreliable) [c000003f7f993820] [c0000000001cb480] __setup_irq+0xa70/0xad0 [c000003f7f9938d0] [c0000000001cb75c] request_threaded_irq+0x13c/0x260 [c000003f7f993940] [c00800000d44e7ac] kvmppc_xive_attach_escalation+0x104/0x270 [kvm] [c000003f7f9939d0] [c00800000d45013c] kvmppc_xive_connect_vcpu+0x424/0x620 [kvm] [c000003f7f993ac0] [c00800000d444428] kvm_arch_vcpu_ioctl+0x260/0x448 [kvm] [c000003f7f993b90] [c00800000d43593c] kvm_vcpu_ioctl+0x154/0x7c8 [kvm] [c000003f7f993d00] [c0000000004840f0] do_vfs_ioctl+0xe0/0xc30 [c000003f7f993db0] [c000000000484d44] ksys_ioctl+0x104/0x120 [c000003f7f993e00] [c000000000484d88] sys_ioctl+0x28/0x80 [c000003f7f993e20] [c00000000000b278] system_call+0x5c/0x68 xive-kvm: Failed to request escalation interrupt for queue 0 of VCPU 2392 ------------[ cut here ]------------ remove_proc_entry: removing non-empty directory 'irq/199', leaking at least 'kvm-2-348' WARNING: CPU: 24 PID: 88176 at /home/greg/Work/linux/kernel-kvm-ppc/fs/proc/generic.c:684 remove_proc_entry+0x1ec/0x200 Modules linked in: kvm_hv kvm dm_mod vhost_net vhost tap xt_CHECKSUM iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter squashfs loop fuse i2c_dev sg ofpart ocxl powernv_flash at24 xts mtd uio_pdrv_genirq vmx_crypto opal_prd ipmi_powernv uio ipmi_devintf ipmi_msghandler ibmpowernv ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables ext4 mbcache jbd2 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq libcrc32c raid1 raid0 linear sd_mod ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci libahci libata tg3 drm_panel_orientation_quirks [last unloaded: kvm] CPU: 24 PID: 88176 Comm: qemu-system-ppc Not tainted 5.3.0-xive-nr-servers-5.3-gku+ #38 NIP: c00000000053b0cc LR: c00000000053b0c8 CTR: c0000000000ba3b0 REGS: c000003f7f9934b0 TRAP: 0700 Not tainted (5.3.0-xive-nr-servers-5.3-gku+) MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 48228222 XER: 20040000 CFAR: c000000000131a50 IRQMASK: 0 GPR00: c00000000053b0c8 c000003f7f993740 c0000000015ec500 0000000000000057 GPR04: 0000000000000001 0000000000000000 000049fb98484262 0000000000001bcf GPR08: 0000000000000007 0000000000000007 0000000000000001 9000000000001033 GPR12: 0000000000008000 c000003ffffeb800 0000000000000000 000000012f4ce5a1 GPR16: 000000012ef5a0c8 0000000000000000 000000012f113bb0 0000000000000000 GPR20: 000000012f45d918 c000003f863758b0 c000003f86375870 0000000000000006 GPR24: c000003f86375a30 0000000000000007 c0002039373d9020 c0000000014c4a48 GPR28: 0000000000000001 c000003fe62a4f6b c00020394b2e9fab c000003fe62a4ec0 NIP [c00000000053b0cc] remove_proc_entry+0x1ec/0x200 LR [c00000000053b0c8] remove_proc_entry+0x1e8/0x200 Call Trace: [c000003f7f993740] [c00000000053b0c8] remove_proc_entry+0x1e8/0x200 (unreliable) [c000003f7f9937e0] [c0000000001d3654] unregister_irq_proc+0x114/0x150 [c000003f7f993880] [c0000000001c6284] free_desc+0x54/0xb0 [c000003f7f9938c0] [c0000000001c65ec] irq_free_descs+0xac/0x100 [c000003f7f993910] [c0000000001d1ff8] irq_dispose_mapping+0x68/0x80 [c000003f7f993940] [c00800000d44e8a4] kvmppc_xive_attach_escalation+0x1fc/0x270 [kvm] [c000003f7f9939d0] [c00800000d45013c] kvmppc_xive_connect_vcpu+0x424/0x620 [kvm] [c000003f7f993ac0] [c00800000d444428] kvm_arch_vcpu_ioctl+0x260/0x448 [kvm] [c000003f7f993b90] [c00800000d43593c] kvm_vcpu_ioctl+0x154/0x7c8 [kvm] [c000003f7f993d00] [c0000000004840f0] do_vfs_ioctl+0xe0/0xc30 [c000003f7f993db0] [c000000000484d44] ksys_ioctl+0x104/0x120 [c000003f7f993e00] [c000000000484d88] sys_ioctl+0x28/0x80 [c000003f7f993e20] [c00000000000b278] system_call+0x5c/0x68 Instruction dump: 2c230000 41820008 3923ff78 e8e900a0 3c82ff69 3c62ff8d 7fa6eb78 7fc5f378 3884f080 3863b948 4bbf6925 60000000 <0fe00000> 4bffff7c fba10088 4bbf6e41 ---[ end trace b925b67a74a1d8d1 ]--- BUG: Kernel NULL pointer dereference at 0x00000010 Faulting instruction address: 0xc00800000d44fc04 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: kvm_hv kvm dm_mod vhost_net vhost tap xt_CHECKSUM iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter squashfs loop fuse i2c_dev sg ofpart ocxl powernv_flash at24 xts mtd uio_pdrv_genirq vmx_crypto opal_prd ipmi_powernv uio ipmi_devintf ipmi_msghandler ibmpowernv ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables ext4 mbcache jbd2 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq libcrc32c raid1 raid0 linear sd_mod ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci libahci libata tg3 drm_panel_orientation_quirks [last unloaded: kvm] CPU: 24 PID: 88176 Comm: qemu-system-ppc Tainted: G W 5.3.0-xive-nr-servers-5.3-gku+ #38 NIP: c00800000d44fc04 LR: c00800000d44fc00 CTR: c0000000001cd970 REGS: c000003f7f9938e0 TRAP: 0300 Tainted: G W (5.3.0-xive-nr-servers-5.3-gku+) MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24228882 XER: 20040000 CFAR: c0000000001cd9ac DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0 GPR00: c00800000d44fc00 c000003f7f993b70 c00800000d468300 0000000000000000 GPR04: 00000000000000c7 0000000000000000 0000000000000000 c000003ffacd06d8 GPR08: 0000000000000000 c000003ffacd0738 0000000000000000 fffffffffffffffd GPR12: 0000000000000040 c000003ffffeb800 0000000000000000 000000012f4ce5a1 GPR16: 000000012ef5a0c8 0000000000000000 000000012f113bb0 0000000000000000 GPR20: 000000012f45d918 00007ffffe0d9a80 000000012f4f5df0 000000012ef8c9f8 GPR24: 0000000000000001 0000000000000000 c000003fe4501ed0 c000003f8b1d0000 GPR28: c0000033314689c0 c000003fe4501c00 c000003fe4501e70 c000003fe4501e90 NIP [c00800000d44fc04] kvmppc_xive_cleanup_vcpu+0xfc/0x210 [kvm] LR [c00800000d44fc00] kvmppc_xive_cleanup_vcpu+0xf8/0x210 [kvm] Call Trace: [c000003f7f993b70] [c00800000d44fc00] kvmppc_xive_cleanup_vcpu+0xf8/0x210 [kvm] (unreliable) [c000003f7f993bd0] [c00800000d450bd4] kvmppc_xive_release+0xdc/0x1b0 [kvm] [c000003f7f993c30] [c00800000d436a98] kvm_device_release+0xb0/0x110 [kvm] [c000003f7f993c70] [c00000000046730c] __fput+0xec/0x320 [c000003f7f993cd0] [c000000000164ae0] task_work_run+0x150/0x1c0 [c000003f7f993d30] [c000000000025034] do_notify_resume+0x304/0x440 [c000003f7f993e20] [c00000000000dcc4] ret_from_except_lite+0x70/0x74 Instruction dump: 3bff0008 7fbfd040 419e0054 847e0004 2fa30000 419effec e93d0000 8929203c 2f890000 419effb8 4800821d e8410018 <e9230010> e9490008 9b2a0039 7c0004ac ---[ end trace b925b67a74a1d8d2 ]--- Kernel panic - not syncing: Fatal exception This affects both XIVE and XICS-on-XIVE devices since the beginning. Check the VP id instead of the vCPU id when a new vCPU is connected. The allocation of the XIVE CPU structure in kvmppc_xive_connect_vcpu() is moved after the check to avoid the need for rollback. Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> |
|
Emmanuel Nicolet | 2272905a45 |
spufs: fix a crash in spufs_create_root()
The spu_fs_context was not set in fc->fs_private, this caused a crash
when accessing ctx->mode in spufs_create_root().
Fixes:
|
|
Jordan Niethe | 7fe4e1176d |
powerpc/kvm: Fix kvmppc_vcore->in_guest value in kvmhv_switch_to_host
kvmhv_switch_to_host() in arch/powerpc/kvm/book3s_hv_rmhandlers.S needs to set kvmppc_vcore->in_guest to 0 to signal secondary CPUs to continue. This happens after resetting the PCR. Before commit |
|
Laurent Dufour | 4ab8a485f7 |
powerpc/pseries: Remove confusing warning message.
Since commit |
|
Stephen Rothwell | 18217da361 |
powerpc/64s/radix: Fix build failure with RADIX_MMU=n
After merging the powerpc tree, today's linux-next build (powerpc64
allnoconfig) failed like this:
arch/powerpc/mm/book3s64/pgtable.c:216:3:
error: implicit declaration of function 'radix__flush_all_lpid_guest'
radix__flush_all_lpid_guest() is only declared for
CONFIG_PPC_RADIX_MMU which is not set for this build.
Fix it by adding an empty version for the RADIX_MMU=n case, which
should never be called.
Fixes:
|
|
Linus Torvalds | 2d00aee21a |
Kbuild fixes for v5.4
- remove unneeded ar-option and KBUILD_ARFLAGS - remove long-deprecated SUBDIRS - fix modpost to suppress false-positive warnings for UML builds - fix namespace.pl to handle relative paths to ${objtree}, ${srctree} - make setlocalversion work for /bin/sh - make header archive reproducible - fix some Makefiles and documents -----BEGIN PGP SIGNATURE----- iQJSBAABCgA8FiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl2YPUEeHHlhbWFkYS5t YXNhaGlyb0Bzb2Npb25leHQuY29tAAoJED2LAQed4NsGVu4P/3Qv7Ov3/R4BlgYb +LaKupCY/ADE5bRAv/N5AAy37+TJmTLQswN2/JwHflYvTeWd4kZjquFpJkFNwMsk Qlb79NQvyM9+NlFfSFjap8HBNb0J8A+92aKmrHmh1sQqJJs6JPZ0MOGoAXmgsJaN SPLvhqophKpmYu7Oa0x2aC2kq+1DnCQyMLTOuVCdrtF0tF8w0hiowDz5GOmOi1U6 VK2ECfzjenFkfbqZOUVBPVfPR9hMpmVBdKdFLwD/iTKVkShZcWmdbxk/ADbemyet 2njehRF2HGp7opbwM4AxIeIubCqYSkThUpLJarKWk/8W87gksH6pCR8yIq1nOwkO l+/GY2YdvkBdDCkSKpLiQxtJaqnZb8Yv1ZPvCfGF09Ba8tFtwX+HSecSkLFHGyJv K9FD0XSOFBkQesZWdpIr/xeLwwiuSH80QACrub1Z5Q4OCURaBkKwrO/eDG1/2Xle YKGZO2va2VVkeo5bisOZ2vfISwZrtiaGakQ8vTdq/5RO59/JvQjsGB8KbccaKXAu Ozk8vVqkwTmLP6gzIEd2Wr/ICNGuAVc0EELT7lj07hcd6rzsCxPWVXqTFsCkGBJe 587i1jeH1z9oyrHUcP6dhR3joIuOUuUJk1uR7YZq4L4POSvrJnvzMFkSv6tBKL2p Uq9qD7mpt/9zl3PART7HK9KYfTGJ =fSXc -----END PGP SIGNATURE----- Merge tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - remove unneeded ar-option and KBUILD_ARFLAGS - remove long-deprecated SUBDIRS - fix modpost to suppress false-positive warnings for UML builds - fix namespace.pl to handle relative paths to ${objtree}, ${srctree} - make setlocalversion work for /bin/sh - make header archive reproducible - fix some Makefiles and documents * tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kheaders: make headers archive reproducible kbuild: update compile-test header list for v5.4-rc2 kbuild: two minor updates for Documentation/kbuild/modules.rst scripts/setlocalversion: clear local variable to make it work for sh namespace: fix namespace.pl script to support relative paths video/logo: do not generate unneeded logo C files video/logo: remove unneeded *.o pattern from clean-files integrity: remove pointless subdir-$(CONFIG_...) integrity: remove unneeded, broken attempt to add -fshort-wchar modpost: fix static EXPORT_SYMBOL warnings for UML build kbuild: correct formatting of header in kbuild module docs kbuild: remove SUBDIRS support kbuild: remove ar-option and KBUILD_ARFLAGS |
|
Linus Torvalds | b145b0eb20 |
ARM and x86 bugfixes of all kinds. The most visible one is that migrating
a nested hypervisor has always been busted on Broadwell and newer processors, and that has finally been fixed. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJdlzTRAAoJEL/70l94x66DElcH/Rvhn5VQE/n2J+tKEXAICxQu FqcTBJ5x2mp04aFe7xD3kWoKRJmz2lmHdw2ahFd4sqqLfGEFF/KW24ADI33vzLx/ UmT78O0Je3PX77TRnEXy+napbJny0iT6ikTAQKPbyQ151JlqlbPvatpDXXLPWQHv jj6nKHCvMBrhV3kgaXO3cTFl8swX1hvR9lo9PcA2gRNt+HMN0heUmpfKughPoOes JH+UNjsEr7MYlXYlIIc9o71EYH+kgPObwlLejy0ture+dvvZEJUJjZJE8H/XG5f2 ryXG9favaCOTAvaGf0R5Es+47A3crqUr6gHS0N28QKPn7x4hehIkKpA9dXQnWIw= =1/LN -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "ARM and x86 bugfixes of all kinds. The most visible one is that migrating a nested hypervisor has always been busted on Broadwell and newer processors, and that has finally been fixed" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) KVM: x86: omit "impossible" pmu MSRs from MSR list KVM: nVMX: Fix consistency check on injected exception error code KVM: x86: omit absent pmu MSRs from MSR list selftests: kvm: Fix libkvm build error kvm: vmx: Limit guest PMCs to those supported on the host kvm: x86, powerpc: do not allow clearing largepages debugfs entry KVM: selftests: x86: clarify what is reported on KVM_GET_MSRS failure KVM: VMX: Set VMENTER_L1D_FLUSH_NOT_REQUIRED if !X86_BUG_L1TF selftests: kvm: add test for dirty logging inside nested guests KVM: x86: fix nested guest live migration with PML KVM: x86: assign two bits to track SPTE kinds KVM: x86: Expose XSAVEERPTR to the guest kvm: x86: Enumerate support for CLZERO instruction kvm: x86: Use AMD CPUID semantics for AMD vCPUs kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH KVM: X86: Fix userspace set invalid CR4 kvm: x86: Fix a spurious -E2BIG in __do_cpuid_func KVM: LAPIC: Loosen filter for adaptive tuning of lapic_timer_advance_ns KVM: arm/arm64: vgic: Use the appropriate TRACE_INCLUDE_PATH arm64: KVM: Kill hyp_alternate_select() ... |
|
Masahiro Yamada | 13dc8c029c |
kbuild: remove ar-option and KBUILD_ARFLAGS
Commit
|
|
Paolo Bonzini | 833b45de69 |
kvm: x86, powerpc: do not allow clearing largepages debugfs entry
The largepages debugfs entry is incremented/decremented as shadow pages are created or destroyed. Clearing it will result in an underflow, which is harmless to KVM but ugly (and could be misinterpreted by tools that use debugfs information), so make this particular statistic read-only. Cc: kvm-ppc@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
|
Linus Torvalds | a3c0e7b1fe |
libnvdimm fixes v5.4-rc1
- Complete the reworks to interoperate with powerpc dynamic huge page sizes - Fix a crash due to missed accounting for the powerpc 'struct page'-memmap mapping granularity. - Fix badblock initialization for volatile (DRAM emulated) pmem ranges. - Stop triggering request_key() notifications to userspace when NVDIMM-security is disabled / not present. - Miscellaneous small fixups. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJdkAprAAoJEB7SkWpmfYgCjXoQAIwJE1VzNP1V+ARxfs1rTGVz pbNJiBnj4gxDaCkcKoatiadRkytUxeUNEcPslEKsfoNinXYqkpjMQoWm2VpILOMU nY+SvIudGRnuesq2/Y+CP8zrX6rV4eBDfHK05RN/Zp1IlW7pTDItUx8mJ7glmDwG PW0vkvK7yZ+dRFnpQ7QFjhA0Q3oudO5YcTVBDK5YYtDGlv69xfXqc9LW8SszJ1kU rhCIT1kdoL5of0TIgG5pTfmggPSQ9y1xPsKjllOHNa3m50eGOkkQLELOVzQb1frW cjAsPLjRDSzvdHHSLyu0Is04Q5JU2CucxHl2SXGHiOt5tigH8dk5XFxWt0Pc8EXx acYYiBqUXC3MomSYWeLK4BdO2cRTqcPPXgJYAqXblqr+/0ys+rFepjw+j8JkiLZa 5UCC30l1GXEpw9u6gdCMqvvHN2gHvDB0BV82Sx8wTewJpeL18wCUJoKVuFmpsHko p1cCe7St1TzcK3eO+xfeW1rxNrcXUpKVYXVa/WOJW0vwErqAZ6YCdNuyJHocZzXn vNyIQmVDOlubsgBAI2ExxeZO6xc8UIwLhLg7XEJ0mg3k6UXA8HZxH2B2THJk1BSF RppodkYiMknh11sqgpGp+Hz5XSEg/jvmCdL/qRDGAwhsFhFaxDH37Kg4Qncj2/dg uDvDHXNCjbGpzCo3tyNx =Z6Fa -----END PGP SIGNATURE----- Merge tag 'libnvdimm-fixes-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm More libnvdimm updates from Dan Williams: - Complete the reworks to interoperate with powerpc dynamic huge page sizes - Fix a crash due to missed accounting for the powerpc 'struct page'-memmap mapping granularity - Fix badblock initialization for volatile (DRAM emulated) pmem ranges - Stop triggering request_key() notifications to userspace when NVDIMM-security is disabled / not present - Miscellaneous small fixups * tag 'libnvdimm-fixes-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm/region: Enable MAP_SYNC for volatile regions libnvdimm: prevent nvdimm from requesting key when security is disabled libnvdimm/region: Initialize bad block for volatile namespaces libnvdimm/nfit_test: Fix acpi_handle redefinition libnvdimm/altmap: Track namespace boundaries in altmap libnvdimm: Fix endian conversion issues libnvdimm/dax: Pick the right alignment default when creating dax devices powerpc/book3s64: Export has_transparent_hugepage() related functions. |
|
Linus Torvalds | a2953204b5 |
powerpc fixes for 5.4 #2
An assortment of fixes that were either missed by me, or didn't arrive quite in time for the first v5.4 pull. Most notable is a fix for an issue with tlbie (broadcast TLB invalidation) on Power9, when using the Radix MMU. The tlbie can race with an mtpid (move to PID register, essentially MMU context switch) on another thread of the core, which can cause stores to continue to go to a page after it's unmapped. A fix in our KVM code to add a missing barrier, the lack of which has been observed to cause missed IPIs and subsequently stuck CPUs in the host. A change to the way we initialise PCR (Processor Compatibility Register) to make it forward compatible with future CPUs. On some older PowerVM systems our H_BLOCK_REMOVE support could oops, fix it to detect such systems and fallback to the old invalidation method. A fix for an oops seen on some machines when using KASAN on 32-bit. A handful of other minor fixes, and two new selftests. Thanks to: Alistair Popple, Aneesh Kumar K.V, Christophe Leroy, Gustavo Romero, Joel Stanley, Jordan Niethe, Laurent Dufour, Michael Roth, Oliver O'Halloran. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl2PRGITHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgClRD/9jKIT6GjVpRMc+Dg9zHB5/Pir7gePk ztXKI+u15GrrXgjtWEZ1PaaXvtNIfs/IZHDQm5gyJjiBAKcGl2v+9ETaMzO5sjZ7 GSe1F8VX/MwzRnET8Jph8w/b0cy0Q8xndkEOjcJqJ7+TF+SSWqmJEdmBfkU23jWD B3kW4W1x2xt/XGsX25l1HpUgpcJqzukCeYUSCdqUu2j+sXAEZmfgTRG8uD4HffzZ 3As76TrBiJsDnkyH0qi2G1BuLXrQbAMdjTeSGi+cb0gTIunCr190gI4+Tjdu2/z7 ywWR2ZUkueCNDcLsqXaqpZx50utPJ44//uY750sk72vixJJOVuOWM6+5HKVW83se /v0zkOcI9+ywNHe0vLfP3Jm/OMMHYxkIwz6kVu2NSR6sE79B9AZpBFU+Nynq7kKl +Hc6md/HATvR+NK6LtQKtEGydRhvxU5n3KBmjq3SQj+B/ZlU6IdgerfhUWrNvg0B zzHeT35X6UBpswonhkQLgqJuaWpkClK9wsUy85MuA7aub1EP8S6/X7paKoiOtAHK NjlXM2JYV5OKwhjGgdCiI94Bdune7yudKPdsXV3Gr8Iw7wf2bQk1p7VH+LcruyE9 YJdXwCgN0PaoFUQh3AR4CqzzFwqDya8FQqdkFN3kqhRLVGAMq/PsV8/Tn+myTgQP rZnWnbfZh9BMjw== =dF42 -----END PGP SIGNATURE----- Merge tag 'powerpc-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "An assortment of fixes that were either missed by me, or didn't arrive quite in time for the first v5.4 pull. - Most notable is a fix for an issue with tlbie (broadcast TLB invalidation) on Power9, when using the Radix MMU. The tlbie can race with an mtpid (move to PID register, essentially MMU context switch) on another thread of the core, which can cause stores to continue to go to a page after it's unmapped. - A fix in our KVM code to add a missing barrier, the lack of which has been observed to cause missed IPIs and subsequently stuck CPUs in the host. - A change to the way we initialise PCR (Processor Compatibility Register) to make it forward compatible with future CPUs. - On some older PowerVM systems our H_BLOCK_REMOVE support could oops, fix it to detect such systems and fallback to the old invalidation method. - A fix for an oops seen on some machines when using KASAN on 32-bit. - A handful of other minor fixes, and two new selftests. Thanks to: Alistair Popple, Aneesh Kumar K.V, Christophe Leroy, Gustavo Romero, Joel Stanley, Jordan Niethe, Laurent Dufour, Michael Roth, Oliver O'Halloran" * tag 'powerpc-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/eeh: Fix eeh eeh_debugfs_break_device() with SRIOV devices powerpc/nvdimm: use H_SCM_QUERY hcall on H_OVERLAP error powerpc/nvdimm: Use HCALL error as the return value selftests/powerpc: Add test case for tlbie vs mtpidr ordering issue powerpc/mm: Fixup tlbie vs mtpidr/mtlpidr ordering issue on POWER9 powerpc/book3s64/radix: Rename CPU_FTR_P9_TLBIE_BUG feature flag powerpc/book3s64/mm: Don't do tlbie fixup for some hardware revisions powerpc/pseries: Call H_BLOCK_REMOVE when supported powerpc/pseries: Read TLB Block Invalidate Characteristics KVM: PPC: Book3S HV: use smp_mb() when setting/clearing host_ipi flag powerpc/mm: Fix an Oops in kasan_mmu_init() powerpc/mm: Add a helper to select PAGE_KERNEL_RO or PAGE_READONLY powerpc/64s: Set reserved PCR bits powerpc: Fix definition of PCR bits to work with old binutils powerpc/book3s64/radix: Remove WARN_ON in destroy_context() powerpc/tm: Add tm-poison test |
|
Oliver O'Halloran | 253c892193 |
powerpc/eeh: Fix eeh eeh_debugfs_break_device() with SRIOV devices
s/CONFIG_IOV/CONFIG_PCI_IOV/
Whoops.
Fixes:
|
|
Mark Rutland | b4ed71f557 |
mm: treewide: clarify pgtable_page_{ctor,dtor}() naming
The naming of pgtable_page_{ctor,dtor}() seems to have confused a few people, and until recently arm64 used these erroneously/pointlessly for other levels of page table. To make it incredibly clear that these only apply to the PTE level, and to align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them to pgtable_pte_page_{ctor,dtor}(). These changes were generated with the following shell script: ---- git grep -lw 'pgtable_page_.tor' | while read FILE; do sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE; sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE; done ---- ... with the documentation re-flowed to remain under 80 columns, and whitespace fixed up in macros to keep backslashes aligned. There should be no functional change as a result of this patch. Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | 9c9fa97a8e |
Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton: - a few hot fixes - ocfs2 updates - almost all of -mm (slab-generic, slab, slub, kmemleak, kasan, cleanups, debug, pagecache, memcg, gup, pagemap, memory-hotplug, sparsemem, vmalloc, initialization, z3fold, compaction, mempolicy, oom-kill, hugetlb, migration, thp, mmap, madvise, shmem, zswap, zsmalloc) * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (132 commits) mm/zsmalloc.c: fix a -Wunused-function warning zswap: do not map same object twice zswap: use movable memory if zpool support allocate movable memory zpool: add malloc_support_movable to zpool_driver shmem: fix obsolete comment in shmem_getpage_gfp() mm/madvise: reduce code duplication in error handling paths mm: mmap: increase sockets maximum memory size pgoff for 32bits mm/mmap.c: refine find_vma_prev() with rb_last() riscv: make mmap allocation top-down by default mips: use generic mmap top-down layout and brk randomization mips: replace arch specific way to determine 32bit task with generic version mips: adjust brk randomization offset to fit generic version mips: use STACK_TOP when computing mmap base address mips: properly account for stack randomization and stack guard gap arm: use generic mmap top-down layout and brk randomization arm: use STACK_TOP when computing mmap base address arm: properly account for stack randomization and stack guard gap arm64, mm: make randomization selected by generic topdown mmap layout arm64, mm: move generic mmap layout functions to mm arm64: consider stack randomization for mmap base only when necessary ... |
|
Kefeng Wang | 9ef258bad7 |
thp: update split_huge_page_pmd() comment
According to
|
|
Mike Rapoport | 782de70c42 |
mm: consolidate pgtable_cache_init() and pgd_cache_init()
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem cache for page table allocations on several architectures that do not use PAGE_SIZE tables for one or more levels of the page table hierarchy. Most architectures do not implement these functions and use __weak default NOP implementation of pgd_cache_init(). Since there is no such default for pgtable_cache_init(), its empty stub is duplicated among most architectures. Rename the definitions of pgd_cache_init() to pgtable_cache_init() and drop empty stubs of pgtable_cache_init(). Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Will Deacon <will@kernel.org> [arm64] Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Nicholas Piggin | 13224794cb |
mm: remove quicklist page table caches
Patch series "mm: remove quicklist page table caches". A while ago Nicholas proposed to remove quicklist page table caches [1]. I've rebased his patch on the curren upstream and switched ia64 and sh to use generic versions of PTE allocation. [1] https://lore.kernel.org/linux-mm/20190711030339.20892-1-npiggin@gmail.com This patch (of 3): Remove page table allocator "quicklists". These have been around for a long time, but have not got much traction in the last decade and are only used on ia64 and sh architectures. The numbers in the initial commit look interesting but probably don't apply anymore. If anybody wants to resurrect this it's in the git history, but it's unhelpful to have this code and divergent allocator behaviour for minor archs. Also it might be better to instead make more general improvements to page allocator if this is still so slow. Link: http://lkml.kernel.org/r/1565250728-21721-2-git-send-email-rppt@linux.ibm.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Matthew Wilcox (Oracle) | d8c6546b1a |
mm: introduce compound_nr()
Replace 1 << compound_order(page) with compound_nr(page). Minor improvements in readability. Link: http://lkml.kernel.org/r/20190721104612.19120-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Matthew Wilcox (Oracle) | 94ad933810 |
mm: introduce page_shift()
Replace PAGE_SHIFT + compound_order(page) with the new page_shift() function. Minor improvements in readability. [akpm@linux-foundation.org: fix build in tce_page_is_contained()] Link: http://lkml.kernel.org/r/201907241853.yNQTrJWd%25lkp@intel.com Link: http://lkml.kernel.org/r/20190721104612.19120-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Aneesh Kumar K.V | faa6d21153 |
powerpc/nvdimm: use H_SCM_QUERY hcall on H_OVERLAP error
Right now we force an unbind of SCM memory at drcindex on H_OVERLAP error. This really slows down operations like kexec where we get the H_OVERLAP error because we don't go through a full hypervisor re init. H_OVERLAP error for a H_SCM_BIND_MEM hcall indicates that SCM memory at drc index is already bound. Since we don't specify a logical memory address for bind hcall, we can use the H_SCM_QUERY hcall to query the already bound logical address. Boot time difference with and without patch is: [ 5.583617] IOMMU table initialized, virtual merging enabled [ 5.603041] papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Retrying bind after unbinding [ 301.514221] papr_scm ibm,persistent-memory:ibm,pmemory@44108001: Retrying bind after unbinding [ 340.057238] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 failures), 275 descs after fix [ 5.101572] IOMMU table initialized, virtual merging enabled [ 5.116984] papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Querying SCM details [ 5.117223] papr_scm ibm,persistent-memory:ibm,pmemory@44108001: Querying SCM details [ 5.120530] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 failures), 275 descs Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903123452.28620-2-aneesh.kumar@linux.ibm.com |
|
Aneesh Kumar K.V | 4111cdef0e |
powerpc/nvdimm: Use HCALL error as the return value
This simplifies the error handling and also enable us to switch to H_SCM_QUERY hcall in a later patch on H_OVERLAP error. We also do some kernel print formatting fixup in this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903123452.28620-1-aneesh.kumar@linux.ibm.com |
|
Linus Torvalds | 0b36c9eed2 |
Merge branch 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more mount API conversions from Al Viro: "Assorted conversions of options parsing to new API. gfs2 is probably the most serious one here; the rest is trivial stuff. Other things in what used to be #work.mount are going to wait for the next cycle (and preferably go via git trees of the filesystems involved)" * 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: gfs2: Convert gfs2 to fs_context vfs: Convert spufs to use the new mount API vfs: Convert hypfs to use the new mount API hypfs: Fix error number left in struct pointer member vfs: Convert functionfs to use the new mount API vfs: Convert bpf to use the new mount API |
|
Aneesh Kumar K.V | cf387d9644 |
libnvdimm/altmap: Track namespace boundaries in altmap
With PFN_MODE_PMEM namespace, the memmap area is allocated from the device
area. Some architectures map the memmap area with large page size. On
architectures like ppc64, 16MB page for memap mapping can map 262144 pfns.
This maps a namespace size of 16G.
When populating memmap region with 16MB page from the device area,
make sure the allocated space is not used to map resources outside this
namespace. Such usage of device area will prevent a namespace destroy.
Add resource end pnf in altmap and use that to check if the memmap area
allocation can map pfn outside the namespace. On ppc64 in such case we fallback
to allocation from memory.
This fix kernel crash reported below:
[ 132.034989] WARNING: CPU: 13 PID: 13719 at mm/memremap.c:133 devm_memremap_pages_release+0x2d8/0x2e0
[ 133.464754] BUG: Unable to handle kernel data access at 0xc00c00010b204000
[ 133.464760] Faulting instruction address: 0xc00000000007580c
[ 133.464766] Oops: Kernel access of bad area, sig: 11 [#1]
[ 133.464771] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
.....
[ 133.464901] NIP [c00000000007580c] vmemmap_free+0x2ac/0x3d0
[ 133.464906] LR [c0000000000757f8] vmemmap_free+0x298/0x3d0
[ 133.464910] Call Trace:
[ 133.464914] [c000007cbfd0f7b0] [c0000000000757f8] vmemmap_free+0x298/0x3d0 (unreliable)
[ 133.464921] [c000007cbfd0f8d0] [c000000000370a44] section_deactivate+0x1a4/0x240
[ 133.464928] [c000007cbfd0f980] [c000000000386270] __remove_pages+0x3a0/0x590
[ 133.464935] [c000007cbfd0fa50] [c000000000074158] arch_remove_memory+0x88/0x160
[ 133.464942] [c000007cbfd0fae0] [c0000000003be8c0] devm_memremap_pages_release+0x150/0x2e0
[ 133.464949] [c000007cbfd0fb70] [c000000000738ea0] devm_action_release+0x30/0x50
[ 133.464955] [c000007cbfd0fb90] [c00000000073a5a4] release_nodes+0x344/0x400
[ 133.464961] [c000007cbfd0fc40] [c00000000073378c] device_release_driver_internal+0x15c/0x250
[ 133.464968] [c000007cbfd0fc80] [c00000000072fd14] unbind_store+0x104/0x110
[ 133.464973] [c000007cbfd0fcd0] [c00000000072ee24] drv_attr_store+0x44/0x70
[ 133.464981] [c000007cbfd0fcf0] [c0000000004a32bc] sysfs_kf_write+0x6c/0xa0
[ 133.464987] [c000007cbfd0fd10] [c0000000004a1dfc] kernfs_fop_write+0x17c/0x250
[ 133.464993] [c000007cbfd0fd60] [c0000000003c348c] __vfs_write+0x3c/0x70
[ 133.464999] [c000007cbfd0fd80] [c0000000003c75d0] vfs_write+0xd0/0x250
djbw: Aneesh notes that this crash can likely be triggered in any kernel that
supports 'papr_scm', so flagging that commit for -stable consideration.
Fixes:
|
|
Aneesh Kumar K.V | a6f197f889 |
powerpc/book3s64: Export has_transparent_hugepage() related functions.
In later patch, we want to use hash_transparent_hugepage() in a kernel module. Export two related functions. Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20190924042440.27946-1-aneesh.kumar@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> |
|
Aneesh Kumar K.V | 047e6575ae |
powerpc/mm: Fixup tlbie vs mtpidr/mtlpidr ordering issue on POWER9
On POWER9, under some circumstances, a broadcast TLB invalidation will
fail to invalidate the ERAT cache on some threads when there are
parallel mtpidr/mtlpidr happening on other threads of the same core.
This can cause stores to continue to go to a page after it's unmapped.
The workaround is to force an ERAT flush using PID=0 or LPID=0 tlbie
flush. This additional TLB flush will cause the ERAT cache
invalidation. Since we are using PID=0 or LPID=0, we don't get
filtered out by the TLB snoop filtering logic.
We need to still follow this up with another tlbie to take care of
store vs tlbie ordering issue explained in commit:
|
|
Aneesh Kumar K.V | 09ce98cacd |
powerpc/book3s64/radix: Rename CPU_FTR_P9_TLBIE_BUG feature flag
Rename the #define to indicate this is related to store vs tlbie
ordering issue. In the next patch, we will be adding another feature
flag that is used to handles ERAT flush vs tlbie ordering issue.
Fixes:
|
|
Aneesh Kumar K.V | 677733e296 |
powerpc/book3s64/mm: Don't do tlbie fixup for some hardware revisions
The store ordering vs tlbie issue mentioned in commit |
|
Laurent Dufour | 59545ebe33 |
powerpc/pseries: Call H_BLOCK_REMOVE when supported
Depending on the hardware and the hypervisor, the hcall H_BLOCK_REMOVE
may not be able to process all the page sizes for a segment base page
size, as reported by the TLB Invalidate Characteristics.
For each pair of base segment page size and actual page size, this
characteristic tells us the size of the block the hcall supports.
In the case, the hcall is not supporting a pair of base segment page
size, actual page size, it is returning H_PARAM which leads to a panic
like this:
kernel BUG at /home/srikar/work/linux.git/arch/powerpc/platforms/pseries/lpar.c:466!
Oops: Exception in kernel mode, sig: 5 [#1]
BE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 28 PID: 583 Comm: modprobe Not tainted 5.2.0-master #5
NIP: c0000000000be8dc LR: c0000000000be880 CTR: 0000000000000000
REGS: c0000007e77fb130 TRAP: 0700 Not tainted (5.2.0-master)
MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42224824 XER: 20000000
CFAR: c0000000000be8fc IRQMASK: 0
GPR00: 0000000022224828 c0000007e77fb3c0 c000000001434d00 0000000000000005
GPR04: 9000000004fa8c00 0000000000000000 0000000000000003 0000000000000001
GPR08: c0000007e77fb450 0000000000000000 0000000000000001 ffffffffffffffff
GPR12: c0000007e77fb450 c00000000edfcb80 0000cd7d3ea30000 c0000000016022b0
GPR16: 00000000000000b0 0000cd7d3ea30000 0000000000000001 c080001f04f00105
GPR20: 0000000000000003 0000000000000004 c000000fbeb05f58 c000000001602200
GPR24: 0000000000000000 0000000000000004 8800000000000000 c000000000c5d148
GPR28: c000000000000000 8000000000000000 a000000000000000 c0000007e77fb580
NIP [c0000000000be8dc] .call_block_remove+0x12c/0x220
LR [c0000000000be880] .call_block_remove+0xd0/0x220
Call Trace:
0xc000000fb8c00240 (unreliable)
.pSeries_lpar_flush_hash_range+0x578/0x670
.flush_hash_range+0x44/0x100
.__flush_tlb_pending+0x3c/0xc0
.zap_pte_range+0x7ec/0x830
.unmap_page_range+0x3f4/0x540
.unmap_vmas+0x94/0x120
.exit_mmap+0xac/0x1f0
.mmput+0x9c/0x1f0
.do_exit+0x388/0xd60
.do_group_exit+0x54/0x100
.__se_sys_exit_group+0x14/0x20
system_call+0x5c/0x70
Instruction dump:
39400001 38a00000 4800003c 60000000 60420000 7fa9e800 38e00000 419e0014
7d29d278 7d290074 7929d182 69270001 <0b070000> 7d495378 394a0001 7fa93040
The call to H_BLOCK_REMOVE should only be made for the supported pair
of base segment page size, actual page size and using the correct
maximum block size.
Due to the required complexity in do_block_remove() and
call_block_remove(), and the fact that currently a block size of 8 is
returned by the hypervisor, we are only supporting 8 size block to the
H_BLOCK_REMOVE hcall.
In order to identify this limitation easily in the code, a local
define HBLKR_SUPPORTED_SIZE defining the currently supported block
size, and a dedicated checking helper is_supported_hlbkr() are
introduced.
For regular pages and hugetlb, the assumption is made that the page
size is equal to the base page size. For THP the page size is assumed
to be 16M.
Fixes:
|
|
Laurent Dufour | 1211ee61b4 |
powerpc/pseries: Read TLB Block Invalidate Characteristics
The PAPR document specifies the TLB Block Invalidate Characteristics
which tells for each pair of segment base page size, actual page size,
the size of the block the hcall H_BLOCK_REMOVE supports.
These characteristics are loaded at boot time in a new table
hblkr_size. The table is separate from the mmu_psize_def because this
is specific to the pseries platform.
A new init function, pseries_lpar_read_hblkrm_characteristics() is
added to read the characteristics. It is called from
pSeries_setup_arch().
Fixes:
|
|
Michael Roth | 3a83f677a6 |
KVM: PPC: Book3S HV: use smp_mb() when setting/clearing host_ipi flag
On a 2-socket Power9 system with 32 cores/128 threads (SMT4) and 1TB
of memory running the following guest configs:
guest A:
- 224GB of memory
- 56 VCPUs (sockets=1,cores=28,threads=2), where:
VCPUs 0-1 are pinned to CPUs 0-3,
VCPUs 2-3 are pinned to CPUs 4-7,
...
VCPUs 54-55 are pinned to CPUs 108-111
guest B:
- 4GB of memory
- 4 VCPUs (sockets=1,cores=4,threads=1)
with the following workloads (with KSM and THP enabled in all):
guest A:
stress --cpu 40 --io 20 --vm 20 --vm-bytes 512M
guest B:
stress --cpu 4 --io 4 --vm 4 --vm-bytes 512M
host:
stress --cpu 4 --io 4 --vm 2 --vm-bytes 256M
the below soft-lockup traces were observed after an hour or so and
persisted until the host was reset (this was found to be reliably
reproducible for this configuration, for kernels 4.15, 4.18, 5.0,
and 5.3-rc5):
[ 1253.183290] rcu: INFO: rcu_sched self-detected stall on CPU
[ 1253.183319] rcu: 124-....: (5250 ticks this GP) idle=10a/1/0x4000000000000002 softirq=5408/5408 fqs=1941
[ 1256.287426] watchdog: BUG: soft lockup - CPU#105 stuck for 23s! [CPU 52/KVM:19709]
[ 1264.075773] watchdog: BUG: soft lockup - CPU#24 stuck for 23s! [worker:19913]
[ 1264.079769] watchdog: BUG: soft lockup - CPU#31 stuck for 23s! [worker:20331]
[ 1264.095770] watchdog: BUG: soft lockup - CPU#45 stuck for 23s! [worker:20338]
[ 1264.131773] watchdog: BUG: soft lockup - CPU#64 stuck for 23s! [avocado:19525]
[ 1280.408480] watchdog: BUG: soft lockup - CPU#124 stuck for 22s! [ksmd:791]
[ 1316.198012] rcu: INFO: rcu_sched self-detected stall on CPU
[ 1316.198032] rcu: 124-....: (21003 ticks this GP) idle=10a/1/0x4000000000000002 softirq=5408/5408 fqs=8243
[ 1340.411024] watchdog: BUG: soft lockup - CPU#124 stuck for 22s! [ksmd:791]
[ 1379.212609] rcu: INFO: rcu_sched self-detected stall on CPU
[ 1379.212629] rcu: 124-....: (36756 ticks this GP) idle=10a/1/0x4000000000000002 softirq=5408/5408 fqs=14714
[ 1404.413615] watchdog: BUG: soft lockup - CPU#124 stuck for 22s! [ksmd:791]
[ 1442.227095] rcu: INFO: rcu_sched self-detected stall on CPU
[ 1442.227115] rcu: 124-....: (52509 ticks this GP) idle=10a/1/0x4000000000000002 softirq=5408/5408 fqs=21403
[ 1455.111787] INFO: task worker:19907 blocked for more than 120 seconds.
[ 1455.111822] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1
[ 1455.111833] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1455.111884] INFO: task worker:19908 blocked for more than 120 seconds.
[ 1455.111905] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1
[ 1455.111925] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1455.111966] INFO: task worker:20328 blocked for more than 120 seconds.
[ 1455.111986] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1
[ 1455.111998] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1455.112048] INFO: task worker:20330 blocked for more than 120 seconds.
[ 1455.112068] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1
[ 1455.112097] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1455.112138] INFO: task worker:20332 blocked for more than 120 seconds.
[ 1455.112159] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1
[ 1455.112179] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1455.112210] INFO: task worker:20333 blocked for more than 120 seconds.
[ 1455.112231] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1
[ 1455.112242] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1455.112282] INFO: task worker:20335 blocked for more than 120 seconds.
[ 1455.112303] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1
[ 1455.112332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1455.112372] INFO: task worker:20336 blocked for more than 120 seconds.
[ 1455.112392] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1
CPUs 45, 24, and 124 are stuck on spin locks, likely held by
CPUs 105 and 31.
CPUs 105 and 31 are stuck in smp_call_function_many(), waiting on
target CPU 42. For instance:
# CPU 105 registers (via xmon)
R00 = c00000000020b20c R16 = 00007d1bcd800000
R01 = c00000363eaa7970 R17 = 0000000000000001
R02 = c0000000019b3a00 R18 = 000000000000006b
R03 = 000000000000002a R19 = 00007d537d7aecf0
R04 = 000000000000002a R20 = 60000000000000e0
R05 = 000000000000002a R21 = 0801000000000080
R06 = c0002073fb0caa08 R22 = 0000000000000d60
R07 = c0000000019ddd78 R23 = 0000000000000001
R08 = 000000000000002a R24 = c00000000147a700
R09 = 0000000000000001 R25 = c0002073fb0ca908
R10 = c000008ffeb4e660 R26 = 0000000000000000
R11 = c0002073fb0ca900 R27 = c0000000019e2464
R12 = c000000000050790 R28 = c0000000000812b0
R13 = c000207fff623e00 R29 = c0002073fb0ca808
R14 = 00007d1bbee00000 R30 = c0002073fb0ca800
R15 = 00007d1bcd600000 R31 = 0000000000000800
pc = c00000000020b260 smp_call_function_many+0x3d0/0x460
cfar= c00000000020b270 smp_call_function_many+0x3e0/0x460
lr = c00000000020b20c smp_call_function_many+0x37c/0x460
msr = 900000010288b033 cr = 44024824
ctr = c000000000050790 xer = 0000000000000000 trap = 100
CPU 42 is running normally, doing VCPU work:
# CPU 42 stack trace (via xmon)
[link register ] c00800001be17188 kvmppc_book3s_radix_page_fault+0x90/0x2b0 [kvm_hv]
[c000008ed3343820] c000008ed3343850 (unreliable)
[c000008ed33438d0] c00800001be11b6c kvmppc_book3s_hv_page_fault+0x264/0xe30 [kvm_hv]
[c000008ed33439d0] c00800001be0d7b4 kvmppc_vcpu_run_hv+0x8dc/0xb50 [kvm_hv]
[c000008ed3343ae0] c00800001c10891c kvmppc_vcpu_run+0x34/0x48 [kvm]
[c000008ed3343b00] c00800001c10475c kvm_arch_vcpu_ioctl_run+0x244/0x420 [kvm]
[c000008ed3343b90] c00800001c0f5a78 kvm_vcpu_ioctl+0x470/0x7c8 [kvm]
[c000008ed3343d00] c000000000475450 do_vfs_ioctl+0xe0/0xc70
[c000008ed3343db0] c0000000004760e4 ksys_ioctl+0x104/0x120
[c000008ed3343e00] c000000000476128 sys_ioctl+0x28/0x80
[c000008ed3343e20] c00000000000b388 system_call+0x5c/0x70
--- Exception: c00 (System Call) at 00007d545cfd7694
SP (7d53ff7edf50) is in userspace
It was subsequently found that ipi_message[PPC_MSG_CALL_FUNCTION]
was set for CPU 42 by at least 1 of the CPUs waiting in
smp_call_function_many(), but somehow the corresponding
call_single_queue entries were never processed by CPU 42, causing the
callers to spin in csd_lock_wait() indefinitely.
Nick Piggin suggested something similar to the following sequence as
a possible explanation (interleaving of CALL_FUNCTION/RESCHEDULE
IPI messages seems to be most common, but any mix of CALL_FUNCTION and
!CALL_FUNCTION messages could trigger it):
CPU
X: smp_muxed_ipi_set_message():
X: smp_mb()
X: message[RESCHEDULE] = 1
X: doorbell_global_ipi(42):
X: kvmppc_set_host_ipi(42, 1)
X: ppc_msgsnd_sync()/smp_mb()
X: ppc_msgsnd() -> 42
42: doorbell_exception(): // from CPU X
42: ppc_msgsync()
105: smp_muxed_ipi_set_message():
105: smb_mb()
// STORE DEFERRED DUE TO RE-ORDERING
--105: message[CALL_FUNCTION] = 1
| 105: doorbell_global_ipi(42):
| 105: kvmppc_set_host_ipi(42, 1)
| 42: kvmppc_set_host_ipi(42, 0)
| 42: smp_ipi_demux_relaxed()
| 42: // returns to executing guest
| // RE-ORDERED STORE COMPLETES
->105: message[CALL_FUNCTION] = 1
105: ppc_msgsnd_sync()/smp_mb()
105: ppc_msgsnd() -> 42
42: local_paca->kvm_hstate.host_ipi == 0 // IPI ignored
105: // hangs waiting on 42 to process messages/call_single_queue
This can be prevented with an smp_mb() at the beginning of
kvmppc_set_host_ipi(), such that stores to message[<type>] (or other
state indicated by the host_ipi flag) are ordered vs. the store to
to host_ipi.
However, doing so might still allow for the following scenario (not
yet observed):
CPU
X: smp_muxed_ipi_set_message():
X: smp_mb()
X: message[RESCHEDULE] = 1
X: doorbell_global_ipi(42):
X: kvmppc_set_host_ipi(42, 1)
X: ppc_msgsnd_sync()/smp_mb()
X: ppc_msgsnd() -> 42
42: doorbell_exception(): // from CPU X
42: ppc_msgsync()
// STORE DEFERRED DUE TO RE-ORDERING
-- 42: kvmppc_set_host_ipi(42, 0)
| 42: smp_ipi_demux_relaxed()
| 105: smp_muxed_ipi_set_message():
| 105: smb_mb()
| 105: message[CALL_FUNCTION] = 1
| 105: doorbell_global_ipi(42):
| 105: kvmppc_set_host_ipi(42, 1)
| // RE-ORDERED STORE COMPLETES
-> 42: kvmppc_set_host_ipi(42, 0)
42: // returns to executing guest
105: ppc_msgsnd_sync()/smp_mb()
105: ppc_msgsnd() -> 42
42: local_paca->kvm_hstate.host_ipi == 0 // IPI ignored
105: // hangs waiting on 42 to process messages/call_single_queue
Fixing this scenario would require an smp_mb() *after* clearing
host_ipi flag in kvmppc_set_host_ipi() to order the store vs.
subsequent processing of IPI messages.
To handle both cases, this patch splits kvmppc_set_host_ipi() into
separate set/clear functions, where we execute smp_mb() prior to
setting host_ipi flag, and after clearing host_ipi flag. These
functions pair with each other to synchronize the sender and receiver
sides.
With that change in place the above workload ran for 20 hours without
triggering any lock-ups.
Fixes:
|
|
Linus Torvalds | 299d14d4c3 |
pci-v5.4-changes
-----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl2JNVAUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vyTOA/9EZeyS7J+ZcOwihWz5vNijf0kfpKp /jZ9VF9nHjsL9Pw3/Fzha605Ssrtwcqge8g/sze9f0g/pxZk99lLHokE6dEOurEA GyKpNNMdiBol4YZMCsSoYji0MpwW0uMCuASPMiEwv2LxZ72A2Tu1RbgYLU+n4m1T fQldDTxsUMXc/OH/8SL8QDEh6o8qyDRhmSXFAOv8RGqN8N3iUwVwhQobKpwpmEvx ddzqWMS8f91qkhIKO7fgc9P4NI/7yI7kkF+wcdwtfiMO8Qkr4IdcdF7qwNVAtpKA A+sMRi59i2XxDTqRFx+wXXMa+rt+Pf1pucv77SO74xXWwpuXSxLVDYjULP1YQugK FTBo4SNmico/ts+n5cgm+CGMq2P2E29VYeqkI1Un6eDDvQnQlBgQdpdcBoadJ0rW y31OInjhRJC1ZK5bATKfCMbmB+VQxFsbyeUA7PBlrALyAmXZfw30iNxX9iHBhWqc myPNVEJJGp0cWTxGxMAU9MhelzeQxDAd+Eb44J5gv51bx0w9yqmZHECSDrOVdtYi HpOyI7E3Cb8m23BOHvCdB/v8igaYMZl08LUUJqu1S9mFclYyYVuOOIB04Yc2Qrx1 3PHtT8TC47FbWuzKwo12RflzoAiNShJGw+tNKo6T1jC+r5jdbKWWtTnsoRqbSfaG rG5RJpB7EuQSP1Y= =/xB3 -----END PGP SIGNATURE----- Merge tag 'pci-v5.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Consolidate _HPP/_HPX stuff in pci-acpi.c and simplify it (Krzysztof Wilczynski) - Fix incorrect PCIe device types and remove dev->has_secondary_link to simplify code that deals with upstream/downstream ports (Mika Westerberg) - After suspend, restore Resizable BAR size bits correctly for 1MB BARs (Sumit Saxena) - Enable PCI_MSI_IRQ_DOMAIN support for RISC-V (Wesley Terpstra) Virtualization: - Add ACS quirks for iProc PAXB (Abhinav Ratna), Amazon Annapurna Labs (Ali Saidi) - Move sysfs SR-IOV functions to iov.c (Kelsey Skunberg) - Remove group write permissions from sysfs sriov_numvfs, sriov_drivers_autoprobe (Kelsey Skunberg) Hotplug: - Simplify pciehp indicator control (Denis Efremov) Peer-to-peer DMA: - Allow P2P DMA between root ports for whitelisted bridges (Logan Gunthorpe) - Whitelist some Intel host bridges for P2P DMA (Logan Gunthorpe) - DMA map P2P DMA requests that traverse host bridge (Logan Gunthorpe) Amazon Annapurna Labs host bridge driver: - Add DT binding and controller driver (Jonathan Chocron) Hyper-V host bridge driver: - Fix hv_pci_dev->pci_slot use-after-free (Dexuan Cui) - Fix PCI domain number collisions (Haiyang Zhang) - Use instance ID bytes 4 & 5 as PCI domain numbers (Haiyang Zhang) - Fix build errors on non-SYSFS config (Randy Dunlap) i.MX6 host bridge driver: - Limit DBI register length (Stefan Agner) Intel VMD host bridge driver: - Fix config addressing issues (Jon Derrick) Layerscape host bridge driver: - Add bar_fixed_64bit property to endpoint driver (Xiaowei Bao) - Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC drivers separately (Xiaowei Bao) Mediatek host bridge driver: - Add MT7629 controller support (Jianjun Wang) Mobiveil host bridge driver: - Fix CPU base address setup (Hou Zhiqiang) - Make "num-lanes" property optional (Hou Zhiqiang) Tegra host bridge driver: - Fix OF node reference leak (Nishka Dasgupta) - Disable MSI for root ports to work around design problem (Vidya Sagar) - Add Tegra194 DT binding and controller support (Vidya Sagar) - Add support for sideband pins and slot regulators (Vidya Sagar) - Add PIPE2UPHY support (Vidya Sagar) Misc: - Remove unused pci_block_cfg_access() et al (Kelsey Skunberg) - Unexport pci_bus_get(), etc (Kelsey Skunberg) - Hide PM, VC, link speed, ATS, ECRC, PTM constants and interfaces in the PCI core (Kelsey Skunberg) - Clean up sysfs DEVICE_ATTR() usage (Kelsey Skunberg) - Mark expected switch fall-through (Gustavo A. R. Silva) - Propagate errors for optional regulators and PHYs (Thierry Reding) - Fix kernel command line resource_alignment parameter issues (Logan Gunthorpe)" * tag 'pci-v5.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (112 commits) PCI: Add pci_irq_vector() and other stubs when !CONFIG_PCI arm64: tegra: Add PCIe slot supply information in p2972-0000 platform arm64: tegra: Add configuration for PCIe C5 sideband signals PCI: tegra: Add support to enable slot regulators PCI: tegra: Add support to configure sideband pins PCI: vmd: Fix shadow offsets to reflect spec changes PCI: vmd: Fix config addressing when using bus offsets PCI: dwc: Add validation that PCIe core is set to correct mode PCI: dwc: al: Add Amazon Annapurna Labs PCIe controller driver dt-bindings: PCI: Add Amazon's Annapurna Labs PCIe host bridge binding PCI: Add quirk to disable MSI-X support for Amazon's Annapurna Labs Root Port PCI/VPD: Prevent VPD access for Amazon's Annapurna Labs Root Port PCI: Add ACS quirk for Amazon Annapurna Labs root ports PCI: Add Amazon's Annapurna Labs vendor ID MAINTAINERS: Add PCI native host/endpoint controllers designated reviewer PCI: hv: Use bytes 4 and 5 from instance ID as the PCI domain numbers dt-bindings: PCI: tegra: Add PCIe slot supplies regulator entries dt-bindings: PCI: tegra: Add sideband pins configuration entries PCI: tegra: Add Tegra194 PCIe support PCI: Get rid of dev->has_secondary_link flag ... |
|
Linus Torvalds | 84da111de0 |
hmm related patches for 5.4
This is more cleanup and consolidation of the hmm APIs and the very strongly related mmu_notifier interfaces. Many places across the tree using these interfaces are touched in the process. Beyond that a cleanup to the page walker API and a few memremap related changes round out the series: - General improvement of hmm_range_fault() and related APIs, more documentation, bug fixes from testing, API simplification & consolidation, and unused API removal - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE, and make them internal kconfig selects - Hoist a lot of code related to mmu notifier attachment out of drivers by using a refcount get/put attachment idiom and remove the convoluted mmu_notifier_unregister_no_release() and related APIs. - General API improvement for the migrate_vma API and revision of its only user in nouveau - Annotate mmu_notifiers with lockdep and sleeping region debugging Two series unrelated to HMM or mmu_notifiers came along due to dependencies: - Allow pagemap's memremap_pages family of APIs to work without providing a struct device - Make walk_page_range() and related use a constant structure for function pointers -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl1/nnkACgkQOG33FX4g mxqaRg//c6FqowV1pQlLutvAOAgMdpzfZ9eaaDKngy9RVQxz+k/MmJrdRH/p/mMA Pq93A1XfwtraGKErHegFXGEDk4XhOustVAVFwvjyXO41dTUdoFVUkti6ftbrl/rS 6CT+X90jlvrwdRY7QBeuo7lxx7z8Qkqbk1O1kc1IOracjKfNJS+y6LTamy6weM3g tIMHI65PkxpRzN36DV9uCN5dMwFzJ73DWHp1b0acnDIigkl6u5zp6orAJVWRjyQX nmEd3/IOvdxaubAoAvboNS5CyVb4yS9xshWWMbH6AulKJv3Glca1Aa7QuSpBoN8v wy4c9+umzqRgzgUJUe1xwN9P49oBNhJpgBSu8MUlgBA4IOc3rDl/Tw0b5KCFVfkH yHkp8n6MP8VsRrzXTC6Kx0vdjIkAO8SUeylVJczAcVSyHIo6/JUJCVDeFLSTVymh EGWJ7zX2iRhUbssJ6/izQTTQyCH3YIyZ5QtqByWuX2U7ZrfkqS3/EnBW1Q+j+gPF Z2yW8iT6k0iENw6s8psE9czexuywa/Lttz94IyNlOQ8rJTiQqB9wLaAvg9hvUk7a kuspL+JGIZkrL3ouCeO/VA6xnaP+Q7nR8geWBRb8zKGHmtWrb5Gwmt6t+vTnCC2l olIDebrnnxwfBQhEJ5219W+M1pBpjiTpqK/UdBd92A4+sOOhOD0= =FRGg -----END PGP SIGNATURE----- Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull hmm updates from Jason Gunthorpe: "This is more cleanup and consolidation of the hmm APIs and the very strongly related mmu_notifier interfaces. Many places across the tree using these interfaces are touched in the process. Beyond that a cleanup to the page walker API and a few memremap related changes round out the series: - General improvement of hmm_range_fault() and related APIs, more documentation, bug fixes from testing, API simplification & consolidation, and unused API removal - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE, and make them internal kconfig selects - Hoist a lot of code related to mmu notifier attachment out of drivers by using a refcount get/put attachment idiom and remove the convoluted mmu_notifier_unregister_no_release() and related APIs. - General API improvement for the migrate_vma API and revision of its only user in nouveau - Annotate mmu_notifiers with lockdep and sleeping region debugging Two series unrelated to HMM or mmu_notifiers came along due to dependencies: - Allow pagemap's memremap_pages family of APIs to work without providing a struct device - Make walk_page_range() and related use a constant structure for function pointers" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (75 commits) libnvdimm: Enable unit test infrastructure compile checks mm, notifier: Catch sleeping/blocking for !blockable kernel.h: Add non_block_start/end() drm/radeon: guard against calling an unpaired radeon_mn_unregister() csky: add missing brackets in a macro for tlb.h pagewalk: use lockdep_assert_held for locking validation pagewalk: separate function pointers from iterator data mm: split out a new pagewalk.h header from mm.h mm/mmu_notifiers: annotate with might_sleep() mm/mmu_notifiers: prime lockdep mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end mm/mmu_notifiers: remove the __mmu_notifier_invalidate_range_start/end exports mm/hmm: hmm_range_fault() infinite loop mm/hmm: hmm_range_fault() NULL pointer bug mm/hmm: fix hmm_range_fault()'s handling of swapped out pages mm/mmu_notifiers: remove unregister_no_release RDMA/odp: remove ib_ucontext from ib_umem RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm' RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr RDMA/mlx5: Use ib_umem_start instead of umem.address ... |
|
Christophe Leroy | cbd18991e2 |
powerpc/mm: Fix an Oops in kasan_mmu_init()
Uncompressing Kernel Image ... OK
Loading Device Tree to 01ff7000, end 01fff74f ... OK
[ 0.000000] printk: bootconsole [udbg0] enabled
[ 0.000000] BUG: Unable to handle kernel data access at 0xf818c000
[ 0.000000] Faulting instruction address: 0xc0013c7c
[ 0.000000] Thread overran stack, or stack corrupted
[ 0.000000] Oops: Kernel access of bad area, sig: 11 [#1]
[ 0.000000] BE PAGE_SIZE=16K PREEMPT
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.3.0-rc4-s3k-dev-00743-g5abe4a3e8fd3-dirty #2080
[ 0.000000] NIP: c0013c7c LR: c0013310 CTR: 00000000
[ 0.000000] REGS: c0c5ff38 TRAP: 0300 Not tainted (5.3.0-rc4-s3k-dev-00743-g5abe4a3e8fd3-dirty)
[ 0.000000] MSR: 00001032 <ME,IR,DR,RI> CR: 99033955 XER: 80002100
[ 0.000000] DAR: f818c000 DSISR: 82000000
[ 0.000000] GPR00: c0013310 c0c5fff0 c0ad6ac0 c0c600c0 f818c031 82000000 00000000 ffffffff
[ 0.000000] GPR08: 00000000 f1f1f1f1 c0013c2c c0013304 99033955 00400008 00000000 07ff9598
[ 0.000000] GPR16: 00000000 07ffb94c 00000000 00000000 00000000 00000000 00000000 f818cfb2
[ 0.000000] GPR24: 00000000 00000000 00001000 ffffffff 00000000 c07dbf80 00000000 f818c000
[ 0.000000] NIP [c0013c7c] do_page_fault+0x50/0x904
[ 0.000000] LR [c0013310] handle_page_fault+0xc/0x38
[ 0.000000] Call Trace:
[ 0.000000] Instruction dump:
[ 0.000000] be010080 91410014 553fe8fe 3d40c001 3d20f1f1 7d800026 394a3c2c 3fffe000
[ 0.000000] 6129f1f1 900100c4 9181007c 91410018 <913f0000> 3d2001f4 6129f4f4 913f0004
Don't map the early shadow page read-only yet when creating the new
page tables for the real shadow memory, otherwise the memblock
allocations that immediately follows to create the real shadow pages
that are about to replace the early shadow page trigger a page fault
if they fall into the region being worked on at the moment.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Fixes:
|
|
Christophe Leroy | 4c0f5d1eb4 |
powerpc/mm: Add a helper to select PAGE_KERNEL_RO or PAGE_READONLY
In a couple of places there is a need to select whether read-only protection of shadow pages is performed with PAGE_KERNEL_RO or with PAGE_READONLY. Add a helper to avoid duplicating the choice. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9f33f44b9cd741c4a02b3dce7b8ef9438fe2cd2a.1566382750.git.christophe.leroy@c-s.fr |
|
Jordan Niethe | 13c7bb3c57 |
powerpc/64s: Set reserved PCR bits
Currently the reserved bits of the Processor Compatibility Register (PCR) are cleared as per the Programming Note in Section 1.3.3 of version 3.0B of the Power ISA. This causes all new architecture features to be made available when running on newer processors with new architecture features added to the PCR as bits must be set to disable a given feature. For example to disable new features added as part of Version 2.07 of the ISA the corresponding bit in the PCR needs to be set. As new processor features generally require explicit kernel support they should be disabled until such support is implemented. Therefore kernels should set all unknown/reserved bits in the PCR such that any new architecture features which the kernel does not currently know about get disabled. An update is planned to the ISA to clarify that the PCR is an exception to the Programming Note on reserved bits in Section 1.3.3. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Tested-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190917004605.22471-2-alistair@popple.id.au |
|
Alistair Popple | c6fadabb28 |
powerpc: Fix definition of PCR bits to work with old binutils
Commit
|
|
Aneesh Kumar K.V | 7aec584eaf |
powerpc/book3s64/radix: Remove WARN_ON in destroy_context()
On failed task initialization due to memory allocation failures, we can call into destroy_context() with process_tb entry already populated. This patch forces the process_tb entry to zero in destroy_context(). With this patch, we lose the ability to track if we are destroying a context without flushing the process table entry. WARNING: CPU: 4 PID: 6368 at arch/powerpc/mm/mmu_context_book3s64.c:246 destroy_context+0x58/0x340 NIP [c0000000000875f8] destroy_context+0x58/0x340 LR [c00000000013da18] __mmdrop+0x78/0x270 Call Trace: [c000000f7db77c80] [c00000000013da18] __mmdrop+0x78/0x270 [c000000f7db77cf0] [c0000000004d6a34] __do_execve_file.isra.13+0xbd4/0x1000 [c000000f7db77e00] [c0000000004d7428] sys_execve+0x58/0x70 [c000000f7db77e30] [c00000000000b388] system_call+0x5c/0x70 Reported-by: Priya M.A <priyama2@in.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Reformat/tweak comment wording] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190918140103.24395-1-aneesh.kumar@linux.ibm.com |
|
Linus Torvalds | 45824fc0da |
powerpc updates for 5.4
- Initial support for running on a system with an Ultravisor, which is software that runs below the hypervisor and protects guests against some attacks by the hypervisor. - Support for building the kernel to run as a "Secure Virtual Machine", ie. as a guest capable of running on a system with an Ultravisor. - Some changes to our DMA code on bare metal, to allow devices with medium sized DMA masks (> 32 && < 59 bits) to use more than 2GB of DMA space. - Support for firmware assisted crash dumps on bare metal (powernv). - Two series fixing bugs in and refactoring our PCI EEH code. - A large series refactoring our exception entry code to use gas macros, both to make it more readable and also enable some future optimisations. As well as many cleanups and other minor features & fixups. Thanks to: Adam Zerella, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anshuman Khandual, Balbir Singh, Benjamin Herrenschmidt, Cédric Le Goater, Christophe JAILLET, Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio Carvalho, Daniel Axtens, David Gibson, David Hildenbrand, Desnes A. Nunes do Rosario, Ganesh Goudar, Gautham R. Shenoy, Greg Kurz, Guerney Hunt, Gustavo Romero, Halil Pasic, Hari Bathini, Joakim Tjernlund, Jonathan Neuschafer, Jordan Niethe, Leonardo Bras, Lianbo Jiang, Madhavan Srinivasan, Mahesh Salgaonkar, Mahesh Salgaonkar, Masahiro Yamada, Maxiwell S. Garcia, Michael Anderson, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Qian Cai, Ram Pai, Ravi Bangoria, Reza Arbab, Ryan Grimm, Sam Bobroff, Santosh Sivaraj, Segher Boessenkool, Sukadev Bhattiprolu, Thiago Bauermann, Thiago Jung Bauermann, Thomas Gleixner, Tom Lendacky, Vasant Hegde. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl2EtEcTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgPfsD/9uXyBXn3anI/H08+mk74k5gCsmMQpn D442CD/ByogZcccp23yBTlhawtCE03hcHnCLygn0Xgd8a4YvHts/RGHUe3fPHqlG bEyZ7jsLVz5ebNZQP7r4eGs2pSzCajwJy2N9HJ/C1ojf15rrfRxoVJtnyhE2wXpm DL+6o2K+nUCB3gTQ1Inr3DnWzoGOOUfNTOea2u+J+yfHwGRqOBYpevwqiwy5eelK aRjUJCqMTvrzra49MeFwjo0Nt3/Y8UNcwA+JlGdeR8bRuWhFrYmyBRiZEKPaujNO 5EAfghBBlB0KQCqvF/tRM/c0OftHqK59AMobP9T7u9oOaBXeF/FpZX/iXjzNDPsN j9Oo2tKLTu/YVEXqBFuREGP+znANr1Wo4CFyOG8SbvYz0HFjR6XbtRJsS+0e8GWl kqX5/ZhYz3lBnKSNe9jgWOrh/J0KCSFigBTEWJT3xsn4YE8x8kK2l9KPqAIldWEP sKb2UjGS7v0NKq+NvShH88Q9AeQUEIjTcg/9aDDQDe6FaRQ7KiF8bUxSdwSPi+Fn j0lnF6i+1ATWZKuCr85veVi7C5qoe/+MqalnmP7MxULyzgXLLxUgN0SzEYO6QofK LQK/VaH2XVr5+M5YAb7K4/NX5gbM3s1bKrCiUy4EyHNvgG7gricYdbz6HgAjKpR7 oP0rHfgmVYvF1g== =WlW+ -----END PGP SIGNATURE----- Merge tag 'powerpc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "This is a bit late, partly due to me travelling, and partly due to a power outage knocking out some of my test systems *while* I was travelling. - Initial support for running on a system with an Ultravisor, which is software that runs below the hypervisor and protects guests against some attacks by the hypervisor. - Support for building the kernel to run as a "Secure Virtual Machine", ie. as a guest capable of running on a system with an Ultravisor. - Some changes to our DMA code on bare metal, to allow devices with medium sized DMA masks (> 32 && < 59 bits) to use more than 2GB of DMA space. - Support for firmware assisted crash dumps on bare metal (powernv). - Two series fixing bugs in and refactoring our PCI EEH code. - A large series refactoring our exception entry code to use gas macros, both to make it more readable and also enable some future optimisations. As well as many cleanups and other minor features & fixups. Thanks to: Adam Zerella, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anshuman Khandual, Balbir Singh, Benjamin Herrenschmidt, Cédric Le Goater, Christophe JAILLET, Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio Carvalho, Daniel Axtens, David Gibson, David Hildenbrand, Desnes A. Nunes do Rosario, Ganesh Goudar, Gautham R. Shenoy, Greg Kurz, Guerney Hunt, Gustavo Romero, Halil Pasic, Hari Bathini, Joakim Tjernlund, Jonathan Neuschafer, Jordan Niethe, Leonardo Bras, Lianbo Jiang, Madhavan Srinivasan, Mahesh Salgaonkar, Mahesh Salgaonkar, Masahiro Yamada, Maxiwell S. Garcia, Michael Anderson, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Qian Cai, Ram Pai, Ravi Bangoria, Reza Arbab, Ryan Grimm, Sam Bobroff, Santosh Sivaraj, Segher Boessenkool, Sukadev Bhattiprolu, Thiago Bauermann, Thiago Jung Bauermann, Thomas Gleixner, Tom Lendacky, Vasant Hegde" * tag 'powerpc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (264 commits) powerpc/mm/mce: Keep irqs disabled during lockless page table walk powerpc: Use ftrace_graph_ret_addr() when unwinding powerpc/ftrace: Enable HAVE_FUNCTION_GRAPH_RET_ADDR_PTR ftrace: Look up the address of return_to_handler() using helpers powerpc: dump kernel log before carrying out fadump or kdump docs: powerpc: Add missing documentation reference powerpc/xmon: Fix output of XIVE IPI powerpc/xmon: Improve output of XIVE interrupts powerpc/mm/radix: remove useless kernel messages powerpc/fadump: support holes in kernel boot memory area powerpc/fadump: remove RMA_START and RMA_END macros powerpc/fadump: update documentation about option to release opalcore powerpc/fadump: consider f/w load area powerpc/opalcore: provide an option to invalidate /sys/firmware/opal/core file powerpc/opalcore: export /sys/firmware/opal/core for analysing opal crashes powerpc/fadump: update documentation about CONFIG_PRESERVE_FA_DUMP powerpc/fadump: add support to preserve crash data on FADUMP disabled kernel powerpc/fadump: improve how crashed kernel's memory is reserved powerpc/fadump: consider reserved ranges while releasing memory powerpc/fadump: make crash memory ranges array allocation generic ... |
|
Linus Torvalds | d7b0827f28 |
Kbuild updates for v5.4
- add modpost warn exported symbols marked as 'static' because 'static' and EXPORT_SYMBOL is an odd combination - break the build early if gold linker is used - optimize the Bison rule to produce .c and .h files by a single pattern rule - handle PREEMPT_RT in the module vermagic and UTS_VERSION - warn CONFIG options leaked to the user-space except existing ones - make single targets work properly - rebuild modules when module linker scripts are updated - split the module final link stage into scripts/Makefile.modfinal - fix the missed error code in merge_config.sh - improve the error message displayed on the attempt of the O= build in unclean source tree - remove 'clean-dirs' syntax - disable -Wimplicit-fallthrough warning for Clang - add CONFIG_CC_OPTIMIZE_FOR_SIZE_O3 for ARC - remove ARCH_{CPP,A,C}FLAGS variables - add $(BASH) to run bash scripts - change *CFLAGS_<basetarget>.o to take the relative path to $(obj) instead of the basename - stop suppressing Clang's -Wunused-function warnings when W=1 - fix linux/export.h to avoid genksyms calculating CRC of trimmed exported symbols - misc cleanups -----BEGIN PGP SIGNATURE----- iQJSBAABCgA8FiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl1+OnoeHHlhbWFkYS5t YXNhaGlyb0Bzb2Npb25leHQuY29tAAoJED2LAQed4NsGoKEQAKcid9lDacMe5KWT 4Ic93hANMFKZ9Qy8WoxivnOr1a93NcloZ0Bhka96QUt7hYUkLmDCs99eMbxKuMfP m/ViHepojOBPzq+VtAGWOiIyPMCA7XDrTPph4wcPDKeOURTreK1PZ20fxDoAR4to +qaqKZJGdRcNf2DpJN1yIosz8Wj0Sa2LQrRi9jgUHi3bzgvLfL7P9WM2xyZMggAc GaSktCEFL0UzMFlMpYyDrKh2EV6ryOnN8+bVAKbmWP89tuU3njutycKdWOoL+bsj tH2kjFThxQyIcZGNHS1VzNunYAFE2q5nj2q47O1EDN6sjTYUoRn5cHwPam6x3Kly NH88xDEtJ7sUUc9GZEIXADWWD0f08QIhAH5x+jxFg3529lNgyrNHRSQ2XceYNAnG i/GnMJ0EhODOFKusXw7sNlWFKtukep+8/pwnvfTXWQu6plEm5EQ3a3RL5SESubVo mHzXsQDFCE0x/UrsJxEAww+3YO3pQEelfVi74W9z0cckpbRF8FuUq/69ltOT15l4 X+gCz80lXMWBKw/kNoR4GQoAJo3KboMEociawwoj72HXEHTPLJnCdUOsAf3n+opj xuz/UPZ4WYSgKdnbmmDbJ+1POA1NqtARZZXpMVyKVVCOiLafbJkLQYwLKEpE2mOO TP9igzP1i3/jPWec8cJ6Fa8UwuGh =VGqV -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - add modpost warn exported symbols marked as 'static' because 'static' and EXPORT_SYMBOL is an odd combination - break the build early if gold linker is used - optimize the Bison rule to produce .c and .h files by a single pattern rule - handle PREEMPT_RT in the module vermagic and UTS_VERSION - warn CONFIG options leaked to the user-space except existing ones - make single targets work properly - rebuild modules when module linker scripts are updated - split the module final link stage into scripts/Makefile.modfinal - fix the missed error code in merge_config.sh - improve the error message displayed on the attempt of the O= build in unclean source tree - remove 'clean-dirs' syntax - disable -Wimplicit-fallthrough warning for Clang - add CONFIG_CC_OPTIMIZE_FOR_SIZE_O3 for ARC - remove ARCH_{CPP,A,C}FLAGS variables - add $(BASH) to run bash scripts - change *CFLAGS_<basetarget>.o to take the relative path to $(obj) instead of the basename - stop suppressing Clang's -Wunused-function warnings when W=1 - fix linux/export.h to avoid genksyms calculating CRC of trimmed exported symbols - misc cleanups * tag 'kbuild-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (63 commits) genksyms: convert to SPDX License Identifier for lex.l and parse.y modpost: use __section in the output to *.mod.c modpost: use MODULE_INFO() for __module_depends export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols export.h: remove defined(__KERNEL__), which is no longer needed kbuild: allow Clang to find unused static inline functions for W=1 build kbuild: rename KBUILD_ENABLE_EXTRA_GCC_CHECKS to KBUILD_EXTRA_WARN kbuild: refactor scripts/Makefile.extrawarn merge_config.sh: ignore unwanted grep errors kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj) modpost: add NOFAIL to strndup modpost: add guid_t type definition kbuild: add $(BASH) to run scripts with bash-extension kbuild: remove ARCH_{CPP,A,C}FLAGS kbuild,arc: add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC kbuild: Do not enable -Wimplicit-fallthrough for clang for now kbuild: clean up subdir-ymn calculation in Makefile.clean kbuild: remove unneeded '+' marker from cmd_clean kbuild: remove clean-dirs syntax kbuild: check clean srctree even earlier ... |