mirror of https://gitee.com/openkylin/linux.git
67503 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Andy Shevchenko | aa6159ab99 |
kernel.h: split out mathematical helpers
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out mathematical helpers. At the same time convert users in header and lib folder to use new header. Though for time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [sfr@canb.auug.org.au: fix powerpc build] Link: https://lkml.kernel.org/r/20201029150809.13059608@canb.auug.org.au Link: https://lkml.kernel.org/r/20201028173212.41768-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Hui Su | a9389683fa |
fs/proc: make pde_get() return nothing
We don't need pde_get()'s return value, so make pde_get() return nothing Link: https://lkml.kernel.org/r/20201211061944.GA2387571@rlk Signed-off-by: Hui Su <sh_def@163.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Alexey Dobriyan | c6c75deda8 |
proc: fix lookup in /proc/net subdirectories after setns(2)
Commit |
|
Anand K Mistry | fe71988834 |
proc: provide details on indirect branch speculation
Similar to speculation store bypass, show information about the indirect branch speculation mode of a task in /proc/$pid/status. For testing/benchmarking, I needed to see whether IB (Indirect Branch) speculation (see Spectre-v2) is enabled on a task, to see whether an IBPB instruction should be executed on an address space switch. Unfortunately, this information isn't available anywhere else and currently the only way to get it is to hack the kernel to expose it (like this change). It also helped expose a bug with conditional IB speculation on certain CPUs. Another place this could be useful is to audit the system when using sanboxing. With this change, I can confirm that seccomp-enabled process have IB speculation force disabled as expected when the kernel command line parameter `spectre_v2_user=seccomp`. Since there's already a 'Speculation_Store_Bypass' field, I used that as precedent for adding this one. [amistry@google.com: remove underscores from field name to workaround documentation issue] Link: https://lkml.kernel.org/r/20201106131015.v2.1.I7782b0cedb705384a634cfd8898eb7523562da99@changeid Link: https://lkml.kernel.org/r/20201030172731.1.I7782b0cedb705384a634cfd8898eb7523562da99@changeid Signed-off-by: Anand K Mistry <amistry@google.com> Cc: Anthony Steinhauser <asteinhauser@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Anand K Mistry <amistry@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alexey Gladkov <gladkov.alexey@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: NeilBrown <neilb@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Randy Dunlap | d2928e8550 |
procfs: delete duplicated words + other fixes
Delete repeated words in fs/proc/. {the, which} where "which which" was changed to "with which". Link: https://lkml.kernel.org/r/20201028191525.13413-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | 7240153a9b |
Driver core updates for 5.11-rc1
Here is the big driver core updates for 5.11-rc1 This time there was a lot of different work happening here for some reason: - redo of the fwnode link logic, speeding it up greatly - auxiliary bus added (this was a tag that will be pulled in from other trees/maintainers this merge window as well, as driver subsystems started to rely on it) - platform driver core cleanups on the way to fixing some long-time api updates in future releases - minor fixes and tweaks. All have been in linux-next with no (finally) reported issues. Testing there did helped in shaking issues out a lot :) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX9iEUQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynBJwCgjBAtVWXquZz4m/pyjn0HoTC7tdYAnAlQIj9s vRbPjOgH9R+YRJzFs1Kx =X6UP -----END PGP SIGNATURE----- Merge tag 'driver-core-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big driver core updates for 5.11-rc1 This time there was a lot of different work happening here for some reason: - redo of the fwnode link logic, speeding it up greatly - auxiliary bus added (this was a tag that will be pulled in from other trees/maintainers this merge window as well, as driver subsystems started to rely on it) - platform driver core cleanups on the way to fixing some long-time api updates in future releases - minor fixes and tweaks. All have been in linux-next with no (finally) reported issues. Testing there did helped in shaking issues out a lot :)" * tag 'driver-core-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (39 commits) driver core: platform: don't oops in platform_shutdown() on unbound devices ACPI: Use fwnode_init() to set up fwnode misc: pvpanic: Replace OF headers by mod_devicetable.h misc: pvpanic: Combine ACPI and platform drivers usb: host: sl811: Switch to use platform_get_mem_or_io() vfio: platform: Switch to use platform_get_mem_or_io() driver core: platform: Introduce platform_get_mem_or_io() dyndbg: fix use before null check soc: fix comment for freeing soc_dev_attr driver core: platform: use bus_type functions driver core: platform: change logic implementing platform_driver_probe driver core: platform: reorder functions driver core: make driver_probe_device() static driver core: Fix a couple of typos driver core: Reorder devices on successful probe driver core: Delete pointless parameter in fwnode_operations.add_links driver core: Refactor fw_devlink feature efi: Update implementation of add_links() to create fwnode links of: property: Update implementation of add_links() to create fwnode links driver core: Use device's fwnode to check if it is waiting for suppliers ... |
|
Linus Torvalds | d635a69dd4 |
Networking updates for 5.11
Core: - support "prefer busy polling" NAPI operation mode, where we defer softirq for some time expecting applications to periodically busy poll - AF_XDP: improve efficiency by more batching and hindering the adjacency cache prefetcher - af_packet: make packet_fanout.arr size configurable up to 64K - tcp: optimize TCP zero copy receive in presence of partial or unaligned reads making zero copy a performance win for much smaller messages - XDP: add bulk APIs for returning / freeing frames - sched: support fragmenting IP packets as they come out of conntrack - net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs BPF: - BPF switch from crude rlimit-based to memcg-based memory accounting - BPF type format information for kernel modules and related tracing enhancements - BPF implement task local storage for BPF LSM - allow the FENTRY/FEXIT/RAW_TP tracing programs to use bpf_sk_storage Protocols: - mptcp: improve multiple xmit streams support, memory accounting and many smaller improvements - TLS: support CHACHA20-POLY1305 cipher - seg6: add support for SRv6 End.DT4/DT6 behavior - sctp: Implement RFC 6951: UDP Encapsulation of SCTP - ppp_generic: add ability to bridge channels directly - bridge: Connectivity Fault Management (CFM) support as is defined in IEEE 802.1Q section 12.14. Drivers: - mlx5: make use of the new auxiliary bus to organize the driver internals - mlx5: more accurate port TX timestamping support - mlxsw: - improve the efficiency of offloaded next hop updates by using the new nexthop object API - support blackhole nexthops - support IEEE 802.1ad (Q-in-Q) bridging - rtw88: major bluetooth co-existance improvements - iwlwifi: support new 6 GHz frequency band - ath11k: Fast Initial Link Setup (FILS) - mt7915: dual band concurrent (DBDC) support - net: ipa: add basic support for IPA v4.5 Refactor: - a few pieces of in_interrupt() cleanup work from Sebastian Andrzej Siewior - phy: add support for shared interrupts; get rid of multiple driver APIs and have the drivers write a full IRQ handler, slight growth of driver code should be compensated by the simpler API which also allows shared IRQs - add common code for handling netdev per-cpu counters - move TX packet re-allocation from Ethernet switch tag drivers to a central place - improve efficiency and rename nla_strlcpy - number of W=1 warning cleanups as we now catch those in a patchwork build bot Old code removal: - wan: delete the DLCI / SDLA drivers - wimax: move to staging - wifi: remove old WDS wifi bridging support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl/YXmUACgkQMUZtbf5S IrvSQBAAgOrt4EFopEvVqlTHZbqI45IEqgtXS+YWmlgnjZCgshyMj8q1yK1zzane qYxr/NNJ9kV3FdtaynmmHPgEEEfR5kJ/D3B2BsxYDkaDDrD0vbNsBGw+L+/Gbhxl N/5l/9FjLyLY1D+EErknuwR5XGuQ6BSDVaKQMhYOiK2hgdnAAI4hszo8Chf6wdD0 XDBslQ7vpD/05r+eMj0IkS5dSAoGOIFXUxhJ5dqrDbRHiKsIyWqA3PLbYemfAhxI s2XckjfmSgGE3FKL8PSFu+EcfHbJQQjLcULJUnqgVcdwEEtRuE9ggEi52nZRXMWM 4e8sQJAR9Fx7pZy0G1xfS149j6iPU5LjRlU9TNSpVABz14Vvvo3gEL6gyIdsz+xh hMN7UBdp0FEaP028CXoIYpaBesvQqj0BSndmee8qsYAtN6j+QKcM2AOSr7JN1uMH C/86EDoGAATiEQIVWJvnX5MPmlAoblyLA+RuVhmxkIBx2InGXkFmWqRkXT5l4jtk LVl8/TArR4alSQqLXictXCjYlCm9j5N4zFFtEVasSYi7/ZoPfgRNWT+lJ2R8Y+Zv +htzGaFuyj6RJTVeFQMrkl3whAtBamo2a0kwg45NnxmmXcspN6kJX1WOIy82+MhD Yht7uplSs7MGKA78q/CDU0XBeGjpABUvmplUQBIfrR/jKLW2730= =GXs1 -----END PGP SIGNATURE----- Merge tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - support "prefer busy polling" NAPI operation mode, where we defer softirq for some time expecting applications to periodically busy poll - AF_XDP: improve efficiency by more batching and hindering the adjacency cache prefetcher - af_packet: make packet_fanout.arr size configurable up to 64K - tcp: optimize TCP zero copy receive in presence of partial or unaligned reads making zero copy a performance win for much smaller messages - XDP: add bulk APIs for returning / freeing frames - sched: support fragmenting IP packets as they come out of conntrack - net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs BPF: - BPF switch from crude rlimit-based to memcg-based memory accounting - BPF type format information for kernel modules and related tracing enhancements - BPF implement task local storage for BPF LSM - allow the FENTRY/FEXIT/RAW_TP tracing programs to use bpf_sk_storage Protocols: - mptcp: improve multiple xmit streams support, memory accounting and many smaller improvements - TLS: support CHACHA20-POLY1305 cipher - seg6: add support for SRv6 End.DT4/DT6 behavior - sctp: Implement RFC 6951: UDP Encapsulation of SCTP - ppp_generic: add ability to bridge channels directly - bridge: Connectivity Fault Management (CFM) support as is defined in IEEE 802.1Q section 12.14. Drivers: - mlx5: make use of the new auxiliary bus to organize the driver internals - mlx5: more accurate port TX timestamping support - mlxsw: - improve the efficiency of offloaded next hop updates by using the new nexthop object API - support blackhole nexthops - support IEEE 802.1ad (Q-in-Q) bridging - rtw88: major bluetooth co-existance improvements - iwlwifi: support new 6 GHz frequency band - ath11k: Fast Initial Link Setup (FILS) - mt7915: dual band concurrent (DBDC) support - net: ipa: add basic support for IPA v4.5 Refactor: - a few pieces of in_interrupt() cleanup work from Sebastian Andrzej Siewior - phy: add support for shared interrupts; get rid of multiple driver APIs and have the drivers write a full IRQ handler, slight growth of driver code should be compensated by the simpler API which also allows shared IRQs - add common code for handling netdev per-cpu counters - move TX packet re-allocation from Ethernet switch tag drivers to a central place - improve efficiency and rename nla_strlcpy - number of W=1 warning cleanups as we now catch those in a patchwork build bot Old code removal: - wan: delete the DLCI / SDLA drivers - wimax: move to staging - wifi: remove old WDS wifi bridging support" * tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1922 commits) net: hns3: fix expression that is currently always true net: fix proc_fs init handling in af_packet and tls nfc: pn533: convert comma to semicolon af_vsock: Assign the vsock transport considering the vsock address flags af_vsock: Set VMADDR_FLAG_TO_HOST flag on the receive path vsock_addr: Check for supported flag values vm_sockets: Add VMADDR_FLAG_TO_HOST vsock flag vm_sockets: Add flags field in the vsock address data structure net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context nfc: s3fwrn5: Release the nfc firmware net: vxget: clean up sparse warnings mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3 mlxsw: spectrum_router_xm: Introduce basic XM cache flushing mlxsw: reg: Add Router LPM Cache Enable Register mlxsw: reg: Add Router LPM Cache ML Delete Register mlxsw: spectrum_router_xm: Implement L-value tracking for M-index mlxsw: reg: Add XM Router M Table Register ... |
|
Linus Torvalds | ac73e3dc8a |
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton: - a few random little subsystems - almost all of the MM patches which are staged ahead of linux-next material. I'll trickle to post-linux-next work in as the dependents get merged up. Subsystems affected by this patch series: kthread, kbuild, ide, ntfs, ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc, uaccess, zram, and cleanups). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits) mm: cleanup kstrto*() usage mm: fix fall-through warnings for Clang mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at mm: shmem: convert shmem_enabled_show to use sysfs_emit_at mm:backing-dev: use sysfs_emit in macro defining functions mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening mm: use sysfs_emit for struct kobject * uses mm: fix kernel-doc markups zram: break the strict dependency from lzo zram: add stat to gather incompressible pages since zram set up zram: support page writeback mm/process_vm_access: remove redundant initialization of iov_r mm/zsmalloc.c: rework the list_add code in insert_zspage() mm/zswap: move to use crypto_acomp API for hardware acceleration mm/zswap: fix passing zero to 'PTR_ERR' warning mm/zswap: make struct kernel_param_ops definitions const userfaultfd/selftests: hint the test runner on required privilege userfaultfd/selftests: fix retval check for userfaultfd_open() userfaultfd/selftests: always dump something in modes userfaultfd: selftests: make __{s,u}64 format specifiers portable ... |
|
Lokesh Gidra | d0d4730ac2 |
userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob
With this change, when the knob is set to 0, it allows unprivileged users to call userfaultfd, like when it is set to 1, but with the restriction that page faults from only user-mode can be handled. In this mode, an unprivileged user (without SYS_CAP_PTRACE capability) must pass UFFD_USER_MODE_ONLY to userfaultd or the API will fail with EPERM. This enables administrators to reduce the likelihood that an attacker with access to userfaultfd can delay faulting kernel code to widen timing windows for other exploits. The default value of this knob is changed to 0. This is required for correct functioning of pipe mutex. However, this will fail postcopy live migration, which will be unnoticeable to the VM guests. To avoid this, set 'vm.userfault = 1' in /sys/sysctl.conf. The main reason this change is desirable as in the short term is that the Android userland will behave as with the sysctl set to zero. So without this commit, any Linux binary using userfaultfd to manage its memory would behave differently if run within the Android userland. For more details, refer to Andrea's reply [1]. [1] https://lore.kernel.org/lkml/20200904033438.GI9411@redhat.com/ Link: https://lkml.kernel.org/r/20201120030411.2690816-3-lokeshgidra@google.com Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Xu <peterx@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Daniel Colascione <dancol@dancol.org> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: <calin@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Shaohua Li <shli@fb.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Nitin Gupta <nigupta@nvidia.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Daniel Colascione <dancol@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Lokesh Gidra | 37cd0575b8 |
userfaultfd: add UFFD_USER_MODE_ONLY
Patch series "Control over userfaultfd kernel-fault handling", v6. This patch series is split from [1]. The other series enables SELinux support for userfaultfd file descriptors so that its creation and movement can be controlled. It has been demonstrated on various occasions that suspending kernel code execution for an arbitrary amount of time at any access to userspace memory (copy_from_user()/copy_to_user()/...) can be exploited to change the intended behavior of the kernel. For instance, handling page faults in kernel-mode using userfaultfd has been exploited in [2, 3]. Likewise, FUSE, which is similar to userfaultfd in this respect, has been exploited in [4, 5] for similar outcome. This small patch series adds a new flag to userfaultfd(2) that allows callers to give up the ability to handle kernel-mode faults with the resulting UFFD file object. It then adds a 'user-mode only' option to the unprivileged_userfaultfd sysctl knob to require unprivileged callers to use this new flag. The purpose of this new interface is to decrease the chance of an unprivileged userfaultfd user taking advantage of userfaultfd to enhance security vulnerabilities by lengthening the race window in kernel code. [1] https://lore.kernel.org/lkml/20200211225547.235083-1-dancol@google.com/ [2] https://duasynt.com/blog/linux-kernel-heap-spray [3] https://duasynt.com/blog/cve-2016-6187-heap-off-by-one-exploit [4] https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html [5] https://bugs.chromium.org/p/project-zero/issues/detail?id=808 This patch (of 2): userfaultfd handles page faults from both user and kernel code. Add a new UFFD_USER_MODE_ONLY flag for userfaultfd(2) that makes the resulting userfaultfd object refuse to handle faults from kernel mode, treating these faults as if SIGBUS were always raised, causing the kernel code to fail with EFAULT. A future patch adds a knob allowing administrators to give some processes the ability to create userfaultfd file objects only if they pass UFFD_USER_MODE_ONLY, reducing the likelihood that these processes will exploit userfaultfd's ability to delay kernel page faults to open timing windows for future exploits. Link: https://lkml.kernel.org/r/20201120030411.2690816-1-lokeshgidra@google.com Link: https://lkml.kernel.org/r/20201120030411.2690816-2-lokeshgidra@google.com Signed-off-by: Daniel Colascione <dancol@google.com> Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <calin@google.com> Cc: Daniel Colascione <dancol@dancol.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nitin Gupta <nigupta@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Shaohua Li <shli@fb.com> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Mike Rapoport | 5e545df329 |
arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
ARM is the only architecture that defines CONFIG_ARCH_HAS_HOLES_MEMORYMODEL which in turn enables memmap_valid_within() function that is intended to verify existence of struct page associated with a pfn when there are holes in the memory map. However, the ARCH_HAS_HOLES_MEMORYMODEL also enables HAVE_ARCH_PFN_VALID and arch-specific pfn_valid() implementation that also deals with the holes in the memory map. The only two users of memmap_valid_within() call this function after a call to pfn_valid() so the memmap_valid_within() check becomes redundant. Remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL and memmap_valid_within() and rely entirely on ARM's implementation of pfn_valid() that is now enabled unconditionally. Link: https://lkml.kernel.org/r/20201101170454.9567-9-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Meelis Roos <mroos@linux.ee> Cc: Michael Schmitz <schmitzmic@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Dmitry Safonov | cd544fd1dc |
mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aio
As kernel expect to see only one of such mappings, any further operations
on the VMA-copy may be unexpected by the kernel. Maybe it's being on the
safe side, but there doesn't seem to be any expected use-case for this, so
restrict it now.
Link: https://lkml.kernel.org/r/20201013013416.390574-4-dima@arista.com
Fixes: commit
|
|
Shakeel Butt | f0c0c115fb |
mm: memcontrol: account pagetables per node
For many workloads, pagetable consumption is significant and it makes sense to expose it in the memory.stat for the memory cgroups. However at the moment, the pagetables are accounted per-zone. Converting them to per-node and using the right interface will correctly account for the memory cgroups as well. [akpm@linux-foundation.org: export __mod_lruvec_page_state to modules for arch/mips/kvm/] Link: https://lkml.kernel.org/r/20201130212541.2781790-3-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Mauricio Faria de Oliveira | 45680967ee |
ocfs2: ratelimit the 'max lookup times reached' notice
Running stress-ng on ocfs2 completely fills the kernel log with 'max lookup times reached, filesystem may have nested directories.' Let's ratelimit this message as done with others in the code. Test-case: # mkfs.ocfs2 --mount local $DEV # mount $DEV $MNT # cd $MNT # dmesg -C # stress-ng --dirdeep 1 --dirdeep-ops 1000 # dmesg | grep -c 'max lookup times reached' Before: # dmesg -C # stress-ng --dirdeep 1 --dirdeep-ops 1000 ... stress-ng: info: [11116] successful run completed in 3.03s # dmesg | grep -c 'max lookup times reached' 967 After: # dmesg -C # stress-ng --dirdeep 1 --dirdeep-ops 1000 ... stress-ng: info: [739] successful run completed in 0.96s # dmesg | grep -c 'max lookup times reached' 10 # dmesg [ 259.086086] ocfs2_check_if_ancestor: 1990 callbacks suppressed [ 259.086092] (stress-ng-dirde,740,1):ocfs2_check_if_ancestor:1091 max lookup times reached, filesystem may have nested directories, src inode: 18007, dest inode: 17940. ... Link: https://lkml.kernel.org/r/20201001224417.478263-1-mfo@canonical.com Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Tom Rix | a0823b5e44 |
fs/ocfs2/cluster/tcp.c: remove unneeded break
A break is not needed if it is preceded by a goto Link: https://lkml.kernel.org/r/20201019175216.2329-1-trix@redhat.com Signed-off-by: Tom Rix <trix@redhat.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Alex Shi | 4dad18f477 |
fs/ntfs: remove unused variable attr_len
This variable isn't used anymore, remove it to skip W=1 warning: fs/ntfs/inode.c:2350:6: warning: variable `attr_len' set but not used [-Wunused-but-set-variable] Link: https://lkml.kernel.org/r/4194376f-898b-b602-81c3-210567712092@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Alex Shi | 3f10c2fa40 |
fs/ntfs: remove unused varibles
We actually don't use these varibles, so remove them to avoid gcc warning: fs/ntfs/file.c:326:14: warning: variable `base_ni' set but not used [-Wunused-but-set-variable] fs/ntfs/logfile.c:481:21: warning: variable `log_page_mask' set but not used [-Wunused-but-set-variable] Link: https://lkml.kernel.org/r/1604821092-54631-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | edd7ab7684 |
The new preemtible kmap_local() implementation:
- Consolidate all kmap_atomic() internals into a generic implementation which builds the base for the kmap_local() API and make the kmap_atomic() interface wrappers which handle the disabling/enabling of preemption and pagefaults. - Switch the storage from per-CPU to per task and provide scheduler support for clearing mapping when scheduling out and restoring them when scheduling back in. - Merge the migrate_disable/enable() code, which is also part of the scheduler pull request. This was required to make the kmap_local() interface available which does not disable preemption when a mapping is established. It has to disable migration instead to guarantee that the virtual address of the mapped slot is the same accross preemption. - Provide better debug facilities: guard pages and enforced utilization of the mapping mechanics on 64bit systems when the architecture allows it. - Provide the new kmap_local() API which can now be used to cleanup the kmap_atomic() usage sites all over the place. Most of the usage sites do not require the implicit disabling of preemption and pagefaults so the penalty on 64bit and 32bit non-highmem systems is removed and quite some of the code can be simplified. A wholesale conversion is not possible because some usage depends on the implicit side effects and some need to be cleaned up because they work around these side effects. The migrate disable side effect is only effective on highmem systems and when enforced debugging is enabled. On 64bit and 32bit non-highmem systems the overhead is completely avoided. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XyQwTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoUolD/9+R+BX96fGir+I8rG9dc3cbLw5meSi 0I/Nq3PToZMs2Iqv50DsoaPYHHz/M6fcAO9LRIgsE9jRbnY93GnsBM0wU9Y8yQaT 4wUzOG5WHaLDfqIkx/CN9coUl458oEiwOEbn79A2FmPXFzr7IpkufnV3ybGDwzwP p73bjMJMPPFrsa9ig87YiYfV/5IAZHi82PN8Cq1v4yNzgXRP3Tg6QoAuCO84ZnWF RYlrfKjcJ2xPdn+RuYyXolPtxr1hJQ0bOUpe4xu/UfeZjxZ7i1wtwLN9kWZe8CKH +x4Lz8HZZ5QMTQ9sCHOLtKzu2MceMcpISzoQH4/aFQCNMgLn1zLbS790XkYiQCuR ne9Cua+IqgYfGMG8cq8+bkU9HCNKaXqIBgPEKE/iHYVmqzCOqhW5Cogu4KFekf6V Wi7pyyUdX2en8BAWpk5NHc8de9cGcc+HXMq2NIcgXjVWvPaqRP6DeITERTZLJOmz XPxq5oPLGl7wdm7z+ICIaNApy8zuxpzb6sPLNcn7l5OeorViORlUu08AN8587wAj FiVjp6ZYomg+gyMkiNkDqFOGDH5TMENpOFoB0hNNEyJwwS0xh6CgWuwZcv+N8aPO HuS/P+tNANbD8ggT4UparXYce7YCtgOf3IG4GA3JJYvYmJ6pU+AZOWRoDScWq4o+ +jlfoJhMbtx5Gg== =n71I -----END PGP SIGNATURE----- Merge tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kmap updates from Thomas Gleixner: "The new preemtible kmap_local() implementation: - Consolidate all kmap_atomic() internals into a generic implementation which builds the base for the kmap_local() API and make the kmap_atomic() interface wrappers which handle the disabling/enabling of preemption and pagefaults. - Switch the storage from per-CPU to per task and provide scheduler support for clearing mapping when scheduling out and restoring them when scheduling back in. - Merge the migrate_disable/enable() code, which is also part of the scheduler pull request. This was required to make the kmap_local() interface available which does not disable preemption when a mapping is established. It has to disable migration instead to guarantee that the virtual address of the mapped slot is the same across preemption. - Provide better debug facilities: guard pages and enforced utilization of the mapping mechanics on 64bit systems when the architecture allows it. - Provide the new kmap_local() API which can now be used to cleanup the kmap_atomic() usage sites all over the place. Most of the usage sites do not require the implicit disabling of preemption and pagefaults so the penalty on 64bit and 32bit non-highmem systems is removed and quite some of the code can be simplified. A wholesale conversion is not possible because some usage depends on the implicit side effects and some need to be cleaned up because they work around these side effects. The migrate disable side effect is only effective on highmem systems and when enforced debugging is enabled. On 64bit and 32bit non-highmem systems the overhead is completely avoided" * tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) ARM: highmem: Fix cache_is_vivt() reference x86/crashdump/32: Simplify copy_oldmem_page() io-mapping: Provide iomap_local variant mm/highmem: Provide kmap_local* sched: highmem: Store local kmaps in task struct x86: Support kmap_local() forced debugging mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL microblaze/mm/highmem: Add dropped #ifdef back xtensa/mm/highmem: Make generic kmap_atomic() work correctly mm/highmem: Take kmap_high_get() properly into account highmem: High implementation details and document API Documentation/io-mapping: Remove outdated blurb io-mapping: Cleanup atomic iomap mm/highmem: Remove the old kmap_atomic cruft highmem: Get rid of kmap_types.h xtensa/mm/highmem: Switch to generic kmap atomic sparc/mm/highmem: Switch to generic kmap atomic powerpc/mm/highmem: Switch to generic kmap atomic nds32/mm/highmem: Switch to generic kmap atomic ... |
|
Linus Torvalds | adb35e8dc9 |
Scheduler updates:
- migrate_disable/enable() support which originates from the RT tree and is now a prerequisite for the new preemptible kmap_local() API which aims to replace kmap_atomic(). - A fair amount of topology and NUMA related improvements - Improvements for the frequency invariant calculations - Enhanced robustness for the global CPU priority tracking and decision making - The usual small fixes and enhancements all over the place -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XwK4THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoX28D/9cVrvziSQGfBfuQWnUiw8iOIq1QBa2 Me+Tvenhfrlt7xU6rbP9ciFu7eTN+fS06m5uQPGI+t22WuJmHzbmw1bJVXfkvYfI /QoU+Hg7DkDAn1p7ZKXh0dRkV0nI9ixxSHl0E+Zf1ATBxCUMV2SO85flg6z/4qJq 3VWUye0dmR7/bhtkIjv5rwce9v2JB2g1AbgYXYTW9lHVoUdGoMSdiZAF4tGyHLnx sJ6DMqQ+k+dmPyYO0z5MTzjW/fXit4n9w2e3z9TvRH/uBu58WSW1RBmQYX6aHBAg dhT9F4lvTs6lJY23x5RSFWDOv6xAvKF5a0xfb8UZcyH5EoLYrPRvm42a0BbjdeRa u0z7LbwIlKA+RFdZzFZWz8UvvO0ljyMjmiuqZnZ5dY9Cd80LSBuxrWeQYG0qg6lR Y2povhhCepEG+q8AXIe2YjHKWKKC1s/l/VY3CNnCzcd21JPQjQ4Z5eWGmHif5IED CntaeFFhZadR3w02tkX35zFmY3w4soKKrbI4EKWrQwd+cIEQlOSY7dEPI/b5BbYj MWAb3P4EG9N77AWTNmbhK4nN0brEYb+rBbCA+5dtNBVhHTxAC7OTWElJOC2O66FI e06dREjvwYtOkRUkUguWwErbIai2gJ2MH0VILV3hHoh64oRk7jjM8PZYnjQkdptQ Gsq0rJW5iiu/OQ== =Oz1V -----END PGP SIGNATURE----- Merge tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - migrate_disable/enable() support which originates from the RT tree and is now a prerequisite for the new preemptible kmap_local() API which aims to replace kmap_atomic(). - A fair amount of topology and NUMA related improvements - Improvements for the frequency invariant calculations - Enhanced robustness for the global CPU priority tracking and decision making - The usual small fixes and enhancements all over the place * tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits) sched/fair: Trivial correction of the newidle_balance() comment sched/fair: Clear SMT siblings after determining the core is not idle sched: Fix kernel-doc markup x86: Print ratio freq_max/freq_base used in frequency invariance calculations x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC x86, sched: Calculate frequency invariance for AMD systems irq_work: Optimize irq_work_single() smp: Cleanup smp_call_function*() irq_work: Cleanup sched: Limit the amount of NUMA imbalance that can exist at fork time sched/numa: Allow a floating imbalance between NUMA nodes sched: Avoid unnecessary calculation of load imbalance at clone time sched/numa: Rename nr_running and break out the magic number sched: Make migrate_disable/enable() independent of RT sched/topology: Condition EAS enablement on FIE support arm64: Rebuild sched domains on invariance status changes sched/topology,schedutil: Wrap sched domains rebuild sched/uclamp: Allow to reset a task uclamp constraint value sched/core: Fix typos in comments Documentation: scheduler: fix information on arch SD flags, sched_domain and sched_debug ... |
|
Linus Torvalds | 1ac0884d54 |
A set of updates for entry/exit handling:
- More generalization of entry/exit functionality - The consolidation work to reclaim TIF flags on x86 and also for non-x86 specific TIF flags which are solely relevant for syscall related work and have been moved into their own storage space. The x86 specific part had to be merged in to avoid a major conflict. - The TIF_NOTIFY_SIGNAL work which replaces the inefficient signal delivery mode of task work and results in an impressive performance improvement for io_uring. The non-x86 consolidation of this is going to come seperate via Jens. - The selective syscall redirection facility which provides a clean and efficient way to support the non-Linux syscalls of WINE by catching them at syscall entry and redirecting them to the user space emulation. This can be utilized for other purposes as well and has been designed carefully to avoid overhead for the regular fastpath. This includes the core changes and the x86 support code. - Simplification of the context tracking entry/exit handling for the users of the generic entry code which guarantee the proper ordering and protection. - Preparatory changes to make the generic entry code accomodate S390 specific requirements which are mostly related to their syscall restart mechanism. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XoPoTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoe0tD/4jSKHIogVM9kVpiYfwjDGS1NluaBXn 71ZoASbX9GZebyGandMyF2QP1iJ24ZO0RztBwHEVH6fyomKB2iFNedssCpO9yfWV 3eFRpOvMpbszY2W2bd0QG3GrqaTttjVfB4ahkGLzqeSbchdob6hZpNDYtBZnujA6 GSnrrurfJkCGoQny+yJQYdQJXQU+BIX90B2a2Q+jW123Luy/iHXC1f/krZSA1m14 fC9xYLSUjPphTzh2ZOW+C3DgdjOL5PfAm/6F+DArt4GtLgrEGD7R74aLSFhvetky dn5QtG+yAsz1i0cc5Wu/JBcT9tOkY92rPYSyLI9bYQUSQ/bMyuprz6oYKj3dubsu ZSsKPdkNFPIniL4fLdCMWZcIXX5xgnrxKjdgXZXW3gtrcxSns8w8uED3Sh7dgE08 pgIeq67E5g/OB8kJXH1VxdewmeQb9cOmnzzHwNO7TrrGbBKjDTYHNdYOKf1dUTTK ZX1UjLfGwxTkMYAbQD1k0JGZ2OLRshzSaH5BW/ZKa3bvJW6yYOq+/YT8B8hbJ8U3 vThlO75/55IJxS5r5Y3vZd/IHdsYbPuETD+TA8tNYtPqNZasW8nnk4TYctWqzDuO /Ka1wvWYid3c6ySznQn4zSyRjr968AfHeZ9YTUMhWufy5waXVmdBMG41u3IKfsVt osyzNc4EK19/Mg== =hsjV -----END PGP SIGNATURE----- Merge tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core entry/exit updates from Thomas Gleixner: "A set of updates for entry/exit handling: - More generalization of entry/exit functionality - The consolidation work to reclaim TIF flags on x86 and also for non-x86 specific TIF flags which are solely relevant for syscall related work and have been moved into their own storage space. The x86 specific part had to be merged in to avoid a major conflict. - The TIF_NOTIFY_SIGNAL work which replaces the inefficient signal delivery mode of task work and results in an impressive performance improvement for io_uring. The non-x86 consolidation of this is going to come seperate via Jens. - The selective syscall redirection facility which provides a clean and efficient way to support the non-Linux syscalls of WINE by catching them at syscall entry and redirecting them to the user space emulation. This can be utilized for other purposes as well and has been designed carefully to avoid overhead for the regular fastpath. This includes the core changes and the x86 support code. - Simplification of the context tracking entry/exit handling for the users of the generic entry code which guarantee the proper ordering and protection. - Preparatory changes to make the generic entry code accomodate S390 specific requirements which are mostly related to their syscall restart mechanism" * tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) entry: Add syscall_exit_to_user_mode_work() entry: Add exit_to_user_mode() wrapper entry_Add_enter_from_user_mode_wrapper entry: Rename exit_to_user_mode() entry: Rename enter_from_user_mode() docs: Document Syscall User Dispatch selftests: Add benchmark for syscall user dispatch selftests: Add kselftest for syscall user dispatch entry: Support Syscall User Dispatch on common syscall entry kernel: Implement selective syscall userspace redirection signal: Expose SYS_USER_DISPATCH si_code type x86: vdso: Expose sigreturn address on vdso to the kernel MAINTAINERS: Add entry for common entry code entry: Fix boot for !CONFIG_GENERIC_ENTRY x86: Support HAVE_CONTEXT_TRACKING_OFFSTACK context_tracking: Only define schedule_user() on !HAVE_CONTEXT_TRACKING_OFFSTACK archs sched: Detect call to schedule from critical entry code context_tracking: Don't implement exception_enter/exit() on CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK context_tracking: Introduce HAVE_CONTEXT_TRACKING_OFFSTACK x86: Reclaim unused x86 TI flags ... |
|
Linus Torvalds | f9b4240b07 |
fixes-v5.11
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCX9daOgAKCRCRxhvAZXjc
ohPkAQChXUB2BAjtIzXlCkZoDBbzHHblm5DZ37oy/4xYFmAcEwEA5sw6dQqyGHnF
GEP9def51HvXLpBV2BzNUGggo1SoGgQ=
=w/cO
-----END PGP SIGNATURE-----
Merge tag 'fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull misc fixes from Christian Brauner:
"This contains several fixes which felt worth being combined into a
single branch:
- Use put_nsproxy() instead of open-coding it switch_task_namespaces()
- Kirill's work to unify lifecycle management for all namespaces. The
lifetime counters are used identically for all namespaces types.
Namespaces may of course have additional unrelated counters and
these are not altered. This work allows us to unify the type of the
counters and reduces maintenance cost by moving the counter in one
place and indicating that basic lifetime management is identical
for all namespaces.
- Peilin's fix adding three byte padding to Dmitry's
PTRACE_GET_SYSCALL_INFO uapi struct to prevent an info leak.
- Two smal patches to convert from the /* fall through */ comment
annotation to the fallthrough keyword annotation which I had taken
into my branch and into -next before
|
|
Linus Torvalds | 6d93a1971a |
time-namespace-v5.11
-----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCX9cwgAAKCRCRxhvAZXjc onViAP9CDMQct0RfdpdKOrh4NkxWiheBp7CzVSP1Xfy8KHBslgD/X7kilcthT8PC JTJmngrVWoehX+s49kl2PSuuLsGElAo= =llnx -----END PGP SIGNATURE----- Merge tag 'time-namespace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull time namespace updates from Christian Brauner: "When time namespaces were introduced we missed to virtualize the 'btime' field in /proc/stat. This confuses tasks which are in another time namespace with a virtualized boottime which is common in some container workloads. This contains Michael's series to fix 'btime' which Thomas asked me to take through my tree. To fix 'btime' virtualization we simply subtract the offset of the time namespace's boottime from btime before printing the stats. Note that since start_boottime of processes are seconds since boottime and the boottime stamp is now shifted according to the time namespace's offset, the offset of the time namespace also needs to be applied before the process stats are given to userspace. This avoids that processes shown by tools such as 'ps' appear as time travelers in the corresponding time namespace. Selftests are included to verify that btime virtualization in /proc/stat works as expected" * tag 'time-namespace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: namespace: make timens_on_fork() return nothing selftests/timens: added selftest for /proc/stat btime fs/proc: apply the time namespace offset to /proc/stat btime timens: additional helper functions for boottime offset handling |
|
Jakub Kicinski | a6b5e026e6 |
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-12-14 1) Expose bpf_sk_storage_*() helpers to iterator programs, from Florent Revest. 2) Add AF_XDP selftests based on veth devs to BPF selftests, from Weqaar Janjua. 3) Support for finding BTF based kernel attach targets through libbpf's bpf_program__set_attach_target() API, from Andrii Nakryiko. 4) Permit pointers on stack for helper calls in the verifier, from Yonghong Song. 5) Fix overflows in hash map elem size after rlimit removal, from Eric Dumazet. 6) Get rid of direct invocation of llc in BPF selftests, from Andrew Delgadillo. 7) Fix xsk_recvmsg() to reorder socket state check before access, from Björn Töpel. 8) Add new libbpf API helper to retrieve ring buffer epoll fd, from Brendan Jackman. 9) Batch of minor BPF selftest improvements all over the place, from Florian Lehner, KP Singh, Jiri Olsa and various others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (31 commits) selftests/bpf: Add a test for ptr_to_map_value on stack for helper access bpf: Permits pointers on stack for helper calls libbpf: Expose libbpf ring_buffer epoll_fd selftests/bpf: Add set_attach_target() API selftest for module target libbpf: Support modules in bpf_program__set_attach_target() API selftests/bpf: Silence ima_setup.sh when not running in verbose mode. selftests/bpf: Drop the need for LLVM's llc selftests/bpf: fix bpf_testmod.ko recompilation logic samples/bpf: Fix possible hang in xdpsock with multiple threads selftests/bpf: Make selftest compilation work on clang 11 selftests/bpf: Xsk selftests - adding xdpxceiver to .gitignore selftests/bpf: Drop tcp-{client,server}.py from Makefile selftests/bpf: Xsk selftests - Bi-directional Sockets - SKB, DRV selftests/bpf: Xsk selftests - Socket Teardown - SKB, DRV selftests/bpf: Xsk selftests - DRV POLL, NOPOLL selftests/bpf: Xsk selftests - SKB POLL, NOPOLL selftests/bpf: Xsk selftests framework bpf: Only provide bpf_sock_from_file with CONFIG_NET bpf: Return -ENOTSUPP when attaching to non-kernel BTF xsk: Validate socket state in xsk_recvmsg, prior touching socket members ... ==================== Link: https://lore.kernel.org/r/20201214214316.20642-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
Linus Torvalds | 405f868f13 |
- Remove all uses of TIF_IA32 and TIF_X32 and reclaim the two bits in the end
(Gabriel Krisman Bertazi) - All kinds of minor cleanups all over the tree. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl/XgtoACgkQEsHwGGHe VUqGuA/9GqN2zNQdhgRvAQ+FLZiOYK9MfXcoayfMq8T61VRPDBWaQRfVYKmfmEjS 0l5OnYgZQ9n6vzqFy6pmgc/ix8Jr553dZp5NCamcOqjCTcuO/LwRRh+ZBeFSBTPi r2qFYKKRYvM7nbyUMm4WqvAakxJ18xsjNbIslr9Aqe8WtHBKKX3MOu8SOpFtGyXz aEc4rhsS45iZa5gTXhvOn73tr3yHGWU1rzyyAAAmDGTgAxRwsTna8v16C4+v+Bua Zg18Wiutj8ZjtFpzKJtGWGZoSBap3Jw2Ys64g42MBQUE56KY/99tQVo/SvbYvvlf PHWLH0f3rPNJ6J2qeKwhtNzPlEAH/6e416A1/6TVwsK+8pdfGmkfaQh2iDHLhJ5i CSwF61H44ZaE3pc1tHHbC5ALvydPlup7D4MKgztfq0mZ3OoV2Vg7dtyyr+Ybz72b G+Kl/tmyacQTXo0FiYbZKETo3/VfTdBXGyVax1rHkx3pt8zvhFg3kxb1TT/l/CoM eSTx53PtTdVtbGOq1CjnUm0FKlbh4+kLoNuo9DYKeXUQBs8PWOCZmL3wXmm4cqlZ mDZVWvll7CjToY8izzcE/AG279cWkgcL5Tcg7W7CR66+egfDdpuqOZ4tv4TyzoWq 0J7WeNj+TAo98b7RA0Ux8LOlszRxS2ykuI6uB2MgwCaRMbbaQao= =lLiH -----END PGP SIGNATURE----- Merge tag 'x86_cleanups_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: "Another branch with a nicely negative diffstat, just the way I like 'em: - Remove all uses of TIF_IA32 and TIF_X32 and reclaim the two bits in the end (Gabriel Krisman Bertazi) - All kinds of minor cleanups all over the tree" * tag 'x86_cleanups_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/ia32_signal: Propagate __user annotation properly x86/alternative: Update text_poke_bp() kernel-doc comment x86/PCI: Make a kernel-doc comment a normal one x86/asm: Drop unused RDPID macro x86/boot/compressed/64: Use TEST %reg,%reg instead of CMP $0,%reg x86/head64: Remove duplicate include x86/mm: Declare 'start' variable where it is used x86/head/64: Remove unused GET_CR2_INTO() macro x86/boot: Remove unused finalize_identity_maps() x86/uaccess: Document copy_from_user_nmi() x86/dumpstack: Make show_trace_log_lvl() static x86/mtrr: Fix a kernel-doc markup x86/setup: Remove unused MCA variables x86, libnvdimm/test: Remove COPY_MC_TEST x86: Reclaim TIF_IA32 and TIF_X32 x86/mm: Convert mmu context ia32_compat into a proper flags field x86/elf: Use e_machine to check for x32/ia32 in setup_additional_pages() elf: Expose ELF header on arch_setup_additional_pages() x86/elf: Use e_machine to select start_thread for x32 elf: Expose ELF header in compat_start_thread() ... |
|
Linus Torvalds | 9e4b0d55d8 |
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu: "API: - Add speed testing on 1420-byte blocks for networking Algorithms: - Improve performance of chacha on ARM for network packets - Improve performance of aegis128 on ARM for network packets Drivers: - Add support for Keem Bay OCS AES/SM4 - Add support for QAT 4xxx devices - Enable crypto-engine retry mechanism in caam - Enable support for crypto engine on sdm845 in qce - Add HiSilicon PRNG driver support" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (161 commits) crypto: qat - add capability detection logic in qat_4xxx crypto: qat - add AES-XTS support for QAT GEN4 devices crypto: qat - add AES-CTR support for QAT GEN4 devices crypto: atmel-i2c - select CONFIG_BITREVERSE crypto: hisilicon/trng - replace atomic_add_return() crypto: keembay - Add support for Keem Bay OCS AES/SM4 dt-bindings: Add Keem Bay OCS AES bindings crypto: aegis128 - avoid spurious references crypto_aegis128_update_simd crypto: seed - remove trailing semicolon in macro definition crypto: x86/poly1305 - Use TEST %reg,%reg instead of CMP $0,%reg crypto: x86/sha512 - Use TEST %reg,%reg instead of CMP $0,%reg crypto: aesni - Use TEST %reg,%reg instead of CMP $0,%reg crypto: cpt - Fix sparse warnings in cptpf hwrng: ks-sa - Add dependency on IOMEM and OF crypto: lib/blake2s - Move selftest prototype into header file crypto: arm/aes-ce - work around Cortex-A57/A72 silion errata crypto: ecdh - avoid unaligned accesses in ecdh_set_secret() crypto: ccree - rework cache parameters handling crypto: cavium - Use dma_set_mask_and_coherent to simplify code crypto: marvell/octeontx - Use dma_set_mask_and_coherent to simplify code ... |
|
Linus Torvalds | 51895d58c7 |
fsverity updates for 5.11
Some cleanups for fs-verity: - Rename some names that have been causing confusion. - Move structs needed for file signing to the UAPI header. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCX9bedBQcZWJpZ2dlcnNA Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK7s6AQD4xMyyPTAQbznYlxPzn+2d8H3y+qik xbX7JwATnyOM4AEApDMGPvokvmEJO4z/ionUunR20i//JIyOklxy8w4whQ0= =XO4L -----END PGP SIGNATURE----- Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt Pull fsverity updates from Eric Biggers: "Some cleanups for fs-verity: - Rename some names that have been causing confusion - Move structs needed for file signing to the UAPI header" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fs-verity: move structs needed for file signing to UAPI header fs-verity: rename "file measurement" to "file digest" fs-verity: rename fsverity_signed_digest to fsverity_formatted_digest fs-verity: remove filenames from file comments |
|
Linus Torvalds | 7c7fdaf6ad |
fscrypt updates for 5.11
This release there are some fixes for longstanding problems, as well as some cleanups: - Fix a race condition where a duplicate filename could be created in an encrypted directory if a syscall that creates a new filename raced with the directory's encryption key being added. - Allow deleting files that use an unsupported encryption policy. - Simplify the locking for 'struct fscrypt_master_key'. - Remove kernel-internal constants from the UAPI header. As usual, all these patches have been in linux-next with no reported issues, and I've tested them with xfstests. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCX9bcDxQcZWJpZ2dlcnNA Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK/HRAP95FGQqS47rIEh4LrvS7cohMJxb5NiX KokAyB88GgmzLQD/c4Xh+iYOxxhFX5NRgruuoec876DrzsuNbEt7WNJ6CQc= =CBoc -----END PGP SIGNATURE----- Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt Pull fscrypt updates from Eric Biggers: "This release there are some fixes for longstanding problems, as well as some cleanups: - Fix a race condition where a duplicate filename could be created in an encrypted directory if a syscall that creates a new filename raced with the directory's encryption key being added. - Allow deleting files that use an unsupported encryption policy. - Simplify the locking for 'struct fscrypt_master_key'. - Remove kernel-internal constants from the UAPI header. As usual, all these patches have been in linux-next with no reported issues, and I've tested them with xfstests" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: allow deleting files with unsupported encryption policy fscrypt: unexport fscrypt_get_encryption_info() fscrypt: move fscrypt_require_key() to fscrypt_private.h fscrypt: move body of fscrypt_prepare_setattr() out-of-line fscrypt: introduce fscrypt_prepare_readdir() ext4: don't call fscrypt_get_encryption_info() from dx_show_leaf() ubifs: remove ubifs_dir_open() f2fs: remove f2fs_dir_open() ext4: remove ext4_dir_open() fscrypt: simplify master key locking fscrypt: remove unnecessary calls to fscrypt_require_key() ubifs: prevent creating duplicate encrypted filenames f2fs: prevent creating duplicate encrypted filenames ext4: prevent creating duplicate encrypted filenames fscrypt: add fscrypt_is_nokey_name() fscrypt: remove kernel-internal constants from UAPI header |
|
Linus Torvalds | 31d00f6eb1 |
io_uring-5.10-2020-12-11
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/UMOwQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpmI3EAC22QohUftfWAJeks142J9wFKQgYfNZpZHs iJR07NTYUSghp9rpfGVNGY/ZonUoKds0VSbDZpRU8TTjglGf9GguzYE4iOMedtG3 MA9diNOnAAK0gwZrBx1uB882X4pf1qYGChMe9fXXJ4mQsHpG84q00GnJI0cPjkKn PaYRVugr8AxCzG4dHHL0ckQtsEHWaPwe7i5u4UkoOC3IsrVjRri6wYAa98xDuznv fqssoMp2cSOcVHsfypQAVZbg0heR3Zo2OuXE4uw6yjB7w5j8tuLIV9XKtqwiRn2n 57rdnGfY6vSfJb9EPm6fwBRnve2cI2Xbr30x/XWSIx+QcUOFOsBfdblaJikX8cSJ FG5c0AgBeRo1heWC4FRNIFStAubm0RPJAPqNm9p6T0DzO4hC3xTTdkl04B5bA4Ms 4tNWhk2fP8YRKhC0gRcw7zRHAl/qjuMjrUor5L68jxGlmFR7h+uMky0AsS4KZKNZ o1+HFmc/m+3ZBNzL7NoXkJzWFaKzZVf0iuEqJP3Xqy91hCA85pvh7DGAlQpZBppM C0eNeJ59rWvfSuUS9bLlLfoMO3Ab5BsE4IEji7K2lBNZsCXQpXHVv4xJxgKJRja9 t+SuBgYpNOfHrIPSY0LaJh93MNzoP22TP8qF8U5gWQ0Bus/mVKE9S1EXSUAvfHPm IyOfTB9M6g== =wjOp -----END PGP SIGNATURE----- Merge tag 'io_uring-5.10-2020-12-11' of git://git.kernel.dk/linux-block Pull io_uring fixes from Jens Axboe: "Two fixes in here, fixing issues introduced in this merge window" * tag 'io_uring-5.10-2020-12-11' of git://git.kernel.dk/linux-block: io_uring: fix file leak on error path of io ctx creation io_uring: fix mis-seting personality's creds |
|
Jakub Kicinski | 46d5e62dd3 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
xdp_return_frame_bulk() needs to pass a xdp_buff to __xdp_return(). strlcpy got converted to strscpy but here it makes no functional difference, so just keep the right code. Conflicts: net/netfilter/nf_tables_api.c Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
Linus Torvalds | 782598ecea |
zonefs fixes for 5.10-rc7
A single patch in this pull request to fix a BIO and page reference leak when writing sequential zone files. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCX9NY0gAKCRDdoc3SxdoY dga9AQCq+hAOyA1FeCkWwBG0gywIIVhPscNphe1bH576JoaIZAEA86DsGB9BED7H yaezfh5W1lr63ctxtSVdyo+x5172fgY= =Te7s -----END PGP SIGNATURE----- Merge tag 'zonefs-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fix from Damien Le Moal: "A single patch in this pull request to fix a BIO and page reference leak when writing sequential zone files" * tag 'zonefs-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: fix page reference and BIO leak |
|
Miles Chen | 40d6366e9d |
proc: use untagged_addr() for pagemap_read addresses
When we try to visit the pagemap of a tagged userspace pointer, we find that the start_vaddr is not correct because of the tag. To fix it, we should untag the userspace pointers in pagemap_read(). I tested with 5.10-rc4 and the issue remains. Explanation from Catalin in [1]: "Arguably, that's a user-space bug since tagged file offsets were never supported. In this case it's not even a tag at bit 56 as per the arm64 tagged address ABI but rather down to bit 47. You could say that the problem is caused by the C library (malloc()) or whoever created the tagged vaddr and passed it to this function. It's not a kernel regression as we've never supported it. Now, pagemap is a special case where the offset is usually not generated as a classic file offset but rather derived by shifting a user virtual address. I guess we can make a concession for pagemap (only) and allow such offset with the tag at bit (56 - PAGE_SHIFT + 3)" My test code is based on [2]: A userspace pointer which has been tagged by 0xb4: 0xb400007662f541c8 userspace program: uint64 OsLayer::VirtualToPhysical(void *vaddr) { uint64 frame, paddr, pfnmask, pagemask; int pagesize = sysconf(_SC_PAGESIZE); off64_t off = ((uintptr_t)vaddr) / pagesize * 8; // off = 0xb400007662f541c8 / pagesize * 8 = 0x5a00003b317aa0 int fd = open(kPagemapPath, O_RDONLY); ... if (lseek64(fd, off, SEEK_SET) != off || read(fd, &frame, 8) != 8) { int err = errno; string errtxt = ErrorString(err); if (fd >= 0) close(fd); return 0; } ... } kernel fs/proc/task_mmu.c: static ssize_t pagemap_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { ... src = *ppos; svpfn = src / PM_ENTRY_BYTES; // svpfn == 0xb400007662f54 start_vaddr = svpfn << PAGE_SHIFT; // start_vaddr == 0xb400007662f54000 end_vaddr = mm->task_size; /* watch out for wraparound */ // svpfn == 0xb400007662f54 // (mm->task_size >> PAGE) == 0x8000000 if (svpfn > mm->task_size >> PAGE_SHIFT) // the condition is true because of the tag 0xb4 start_vaddr = end_vaddr; ret = 0; while (count && (start_vaddr < end_vaddr)) { // we cannot visit correct entry because start_vaddr is set to end_vaddr int len; unsigned long end; ... } ... } [1] https://lore.kernel.org/patchwork/patch/1343258/ [2] https://github.com/stressapptest/stressapptest/blob/master/src/os.cc#L158 Link: https://lkml.kernel.org/r/20201204024347.8295-1-miles.chen@mediatek.com Signed-off-by: Miles Chen <miles.chen@mediatek.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Will Deacon <will@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com> Cc: <stable@vger.kernel.org> [5.4-] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | 6840a3dcc2 |
More NFS Client Bugfixes for Linux 5.10
- Bugfixes: - Fix array overflow when flexfiles mirroring is enabled - Fix rpcrdma_inline_fixup() crash with new LISTXATTRS - Fix 5 second delay when doing inter-server copy - Disable READ_PLUS by default -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAl/Sl70ACgkQ18tUv7Cl QOuGdw/9GMmrLF26brO3q3lSTD9xdljJGQi0q1DEF2amNUL9cPgxB0xZB9lyUPqs AbpLxsCgfJ+rN3sbOnDdz8Ae/wiWS1PkeJWw4cu0/kIIjyPD2Uu4jWuIwxpFvP1v lFfBHeEdsNgqS5t9t9cF9u30PjPKt9SHG/64/8GY92zlrKDRUaJUyvW1zUKo4ne4 m3JxZ1rCxNkBYV+odrQIYE9gqFI6OBWpVsUeIXjBWWDZ/+zK9rGiO5fxZ8oe+yT2 cmrR/vOznoPgUGaH380HrCbXcIKwim2mn76t1MH9fScZQc1ocSWckrZRRdOVWlJ+ 228ImDPACLFaTqpgi8ZhFmuD3Mwbz4AtPLce52lvZLuMAPVMoFhR8xrm/VpJFSim pNjwsl/j9qlGSAe5XdGW+X/DzmHPVd6uhfzOFAd7ZErt7rU0gjyJTWzrmayoB3N1 GF5g4LaK1dYYj7SG8d8DrEV2CGyYle2UP4aDm8qJ/ZLQhd10RHeVt4BHfSKwY05l 03Gim7wlH/ercYfgU6wrgmEG2SWjuFJr+j1v//aqcmJVhA7FvTKygGpGyNK16pNL 9dZ81wUHKHzVt5C/ruG+3tARvnobvmB4ngQAQmtu0aXe9c+Kr2XlaXUAusHPK+Me pVqtmaKBgddUtJwFms8JF6dePiSBRQgpwyvluqNi1D7ntRhS+Ik= =EXrS -----END PGP SIGNATURE----- Merge tag 'nfs-for-5.10-3' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client fixes from Anna Schumaker: "Here are a handful more bugfixes for 5.10. Unfortunately, we found some problems with the new READ_PLUS operation that aren't easy to fix. We've decided to disable this codepath through a Kconfig option for now, but a series of patches going into 5.11 will clean up the code and fix the issues at the same time. This seemed like the best way to go about it. Summary: - Fix array overflow when flexfiles mirroring is enabled - Fix rpcrdma_inline_fixup() crash with new LISTXATTRS - Fix 5 second delay when doing inter-server copy - Disable READ_PLUS by default" * tag 'nfs-for-5.10-3' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: Disable READ_PLUS by default NFSv4.2: Fix 5 seconds delay when doing inter server copy NFS: Fix rpcrdma_inline_fixup() crash with new LISTXATTRS operation pNFS/flexfiles: Fix array overflow when flexfiles mirroring is enabled |
|
Anna Schumaker | 21e31401fc |
NFS: Disable READ_PLUS by default
We've been seeing failures with xfstests generic/091 and generic/263 when using READ_PLUS. I've made some progress on these issues, and the tests fail later on but still don't pass. Let's disable READ_PLUS by default until we can work out what is going on. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> |
|
Dai Ngo | fe8eb820e3 |
NFSv4.2: Fix 5 seconds delay when doing inter server copy
Since commit |
|
Chuck Lever | 1c87b85162 |
NFS: Fix rpcrdma_inline_fixup() crash with new LISTXATTRS operation
By switching to an XFS-backed export, I am able to reproduce the
ibcomp worker crash on my client with xfstests generic/013.
For the failing LISTXATTRS operation, xdr_inline_pages() is called
with page_len=12 and buflen=128.
- When ->send_request() is called, rpcrdma_marshal_req() does not
set up a Reply chunk because buflen is smaller than the inline
threshold. Thus rpcrdma_convert_iovs() does not get invoked at
all and the transport's XDRBUF_SPARSE_PAGES logic is not invoked
on the receive buffer.
- During reply processing, rpcrdma_inline_fixup() tries to copy
received data into rq_rcv_buf->pages because page_len is positive.
But there are no receive pages because rpcrdma_marshal_req() never
allocated them.
The result is that the ibcomp worker faults and dies. Sometimes that
causes a visible crash, and sometimes it results in a transport hang
without other symptoms.
RPC/RDMA's XDRBUF_SPARSE_PAGES support is not entirely correct, and
should eventually be fixed or replaced. However, my preference is
that upper-layer operations should explicitly allocate their receive
buffers (using GFP_KERNEL) when possible, rather than relying on
XDRBUF_SPARSE_PAGES.
Reported-by: Olga kornievskaia <kolga@netapp.com>
Suggested-by: Olga kornievskaia <kolga@netapp.com>
Fixes:
|
|
Damien Le Moal | 6bea0225a4 |
zonefs: fix page reference and BIO leak
In zonefs_file_dio_append(), the pages obtained using
bio_iov_iter_get_pages() are not released on completion of the
REQ_OP_APPEND BIO, nor when bio_iov_iter_get_pages() fails.
Furthermore, a call to bio_put() is missing when
bio_iov_iter_get_pages() fails.
Fix these resource leaks by adding BIO resource release code (bio_put()i
and bio_release_pages()) at the end of the function after the BIO
execution and add a jump to this resource cleanup code in case of
bio_iov_iter_get_pages() failure.
While at it, also fix the call to task_io_account_write() to be passed
the correct BIO size instead of bio_iov_iter_get_pages() return value.
Reported-by: Christoph Hellwig <hch@lst.de>
Fixes:
|
|
Hui Su | 0288e7fa35 |
fs/kernfs: remove the double check of dentry->inode
In both kernfs_node_from_dentry() and in kernfs_dentry_node(), we will check the dentry->inode is NULL or not, which is superfluous. So remove the check in kernfs_node_from_dentry(). Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Hui Su <sh_def@163.com> Link: https://lore.kernel.org/r/20201113132143.GA119541@rlk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
|
David Howells | 4cb6829647 |
afs: Fix memory leak when mounting with multiple source parameters
There's a memory leak in afs_parse_source() whereby multiple source=
parameters overwrite fc->source in the fs_context struct without freeing
the previously recorded source.
Fix this by only permitting a single source parameter and rejecting with
an error all subsequent ones.
This was caught by syzbot with the kernel memory leak detector, showing
something like the following trace:
unreferenced object 0xffff888114375440 (size 32):
comm "repro", pid 5168, jiffies 4294923723 (age 569.948s)
backtrace:
slab_post_alloc_hook+0x42/0x79
__kmalloc_track_caller+0x125/0x16a
kmemdup_nul+0x24/0x3c
vfs_parse_fs_string+0x5a/0xa1
generic_parse_monolithic+0x9d/0xc5
do_new_mount+0x10d/0x15a
do_mount+0x5f/0x8e
__do_sys_mount+0xff/0x127
do_syscall_64+0x2d/0x3a
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes:
|
|
Linus Torvalds | 7d8761ba27 |
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull seq_file fix from Al Viro: "This fixes a regression introduced in this cycle wrt iov_iter based variant for reading a seq_file" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix return values of seq_read_iter() |
|
Hillf Danton | f26c08b444 |
io_uring: fix file leak on error path of io ctx creation
Put file as part of error handling when setting up io ctx to fix
memory leaks like the following one.
BUG: memory leak
unreferenced object 0xffff888101ea2200 (size 256):
comm "syz-executor355", pid 8470, jiffies 4294953658 (age 32.400s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
20 59 03 01 81 88 ff ff 80 87 a8 10 81 88 ff ff Y..............
backtrace:
[<000000002e0a7c5f>] kmem_cache_zalloc include/linux/slab.h:654 [inline]
[<000000002e0a7c5f>] __alloc_file+0x1f/0x130 fs/file_table.c:101
[<000000001a55b73a>] alloc_empty_file+0x69/0x120 fs/file_table.c:151
[<00000000fb22349e>] alloc_file+0x33/0x1b0 fs/file_table.c:193
[<000000006e1465bb>] alloc_file_pseudo+0xb2/0x140 fs/file_table.c:233
[<000000007118092a>] anon_inode_getfile fs/anon_inodes.c:91 [inline]
[<000000007118092a>] anon_inode_getfile+0xaa/0x120 fs/anon_inodes.c:74
[<000000002ae99012>] io_uring_get_fd fs/io_uring.c:9198 [inline]
[<000000002ae99012>] io_uring_create fs/io_uring.c:9377 [inline]
[<000000002ae99012>] io_uring_setup+0x1125/0x1630 fs/io_uring.c:9411
[<000000008280baad>] do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
[<00000000685d8cf0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported-by: syzbot+71c4697e27c99fddcf17@syzkaller.appspotmail.com
Fixes:
|
|
Pavel Begunkov | e8c954df23 |
io_uring: fix mis-seting personality's creds
After io_identity_cow() copies an work.identity it wants to copy creds to the new just allocated id, not the old one. Otherwise it's akin to req->work.identity->creds = req->work.identity->creds. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
Menglong Dong | 2bf509d96d |
coredump: fix core_pattern parse error
'format_corename()' will splite 'core_pattern' on spaces when it is in
pipe mode, and take helper_argv[0] as the path to usermode executable.
It works fine in most cases.
However, if there is a space between '|' and '/file/path', such as
'| /usr/lib/systemd/systemd-coredump %P %u %g', then helper_argv[0] will
be parsed as '', and users will get a 'Core dump to | disabled'.
It is not friendly to users, as the pattern above was valid previously.
Fix this by ignoring the spaces between '|' and '/file/path'.
Fixes:
|
|
Linus Torvalds | 619ca2664c |
io_uring-5.10-2020-12-05
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/L/o4QHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpifFD/9tqDSwqcRIKRhSHqsD6OJRagZ0UB37cPvH ZXmtHblK9BkeXc83OPy+NmlcPq1mpDmbMgy0rls8UOGnZs0RINN9En5hHGOeRVD4 Ma6Abh9ypzXhNKqY4HO+lcsN+elGqoaYNSnu/Nbcib7Z5yYzUSNDnpIeTQy7Onbw dTGNTJalZYqrQ+6ytglfVeQuXSSKYPJfjFdcYd1It6Ks0aSKDgcQxJZgPeuF6xE+ mxEWADs+zpZG8UqZYyLnS6+9dxuJwhx+Cd47Ntky0VdaWmxCogQiI3X/Q3vspnVq gz9VqnUIA3Qgabm0UJkyjCi87W2KNDf4r5OOaQw4dLc0kDhIVS6ii67iGVs9Dl7F xCZMCE4pkEvs0pPYCyITUUNPh/fBONHgDHAWxxhztkE8ldvsAUmqNyV/LWMMvtSl SUQli6OZMVyyUX9sroHOWKhi8rQYt1UZa0tm+UZE/+YrZLfp6qBCZHFC6hdNfQoS k7PiuxecsH4i/OweJMPtggChJ7VWHZUorTljfHnl14oDuFuVD22AJvc3q+OSStI1 Ua5N27b2/pWXnf9P3dPGzikprOU/vXPQnlifmBxXNj+/TD29+GPUHhV2+kibc5/F gMbrhX3TX4tstHIGxUCoeB8XepiBqOmn1eLkWNQQGi8RbqCKfE4j/5o07HgxST2P Ns2GaEKb4A== =CPrM -----END PGP SIGNATURE----- Merge tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block Pull io_uring fix from Jens Axboe: "Just a small fix this time, for an issue with 32-bit compat apps and buffer selection with recvmsg" * tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block: io_uring: fix recvmsg setup with compat buf-select |
|
Linus Torvalds | d4e904198c |
Three smb3 fixes, two for stable
-----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl/K3U0ACgkQiiy9cAdy T1Gewwv7Bjtq+c4tvc0LZj04oCfgsmcdu9s82Xff+e81lJlPyK+04zrp1kPiVQ8O iV5+bQr96BPgA96f2Et/OdnO4ei4T1vp9yTfXkTHQCWHQhd57N98TNudtJ+JNKCv ziaY+X6BFBBFnbCiPQxLZjvXsXenS3PFTq+wJB6ot4Qsj7z3kk88nXrvDDP7kiFL 3WqZzUWKPJaRHjJQIrUch9M77P1cGk1duKT8Ydj/XKQujocWFWYXEWVYMGKcFFpK cXBEbN6vBUTogn6fA8lQKy3KTarRGT8qpl8c91UCZ61sH8t6RlHQ53rtPrLPG/9k KH8xyN+Lu4+w9LDni5HqsIHAqi0+dnPvKRZWIkxMI/ZbZcV4Mwoi4kMZ/H85psp6 wEDgA6czaQSS9c3jZ0lvNgOHvpKkz1lz8sXwUuTYVehMaQrTcxNdEgh8TdlfZ9P6 St404ZXSsthXChac2N22G9JX4XcM/wocSWq/sfpoUvr9eKJMjFMdB7rP7ORPQofb S/u8v5pD =3b+C -----END PGP SIGNATURE----- Merge tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: "Three smb3 fixes (two for stable) fixing - a null pointer issue in a DFS error path - a problem with excessive padding when mounted with "idsfromsid" causing owner fields to get corrupted - a more recent problem with compounded reparse point query found in testing to the Linux kernel server" * tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: refactor create_sd_buf() and and avoid corrupting the buffer cifs: add NULL check for ses->tcon_ipc smb3: set COMPOUND_FID to FileID field of subsequent compound request |
|
Florent Revest | dba4a9256b |
net: Remove the err argument from sock_from_file
Currently, the sock_from_file prototype takes an "err" pointer that is either not set or set to -ENOTSOCK IFF the returned socket is NULL. This makes the error redundant and it is ignored by a few callers. This patch simplifies the API by letting callers deduce the error based on whether the returned socket is NULL or not. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Florent Revest <revest@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20201204113609.1850150-1-revest@google.com |
|
Jakub Kicinski | a1dd1d8697 |
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-12-03 The main changes are: 1) Support BTF in kernel modules, from Andrii. 2) Introduce preferred busy-polling, from Björn. 3) bpf_ima_inode_hash() and bpf_bprm_opts_set() helpers, from KP Singh. 4) Memcg-based memory accounting for bpf objects, from Roman. 5) Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks, from Stanislav. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (118 commits) selftests/bpf: Fix invalid use of strncat in test_sockmap libbpf: Use memcpy instead of strncpy to please GCC selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module selftests/bpf: Add tp_btf CO-RE reloc test for modules libbpf: Support attachment of BPF tracing programs to kernel modules libbpf: Factor out low-level BPF program loading helper bpf: Allow to specify kernel module BTFs when attaching BPF programs bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF selftests/bpf: Add support for marking sub-tests as skipped selftests/bpf: Add bpf_testmod kernel module for testing libbpf: Add kernel module BTF support for CO-RE relocations libbpf: Refactor CO-RE relocs to not assume a single BTF object libbpf: Add internal helper to load BTF data by FD bpf: Keep module's btf_data_size intact after load bpf: Fix bpf_put_raw_tracepoint()'s use of __module_address() selftests/bpf: Add Userspace tests for TCP_WINDOW_CLAMP bpf: Adds support for setting window clamp samples/bpf: Fix spelling mistake "recieving" -> "receiving" bpf: Fix cold build of test_progs-no_alu32 ... ==================== Link: https://lore.kernel.org/r/20201204021936.85653-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
Ronnie Sahlberg | ea64370bca |
cifs: refactor create_sd_buf() and and avoid corrupting the buffer
When mounting with "idsfromsid" mount option, Azure corrupted the owner SIDs due to excessive padding caused by placing the owner fields at the end of the security descriptor on create. Placing owners at the front of the security descriptor (rather than the end) is also safer, as the number of ACEs (that follow it) are variable. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Suggested-by: Rohith Surabattula <rohiths@microsoft.com> CC: Stable <stable@vger.kernel.org> # v5.8 Signed-off-by: Steve French <stfrench@microsoft.com> |
|
Aurelien Aptel | 59463eb888 |
cifs: add NULL check for ses->tcon_ipc
In some scenarios (DFS and BAD_NETWORK_NAME) set_root_set() can be called with a NULL ses->tcon_ipc. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> |
|
Namjae Jeon | 7963178485 |
smb3: set COMPOUND_FID to FileID field of subsequent compound request
For an operation compounded with an SMB2 CREATE request, client must set
COMPOUND_FID(0xFFFFFFFFFFFFFFFF) to FileID field of smb2 ioctl.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Fixes:
|
|
Linus Torvalds | c82a505c00 |
Restore splice functionality for 9p
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAl/IvNUACgkQq06b7GqY 5nBRAA//UQJ3ioEvTlIpdHR6wKeH9XV2bucqjbAO2PRU0r8qEBC6wRr8IDBv4J4D xUyQ+teWjBLoKekdhoXsz3xXWtwqPNchVXClFOtjsUqYn2F71pUYNhRYJL9Xy+yy pumUxG3vJpJ4QTWkuSeB7/Ha561waWHILIeo1knT6z0Q49jmYI1J9Bl84q7RiC/7 wH+bgyE0RSORoIfkfapIvgwdjamUsnzMVhSVKNOol3gwkjFJGfOtk16VVSKq/RlK 5uOnjmWiVAkfPdDOA93mu6KE3FA4FB/sQaESwBR2fawsd646WooG5hn8fhMa0kHP y6AZo6fdYglcOD/OE1S0GpcSOocAdglfcU51kfxX8Z7BwfAbl7WCPCS2Rl75PYFf PhTzidD+vQsroGEaCF2+w90yIZBGgLz0Lh6az3zmaNGo7lcrRyP6GzQMmKILp2O1 MuPWNFRzt5ZAmyq7roVVclMzXnXEQM2v5YS3ib+wuUpQHWwGAUzdV6u9uFsp6vXx QSnsD4Y9wwMSI03QkkzW7TaH1laImRt820emivo3rOoWk1f7sCG5r3WlNPRIh1cN 9n6oCCOvFgXpTmiE0btqSKeMMxCqjtPHGZ5dQTTnftf0euNXGIZDPMSL45ANDXh+ 5Ezhjggqa91ohUSBRJgwc9MGsZy0GxOneDWPMpXAIDe9OlRkSC0= =zCUA -----END PGP SIGNATURE----- Merge tag '9p-for-5.10-rc7' of git://github.com/martinetd/linux Pull 9p fixes from Dominique Martinet: "Restore splice functionality for 9p" * tag '9p-for-5.10-rc7' of git://github.com/martinetd/linux: fs: 9p: add generic splice_write file operation fs: 9p: add generic splice_read file operations |