Merge more updates from Andrew Morton:
"More MM work: a memcg scalability improvememt"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/lru: revise the comments of lru_lock
mm/lru: introduce relock_page_lruvec()
mm/lru: replace pgdat lru_lock with lruvec lock
mm/swap.c: serialize memcg changes in pagevec_lru_move_fn
mm/compaction: do page isolation first in compaction
mm/lru: introduce TestClearPageLRU()
mm/mlock: remove __munlock_isolate_lru_page()
mm/mlock: remove lru_lock on TestClearPageMlocked
mm/vmscan: remove lruvec reget in move_pages_to_lru
mm/lru: move lock into lru_note_cost
mm/swap.c: fold vm event PGROTATED into pagevec_move_tail_fn
mm/memcg: add debug checking in lock_page_memcg
mm: page_idle_get_page() does not need lru_lock
mm/rmap: stop store reordering issue on page->mapping
mm/vmscan: remove unnecessary lruvec adding
mm/thp: narrow lru locking
mm/thp: simplify lru_add_page_tail()
mm/thp: use head for head page in lru_add_page_tail()
mm/thp: move lru_add_page_tail() to huge_memory.c
Since we changed the pgdat->lru_lock to lruvec->lru_lock, it's time to fix
the incorrect comments in code. Also fixed some zone->lru_lock comment
error from ancient time. etc.
I struggled to understand the comment above move_pages_to_lru() (surely
it never calls page_referenced()), and eventually realized that most of
it had got separated from shrink_active_list(): move that comment back.
Link: https://lkml.kernel.org/r/1604566549-62481-20-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Jann Horn <jannh@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Chen, Rong A" <rong.a.chen@intel.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the big char/misc driver update for 5.11-rc1.
Continuing the tradition of previous -rc1 pulls, there seems to be more
and more tiny driver subsystems flowing through this tree.
Lots of different things, all of which have been in linux-next for a
while with no reported issues:
- extcon driver updates
- habannalab driver updates
- mei driver updates
- uio driver updates
- binder fixes and features added
- soundwire driver updates
- mhi bus driver updates
- phy driver updates
- coresight driver updates
- fpga driver updates
- speakup driver updates
- slimbus driver updates
- various small char and misc driver updates
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX9iDZA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylRMACgqxKS2CUcY8tPnR5weHEsbz6O+KAAn3BtEFnK
7V9EnSuZe4L1jNOHOB5V
=xzHh
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver updates from Greg KH:
"Here is the big char/misc driver update for 5.11-rc1.
Continuing the tradition of previous -rc1 pulls, there seems to be
more and more tiny driver subsystems flowing through this tree.
Lots of different things, all of which have been in linux-next for a
while with no reported issues:
- extcon driver updates
- habannalab driver updates
- mei driver updates
- uio driver updates
- binder fixes and features added
- soundwire driver updates
- mhi bus driver updates
- phy driver updates
- coresight driver updates
- fpga driver updates
- speakup driver updates
- slimbus driver updates
- various small char and misc driver updates"
* tag 'char-misc-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (305 commits)
extcon: max77693: Fix modalias string
extcon: fsa9480: Support TI TSU6111 variant
extcon: fsa9480: Rewrite bindings in YAML and extend
dt-bindings: extcon: add binding for TUSB320
extcon: Add driver for TI TUSB320
slimbus: qcom: fix potential NULL dereference in qcom_slim_prg_slew()
siox: Make remove callback return void
siox: Use bus_type functions for probe, remove and shutdown
spmi: Add driver shutdown support
spmi: fix some coding style issues at the spmi core
spmi: get rid of a warning when built with W=1
uio: uio_hv_generic: use devm_kzalloc() for private data alloc
uio: uio_fsl_elbc_gpcm: use device-managed allocators
uio: uio_aec: use devm_kzalloc() for uio_info object
uio: uio_cif: use devm_kzalloc() for uio_info object
uio: uio_netx: use devm_kzalloc() for or uio_info object
uio: uio_mf624: use devm_kzalloc() for uio_info object
uio: uio_sercos3: use device-managed functions for simple allocs
uio: uio_dmem_genirq: finalize conversion of probe to devm_ handlers
uio: uio_dmem_genirq: convert simple allocations to device-managed
...
Here is the big USB and thunderbolt pull request for 5.11-rc1.
Nothing major in here, just the grind of constant development to support
new hardware and fix old issues:
- thunderbolt updates for new USB4 hardware
- cdns3 major driver updates
- lots of typec updates and additions as more hardware is available
- usb serial driver updates and fixes
- other tiny USB driver updates
All have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX9iKFQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yme2gCfacEYztlnc0fh7PyyatYXgdkspbYAn2ri6mfF
4VY86HYXKqQRDW8w/lSg
=f8KJ
-----END PGP SIGNATURE-----
Merge tag 'usb-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big USB and thunderbolt pull request for 5.11-rc1.
Nothing major in here, just the grind of constant development to
support new hardware and fix old issues:
- thunderbolt updates for new USB4 hardware
- cdns3 major driver updates
- lots of typec updates and additions as more hardware is available
- usb serial driver updates and fixes
- other tiny USB driver updates
All have been in linux-next with no reported issues"
* tag 'usb-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (172 commits)
usb: phy: convert comma to semicolon
usb: ucsi: convert comma to semicolon
usb: typec: tcpm: convert comma to semicolon
usb: typec: tcpm: Update vbus_vsafe0v on init
usb: typec: tcpci: Enable bleed discharge when auto discharge is enabled
usb: typec: Add class for plug alt mode device
USB: typec: tcpci: Add Bleed discharge to POWER_CONTROL definition
USB: typec: tcpm: Add a 30ms room for tPSSourceOn in PR_SWAP
USB: typec: tcpm: Fix PR_SWAP error handling
USB: typec: tcpm: Hard Reset after not receiving a Request
USB: gadget: f_fs: remove likely/unlikely
usb: gadget: f_fs: Re-use SS descriptors for SuperSpeedPlus
USB: gadget: f_midi: setup SuperSpeed Plus descriptors
USB: gadget: f_acm: add support for SuperSpeed Plus
USB: gadget: f_rndis: fix bitrate for SuperSpeed and above
usb: typec: intel_pmc_mux: Configure cable generation value for USB4
MAINTAINERS: Add myself as a reviewer for CADENCE USB3 DRD IP DRIVER
usb: chipidea: ci_hdrc_imx: Use of_device_get_match_data()
usb: chipidea: usbmisc_imx: Use of_device_get_match_data()
usb: cdns3: fix NULL pointer dereference on no platform data
...
Core:
- support "prefer busy polling" NAPI operation mode, where we defer softirq
for some time expecting applications to periodically busy poll
- AF_XDP: improve efficiency by more batching and hindering
the adjacency cache prefetcher
- af_packet: make packet_fanout.arr size configurable up to 64K
- tcp: optimize TCP zero copy receive in presence of partial or unaligned
reads making zero copy a performance win for much smaller messages
- XDP: add bulk APIs for returning / freeing frames
- sched: support fragmenting IP packets as they come out of conntrack
- net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs
BPF:
- BPF switch from crude rlimit-based to memcg-based memory accounting
- BPF type format information for kernel modules and related tracing
enhancements
- BPF implement task local storage for BPF LSM
- allow the FENTRY/FEXIT/RAW_TP tracing programs to use bpf_sk_storage
Protocols:
- mptcp: improve multiple xmit streams support, memory accounting and
many smaller improvements
- TLS: support CHACHA20-POLY1305 cipher
- seg6: add support for SRv6 End.DT4/DT6 behavior
- sctp: Implement RFC 6951: UDP Encapsulation of SCTP
- ppp_generic: add ability to bridge channels directly
- bridge: Connectivity Fault Management (CFM) support as is defined in
IEEE 802.1Q section 12.14.
Drivers:
- mlx5: make use of the new auxiliary bus to organize the driver internals
- mlx5: more accurate port TX timestamping support
- mlxsw:
- improve the efficiency of offloaded next hop updates by using
the new nexthop object API
- support blackhole nexthops
- support IEEE 802.1ad (Q-in-Q) bridging
- rtw88: major bluetooth co-existance improvements
- iwlwifi: support new 6 GHz frequency band
- ath11k: Fast Initial Link Setup (FILS)
- mt7915: dual band concurrent (DBDC) support
- net: ipa: add basic support for IPA v4.5
Refactor:
- a few pieces of in_interrupt() cleanup work from Sebastian Andrzej Siewior
- phy: add support for shared interrupts; get rid of multiple driver
APIs and have the drivers write a full IRQ handler, slight growth
of driver code should be compensated by the simpler API which
also allows shared IRQs
- add common code for handling netdev per-cpu counters
- move TX packet re-allocation from Ethernet switch tag drivers to
a central place
- improve efficiency and rename nla_strlcpy
- number of W=1 warning cleanups as we now catch those in a patchwork
build bot
Old code removal:
- wan: delete the DLCI / SDLA drivers
- wimax: move to staging
- wifi: remove old WDS wifi bridging support
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl/YXmUACgkQMUZtbf5S
IrvSQBAAgOrt4EFopEvVqlTHZbqI45IEqgtXS+YWmlgnjZCgshyMj8q1yK1zzane
qYxr/NNJ9kV3FdtaynmmHPgEEEfR5kJ/D3B2BsxYDkaDDrD0vbNsBGw+L+/Gbhxl
N/5l/9FjLyLY1D+EErknuwR5XGuQ6BSDVaKQMhYOiK2hgdnAAI4hszo8Chf6wdD0
XDBslQ7vpD/05r+eMj0IkS5dSAoGOIFXUxhJ5dqrDbRHiKsIyWqA3PLbYemfAhxI
s2XckjfmSgGE3FKL8PSFu+EcfHbJQQjLcULJUnqgVcdwEEtRuE9ggEi52nZRXMWM
4e8sQJAR9Fx7pZy0G1xfS149j6iPU5LjRlU9TNSpVABz14Vvvo3gEL6gyIdsz+xh
hMN7UBdp0FEaP028CXoIYpaBesvQqj0BSndmee8qsYAtN6j+QKcM2AOSr7JN1uMH
C/86EDoGAATiEQIVWJvnX5MPmlAoblyLA+RuVhmxkIBx2InGXkFmWqRkXT5l4jtk
LVl8/TArR4alSQqLXictXCjYlCm9j5N4zFFtEVasSYi7/ZoPfgRNWT+lJ2R8Y+Zv
+htzGaFuyj6RJTVeFQMrkl3whAtBamo2a0kwg45NnxmmXcspN6kJX1WOIy82+MhD
Yht7uplSs7MGKA78q/CDU0XBeGjpABUvmplUQBIfrR/jKLW2730=
=GXs1
-----END PGP SIGNATURE-----
Merge tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- support "prefer busy polling" NAPI operation mode, where we defer
softirq for some time expecting applications to periodically busy
poll
- AF_XDP: improve efficiency by more batching and hindering the
adjacency cache prefetcher
- af_packet: make packet_fanout.arr size configurable up to 64K
- tcp: optimize TCP zero copy receive in presence of partial or
unaligned reads making zero copy a performance win for much smaller
messages
- XDP: add bulk APIs for returning / freeing frames
- sched: support fragmenting IP packets as they come out of conntrack
- net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs
BPF:
- BPF switch from crude rlimit-based to memcg-based memory accounting
- BPF type format information for kernel modules and related tracing
enhancements
- BPF implement task local storage for BPF LSM
- allow the FENTRY/FEXIT/RAW_TP tracing programs to use
bpf_sk_storage
Protocols:
- mptcp: improve multiple xmit streams support, memory accounting and
many smaller improvements
- TLS: support CHACHA20-POLY1305 cipher
- seg6: add support for SRv6 End.DT4/DT6 behavior
- sctp: Implement RFC 6951: UDP Encapsulation of SCTP
- ppp_generic: add ability to bridge channels directly
- bridge: Connectivity Fault Management (CFM) support as is defined
in IEEE 802.1Q section 12.14.
Drivers:
- mlx5: make use of the new auxiliary bus to organize the driver
internals
- mlx5: more accurate port TX timestamping support
- mlxsw:
- improve the efficiency of offloaded next hop updates by using
the new nexthop object API
- support blackhole nexthops
- support IEEE 802.1ad (Q-in-Q) bridging
- rtw88: major bluetooth co-existance improvements
- iwlwifi: support new 6 GHz frequency band
- ath11k: Fast Initial Link Setup (FILS)
- mt7915: dual band concurrent (DBDC) support
- net: ipa: add basic support for IPA v4.5
Refactor:
- a few pieces of in_interrupt() cleanup work from Sebastian Andrzej
Siewior
- phy: add support for shared interrupts; get rid of multiple driver
APIs and have the drivers write a full IRQ handler, slight growth
of driver code should be compensated by the simpler API which also
allows shared IRQs
- add common code for handling netdev per-cpu counters
- move TX packet re-allocation from Ethernet switch tag drivers to a
central place
- improve efficiency and rename nla_strlcpy
- number of W=1 warning cleanups as we now catch those in a patchwork
build bot
Old code removal:
- wan: delete the DLCI / SDLA drivers
- wimax: move to staging
- wifi: remove old WDS wifi bridging support"
* tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1922 commits)
net: hns3: fix expression that is currently always true
net: fix proc_fs init handling in af_packet and tls
nfc: pn533: convert comma to semicolon
af_vsock: Assign the vsock transport considering the vsock address flags
af_vsock: Set VMADDR_FLAG_TO_HOST flag on the receive path
vsock_addr: Check for supported flag values
vm_sockets: Add VMADDR_FLAG_TO_HOST vsock flag
vm_sockets: Add flags field in the vsock address data structure
net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled
tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit
net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context
nfc: s3fwrn5: Release the nfc firmware
net: vxget: clean up sparse warnings
mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router
mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3
mlxsw: spectrum_router_xm: Introduce basic XM cache flushing
mlxsw: reg: Add Router LPM Cache Enable Register
mlxsw: reg: Add Router LPM Cache ML Delete Register
mlxsw: spectrum_router_xm: Implement L-value tracking for M-index
mlxsw: reg: Add XM Router M Table Register
...
Merge misc updates from Andrew Morton:
- a few random little subsystems
- almost all of the MM patches which are staged ahead of linux-next
material. I'll trickle to post-linux-next work in as the dependents
get merged up.
Subsystems affected by this patch series: kthread, kbuild, ide, ntfs,
ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache,
gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation,
kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction,
oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc,
uaccess, zram, and cleanups).
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits)
mm: cleanup kstrto*() usage
mm: fix fall-through warnings for Clang
mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at
mm: shmem: convert shmem_enabled_show to use sysfs_emit_at
mm:backing-dev: use sysfs_emit in macro defining functions
mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening
mm: use sysfs_emit for struct kobject * uses
mm: fix kernel-doc markups
zram: break the strict dependency from lzo
zram: add stat to gather incompressible pages since zram set up
zram: support page writeback
mm/process_vm_access: remove redundant initialization of iov_r
mm/zsmalloc.c: rework the list_add code in insert_zspage()
mm/zswap: move to use crypto_acomp API for hardware acceleration
mm/zswap: fix passing zero to 'PTR_ERR' warning
mm/zswap: make struct kernel_param_ops definitions const
userfaultfd/selftests: hint the test runner on required privilege
userfaultfd/selftests: fix retval check for userfaultfd_open()
userfaultfd/selftests: always dump something in modes
userfaultfd: selftests: make __{s,u}64 format specifiers portable
...
Currently, zram supports the stat via /sys/block/zram/mm_stat to represent
how many of incompressible pages are stored at the moment but it couldn't
show how many times incompressible pages were wrote down since zram set
up. It's also good indication to see how zram is effective in the system.
Link: https://lkml.kernel.org/r/20201130201907.1284910-1-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is demand to writeback specific process pages to backing store
instead of all idles pages in the system due to storage wear out concerns
and to launching latency of apps which are most of the time idle but are
critical for resume latency.
This patch extends the writeback knob to support a specific page
writeback.
Link: https://lkml.kernel.org/r/20201020190506.3758660-1-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With this change, when the knob is set to 0, it allows unprivileged users
to call userfaultfd, like when it is set to 1, but with the restriction
that page faults from only user-mode can be handled. In this mode, an
unprivileged user (without SYS_CAP_PTRACE capability) must pass
UFFD_USER_MODE_ONLY to userfaultd or the API will fail with EPERM.
This enables administrators to reduce the likelihood that an attacker with
access to userfaultfd can delay faulting kernel code to widen timing
windows for other exploits.
The default value of this knob is changed to 0. This is required for
correct functioning of pipe mutex. However, this will fail postcopy live
migration, which will be unnoticeable to the VM guests. To avoid this,
set 'vm.userfault = 1' in /sys/sysctl.conf.
The main reason this change is desirable as in the short term is that the
Android userland will behave as with the sysctl set to zero. So without
this commit, any Linux binary using userfaultfd to manage its memory would
behave differently if run within the Android userland. For more details,
refer to Andrea's reply [1].
[1] https://lore.kernel.org/lkml/20200904033438.GI9411@redhat.com/
Link: https://lkml.kernel.org/r/20201120030411.2690816-3-lokeshgidra@google.com
Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Daniel Colascione <dancol@dancol.org>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: <calin@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Nitin Gupta <nigupta@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Daniel Colascione <dancol@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 5647bc293a ("mm: compaction: Move migration fail/success
stats to migrate.c"), removed 3 items in /proc/vmstat. but the docs
still has their explanation. let's remove them.
"compact_blocks_moved",
"compact_pages_moved",
"compact_pagemigrate_failed",
Link: https://lkml.kernel.org/r/1605520282-51993-1-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For many workloads, pagetable consumption is significant and it makes
sense to expose it in the memory.stat for the memory cgroups. However at
the moment, the pagetables are accounted per-zone. Converting them to
per-node and using the right interface will correctly account for the
memory cgroups as well.
[akpm@linux-foundation.org: export __mod_lruvec_page_state to modules for arch/mips/kvm/]
Link: https://lkml.kernel.org/r/20201130212541.2781790-3-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update cgroup v1 docs after the deprecation of the non-hierarchical mode
of the memory controller.
Link: https://lkml.kernel.org/r/20201110220800.929549-3-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As huge page usage in the page cache and for shmem files proliferates in
our production environment, the performance monitoring team has asked for
per-cgroup stats on those pages.
We already track and export anon_thp per cgroup. We already track file
THP and shmem THP per node, so making them per-cgroup is only a matter of
switching from node to lruvec counters. All callsites are in places where
the pages are charged and locked, so page->memcg is stable.
[hannes@cmpxchg.org: add documentation]
Link: https://lkml.kernel.org/r/20201026174029.GC548555@cmpxchg.org
Link: https://lkml.kernel.org/r/20201022151844.489337-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- More generalization of entry/exit functionality
- The consolidation work to reclaim TIF flags on x86 and also for non-x86
specific TIF flags which are solely relevant for syscall related work
and have been moved into their own storage space. The x86 specific part
had to be merged in to avoid a major conflict.
- The TIF_NOTIFY_SIGNAL work which replaces the inefficient signal
delivery mode of task work and results in an impressive performance
improvement for io_uring. The non-x86 consolidation of this is going to
come seperate via Jens.
- The selective syscall redirection facility which provides a clean and
efficient way to support the non-Linux syscalls of WINE by catching them
at syscall entry and redirecting them to the user space emulation. This
can be utilized for other purposes as well and has been designed
carefully to avoid overhead for the regular fastpath. This includes the
core changes and the x86 support code.
- Simplification of the context tracking entry/exit handling for the users
of the generic entry code which guarantee the proper ordering and
protection.
- Preparatory changes to make the generic entry code accomodate S390
specific requirements which are mostly related to their syscall restart
mechanism.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XoPoTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoe0tD/4jSKHIogVM9kVpiYfwjDGS1NluaBXn
71ZoASbX9GZebyGandMyF2QP1iJ24ZO0RztBwHEVH6fyomKB2iFNedssCpO9yfWV
3eFRpOvMpbszY2W2bd0QG3GrqaTttjVfB4ahkGLzqeSbchdob6hZpNDYtBZnujA6
GSnrrurfJkCGoQny+yJQYdQJXQU+BIX90B2a2Q+jW123Luy/iHXC1f/krZSA1m14
fC9xYLSUjPphTzh2ZOW+C3DgdjOL5PfAm/6F+DArt4GtLgrEGD7R74aLSFhvetky
dn5QtG+yAsz1i0cc5Wu/JBcT9tOkY92rPYSyLI9bYQUSQ/bMyuprz6oYKj3dubsu
ZSsKPdkNFPIniL4fLdCMWZcIXX5xgnrxKjdgXZXW3gtrcxSns8w8uED3Sh7dgE08
pgIeq67E5g/OB8kJXH1VxdewmeQb9cOmnzzHwNO7TrrGbBKjDTYHNdYOKf1dUTTK
ZX1UjLfGwxTkMYAbQD1k0JGZ2OLRshzSaH5BW/ZKa3bvJW6yYOq+/YT8B8hbJ8U3
vThlO75/55IJxS5r5Y3vZd/IHdsYbPuETD+TA8tNYtPqNZasW8nnk4TYctWqzDuO
/Ka1wvWYid3c6ySznQn4zSyRjr968AfHeZ9YTUMhWufy5waXVmdBMG41u3IKfsVt
osyzNc4EK19/Mg==
=hsjV
-----END PGP SIGNATURE-----
Merge tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core entry/exit updates from Thomas Gleixner:
"A set of updates for entry/exit handling:
- More generalization of entry/exit functionality
- The consolidation work to reclaim TIF flags on x86 and also for
non-x86 specific TIF flags which are solely relevant for syscall
related work and have been moved into their own storage space. The
x86 specific part had to be merged in to avoid a major conflict.
- The TIF_NOTIFY_SIGNAL work which replaces the inefficient signal
delivery mode of task work and results in an impressive performance
improvement for io_uring. The non-x86 consolidation of this is
going to come seperate via Jens.
- The selective syscall redirection facility which provides a clean
and efficient way to support the non-Linux syscalls of WINE by
catching them at syscall entry and redirecting them to the user
space emulation. This can be utilized for other purposes as well
and has been designed carefully to avoid overhead for the regular
fastpath. This includes the core changes and the x86 support code.
- Simplification of the context tracking entry/exit handling for the
users of the generic entry code which guarantee the proper ordering
and protection.
- Preparatory changes to make the generic entry code accomodate S390
specific requirements which are mostly related to their syscall
restart mechanism"
* tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
entry: Add syscall_exit_to_user_mode_work()
entry: Add exit_to_user_mode() wrapper
entry_Add_enter_from_user_mode_wrapper
entry: Rename exit_to_user_mode()
entry: Rename enter_from_user_mode()
docs: Document Syscall User Dispatch
selftests: Add benchmark for syscall user dispatch
selftests: Add kselftest for syscall user dispatch
entry: Support Syscall User Dispatch on common syscall entry
kernel: Implement selective syscall userspace redirection
signal: Expose SYS_USER_DISPATCH si_code type
x86: vdso: Expose sigreturn address on vdso to the kernel
MAINTAINERS: Add entry for common entry code
entry: Fix boot for !CONFIG_GENERIC_ENTRY
x86: Support HAVE_CONTEXT_TRACKING_OFFSTACK
context_tracking: Only define schedule_user() on !HAVE_CONTEXT_TRACKING_OFFSTACK archs
sched: Detect call to schedule from critical entry code
context_tracking: Don't implement exception_enter/exit() on CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK
context_tracking: Introduce HAVE_CONTEXT_TRACKING_OFFSTACK
x86: Reclaim unused x86 TI flags
...
of the churn behind us. Significant stuff in this pull includes:
- A set of new Chinese translations
- Italian translation updates
- A mechanism from Mauro to automatically format Documentation/features
for the built docs
- Automatic cross references without explicit :ref: markup
- A new reset-controller document
- An extensive new document on reporting problems from Thorsten
That last patch also adds the CC-BY-4.0 license to LICENSES/dual; there was
some discussion on this, but we seem to have consensus and an ack from Greg
for that addition.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl/XyewPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5YUeYH/AiNNlVIF/80T45TAkm+1kpy2Fb+d/5wbtGK
PB7OTXPyDmmqwZNldlF9IsRhp5W+wYC3PNlulYMG44hT7/Jqf2CMFw8SOZqGLmBV
LhWwoS+TAWLB19IOOMrVXbhAlNsX01NwBDY/dwONjW1Jcu+tuAsBR47T9lKjw4kJ
qGFGMQTvZG9Ig1x7E6X38mAd7W3SD1viNuUePS2YcoB15GAocWfVVHvu1r+RHUTS
27ET8tWzMMuiaCAD6toVY9L4T7iCI7YSPXQm8BOkf/f4LXDnpo8Fo11LE5ozTAh3
+avnNt8vnrRXc06MnzwsvNHm2TqN97B4shkeDiPAV3ySXI8Zu/w=
=HScX
-----END PGP SIGNATURE-----
Merge tag 'docs-5.11' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"A much quieter cycle for documentation (happily), with, one hopes, the
bulk of the churn behind us. Significant stuff in this pull includes:
- A set of new Chinese translations
- Italian translation updates
- A mechanism from Mauro to automatically format
Documentation/features for the built docs
- Automatic cross references without explicit :ref: markup
- A new reset-controller document
- An extensive new document on reporting problems from Thorsten
That last patch also adds the CC-BY-4.0 license to LICENSES/dual;
there was some discussion on this, but we seem to have consensus and
an ack from Greg for that addition"
* tag 'docs-5.11' of git://git.lwn.net/linux: (50 commits)
docs: fix broken cross reference in translations/zh_CN
docs: Note that sphinx 1.7 will be required soon
docs: update requirements to install six module
docs: reporting-issues: move 'outdated, need help' note to proper place
docs: Update documentation to reflect what TAINT_CPU_OUT_OF_SPEC means
docs: add a reset controller chapter to the driver API docs
docs: make reporting-bugs.rst obsolete
docs: Add a new text describing how to report bugs
LICENSES: Add the CC-BY-4.0 license
Documentation: fix multiple typos found in the admin-guide subdirectory
Documentation: fix typos found in admin-guide subdirectory
kernel-doc: Fix example in Nested structs/unions
docs: clean up sysctl/kernel: titles, version
docs: trace: fix event state structure name
docs: nios2: add missing ReST file
scripts: get_feat.pl: reduce table width for all features output
scripts: get_feat.pl: change the group by order
scripts: get_feat.pl: make complete table more coincise
scripts: kernel-doc: fix parsing function-like typedefs
Documentation: fix typos found in process, dev-tools, and doc-guide subdirectories
...
applications to populate protected regions of user code and data called
enclaves. Once activated, the new hardware protects enclave code and
data from outside access and modification.
Enclaves provide a place to store secrets and process data with those
secrets. SGX has been used, for example, to decrypt video without
exposing the decryption keys to nosy debuggers that might be used to
subvert DRM. Software has generally been rewritten specifically to
run in enclaves, but there are also projects that try to run limited
unmodified software in enclaves."
Most of the functionality is concentrated into arch/x86/kernel/cpu/sgx/
except the addition of a new mprotect() hook to control enclave page
permissions and support for vDSO exceptions fixup which will is used by
SGX enclaves.
All this work by Sean Christopherson, Jarkko Sakkinen and many others.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl/XTtMACgkQEsHwGGHe
VUqxFw/+NZGf2b3CWPcrvwXCpkvSpIrqh1jQwyvkZyJ1gen7Vy8dkvf99h8+zQPI
4wSArEyjhYJKAAmBNefLKi/Cs/bdkGzLlZyDGqtM641XRjf0xXIpQkOBb6UBa+Pv
to8veQmVH2bBTM49qnd+H1wM6FzYvhTYCD8xr4HlLXtIfpP2CK2GvCb8s/4LifgD
fTucZX9TFwLgVkWOHWHN0n8XMR2Fjb2YCrwjFMKyr/M2W+pPoOCTIt4PWDuXiOeG
rFP7R4DT9jDg8ht5j2dHQT/Bo8TvTCB4Oj98MrX1TTgkSjLJySSMfyQg5EwNfSIa
HC0lg/6qwAxnhWX7cCCBETNZ4aYDmz/dxcCSsLbomGP9nMaUgUy7qn5nNuNbJilb
oCBsr8LDMzu1LJzmkduM8Uw6OINh+J8ICoVXaR5pS7gSZz/+vqIP/rK691AiqhJL
QeMkI9gQ83jEXpr/AV7ABCjGCAeqELOkgravUyTDev24eEc0LyU0qENpgxqWSTca
OvwSWSwNuhCKd2IyKZBnOmjXGwvncwX0gp1KxL9WuLkR6O8XldLAYmVCwVAOrIh7
snRot8+3qNjELa65Nh5DapwLJrU24TRoKLHLgfWK8dlqrMejNtXKucQ574Np0feR
p2hrNisOrtCwxAt7OAgWygw8agN6cJiY18onIsr4wSBm5H7Syb0=
=k7tj
-----END PGP SIGNATURE-----
Merge tag 'x86_sgx_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGC support from Borislav Petkov:
"Intel Software Guard eXtensions enablement. This has been long in the
making, we were one revision number short of 42. :)
Intel SGX is new hardware functionality that can be used by
applications to populate protected regions of user code and data
called enclaves. Once activated, the new hardware protects enclave
code and data from outside access and modification.
Enclaves provide a place to store secrets and process data with those
secrets. SGX has been used, for example, to decrypt video without
exposing the decryption keys to nosy debuggers that might be used to
subvert DRM. Software has generally been rewritten specifically to run
in enclaves, but there are also projects that try to run limited
unmodified software in enclaves.
Most of the functionality is concentrated into arch/x86/kernel/cpu/sgx/
except the addition of a new mprotect() hook to control enclave page
permissions and support for vDSO exceptions fixup which will is used
by SGX enclaves.
All this work by Sean Christopherson, Jarkko Sakkinen and many others"
* tag 'x86_sgx_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
x86/sgx: Return -EINVAL on a zero length buffer in sgx_ioc_enclave_add_pages()
x86/sgx: Fix a typo in kernel-doc markup
x86/sgx: Fix sgx_ioc_enclave_provision() kernel-doc comment
x86/sgx: Return -ERESTARTSYS in sgx_ioc_enclave_add_pages()
selftests/sgx: Use a statically generated 3072-bit RSA key
x86/sgx: Clarify 'laundry_list' locking
x86/sgx: Update MAINTAINERS
Documentation/x86: Document SGX kernel architecture
x86/sgx: Add ptrace() support for the SGX driver
x86/sgx: Add a page reclaimer
selftests/x86: Add a selftest for SGX
x86/vdso: Implement a vDSO for Intel SGX enclave call
x86/traps: Attempt to fixup exceptions in vDSO before signaling
x86/fault: Add a helper function to sanitize error code
x86/vdso: Add support for exception fixup in vDSO functions
x86/sgx: Add SGX_IOC_ENCLAVE_PROVISION
x86/sgx: Add SGX_IOC_ENCLAVE_INIT
x86/sgx: Add SGX_IOC_ENCLAVE_ADD_PAGES
x86/sgx: Add SGX_IOC_ENCLAVE_CREATE
x86/sgx: Add an SGX misc driver interface
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAl/XHngACgkQCF8+vY7k
4RXHjg/8CVAkeLzVHFJ8odrt/tABXd5UxFE8RvqDnrb9SRvtx1tyLmcRb/WXoAhw
Eg0MM+o5qYN8t7uP3x16yOxzsm3ix2Z+imRiIWBLSju14BTPSD7kLp+W+AaY6kT5
cudI/907vqIb7uEZvG7jF7jM6BJfz58Du8dnmhCgehWTBguUOChc0lBxjuTG/KGZ
Cueiq+LgvxKeZk9GvN4H6xeMPsn/ZEB5VSe/Knp95iCA6kEFq56DC0oYCUFzi2ao
5sX5UsX9xpertFXna/tZBTo34RIofpPcctNd98La36oIV4XIVDp0FMpKCpmaDcHM
wCMmK/K7sGRLqS6pmPZvfA6V1uIITQbYLz4z3WO9k0rJb3LgD9ied0XmHfcgNP8P
NxTPm4jYTk6ELc/bgB/2k1AXrOm6kWItiITKZThcuCBffoLOrRcYgsVdP+ieSeb5
8XkhjH5jADtB2HdSNvkX9CikJMB3XzaFjqLzcaFgwDqTgw1voh2ardSp5xuZuiEJ
fw3QEpnBYbN5XFXlkwJY7VA94Dt93OQX5pfT2fUAh6MJt+SzmW17Tup+6LsfvzL5
yJcZ18rjyo5rr0kIfBl7FLZ7nrM9PA4erayJ2gZwCUxP9mF+URW+/UI/ytL1cOIu
iFqzj7KRD2nwfySd5UgOkD1yViXb6M4dLf5E/t5VbyU3qIHUpwM=
=mi60
-----END PGP SIGNATURE-----
Merge tag 'media/v5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- some rework at the uAPI pixel format docs
- the smiapp driver has started to gain support for MIPI CSS camera
sensors and was renamed
- two new sensor drivers: ov02a10 and ov9734
- Meson gained a driver for the 2D acceleration unit
- Rockchip rkisp1 driver was promoted from staging
- Cedrus driver gained support for VP8
- two new remote controller keymaps were added
- the usual set of fixes cleanups and driver improvements
* tag 'media/v5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (447 commits)
media: ccs: Add support for obtaining C-PHY configuration from firmware
media: ccs-pll: Print pixel rates
media: ccs: Print written register values
media: ccs: Add support for DDR OP SYS and OP PIX clocks
media: ccs-pll: Add support for DDR OP system and pixel clocks
media: ccs: Dual PLL support
media: ccs-pll: Add trivial dual PLL support
media: ccs-pll: Separate VT divisor limit calculation from the rest
media: ccs-pll: Fix VT post-PLL divisor calculation
media: ccs-pll: Make VT divisors 16-bit
media: ccs-pll: Rework bounds checks
media: ccs-pll: Print relevant information on PLL tree
media: ccs-pll: Better separate OP and VT sub-tree calculation
media: ccs-pll: Check for derating and overrating, support non-derating sensors
media: ccs-pll: Split off VT subtree calculation
media: ccs-pll: Add C-PHY support
media: ccs-pll: Add sanity checks
media: ccs-pll: Add support flexible OP PLL pixel clock divider
media: ccs-pll: Support two cycles per pixel on OP domain
media: ccs-pll: Add support for extended input PLL clock divider
...
UAS does not share the pessimistic assumption storage is making that
devices cannot deal with WRITE_SAME. A few devices supported by UAS,
are reported to not deal well with WRITE_SAME. Those need a quirk.
Add it to the device that needs it.
Reported-by: David C. Partridge <david.partridge@perdrix.co.uk>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201209152639.9195-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here's a patch updating the meaning of TAINT_CPU_OUT_OF_SPEC after
Borislav introduced changes in a7e1f67ed2 and upcoming patches in tip.
TAINT_CPU_OUT_OF_SPEC now means a bit more what it implies as the
flag isn't set just because of a CPU misconfiguration or mismatch.
Historically it was for SMP kernel oops on an officially SMP incapable
processor but now it also covers CPUs whose MSRs have been incorrectly
poked at from userspace, drivers being used on non supported
architectures, broken firmware, mismatched CPUs, ...
Update documentation and script to reflect that.
Signed-off-by: Mathieu Chouquet-Stringer <me@mathieu.digital>
Link: https://lore.kernel.org/r/20201202153244.709752-1-me@mathieu.digital
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Make various places which point to
Documentation/admin-guide/reporting-bugs.rst point to
Documentation/admin-guide/reporting-issues.rst instead. That document is
brand new and as of now is not completely finished. But even at this
stage it's a lot more helpful and accurate than reporting-bugs.rst.
Hence also add a note to reporting-bugs.rst, telling people they're
better off reading reporting-issues.rst instead.
reporting-bugs.rst is scheduled for removal once reporting-issues.rst
is considered ready.
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/3df7c2d16de112b47bb6e6158138608e78562bf5.1607063223.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Add a mostly finished document describing how to report issues with the
Linux kernel to its developers. It is designed to be a lot more straight
forward and easier to follow than the current text about this
(Documentation/admin-guide/reporting-bugs.rst); at the same time the new
text should be more helpful for people unfamiliar with the topic, as it
provides a lot more details, too.
The main work on the text is done, but some polishing is still needed.
The text also needs to be reviewed by more people and a few issues still
might need some discussion. To make these tasks easier, it was decided
([1]) to add this document to the kernel sources in parallel to the
existing text; the latter will be removed once this text is considered
good enough(tm).
This document is quite long and provides a lot of details, but was
carefully crafted to make sure it's can also serve people that are in a
hurry. That's mainly achieved by having a TDLR and a step-by-step guide,
which should be good enough for quite a lot of people. Everybody that
wants or need more explanations can find them in a reference section,
which describes all the needed steps in detail.
Thanks to this structure the text can work for kernel developers that
just need to look something up, experienced FLOSS contributors that are
unfamiliar with the kernel's bug reporting workflow, and users reporting
something upstream for the first time. The text is thus a bit like the
kernel itself, which works well for embedded machines, a typical desktop
PC, cloud servers, and HPC.
The document was written in the hope it will improve the quality of the
bug reports, especially those that come from people unfamiliar with how
Linux kernel development works. Sadly quite a few reports from this
group are currently of poor quality and/or get submitted to the wrong
place. Part of the problem is the old reporting-bugs document, as it
makes its essence hard to grasp; it's and also inaccurate and slightly
outdated in a few spots. Due to this quite a few valid reports are
ignored in the end, which is annoying for those that compiled them and
bad for the kernel's quality.
The document near the top points out that it's still unfinished, but
nevertheless ready for consumption. Those few areas in the text that
might need some further discussion contain a note pointing this out.
Besides lack of review from core developers there is only one major
issue left: the section 'Decode failure message' is known to be
outdated: it's waiting for someone familiar with the topic to write
something up or give at least provide some hints and pointers what to
write there.
The new document is dual-licensed under GPL-2.0+ or CC-BY-4.0. The
latter is way more liberal and makes it attractive to use this text as a
base when writing about this topic on websites or in books. This
hopefully increases the chances that such texts are accurate and stick
to official way of doing things.
[1] https://lkml.kernel.org/r/20201118172958.5b014a44@lwn.net
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/e2db808f954744b79f10937a923d9c99bdca1fca.1607063223.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This cleans up a few titles with extra colons, and removes the
reference to kernel 2.2. The docs don't yet cover *all* of 5.10 or
5.11, but I think they're close enough. Most entries are documented,
and have been checked against current kernels.
Signed-off-by: Stephen Kitt <steve@sk2.org>
Link: https://lore.kernel.org/r/20201208074922.30359-1-steve@sk2.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
We want the fixes in here, and this resolves a merge issue with
drivers/misc/habanalabs/common/memory.c.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix warnings from make htmlddocs:
Documentation/output/videodev2.h.rst:6: WARNING: undefined label: v4l2-meta-fmt-rk-isp1-params (if the link has no caption the label must precede a section header)
Documentation/output/videodev2.h.rst:6: WARNING: undefined label: v4l2-meta-fmt-rk-isp1-stat-3a (if the link has no caption the label must precede a section header)
Fixes: df22026aeb ("media: videodev2.h, v4l2-ioctl: add rkisp1 meta buffer format")
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
In case the bootconfig is created on one kind of endian machine, and then
read on the other kind of endian kernel, the size and checksum will be
incorrect. Instead, have both the size and checksum always be little
endian and have the tool and the kernel convert it from little endian to
or from the host endian.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX8brThQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qiBMAQDe1vsp/SyHO9H5pnsepdmk4fERn0bC
Q0qtCoYp1xUKOQEAjnOJKdCE1O6n24u+b+3jw3BHswQLyUKOFaPcIM7jSgM=
=Z6kA
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull bootconfig fixes from Steven Rostedt:
"Have bootconfig size and checksum be little endian
In case the bootconfig is created on one kind of endian machine, and
then read on the other kind of endian kernel, the size and checksum
will be incorrect. Instead, have both the size and checksum always be
little endian and have the tool and the kernel convert it from little
endian to or from the host endian"
* tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
docs: bootconfig: Add the endianness of fields
tools/bootconfig: Store size and checksum in footer as le32
bootconfig: Load size and checksum in the footer as le32
Explain the interface, provide some background and security notes.
[ tglx: Add note about non-visibility, add it to the index and fix the
kerneldoc warning ]
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20201127193238.821364-8-krisman@collabora.com
- Use correct timestamp variable for ring buffer write stamp update
- Fix up before stamp and write stamp when crossing ring buffer sub
buffers
- Keep a zero delta in ring buffer in slow path if cmpxchg fails
- Fix trace_printk static buffer for archs that care
- Fix ftrace record accounting for ftrace ops with trampolines
- Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
- Remove WARN_ON in hwlat tracer that triggers on something that is OK
- Make "my_tramp" trampoline in ftrace direct sample code global
- Fixes in the bootconfig tool for better alignment management
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX8ZzghQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qg0JAQCII1bDQyF3APLlNFRqfHf3bTo7Zl5z
WaUd1Cd7JkY+WAD/eF1dWjN0JRtfU+oRlk6UZ4oNmp8WMJvQ7oV26ub2egE=
=lts8
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Use correct timestamp variable for ring buffer write stamp update
- Fix up before stamp and write stamp when crossing ring buffer sub
buffers
- Keep a zero delta in ring buffer in slow path if cmpxchg fails
- Fix trace_printk static buffer for archs that care
- Fix ftrace record accounting for ftrace ops with trampolines
- Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
- Remove WARN_ON in hwlat tracer that triggers on something that is OK
- Make "my_tramp" trampoline in ftrace direct sample code global
- Fixes in the bootconfig tool for better alignment management
* tag 'trace-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ring-buffer: Always check to put back before stamp when crossing pages
ftrace: Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
ftrace: Fix updating FTRACE_FL_TRAMP
tracing: Fix alignment of static buffer
tracing: Remove WARN_ON in start_thread()
samples/ftrace: Mark my_tramp[12]? global
ring-buffer: Set the right timestamp in the slow path of __rb_reserve_next()
ring-buffer: Update write stamp with the correct ts
docs: bootconfig: Update file format on initrd image
tools/bootconfig: Align the bootconfig applied initrd image size to 4
tools/bootconfig: Fix to check the write failure correctly
tools/bootconfig: Fix errno reference after printf()
Add a description about the endianness of the size and the checksum
fields. Those must be stored as le32 instead of u32. This will allow
us to apply bootconfig to the cross build initrd without caring
the endianness.
Link: https://lkml.kernel.org/r/160583936246.547349.10964204130590955409.stgit@devnote2
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
early_param memmap is only implemented on X86, MIPS and XTENSA. To avoid
wasting users’ time on trying this on platform like ARM, mark it clearly.
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20201128195121.2556-1-song.bao.hua@hisilicon.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
From Daniel's cover letter:
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.
However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern.
This patch series flushes the L1 cache on kernel entry (patch 2) and after the
kernel performs any user accesses (patch 3). It also adds a self-test and
performs some related cleanups.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl+2aqETHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgG+hD/4njSFct2amqWfqDYR9b2OykWmnMQXn
geookk5SbItQF7vh1q2SVA6r43s5ZAxgD5fezx4LgG6p3QU39+Tr0RhzUUHWMPDV
UNGZK6x/N/GSYeq0bqvMHmVwS0FDjPE8nOtA8Hn2T9mUUsu9G0okpgYPLnEu6rb1
gIyS35zlLBh9obi3MfJzyln/AmCE7hdonKRtLAxvGiERJAyfAG757lrdjrwavyHy
mwz+XPl5PF88jfO5cbcZT9gNHmZZPzVsOVwNcstCh2FcwuePv9dWe1pxsBxxKqP5
UXceXPcKM7VlRNmehimq7q/hfbget4RJGGKYPNXeKHOo6yfy7lJPiQV4h+5z2pSs
SPP2fQQPq0aubmcO23CXFtZl4WRHQ4pax6opepnpIfC2vZ0HLXJtPrhMKcbFJNTo
qPis6HWQPpIuI6l4MJfs+YO9ETxCR31Yd28qFAfPFoHlnQZTfx6NPhw8HKxTbSh2
Svr4X6Y14j3UsQgLTCArCXWAG/hlfRwxDZJ4AvR9EU0HJGDyZ45Y+LTD1N8bbsny
zcYfPqWGPIanLcNPNFYIQwDZo7ff08KdmngUvf/Q9om60mP1hsPJMHf6VhPXj4fC
2TZ11fORssSlBSNtIkFkbjEG+aiWtWnz3fN3uSyT50rgGwtDHJzVzLiUWHlZKcxW
X73YdxuT8fqQwg==
=Yibq
-----END PGP SIGNATURE-----
Merge tag 'powerpc-cve-2020-4788' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fixes for CVE-2020-4788.
From Daniel's cover letter:
IBM Power9 processors can speculatively operate on data in the L1
cache before it has been completely validated, via a way-prediction
mechanism. It is not possible for an attacker to determine the
contents of impermissible memory using this method, since these
systems implement a combination of hardware and software security
measures to prevent scenarios where protected data could be leaked.
However these measures don't address the scenario where an attacker
induces the operating system to speculatively execute instructions
using data that the attacker controls. This can be used for example to
speculatively bypass "kernel user access prevention" techniques, as
discovered by Anthony Steinhauser of Google's Safeside Project. This
is not an attack by itself, but there is a possibility it could be
used in conjunction with side-channels or other weaknesses in the
privileged code to construct an attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern.
This patch series flushes the L1 cache on kernel entry (patch 2) and
after the kernel performs any user accesses (patch 3). It also adds a
self-test and performs some related cleanups"
* tag 'powerpc-cve-2020-4788' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: rename pnv|pseries_setup_rfi_flush to _setup_security_mitigations
selftests/powerpc: refactor entry and rfi_flush tests
selftests/powerpc: entry flush test
powerpc: Only include kup-radix.h for 64-bit Book3S
powerpc/64s: flush L1D after user accesses
powerpc/64s: flush L1D on kernel entry
selftests/powerpc: rfi_flush: disable entry flush if present
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.
However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache after user accesses.
This is part of the fix for CVE-2020-4788.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.
However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache on kernel entry.
This is part of the fix for CVE-2020-4788.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add a kernel parameter to disable SGX kernel support and document it.
[ bp: Massage. ]
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Jethro Beekman <jethro@fortanix.com>
Tested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Link: https://lkml.kernel.org/r/20201112220135.165028-9-jarkko@kernel.org
This was implemented a long time ago, but never actually added to the
documentation.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Link: https://lore.kernel.org/r/20201108181824.bso5exam72b4p4tk@function
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Unify the handling of managed and stateless device links in the
runtime PM framework and prevent runtime PM references to devices
from being leaked after device link removal (Rafael Wysocki).
- Fix two mistakes in the cpuidle documentation (Julia Lawall).
- Prevent the schedutil cpufreq governor from missing policy
limits updates in some cases (Viresh Kumar).
- Prevent static OPPs from being dropped by mistake (Viresh Kumar).
- Prevent helper function in the OPP framework from returning
prematurely (Viresh Kumar).
- Prevent opp_table_lock from being held too long during removal
of OPP tables with no more active references (Viresh Kumar).
- Drop redundant semicolon from the Intel RAPL power capping
driver (Tom Rix).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl+kAnUSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxND0P/j5e/wqFZ/34XEaUa4PIs27TDoJHq1Jh
PfUxqSBEVO6Xm8TMCtOIuS8gw4x7ehenXC8gFdivy+Fu+4QNkPKYggF5/LgWO8Gl
cMNzYwJ8shIBQWQZftIx01Bn5tAvk1YhV1mnSNf580Iy7FhKqojMnrvzQnpGD4jR
piBkvBcbIJWDk+T96RzbqnqmMD0euvlYfzg1KnsyqsOpoRl7kZoH4ahWYskdIxDY
NtA4SqQ8dvxjTwI8+a+JBb//ua9jjjDyjd7FUV87HMdcVh9rZKbmHKkk4Yv/3C4C
jZuCTV6zaWIHCbbkZwi+OTNE8q21GgWm1xqwnGWVTFSGrs8ZyfjLs73/g9S5zI3N
C/n4YUJxiYDVcnrAVwR7kSTEouiQ6mTuChOt7T3r1Ilx67rw4TDsI59Fk1Xn07bZ
1wfaPASUPsFAxF7vMhkdzsidhpR3BYNgMwmAWo9R4Yvw4evyRN4tywoUQJ1X27zg
ERir6mFVz6XCnXPrJhyx0bWLo+VD8J/arhfIPSTqHR7wn0tgc23aVeYy7wvfYiiu
QQJqQ58Aoa0Z7ZpeAv3xbSung+eqQ/dDC9FnXqxdZCI6brYUhrZ19OhsFnEyzGUR
IDRzcP1/72AxhidpjO71txOJSJFTqKF2LzL/wuS6kMbYwmFjYfj18xX5GvOUuXmL
hO+E+QdleQFo
=lKQA
-----END PGP SIGNATURE-----
Merge tag 'pm-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix the device links support in runtime PM, correct mistakes in
the cpuidle documentation, fix the handling of policy limits changes
in the schedutil cpufreq governor, fix assorted issues in the OPP
(operating performance points) framework and make one janitorial
change.
Specifics:
- Unify the handling of managed and stateless device links in the
runtime PM framework and prevent runtime PM references to devices
from being leaked after device link removal (Rafael Wysocki).
- Fix two mistakes in the cpuidle documentation (Julia Lawall).
- Prevent the schedutil cpufreq governor from missing policy limits
updates in some cases (Viresh Kumar).
- Prevent static OPPs from being dropped by mistake (Viresh Kumar).
- Prevent helper function in the OPP framework from returning
prematurely (Viresh Kumar).
- Prevent opp_table_lock from being held too long during removal of
OPP tables with no more active references (Viresh Kumar).
- Drop redundant semicolon from the Intel RAPL power capping driver
(Tom Rix)"
* tag 'pm-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: runtime: Resume the device earlier in __device_release_driver()
PM: runtime: Drop pm_runtime_clean_up_links()
PM: runtime: Drop runtime PM references to supplier on link removal
powercap/intel_rapl: remove unneeded semicolon
Documentation: PM: cpuidle: correct path name
Documentation: PM: cpuidle: correct typo
cpufreq: schedutil: Don't skip freq update if need_freq_update is set
opp: Reduce the size of critical section in _opp_table_kref_release()
opp: Fix early exit from dev_pm_opp_register_set_opp_helper()
opp: Don't always remove static OPPs in _of_add_opp_table_v1()
number of warnings from the once-noisy docs build process is nearly zero.
Getting to this point has required a lot of work; once there, hopefully we
can keep things that way.
I have packaged this as a separate pull because it does a fair amount of
reaching outside of Documentation/. The changes are all in comments and in
code placement. It's all been in linux-next since last week.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl+hscQPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5YgZAH/0JeDA/1VLZYYTmdABz8mjBZsoW9tyPGGztF
nsh5ykdHhL3MeTRwumW5armLVrfKhd1XT+nIzD7OcWlqu+RDOvQ5I95rahr473hP
1SHTjqm3/AlJwQoeS72X5U6QEJQ58e2IwCbP23H3x7I3Q3snEA/HhswzxurfoB/Z
j81YzDV2YPEc0LJWZ5Vn0NEdwP8cdpFv5rojsQmepq7K0yJ7tEHb7/u2cEuUBgXS
8LcYCNPLpiN+q5N8uQ5oDjIUNdLQvP03kgKtQWiCTr4BRydOrDlJie28LIedamEz
anu7UfaVK4bxn+ugRI0g2+aWQKux81ULCinKUWmLRNbcxjhaQqQ=
=hDfp
-----END PGP SIGNATURE-----
Merge tag 'docs-5.10-warnings' of git://git.lwn.net/linux
Pull documentation build warning fixes from Jonathan Corbet:
"This contains a series of warning fixes from Mauro; once applied, the
number of warnings from the once-noisy docs build process is nearly
zero.
Getting to this point has required a lot of work; once there,
hopefully we can keep things that way.
I have packaged this as a separate pull because it does a fair amount
of reaching outside of Documentation/. The changes are all in comments
and in code placement. It's all been in linux-next since last week"
* tag 'docs-5.10-warnings' of git://git.lwn.net/linux: (24 commits)
docs: SafeSetID: fix a warning
amdgpu: fix a few kernel-doc markup issues
selftests: kselftest_harness.h: fix kernel-doc markups
drm: amdgpu_dm: fix a typo
gpu: docs: amdgpu.rst: get rid of wrong kernel-doc markups
drm: amdgpu: kernel-doc: update some adev parameters
docs: fs: api-summary.rst: get rid of kernel-doc include
IB/srpt: docs: add a description for cq_size member
locking/refcount: move kernel-doc markups to the proper place
docs: lockdep-design: fix some warning issues
MAINTAINERS: fix broken doc refs due to yaml conversion
ice: docs fix a devlink info that broke a table
crypto: sun8x-ce*: update entries to its documentation
net: phy: remove kernel-doc duplication
mm: pagemap.h: fix two kernel-doc markups
blk-mq: docs: add kernel-doc description for a new struct member
docs: userspace-api: add iommu.rst to the index file
docs: hwmon: mp2975.rst: address some html build warnings
docs: net: statistics.rst: remove a duplicated kernel-doc
docs: kasan.rst: add two missing blank lines
...