so that the work can scale better.
* New driver for Mellanox' BlueField SoC DDR controller (Shravan Kumar Ramani)
* AMD Rome support in amd64_edac (Yazen Ghannam and Isaac Vaughn)
* Misc fixes, cleanups and code improvements
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl17RrMACgkQEsHwGGHe
VUpt1g//WHs2FkGvwzHFmNHaUpATWLVKJ3azrNs+q1Av53xGngg/V8qTewwcQ1tx
OdgCqHj6PIMAoJbI2IIMIrjdMfM7EFXKgtk1kSut2uBowODkt29Ccj1ncdQ6uZOa
OSBIt7fXcsQQoX+0/SbutR3GB0gH6/rJQ+LFZix1g1o0pUfqgPGf1ypZ+gaSc8zr
bq+Ke8mXwaw13qy06eXAoOb32sAOfHqHjQUs53ecA+shmhua6AhRwuH+LcQyORoI
RdIKLIKb6yw/Vri3PdrokVKZo4OPa8hbdSlC9JPhZ3SfAMlmrDfr7OiVqneKfBJz
wRBfVVNNozbIHu5lv7ZxVaDAUh6VnWhhW036HxiMSG6NdjalVb9HB8xpKNbBlDxy
L7q0Y5CLEcRQpkmO/S0SK8KA4Vr8w8Vw/rYVRO9ypM6LEz+uoyMIZ5+mJjiCnjYo
sClPz5TqqgxCieId0tyAs+IZ1iXItRpEogBTXmvgIrrnWXkhQmo6Jhvd7VGDM6bT
3RwrkWXApzxdLQSv/JqRC7N8mdII0WwDI6AjBUoC7cOeBaMQcHrcD3sLYgjWL1Up
IQgSfHdR13S2p4Og1Phni/gSOnruCzCJQI1aCC8jjj92f1z8a/T0rjyMkE/qpZ+X
tPOqNGkXfw86LzBDFFIOtGI6v8fdVX+Di/pTQid5hm3YdPMe1D8=
=lEnD
-----END PGP SIGNATURE-----
Merge tag 'edac_for_5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
"The new thing this time around is that we have three maintainers now
and a new, old repo. New because it is new for the EDAC tree which is
hosted there from now on and old because it is Tony's and mine's old
RAS repo which we still use occasionally when the stuff isn't in tip.
Summary:
- EDAC tree has three maintainers and one new designated reviewer
now, so that the work can scale better.
- New driver for Mellanox' BlueField SoC DDR controller (Shravan
Kumar Ramani)
- AMD Rome support in amd64_edac (Yazen Ghannam and Isaac Vaughn)
- Misc fixes, cleanups and code improvements"
* tag 'edac_for_5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/amd64: Add PCI device IDs for family 17h, model 70h
MAINTAINERS: Add Robert as a EDAC reviewer
EDAC/mc_sysfs: Make debug messages consistent
EDAC/mc_sysfs: Remove pointless gotos
EDAC: Prefer 'unsigned int' to bare use of 'unsigned'
EDAC/amd64: Support asymmetric dual-rank DIMMs
EDAC/amd64: Cache secondary Chip Select registers
EDAC/amd64: Decode syndrome before translating address
EDAC/amd64: Find Chip Select memory size using Address Mask
EDAC/amd64: Initialize DIMM info for systems with more than two channels
EDAC/amd64: Recognize DRAM device type ECC capability
EDAC/amd64: Support more than two controllers for chip selects handling
EDAC/mc: Cleanup _edac_mc_free() code
EDAC, pnd2: Fix ioremap() size in dnv_rd_reg()
EDAC, mellanox: Add ECC support for BlueField DDR4
EDAC/altera: Use the proper type for the IRQ status bits
EDAC/mc: Fix grain_bits calculation
edac: altera: Move Stratix10 SDRAM ECC to peripheral
MAINTAINERS: update EDAC entry to reflect current tree and maintainers
-----BEGIN PGP SIGNATURE-----
iJYEABYIAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCXW0lVyAcamFya2tvLnNh
a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0sjdAQD3lpIS7YeT37Bu
QVdUkRObzkN3gWvv2oZkGXPg72843AD+KH0GL+M9SfBrYO/1StBJascarOHIIqUt
/1uqUgL7fAI=
=IV4V
-----END PGP SIGNATURE-----
Merge tag 'tpmdd-next-20190902' of git://git.infradead.org/users/jjs/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
"A new driver for fTPM living inside ARM TEE was added this round.
In addition to that, there are three bug fixes and one clean up"
* tag 'tpmdd-next-20190902' of git://git.infradead.org/users/jjs/linux-tpmdd:
tpm/tpm_ftpm_tee: Document fTPM TEE driver
tpm/tpm_ftpm_tee: A driver for firmware TPM running inside TEE
tpm: Remove a deprecated comments about implicit sysfs locking
tpm_tis_core: Set TPM_CHIP_FLAG_IRQ before probing for interrupts
tpm_tis_core: Turn on the TPM before probing IRQ's
MAINTAINERS: fix style in KEYS-TRUSTED entry
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXXe8mQAKCRCRxhvAZXjc
ou7oAQCszihkNfpjORSSSOqenMDrxxDW++A7TIOLuq7UyZQl8QD+LM1wvT/xypfJ
ORD9XX8+Wrv07AQn85fZBEFXGrnengk=
=o+VL
-----END PGP SIGNATURE-----
Merge tag 'core-process-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull pidfd/waitid updates from Christian Brauner:
"This contains two features and various tests.
First, it adds support for waiting on process through pidfds by adding
the P_PIDFD type to the waitid() syscall. This completes the basic
functionality of the pidfd api (cf. [1]). In the meantime we also have
a new adition to the userspace projects that make use of the pidfd
api. The qt project was nice enough to send a mail pointing out that
they have a pr up to switch to the pidfd api (cf. [2]).
Second, this tag contains an extension to the waitid() syscall to make
it possible to wait on the current process group in a race free manner
(even though the actual problem is very unlikely) by specifing 0
together with the P_PGID type. This extension traces back to a
discussion on the glibc development mailing list.
There are also a range of tests for the features above. Additionally,
the test-suite which detected the pidfd-polling race we fixed in [3]
is included in this tag"
[1] https://lwn.net/Articles/794707/
[2] https://codereview.qt-project.org/c/qt/qtbase/+/108456
[3] commit b191d6491b ("pidfd: fix a poll race when setting exit_state")
* tag 'core-process-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
waitid: Add support for waiting for the current process group
tests: add pidfd poll tests
tests: move common definitions and functions into pidfd.h
pidfd: add pidfd_wait tests
pidfd: add P_PIDFD to waitid()
Rewrite some lines to match line length and replace
format string 0x%x to %#x. Add and remove blank line.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
After recent changes the MSI message data needs to specify the
function-relative IRQ number.
Reported-and-tested-by: Alexander Schmidt <alexs@linux.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
This reverts commit b03755ad6f.
This is sad, and done for all the wrong reasons. Because that commit is
good, and does exactly what it says: avoids a lot of small disk requests
for the inode table read-ahead.
However, it turns out that it causes an entirely unrelated problem: the
getrandom() system call was introduced back in 2014 by commit
c6e9d6f388 ("random: introduce getrandom(2) system call"), and people
use it as a convenient source of good random numbers.
But part of the current semantics for getrandom() is that it waits for
the entropy pool to fill at least partially (unlike /dev/urandom). And
at least ArchLinux apparently has a systemd that uses getrandom() at
boot time, and the improvements in IO patterns means that existing
installations suddenly start hanging, waiting for entropy that will
never happen.
It seems to be an unlucky combination of not _quite_ enough entropy,
together with a particular systemd version and configuration. Lennart
says that the systemd-random-seed process (which is what does this early
access) is supposed to not block any other boot activity, but sadly that
doesn't actually seem to be the case (possibly due bogus dependencies on
cryptsetup for encrypted swapspace).
The correct fix is to fix getrandom() to not block when it's not
appropriate, but that fix is going to take a lot more discussion. Do we
just make it act like /dev/urandom by default, and add a new flag for
"wait for entropy"? Do we add a boot-time option? Or do we just limit
the amount of time it will wait for entropy?
So in the meantime, we do the revert to give us time to discuss the
eventual fix for the fundamental problem, at which point we can re-apply
the ext4 inode table access optimization.
Reported-by: Ahmed S. Darwish <darwish.07@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Alexander E. Patrakov <patrakov@gmail.com>
Cc: Lennart Poettering <mzxreary@0pointer.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the Distribution Kernels track at Linux Plumbers Conference there
was some discussion around the difficulty of making kernel builds
reproducible.
This is a solved problem, but the solutions don't appear to be
documented in one place. This document lists the issues I know about
and the settings needed to ensure reproducibility.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
code that was thought unnecessary; however, since then KVM has grown quite
a few cond_resched()s and for that reason the simplified code is prone to
livelocks---one CPUs tries to empty a list of guest page tables while the
others keep adding to them. This adds back the generation-based zapping of
guest page tables, which was not unnecessary after all.
On top of this, there is a fix for a kernel memory leak and a couple of
s390 fixlets as well.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdfJcxAAoJEL/70l94x66D8ocH/jhz+95+TBNF5j0xGmBWiDbH
zQlWmEMkAQ8o66J+503bc2ltQhM8078Ohumgtmq8HFaRgctDiRdjLBcce6aOr4tH
09gBdlWgkVWLhN8AhydSbHh+SLcCWdZQSAWQrvt1aM2BRdz5WECkUdauSL2oHsTP
P58318EL0JrqQaQZtK4qI4eNDiYWdKq2luMu9BTYm3f1Hnk28gorErgBehScYuxE
LlnM4RYedH54UR8opdUDmhHxO7bGTW4SVz4obq0sSOJs190DuVbCxbaJjhP+tSwk
7hPac3tHoYFUIhVtgGD3N/LrsFgvcShmjawLP0szCrR2sWrtTqmb0R63DLxpmyg=
=yuZ9
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"The main change here is a revert of reverts. We recently simplified
some code that was thought unnecessary; however, since then KVM has
grown quite a few cond_resched()s and for that reason the simplified
code is prone to livelocks---one CPUs tries to empty a list of guest
page tables while the others keep adding to them. This adds back the
generation-based zapping of guest page tables, which was not
unnecessary after all.
On top of this, there is a fix for a kernel memory leak and a couple
of s390 fixlets as well"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot
KVM: x86: work around leak of uninitialized stack contents
KVM: nVMX: handle page fault in vmread
KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl
KVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target for memset()
32 bit build got broken by the latest defence in depth patch.
Revert and we'll try again in the next cycle.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJdfT6AAAoJECgfDbjSjVRpOrUH/09N3JI/VSIvH5y75UYAU1pc
nDrekWI7TLYYIKJCUwfcxIzxskw1EOENxSFNRAtkKyRKZtq7HmhWOBec4NdnHkvy
ze1Cbt2aqQiMfbxJYStri/AYNSpC+aTttFSDAMm4TfE+QxfEjumO3J/HSRk0RYdt
leGziB4H2BjsZM/2JpMD9UFIq/D9SeGZTwd2sZTCTQ+RIvYEN2hGUXoG5hYl/xv6
+g6/Rkj/eAHqilpvyAt2PWFxslqLsQhWt/S/PHa61HpQLuPBCBYAu38O6X0vD93F
8vNQfcedz4qpyyHdfaGB+jquZ800BxqaUvdLdyAdQJXXGaytopAYzncEa1iPyyg=
=s4iF
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fix from Michael Tsirkin:
"A last minute revert
The 32-bit build got broken by the latest defence in depth patch.
Revert and we'll try again in the next cycle"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
Revert "vhost: block speculation of translated descriptors"
Last week, Palmer and I learned that there was an error in the RISC-V
kernel image header format that could make it less compatible with the
ARM64 kernel image header format. I had missed this error during my
original reviews of the patch.
The kernel image header format is an interface that impacts
bootloaders, QEMU, and other user tools. Those packages must be
updated to align with whatever is merged in the kernel. We would like
to avoid proliferating these image formats by keeping the RISC-V
header as close as possible to the existing ARM64 header. Since the
arch/riscv patch that adds support for the image header was merged
with our v5.3-rc1 pull request as commit 0f327f2aaa ("RISC-V: Add
an Image header that boot loader can parse."), we think it wise to try
to fix this error before v5.3 is released.
The fix itself should be backwards-compatible with any project that
has already merged support for premature versions of this interface.
It primarily involves ensuring that the RISC-V image header has
something useful in the same field as the ARM64 image header.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl1870sACgkQx4+xDQu9
Kkt33A//SG4fTIyz0rIGTZpJPKV3nXacBq6XvOFxFsHRHlEvD2f/JkSK1Ab+hV5R
vmTkVGCSCVz1C/OEA+KWsjuiJEglII6eOLIRqST1Wm6KumwAwLc78xdgEb1Sm/SC
E7OTYtSqbUjCqzzD1BFcXfbP4mGF/9IBjWI3OcCnb1UcLuL29Mt35gvxI9fF1FB6
+EU96MBQbk4gVUYjKXObTvaAZwWIYrMkOFQmdRgb4jqk42i0hLmKx//WkI1Ajlp8
FDjE2nIo2NAt0N7pImJ/QtqxkOsQjMtOyOscoTyhB4eGJW0+fTyVrt6FpUdYQDQq
vZI/WS2RFYUi2wfj+JNQ959MgsWZZ8z21KbFWwR0HC4k2xRZaxCO48g/VweJA/QW
3f6+CMxYgwF5KzToHvUjlo0wNMW2Xo/FX9bky3gb8rJPWnSx9uu9lfoh17FUD4Ty
cEknaLtmMALA8Lgr8hwTKbZLg7J1ih5r1SPj0UvjpjEmwDUl2doA0EONuuBroEHM
KDerGitg6D0g4B4VlGsHuLMd6Gj/5r2teno97tPoaf5J9mCZ1v2/Q5OL0QwBYd84
5cp+Ox1aQTY6SJq8gftBOD3MmW2lKCC5tT6H0bJvKBAE7tJaLPv5YIj6dp1jfXKB
klzJUdGRsL60EwlL/cbFOurDfhBeQlq8akdzG5Cg5e8q+mISSTE=
=Jt6U
-----END PGP SIGNATURE-----
Merge tag 'riscv/for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fix from Paul Walmsley:
"Last week, Palmer and I learned that there was an error in the RISC-V
kernel image header format that could make it less compatible with the
ARM64 kernel image header format. I had missed this error during my
original reviews of the patch.
The kernel image header format is an interface that impacts
bootloaders, QEMU, and other user tools. Those packages must be
updated to align with whatever is merged in the kernel. We would like
to avoid proliferating these image formats by keeping the RISC-V
header as close as possible to the existing ARM64 header. Since the
arch/riscv patch that adds support for the image header was merged
with our v5.3-rc1 pull request as commit 0f327f2aaa ("RISC-V: Add
an Image header that boot loader can parse."), we think it wise to try
to fix this error before v5.3 is released.
The fix itself should be backwards-compatible with any project that
has already merged support for premature versions of this interface.
It primarily involves ensuring that the RISC-V image header has
something useful in the same field as the ARM64 image header"
* tag 'riscv/for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: modify the Image header to improve compatibility with the ARM64 header
This reverts commit a89db445fb.
I was hasty to include this patch, and it breaks the build on 32 bit.
Defence in depth is good but let's do it properly.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Pull networking fixes from David Miller:
1) Don't corrupt xfrm_interface parms before validation, from Nicolas
Dichtel.
2) Revert use of usb-wakeup in btusb, from Mario Limonciello.
3) Block ipv6 packets in bridge netfilter if ipv6 is disabled, from
Leonardo Bras.
4) IPS_OFFLOAD not honored in ctnetlink, from Pablo Neira Ayuso.
5) Missing ULP check in sock_map, from John Fastabend.
6) Fix receive statistic handling in forcedeth, from Zhu Yanjun.
7) Fix length of SKB allocated in 6pack driver, from Christophe
JAILLET.
8) ip6_route_info_create() returns an error pointer, not NULL. From
Maciej Żenczykowski.
9) Only add RDS sock to the hashes after rs_transport is set, from
Ka-Cheong Poon.
10) Don't double clean TX descriptors in ixgbe, from Ilya Maximets.
11) Presence of transmit IPSEC offload in an SKB is not tested for
correctly in ixgbe and ixgbevf. From Steffen Klassert and Jeff
Kirsher.
12) Need rcu_barrier() when register_netdevice() takes one of the
notifier based failure paths, from Subash Abhinov Kasiviswanathan.
13) Fix leak in sctp_do_bind(), from Mao Wenan.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
cdc_ether: fix rndis support for Mediatek based smartphones
sctp: destroy bucket if failed to bind addr
sctp: remove redundant assignment when call sctp_get_port_local
sctp: change return type of sctp_get_port_local
ixgbevf: Fix secpath usage for IPsec Tx offload
sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()'
ixgbe: Fix secpath usage for IPsec TX offload.
net: qrtr: fix memort leak in qrtr_tun_write_iter
net: Fix null de-reference of device refcount
ipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()'
tun: fix use-after-free when register netdev failed
tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR
ixgbe: fix double clean of Tx descriptors with xdp
ixgbe: Prevent u8 wrapping of ITR value to something less than 10us
mlx4: fix spelling mistake "veify" -> "verify"
net: hns3: fix spelling mistake "undeflow" -> "underflow"
net: lmc: fix spelling mistake "runnin" -> "running"
NFC: st95hf: fix spelling mistake "receieve" -> "receive"
net/rds: An rds_sock is added too early to the hash table
mac80211: Do not send Layer 2 Update frame before authorization
...
lima:
- fix gem_wait ioctl
core:
- constify modes list
i915:
- DP MST high color depth regression
- GPU hangs on vulkan compute workloads
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJdemkOAAoJEAx081l5xIa+4ooP/RGV1E+pHsdC1SzsZYLZ+GDp
TLeg0KnNMijxfP8TCqjssPwJFbYVO3RxZHCDKbCQdESRG6TKS5wqJ9bgR2y3jdXZ
rjbhGY8YkxNkZbpAYPsYkzNcRqIrZFOYOnKDeafXP02KnHoiIckf8W3DKuCNr5XF
/htgh4rQnKZnkiFJm1OzGwxdTTI+yv4aqi50cLR5pGAnSUKpxeHm2M+S9P2BmMgL
BvsjDFFz+WGQllQ7nKUiMZ9oPbAGiQjWmICfucBniBUnhdleMk8qiNTuD8YQ1Cwh
m5Ix2wa8vPZBeg1Hb55iJGhC+l4ePGdLPtJiPUUUMM+d8A0QlcMLrFpXM8jvtgA5
h+cZFR9Po8QS1bb6Jic/i9OLzG+KhFAQLSREcBNqGcyqfCZHqKArPSlTqqjUkRRj
KW4OhdGxPrpmIzUHrHInSZKXzZtdWXKF35YFDWbjy0WNo91RW/X4I/qvCsML99uH
WDoPumEG4eolPLK5JCcTnMMhRYULRopulVc8Tw/yTIxlpUEVhWk7fGhc8XAK/F0D
S5bpxx8OQMfSYAaln5tzGhf4vUWVmdFNv0kyIdozZLRCQzYNuECg6bB16gZ9utvN
np8k9x6CKFWTlNerAhO2jpvmh/pVupQ4GylGY5uIc2Z4P9LPMzORltrNU8cgaZB/
8g9NDRvzCpfqW/UWXwFD
=wL1R
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2019-09-13' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"From the maintainer summit, just some last minute fixes for final:
lima:
- fix gem_wait ioctl
core:
- constify modes list
i915:
- DP MST high color depth regression
- GPU hangs on vulkan compute workloads"
* tag 'drm-fixes-2019-09-13' of git://anongit.freedesktop.org/drm/drm:
drm/lima: fix lima_gem_wait() return value
drm/i915: Restore relaxed padding (OCL_OOB_SUPPRES_ENABLE) for skl+
drm/i915: Limit MST to <= 8bpc once again
drm/modes: Make the whitelist more const
Standard integer promotion is already done and %hx and %hhx is useless
so do not encourage the use of %hh[xudi] or %h[xudi].
As Linus said in:
https://lore.kernel.org/lkml/CAHk-=wgoxnmsj8GEVFJSvTwdnWm8wVJthefNk2n6+4TC=20e0Q@mail.gmail.com/
It's a pointless warning, making for more complex code, and
making people remember esoteric printf format details that have no
reason for existing.
The "h" and "hh" things should never be used. The only reason for them
being used if if you have an "int", but you want to print it out as a
"char" (and honestly, that is a really bad reason, you'd be better off
just using a proper cast to make the code more obvious).
So if what you have a "char" (or unsigned char) you should always just
print it out as an "int", knowing that the compiler already did the
proper type conversion.
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Louis Taylor <louis@kragniz.eu>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This argument is supported on RISC-V systems and widely used, but was
not documented here.
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Remove the clever example about read-write lock because this type of
lock is not recommended anymore (according to the very same document).
So there is no reason to teach clever things that people should not do.
Signed-off-by: Federico Vaga <federico.vaga@vaga.pv.it>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Describe how the comedi minor device numbers are split across comedi
devices and comedi subdevices.
Replace the current, long dead URL with an official URL for the Comedi
project.
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
- prevent a user triggerable oops in the migration code
- do not leak kernel stack content
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJdejosAAoJEBF7vIC1phx8ZcYP/09WMmcbOexGvopqyMzIWgAv
xpSHAW0+mGriu9b41OwkxBsMG3MxUzk86b3zL0r5eaigWXSuE2NU0OhScqF9ehMX
pTtoeSzFJsPFwGQrOKIhpgcNzOJ+YfVqTDlf5dxq9uSNYF32suuz0Dw4P9PdFJOg
k8prJXiKu+bL21TcbhWsAAP7Gb5/DA26p4d5KM3wJe351Af9lrLrDF2z+pKe9fbY
v0vMcH3tJoBOOTYUSJeptEWU9OlYljMrJN7kkmXCEC8yklwoXPDNgAC8Yg2SfqYM
xNKVkX/rY97cn1Dq0LpAvEjMDYvu7KbOM1qQE9A67gRLIjuGJnDyEa+j/iB/tOrz
BMmTdut44XRaVZVdDL+d2pg3LKI+1+UV4XTwpD4g1tSpYLar3dJVb9mq00OzdCAg
TsK+pQYTSZig+H4ubtikgm9pFGKOB2Jsp2+FoC7jYxhYQWyj4syBkSoaaUdY0LvE
/Du3NY3RaG4yi2K2XV0yjBVAjpXxYMWqvzJYTC9XlrEQJ5nAmiefTgxZmcg4ZCMw
0YVRigG7vz8oKpVRl/6smGd/U+qTNZN4cXnFgUr71yONiIxsSndUZ/Yledtf+KQR
uzPfvIwYpRzwqVnXkkFb+PNxvJVftCbe2rRI4D549VsbmEJmSadjiB5aW1Rj3fMN
47ZjXZmmGETR8BtQEM37
=LxGy
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-master-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master
KVM: s390: Fixes for 5.3
- prevent a user triggerable oops in the migration code
- do not leak kernel stack content
James Harvey reported a livelock that was introduced by commit
d012a06ab1 ("Revert "KVM: x86/mmu: Zap only the relevant pages when
removing a memslot"").
The livelock occurs because kvm_mmu_zap_all() as it exists today will
voluntarily reschedule and drop KVM's mmu_lock, which allows other vCPUs
to add shadow pages. With enough vCPUs, kvm_mmu_zap_all() can get stuck
in an infinite loop as it can never zap all pages before observing lock
contention or the need to reschedule. The equivalent of kvm_mmu_zap_all()
that was in use at the time of the reverted commit (4e103134b8, "KVM:
x86/mmu: Zap only the relevant pages when removing a memslot") employed
a fast invalidate mechanism and was not susceptible to the above livelock.
There are three ways to fix the livelock:
- Reverting the revert (commit d012a06ab1) is not a viable option as
the revert is needed to fix a regression that occurs when the guest has
one or more assigned devices. It's unlikely we'll root cause the device
assignment regression soon enough to fix the regression timely.
- Remove the conditional reschedule from kvm_mmu_zap_all(). However, although
removing the reschedule would be a smaller code change, it's less safe
in the sense that the resulting kvm_mmu_zap_all() hasn't been used in
the wild for flushing memslots since the fast invalidate mechanism was
introduced by commit 6ca18b6950 ("KVM: x86: use the fast way to
invalidate all pages"), back in 2013.
- Reintroduce the fast invalidate mechanism and use it when zapping shadow
pages in response to a memslot being deleted/moved, which is what this
patch does.
For all intents and purposes, this is a revert of commit ea145aacf4
("Revert "KVM: MMU: fast invalidate all pages"") and a partial revert of
commit 7390de1e99 ("Revert "KVM: x86: use the fast way to invalidate
all pages""), i.e. restores the behavior of commit 5304b8d37c ("KVM:
MMU: fast invalidate all pages") and commit 6ca18b6950 ("KVM: x86:
use the fast way to invalidate all pages") respectively.
Fixes: d012a06ab1 ("Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"")
Reported-by: James Harvey <jamespharvey20@gmail.com>
Cc: Alex Willamson <alex.williamson@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Emulation of VMPTRST can incorrectly inject a page fault
when passed an operand that points to an MMIO address.
The page fault will use uninitialized kernel stack memory
as the CR2 and error code.
The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
exit to userspace; however, it is not an easy fix, so for now just ensure
that the error code and CR2 are zero.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Cc: stable@vger.kernel.org
[add comment]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The implementation of vmread to memory is still incomplete, as it
lacks the ability to do vmread to I/O memory just like vmptrst.
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Part of the intention during the definition of the RISC-V kernel image
header was to lay the groundwork for a future merge with the ARM64
image header. One error during my original review was not noticing
that the RISC-V header's "magic" field was at a different size and
position than the ARM64's "magic" field. If the existing ARM64 Image
header parsing code were to attempt to parse an existing RISC-V kernel
image header format, it would see a magic number 0. This is
undesirable, since it's our intention to align as closely as possible
with the ARM64 header format. Another problem was that the original
"res3" field was not being initialized correctly to zero.
Address these issues by creating a 32-bit "magic2" field in the RISC-V
header which matches the ARM64 "magic" field. RISC-V binaries will
store "RSC\x05" in this field. The intention is that the use of the
existing 64-bit "magic" field in the RISC-V header will be deprecated
over time. Increment the minor version number of the file format to
indicate this change, and update the documentation accordingly. Fix
the assembler directives in head.S to ensure that reserved fields are
properly zero-initialized.
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Reported-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Cc: Atish Patra <atish.patra@wdc.com>
Cc: Karsten Merker <merker@debian.org>
Link: https://lore.kernel.org/linux-riscv/194c2f10c9806720623430dbf0cc59a965e50448.camel@wdc.com/T/#u
Link: https://lore.kernel.org/linux-riscv/mhng-755b14c4-8f35-4079-a7ff-e421fd1b02bc@palmer-si-x1e/T/#t
A Mediatek based smartphone owner reports problems with USB
tethering in Linux. The verbose USB listing shows a rndis_host
interface pair (e0/01/03 + 10/00/00), but the driver fails to
bind with
[ 355.960428] usb 1-4: bad CDC descriptors
The problem is a failsafe test intended to filter out ACM serial
functions using the same 02/02/ff class/subclass/protocol as RNDIS.
The serial functions are recognized by their non-zero bmCapabilities.
No RNDIS function with non-zero bmCapabilities were known at the time
this failsafe was added. But it turns out that some Wireless class
RNDIS functions are using the bmCapabilities field. These functions
are uniquely identified as RNDIS by their class/subclass/protocol, so
the failing test can safely be disabled. The same applies to the two
types of Misc class RNDIS functions.
Applying the failsafe to Communication class functions only retains
the original functionality, and fixes the problem for the Mediatek based
smartphone.
Tow examples of CDC functional descriptors with non-zero bmCapabilities
from Wireless class RNDIS functions are:
0e8d:000a Mediatek Crosscall Spider X5 3G Phone
CDC Header:
bcdCDC 1.10
CDC ACM:
bmCapabilities 0x0f
connection notifications
sends break
line coding and serial state
get/set/clear comm features
CDC Union:
bMasterInterface 0
bSlaveInterface 1
CDC Call Management:
bmCapabilities 0x03
call management
use DataInterface
bDataInterface 1
and
19d2:1023 ZTE K4201-z
CDC Header:
bcdCDC 1.10
CDC ACM:
bmCapabilities 0x02
line coding and serial state
CDC Call Management:
bmCapabilities 0x03
call management
use DataInterface
bDataInterface 1
CDC Union:
bMasterInterface 0
bSlaveInterface 1
The Mediatek example is believed to apply to most smartphones with
Mediatek firmware. The ZTE example is most likely also part of a larger
family of devices/firmwares.
Suggested-by: Lars Melin <larsm17@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mao Wenan says:
====================
fix memory leak for sctp_do_bind
First two patches are to do cleanup, remove redundant assignment,
and change return type of sctp_get_port_local.
Third patch is to fix memory leak for sctp_do_bind if failed
to bind address.
v2: add one patch to change return type of sctp_get_port_local.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There are more parentheses in if clause when call sctp_get_port_local
in sctp_do_bind, and redundant assignment to 'ret'. This patch is to
do cleanup.
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently sctp_get_port_local() returns a long
which is either 0,1 or a pointer casted to long.
It's neither of the callers use the return value since
commit 62208f1245 ("net: sctp: simplify sctp_get_port").
Now two callers are sctp_get_port and sctp_do_bind,
they actually assumend a casted to an int was the same as
a pointer casted to a long, and they don't save the return
value just check whether it is zero or non-zero, so
it would better change return type from long to int for
sctp_get_port_local.
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Port the same fix for ixgbe to ixgbevf.
The ixgbevf driver currently does IPsec Tx offloading
based on an existing secpath. However, the secpath
can also come from the Rx side, in this case it is
misinterpreted for Tx offload and the packets are
dropped with a "bad sa_idx" error. Fix this by using
the xfrm_offload() function to test for Tx offload.
CC: Shannon Nelson <snelson@pensando.io>
Fixes: 7f68d43067 ("ixgbevf: enable VF IPsec offload operations")
Reported-by: Jonathan Tooker <jonathan@reliablehosting.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
New driver should use devm_hwmon_device_register_with_info() or
hwmon_device_register_with_info() to register with the hwmon subsystem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Accessing the device when it may be runtime suspended is a bug, which is
the case in tmio_mmc_host_remove(). Let's fix the behaviour.
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
The tmio_mmc_host_probe() calls pm_runtime_set_active() to update the
runtime PM status of the device, as to make it reflect the current status
of the HW. This works fine for most cases, but unfortunate not for all.
Especially, there is a generic problem when the device has a genpd attached
and that genpd have the ->start|stop() callbacks assigned.
More precisely, if the driver calls pm_runtime_set_active() during
->probe(), genpd does not get to invoke the ->start() callback for it,
which means the HW isn't really fully powered on. Furthermore, in the next
phase, when the device becomes runtime suspended, genpd will invoke the
->stop() callback for it, potentially leading to usage count imbalance
problems, depending on what's implemented behind the callbacks of course.
To fix this problem, convert to call pm_runtime_get_sync() from
tmio_mmc_host_probe() rather than pm_runtime_set_active(). Additionally, to
avoid bumping usage counters and unnecessary re-initializing the HW the
first time the tmio driver's ->runtime_resume() callback is called,
introduce a state flag to keeping track of this.
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
This reverts commit 7ff2131933.
It turns out that the above commit introduces other problems. For example,
calling pm_runtime_set_active() must not be done prior calling
pm_runtime_enable() as that makes it fail. This leads to additional
problems, such as clock enables being wrongly balanced.
Rather than fixing the problem on top, let's start over by doing a revert.
Fixes: 7ff2131933 ("mmc: tmio: move runtime PM enablement to the driver implementations")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add detection for machine types 0x8562 and 8x8561 and set the ELF platform
name to z15. Add the miscellaneous-instruction-extension 3 facility to
the list of facilities for z15.
And allow to generate code that only runs on a z15 machine.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
This patch introduces sha3 support for s390.
- Rework the s390-specific SHA1 and SHA2 related code to
provide the basis for SHA3.
- Provide two new kernel modules sha3_256_s390 and
sha3_512_s390 together with new kernel options.
Signed-off-by: Joerg Schmidbauer <jschmidb@de.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Try to print out startup pgm check info including exact linux kernel
version, pgm interruption code and ilc, psw and general registers. Like
the following:
Linux version 5.3.0-rc7-07282-ge7b4d41d61bd-dirty (gor@tuxmaker) #3 SMP PREEMPT Thu Sep 5 16:07:34 CEST 2019
Kernel fault: interruption code 0005 ilc:2
PSW : 0000000180000000 0000000000012e52
R:0 T:0 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:0 PM:0 RI:0 EA:3
GPRS: 0000000000000000 00ffffffffffffff 0000000000000000 0000000000019a58
000000000000bf68 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 000000000001a041 0000000000000000
0000000004c9c000 0000000000010070 0000000000012e42 000000000000beb0
This info makes it apparent that kernel startup failed and might help
to understand what went wrong without actual standalone dump.
Printing code runs on its own stack of 1 page (at unused 0x5000), which
should be sufficient for sclp_early_printk usage (typical stack usage
observed has been around 512 bytes).
The code has pgm check recursion prevention, despite pgm check info
printing failure (follow on pgm check) or success it restores original
faulty psw and gprs and does disabled wait.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Pull cgroup fix from Tejun Heo:
"Roman found and fixed a bug in the cgroup2 freezer which allows new
child cgroup to escape frozen state"
* 'for-5.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: freezer: fix frozen state inheritance
kselftests: cgroup: add freezer mkdir test
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl16aTEACgkQxWXV+ddt
WDvICg//cSn5w+g6EnxbrAZ6IYQJ4GA7lZSk2i6Dc/lI3BTfs7Wj0SPRKd01pBjT
N+wbqoOgubsS1jkNfJsGCN80XzSa0tvyQdbezj5ncgSPXp4FRlT0K24EUQNPaqbg
SsvvxAOCerVN3Yj2qrHNWIS5qZ5/8/NjLXca1DJ/OYmrkKfhe+Z6/b9EuKffPnco
erMnaeSvQ27hYkkcdM0DGcWDoHHAQrefGNjQzp5vncJNN1F7+EGLbcH31UwApk1K
/hvOQ6Q6SoR/NKbQu3AitrR9u7v9uhWP9jHJZT46q1m89CzI4S5FjK2wKZFjPE6r
0PGRqnpdaGAERaTo3s6jIqv/X2gzJkhhhzGMiPgPJCQbAH39f/fFGEX22TjG33Yq
2CiGSIPnmKQ7HE494YLuSyHD/89SutMMCkbF0sFBoKuTnu2HQMn9r5Pk6bEKtvIY
iTk75/WTXR02qWCVhTyNDa9QnxewQGJC1d1KNQ6MwbzBiYyG9S/DDZnjLJPNx7DF
KAAANCDdyPpraLcmw2sD/qd1o10HfQmn9z1L2v3YvJBfjMe76SQFCP5WwaJRcjOm
c3ScAX9bXeXJgH+E98kWc7T6p49IPdMDGAtArQmtjO4V8pFRuqG+2Ibg6Za/y5XZ
fkaS5UY+XIk3TUpEqkWKMPMigM9a3jgHskyMgdRLQfVnoOc6Z+k=
=KXB8
-----END PGP SIGNATURE-----
Merge tag 'for-5.3-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Here are two fixes, one of them urgent fixing a bug introduced in 5.2
and reported by many users. It took time to identify the root cause,
catching the 5.3 release is higly desired also to push the fix to 5.2
stable tree.
The bug is a mess up of return values after adding proper error
handling and honestly the kind of bug that can cause sleeping
disorders until it's caught. My appologies to everybody who was
affected.
Summary of what could happen:
1) either a hang when committing a transaction, if this happens
there's no risk of corruption, still the hang is very inconvenient
and can't be resolved without a reboot
2) writeback for some btree nodes may never be started and we end up
committing a transaction without noticing that, this is really
serious and that will lead to the "parent transid verify failed"
messages"
* tag 'for-5.3-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix unwritten extent buffers and hangs on future writeback attempts
Btrfs: fix assertion failure during fsync and use of stale transaction
EAS computes the energy impact of migrating a waking task when deciding
on which CPU it should run. However, the current approach is known to
have a high algorithmic complexity, which can result in prohibitively
high wake-up latencies on systems with complex energy models, such as
systems with per-CPU DVFS. On such systems, the algorithm complexity is
in O(n^2) (ignoring the cost of searching for performance states in the
EM) with 'n' the number of CPUs.
To address this, re-factor the EAS wake-up path to compute the energy
'delta' (with and without the task) on a per-performance domain basis,
rather than system-wide, which brings the complexity down to O(n).
No functional changes intended.
Test results
~~~~~~~~~~~~
* Setup: Tested on a Google Pixel 3, with a Snapdragon 845 (4+4 CPUs,
A55/A75). Base kernel is 5.3-rc5 + Pixel3 specific patches. Android
userspace, no graphics.
* Test case: Run a periodic rt-app task, with 16ms period, ramping down
from 70% to 10%, in 5% steps of 500 ms each (json avail. at [1]).
Frequencies of all CPUs are pinned to max (using scaling_min_freq
CPUFreq sysfs entries) to reduce variability. The time to run
select_task_rq_fair() is measured using the function profiler
(/sys/kernel/debug/tracing/trace_stat/function*). See the test script
for more details [2].
Test 1:
I hacked the DT to 'fake' per-CPU DVFS. That is, we end up with one
CPUFreq policy per CPU (8 policies in total). Since all frequencies are
pinned to max for the test, this should have no impact on the actual
frequency selection, but it does in the EAS calculation.
+---------------------------+----------------------------------+
| Without patch | With patch |
+-----+-----+----------+----------+-----+-----------------+----------+
| CPU | Hit | Avg (us) | s^2 (us) | Hit | Avg (us) | s^2 (us) |
|-----+-----+----------+----------+-----+-----------------+----------+
| 0 | 274 | 38.303 | 1750.239 | 401 | 14.126 (-63.1%) | 146.625 |
| 1 | 197 | 49.529 | 1695.852 | 314 | 16.135 (-67.4%) | 167.525 |
| 2 | 142 | 34.296 | 1758.665 | 302 | 14.133 (-58.8%) | 130.071 |
| 3 | 172 | 31.734 | 1490.975 | 641 | 14.637 (-53.9%) | 139.189 |
| 4 | 316 | 7.834 | 178.217 | 425 | 5.413 (-30.9%) | 20.803 |
| 5 | 447 | 8.424 | 144.638 | 556 | 5.929 (-29.6%) | 27.301 |
| 6 | 581 | 14.886 | 346.793 | 456 | 5.711 (-61.6%) | 23.124 |
| 7 | 456 | 10.005 | 211.187 | 997 | 4.708 (-52.9%) | 21.144 |
+-----+-----+----------+----------+-----+-----------------+----------+
* Hit, Avg and s^2 are as reported by the function profiler
Test 2:
I also ran the same test with a normal DT, with 2 CPUFreq policies, to
see if this causes regressions in the most common case.
+---------------------------+----------------------------------+
| Without patch | With patch |
+-----+-----+----------+----------+-----+-----------------+----------+
| CPU | Hit | Avg (us) | s^2 (us) | Hit | Avg (us) | s^2 (us) |
|-----+-----+----------+----------+-----+-----------------+----------+
| 0 | 345 | 22.184 | 215.321 | 580 | 18.635 (-16.0%) | 146.892 |
| 1 | 358 | 18.597 | 200.596 | 438 | 12.934 (-30.5%) | 104.604 |
| 2 | 359 | 25.566 | 200.217 | 397 | 10.826 (-57.7%) | 74.021 |
| 3 | 362 | 16.881 | 200.291 | 718 | 11.455 (-32.1%) | 102.280 |
| 4 | 457 | 3.822 | 9.895 | 757 | 4.616 (+20.8%) | 13.369 |
| 5 | 344 | 4.301 | 7.121 | 594 | 5.320 (+23.7%) | 18.798 |
| 6 | 472 | 4.326 | 7.849 | 464 | 5.648 (+30.6%) | 22.022 |
| 7 | 331 | 4.630 | 13.937 | 408 | 5.299 (+14.4%) | 18.273 |
+-----+-----+----------+----------+-----+-----------------+----------+
* Hit, Avg and s^2 are as reported by the function profiler
In addition to these two tests, I also ran 50 iterations of the Lisa
EAS functional test suite [3] with this patch applied on Arm Juno r0,
Arm Juno r2, Arm TC2 and Hikey960, and could not see any regressions
(all EAS functional tests are passing).
[1] https://paste.debian.net/1100055/
[2] https://paste.debian.net/1100057/
[3] https://github.com/ARM-software/lisa/blob/master/lisa/tests/scheduler/eas_behaviour.py
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: juri.lelli@redhat.com
Cc: morten.rasmussen@arm.com
Cc: qais.yousef@arm.com
Cc: qperret@qperret.net
Cc: rjw@rjwysocki.net
Cc: tkjos@google.com
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Link: https://lkml.kernel.org/r/20190912094404.13802-1-qperret@qperret.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If a new child cgroup is created in the frozen cgroup hierarchy
(one or more of ancestor cgroups is frozen), the CGRP_FREEZE cgroup
flag should be set. Otherwise if a process will be attached to the
child cgroup, it won't become frozen.
The problem can be reproduced with the test_cgfreezer_mkdir test.
This is the output before this patch:
~/test_freezer
ok 1 test_cgfreezer_simple
ok 2 test_cgfreezer_tree
ok 3 test_cgfreezer_forkbomb
Cgroup /sys/fs/cgroup/cg_test_mkdir_A/cg_test_mkdir_B isn't frozen
not ok 4 test_cgfreezer_mkdir
ok 5 test_cgfreezer_rmdir
ok 6 test_cgfreezer_migrate
ok 7 test_cgfreezer_ptrace
ok 8 test_cgfreezer_stopped
ok 9 test_cgfreezer_ptraced
ok 10 test_cgfreezer_vfork
And with this patch:
~/test_freezer
ok 1 test_cgfreezer_simple
ok 2 test_cgfreezer_tree
ok 3 test_cgfreezer_forkbomb
ok 4 test_cgfreezer_mkdir
ok 5 test_cgfreezer_rmdir
ok 6 test_cgfreezer_migrate
ok 7 test_cgfreezer_ptrace
ok 8 test_cgfreezer_stopped
ok 9 test_cgfreezer_ptraced
ok 10 test_cgfreezer_vfork
Reported-by: Mark Crossen <mcrossen@fb.com>
Signed-off-by: Roman Gushchin <guro@fb.com>
Fixes: 76f969e894 ("cgroup: cgroup v2 freezer")
Cc: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Tejun Heo <tj@kernel.org>