When writing a tool for enabling events in the tracing system,
an anomaly was discovered. The top level event "enable" file would
never show "1" when all events were enabled. The system and event
"enable" files worked as expected. The reason was because the top
level event "enable" file included the "ftrace" tracer events,
which are not controlled by the "enable" file and would cause the
output to be wrong. This appears to have been a bug since it was created.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYCGOmxQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qhDFAQDjSrHmSC0ziTck9QMXSUdxLs0gjENr
R0n5WPZ/mRboxQD/aWlw99TnuSwFDzB0gTlwDuDd1Ge2snqqmFCRTscU7gE=
=Pig3
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Fix output of top level event tracing 'enable' file.
When writing a tool for enabling events in the tracing system, an
anomaly was discovered. The top level event 'enable' file would never
show '1' when all events were enabled.
The system and event 'enable' files worked as expected.
The reason was because the top level event 'enable' file included the
'ftrace' tracer events, which are not controlled by the 'enable' file
and would cause the output to be wrong. This appears to have been a
bug since it was created"
* tag 'trace-v5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Do not count ftrace events in top level enable output
- fix a 32 vs 64-bit padding issue in the new benchmark code
(Barry Song)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmAgE/ALHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYM1pw/+MDNm/z5v8hNUkffBuEygZz36VP2Nupc9pDS8ctFF
0YracQ9SWmFFFzpXKwkMA49QvQR07hBodqBrd+lDsuXtwaSu5lAnZa3H24l3eZGO
UYaNIl3n/yYM0ALOD0OZ6OPmj/RHMJMQSHtEiVRjBusCNIrgZd5EBP0h0my3Wu1D
nRbbZDdoeI9jVCiYfiIh8UasJKGtL32LYiQDQMlUL+IA3Vuh3dCS9CojURuOs4EU
9+U80MKH5TMwHaSQqQXr8bosiDY4IImOhUvlEiy1c4bk0Uof6IOuq/LmucqCzLPw
srUZjY7paz8ntO5M2jIH1UbUmeE9/4YH35xv3DVGYCOu24TohLUO4WP4T9VNUtx7
vQk1weBs4q6IWYkGNdYaomM4514u/59MBd24MdQsnQxxYzPFzSxX7VmK2tFNUHuS
AqgUppT4IqkBqGMMcJmnOM48Xhy+q996cpkWZCtfGKoFclIaoEC+kD3YBNfvm1vs
9upivyD9Ht1h/4jfWFvSKyxKF257AoueYugYVd57pNY6PNIbTf221CW6d57lzPA6
rCpQLUlN6A6QQ9ifa7FtSbClj7PQrbUb0iFcdAerJU8FgyURMbncpNoc+t54Lxyw
zO+tLUn+yZ+6ji7kydsOqs/RIt5chi7cDsv+p+yUqlBdBDyb3UisihAhiYlKtpju
Bu0=
=OqA5
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.11-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
"Fix a 32 vs 64-bit padding issue in the new benchmark code (Barry
Song)"
* tag 'dma-mapping-5.11-2' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: benchmark: use u8 for reserved field in uAPI structure
- A fix for MSI activation of PCI endpoints with multiple MSIs.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmAf9E8ACgkQEsHwGGHe
VUrdnBAAshn35KlffL7TayhPnO9FArEHw9GRoRdVOvfLp/NEQsALlEFx3ecaYo5j
Rxoh/+/UdIx3pp/OTWu6uDnAxSnwctNZ50o1MFSiXZlYkoC/vVawauOPS29+W3bL
40fhGcA8RNx6Hi7a0Cgj0uioxmRJpZ0x8NvLzKT5uvkPYnRfLQSf7xqrkhQR9pm/
lJaG11aa/LNXndamYlrC1PllkDmX2UwZ6z0XBP9PJf6tDHlfR8sLHhGJ1E/ACaY6
Vw03DKsXHdiqqa+1bc8XduagHfchL4RCQXe9FS0IymH0a3lrjdOtdqZznTHR8S7N
uwyPyNSdQDOV6Ni+qgc/Icoxfkj0/ZXytD4wkgpLP6ShUnGUaO6PrA5tm7CX/eoj
900eh1p2ZHHB5UP3FtG1ldUV0vn2HVtk7XOwSiPURoUldcBAnvJThQvxFA2wkeZA
BnhTfoWCl2cncyWmUndNJ5kQFObGW7u8V6rU8kHgKNQDUKrD7hOGgOeFcPQ4j4I6
lXqrHKXu3yGCxVNZKt+4Ay5rRVQL8vKzXjDZbHhmLAomxuX4BCOqTCgWVFszX2Nr
3mLHw13tXAYobFDnq24CfPhljgGj7HUIOvadOJtoTG/5Kb4M7hybyqnlHRx8GVMh
fOS3/o6TKhHQbfwMkx1Km3EiKQkDmvhJrzp/fQ6NcxXa8PY65T8=
=v33D
-----END PGP SIGNATURE-----
Merge tag 'irq_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Prevent device managed IRQ allocation helpers from returning IRQ 0
- A fix for MSI activation of PCI endpoints with multiple MSIs
* tag 'irq_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Prevent [devm_]irq_alloc_desc from returning irq 0
genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
redirection range specification before the API has been made official in 5.11.
- Ensure tasks using the generic syscall code do trap after returning
from a syscall when single-stepping is requested.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmAfz7gACgkQEsHwGGHe
VUp+8hAAlNdy5EJVBVEBT8U6K9ZxHJ2Mnk/uPteD8Sq9o37dndfJ5utrXd52h9om
JFfcsIVO7Ej2i7bKNVzM1FgUeO5UqtwGoZyJxuyT4ma+MZIjFibaem0+ousovJiU
MhB6Vl+jkEBIEJXg2z9btoLTa86SPJM77u+gtJXaeQegcNJENY1jpUHYlV22q90/
b3b3MTVNNbw3bQty5hwWSU9G6PEXa888CJ+lEeuSjMQrVTmQ5i5oSMfYbUMCZIwm
RQGcC/8qlDFfECBP9qMfq6sSoGnJ9uYmcT2Dzo7NiZHvBhtkzoWP4myjVF5g1oc/
H5nUwrG2EXem73xuAdxbPe1nqVoU2byd658GjZ0St/Zcb5usanNEOkgJa3f+O3X5
eRT5u9PFzhaTo2UDcLo02DlEqi/4Ed7bXJ2gxryHHxVi91Dr4G1uR+PL04MXJ6r8
8YCf10c5qOrQ8u5DJ7/yq7uZkNpecdwzvEpQWkR7SmEjY0hNo2yt0Lt8JcD6eFcv
Jx27bETAseUTrynnJJmyG7y+HvDds5M+t1gj8NPPs7vA/XkdEFRUdKoDGCJE+p6+
y+cvRemx5p9YTiiTIEaiG187jR3M460DOvmT54xHcIWEWoJz3WfcRfXUqkx4xWOB
TdJW5qTUnIkPr8XvHVcJUl6o9HIODclJCgZ7F7ceUP8XF2s2ATw=
=l5j7
-----END PGP SIGNATURE-----
Merge tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull syscall entry fixes from Borislav Petkov:
- For syscall user dispatch, separate prctl operation from syscall
redirection range specification before the API has been made official
in 5.11.
- Ensure tasks using the generic syscall code do trap after returning
from a syscall when single-stepping is requested.
* tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry: Use different define for selector variable in SUD
entry: Ensure trap after single-step on system call return
and trigger suspend assertion checks in the i2c subsystem.
- Correct a previous RTC validation change to check only bit 6 in register D
because some Intel machines use bits 0-5.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmAfxt8ACgkQEsHwGGHe
VUqGnQ//W1gu/MyIGauA2Ds6WHtvgguyOLUfjQbykSXXHol9aygcdray6Zhca/+D
6bf7gudkIQVYy6A38dD6tH1/2brHelY9SsxJ/MOhKJ2zh3wistdV4tJsH682Dp8G
9BgmLYkc/QRuSMh04GKL+UoXxdv3IsDy6q2dZfMoQj6cDwx65JL2qdIp4HvAYZ+B
FwF8BJxakLGr4ZHRurYQaT/+OKwc6rrF1/ix8zGl6sN8BATZTbcn0SVHWiiaoNlj
TVXDLoVUHWw1X3xWdLwZlhD0SPsc1f3nO8Y+q/86zbf0r9YUJVq5dhuAqRRcAl2L
CQDDmOUJLmtf6S2ZOcbQIRbC0gjQulMGoEOVZclYa8x1eeUywBwyHSZwVVzhvyVC
jvtXu9yW7Y0kAbKbQnL42hwVJra+0fIwIG1ay3h2kZzlBKazxSom2JozFuhcQ/6M
gNbHk8QZ4FPDNVl/gN6hxDtKcVv6ObvZGZnNbr6xjRCUSJ57O/kcmq/vkwYeRof/
vS2SPaY6OifrBYQVuH10CxpE4HJBA309eQ1vdwHtfq5+IcJE50XBNNm5VG1xu5h3
RQQINsQXg8+mERT1Jkpyy/JTTnBje2Hp0qxyC6FYRwDsBjNv8HjrhZT/H2rTWioG
a3D9BZ0tcnJK/pu47FlA9gKQ2WMrnSJ7K2nHjHam5su0iIZTRk4=
=UHNm
-----END PGP SIGNATURE-----
Merge tag 'timers_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Borislav Petkov:
"Two more timers-related fixes for v5.11:
- Use a freezable workqueue for RTC sync because the sync can happen
at any time and trigger suspend assertion checks in the i2c
subsystem.
- Correct a previous RTC validation change to check only bit 6 in
register D because some Intel machines use bits 0-5"
* tag 'timers_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ntp: Use freezable workqueue for RTC synchronization
rtc: mc146818: Dont test for bit 0-5 in Register D
Michael Kerrisk suggested that, from an API perspective, it is a bad
idea to share the PR_SYS_DISPATCH_ defines between the prctl operation
and the selector variable.
Therefore, define two new constants to be used by SUD's selector variable
and update the corresponding documentation and test cases.
While this changes the API syscall user dispatch has never been part of a
Linux release, it will show up for the first time in 5.11.
Suggested-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210205184321.2062251-1-krisman@collabora.com
Commit 2991552447 ("entry: Drop usage of TIF flags in the generic syscall
code") introduced a bug on architectures using the generic syscall entry
code, in which processes stopped by PTRACE_SYSCALL do not trap on syscall
return after receiving a TIF_SINGLESTEP.
The reason is that the meaning of TIF_SINGLESTEP flag is overloaded to
cause the trap after a system call is executed, but since the above commit,
the syscall call handler only checks for the SYSCALL_WORK flags on the exit
work.
Split the meaning of TIF_SINGLESTEP such that it only means single-step
mode, and create a new type of SYSCALL_WORK to request a trap immediately
after a syscall in single-step mode. In the current implementation, the
SYSCALL_WORK flag shadows the TIF_SINGLESTEP flag for simplicity.
Update x86 to flip this bit when a tracer enables single stepping.
Fixes: 2991552447 ("entry: Drop usage of TIF flags in the generic syscall code")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kyle Huey <me@kylehuey.com>
Link: https://lore.kernel.org/r/87h7mtc9pr.fsf_-_@collabora.com
The file /sys/kernel/tracing/events/enable is used to enable all events by
echoing in "1", or disabling all events when echoing in "0". To know if all
events are enabled, disabled, or some are enabled but not all of them,
cating the file should show either "1" (all enabled), "0" (all disabled), or
"X" (some enabled but not all of them). This works the same as the "enable"
files in the individule system directories (like tracing/events/sched/enable).
But when all events are enabled, the top level "enable" file shows "X". The
reason is that its checking the "ftrace" events, which are special events
that only exist for their format files. These include the format for the
function tracer events, that are enabled when the function tracer is
enabled, but not by the "enable" file. The check includes these events,
which will always be disabled, and even though all true events are enabled,
the top level "enable" file will show "X" instead of "1".
To fix this, have the check test the event's flags to see if it has the
"IGNORE_ENABLE" flag set, and if so, not test it.
Cc: stable@vger.kernel.org
Fixes: 553552ce17 ("tracing: Combine event filter_active and enable into single flags field")
Reported-by: "Yordan Karadzhov (VMware)" <y.karadz@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
On ARCH=um, loading a module doesn't result in its constructors getting
called, which breaks module gcov since the debugfs files are never
registered. On the other hand, in-kernel constructors have already been
called by the dynamic linker, so we can't call them again.
Get out of this conundrum by allowing CONFIG_CONSTRUCTORS to be
selected, but avoiding the in-kernel constructor calls.
Also remove the "if !UML" from GCOV selecting CONSTRUCTORS now, since we
really do want CONSTRUCTORS, just not kernel binary ones.
Link: https://lkml.kernel.org/r/20210120172041.c246a2cac2fb.I1358f584b76f1898373adfed77f4462c8705b736@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The bug fixed by commit e3fab2f3de ("ntp: Fix RTC synchronization on
32-bit platforms") revealed an underlying issue: RTC synchronization may
happen anytime, even while the system is partially suspended.
On systems where the RTC is connected to an I2C bus, the I2C bus controller
may already or still be suspended, triggering a WARNING during suspend or
resume from s2ram:
WARNING: CPU: 0 PID: 124 at drivers/i2c/i2c-core.h:54 __i2c_transfer+0x634/0x680
i2c i2c-6: Transfer while suspended
[...]
Workqueue: events_power_efficient sync_hw_clock
[...]
(__i2c_transfer)
(i2c_transfer)
(regmap_i2c_read)
...
(da9063_rtc_set_time)
(rtc_set_time)
(sync_hw_clock)
(process_one_work)
Fix this race condition by using the freezable instead of the normal
power-efficient workqueue.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20210125143039.1051912-1-geert+renesas@glider.be
The original code put five u32 before a u64 expansion[10] array. Five is
odd, this will cause trouble in the extension of the structure by adding
new features. This patch moves to use u8 for reserved field to avoid
future alignment risk.
Meanwhile, it also clears the memory of struct map_benchmark in tools,
otherwise, if users use old version to run on newer kernel, the random
expansion value will cause side effect on newer kernel.
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
- Initialize tracing-graph-pause at task creation, not start of
function tracing. Causes the pause counter to be corrupted.
- Set "pause-on-trace" for latency tracers as that option breaks
their output (regression).
- Fix the wrong error return for setting kretprobes on future
modules (before they are loaded).
- Fix re-registering the same kretprobe.
- Add missing value check for added RCU variable reload.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYBnNzhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qlpoAP4hU98lfAButfYTuuS7Id+/r21bB4lG
9HHB72wkpEfs8AEAlTDC5c3eXhnXXJC4a8b4sGv1wvBiHL2ZoW/yQ/4oZgA=
=hpY/
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Initialize tracing-graph-pause at task creation, not start of
function tracing, to avoid corrupting the pause counter.
- Set "pause-on-trace" for latency tracers as that option breaks their
output (regression).
- Fix the wrong error return for setting kretprobes on future modules
(before they are loaded).
- Fix re-registering the same kretprobe.
- Add missing value check for added RCU variable reload.
* tag 'trace-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracepoint: Fix race between tracing and removing tracepoint
kretprobe: Avoid re-registration of the same kretprobe earlier
tracing/kprobe: Fix to support kretprobe events on unloaded modules
tracing: Use pause-on-trace with the latency tracers
fgraph: Initialize tracing_graph_pause at task creation
- fix a kernel crash in the new dma-mapping benchmark test (Barry Song)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmAZGkkLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYNF0A//WjFHsPCMbbhlOlCMcRD/I1+JLQYchjlWp48wAzxq
K3fmGnLlsvd3lCyUQzLLcUMSs+NsaTlzNNtH+MSNfGvX3x/8Mz9F57AgJ7C2xlaD
XXbob5SKAqls6UFw6sBhlNbUe/l3Tup7LgqyCqQGRfpftycO7Vk70oHfFir0xqeB
IX2s2s9UK+iePRtqfhyOylipIPXG3A3TnDJ+T3x5wsw6m4ejr8TVUVNsA1Xg0mha
xsMVELyPwp3pEYp0+LNZsvtRC6uv7MeNf11mqmtk9VOZtrnS1VsD+ZPeNXzoCP3n
5iipcFGeiz8ac2bdldIjwYa4wyIBZR1kOOQ+1+gq42/Hs/Eu9N3aQydhgrzw8ZnY
MxUHbHazhpB2qt2lxyMjODVRKFdEt2FrkP7a7d3guw2Il4o2f14g9MZS72c5H5jk
Vg90VSW8UQr51MKamZkgLeIgFaqVTOkgXO5h7e+dfD/XMgm4Vt3qsjcBSQfTcCrj
4efmJwUvyv8x13HXoSvistKXG6kTH+rBtbxD6RvzE2C9dasb5L3mYIj+loQJqU7+
u/+X9ZUVHTvXlH7j7Fc9vBgCUf0VlZJ8N3LXl5zN7zRP40vcvudlhyNj1PWhiUlu
G5RBhuyTm3jd+NqCYE7XCqf+dvA7BGgkW2LZM/qAhlZgmJaWLwlJwAe0XA0KR9uR
U5w=
=l+IX
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.11-1' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
"Fix a kernel crash in the new dma-mapping benchmark test (Barry Song)"
* tag 'dma-mapping-5.11-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: benchmark: fix kernel crash when dma_map_single fails
trees.
Current release - regressions:
- ip_tunnel: fix mtu calculation
- mlx5: fix function calculation for page trees
Previous releases - regressions:
- vsock: fix the race conditions in multi-transport support
- neighbour: prevent a dead entry from updating gc_list
- dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Previous releases - always broken:
- bpf, cgroup: two copy_{from,to}_user() warn_on_once splats for BPF
cgroup getsockopt infra when user space is trying
to race against optlen, from Loris Reiff.
- bpf: add missing fput() in BPF inode storage map update helper
- udp: ipv4: manipulate network header of NATed UDP GRO fraglist
- mac80211: fix station rate table updates on assoc
- r8169: work around RTL8125 UDP HW bug
- igc: report speed and duplex as unknown when device is runtime
suspended
- rxrpc: fix deadlock around release of dst cached on udp tunnel
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmAZjwQACgkQMUZtbf5S
IruLbQ//Yg9+xEnqhDuOJZtYHB0rsJjLlKmtvgOsBr8BaTcUEPoPoqUPm+EMvCHb
o1fFa1qIrbS5luVEofu9hNX7DGXwvgawaMW2TympJhqLZQqjazCMB/st99LphhJw
RvaZI8aDOikosT4c+I0vm83jDQETonrjziIcPfHHPjn/Q+amGRRRXiTSQnRF/MlU
oARCG+U3kHsHBDUPNSCtSjKXshoZPjFb/pD7fQAlzzm7CssvbPhNWbducueyP2Fb
XW4RwJu9QBBH2JS6uZJ1Y6LVoRzusmE9dUam3KhkiL/CHs72lWPsc+Rn5gbBPvc5
Y4T4h61Xti1O4ULKdqhGceror6XY+4Qb1VlHWWztOhIo00wIAv3IHbTup/4o0HBr
j84MtcyOl/qxSFXjunPJkbWJngXikrkIMS0Bl6ZcPAejYM9wN6vCgbvFCHbEg1Rx
cWFnYyS9FCLduaxHSizv050tWhknOdX+zHK3fOtlW0yWnreJAB8Hoc21Zm7YKvg0
GxxcGK6AhqJ6s2ixVDv7MyJrltJ/hOJQb+T3HgHFuY2BYUs8F2r/HoHU/u4uCl76
RdBzbC/sLnBpMHf6r1rHTnGPsapoJOOYWnej71l425vX1qr5xnmxVNNB6HReObNv
+/jPoRYa5BVsVt2LmDcuH1O32pXJPWKVBR7Yfa6Bn2yzhcbECTc=
=ZByM
-----END PGP SIGNATURE-----
Merge tag 'net-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.11-rc7, including fixes from bpf and mac80211
trees.
Current release - regressions:
- ip_tunnel: fix mtu calculation
- mlx5: fix function calculation for page trees
Previous releases - regressions:
- vsock: fix the race conditions in multi-transport support
- neighbour: prevent a dead entry from updating gc_list
- dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Previous releases - always broken:
- bpf, cgroup: two copy_{from,to}_user() warn_on_once splats for BPF
cgroup getsockopt infra when user space is trying to race against
optlen, from Loris Reiff.
- bpf: add missing fput() in BPF inode storage map update helper
- udp: ipv4: manipulate network header of NATed UDP GRO fraglist
- mac80211: fix station rate table updates on assoc
- r8169: work around RTL8125 UDP HW bug
- igc: report speed and duplex as unknown when device is runtime
suspended
- rxrpc: fix deadlock around release of dst cached on udp tunnel"
* tag 'net-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
net: hsr: align sup_multicast_addr in struct hsr_priv to u16 boundary
net: ipa: fix two format specifier errors
net: ipa: use the right accessor in ipa_endpoint_status_skip()
net: ipa: be explicit about endianness
net: ipa: add a missing __iomem attribute
net: ipa: pass correct dma_handle to dma_free_coherent()
r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set
net/rds: restrict iovecs length for RDS_CMSG_RDMA_ARGS
net: mvpp2: TCAM entry enable should be written after SRAM data
net: lapb: Copy the skb before sending a packet
net/mlx5e: Release skb in case of failure in tc update skb
net/mlx5e: Update max_opened_tc also when channels are closed
net/mlx5: Fix leak upon failure of rule creation
net/mlx5: Fix function calculation for page trees
docs: networking: swap words in icmp_errors_use_inbound_ifaddr doc
udp: ipv4: manipulate network header of NATed UDP GRO fraglist
net: ip_tunnel: fix mtu calculation
vsock: fix the race conditions in multi-transport support
net: sched: replaced invalid qdisc tree flush helper in qdisc_replace
ibmvnic: device remove has higher precedence over reset
...
condition wrong when moving SYSCALL_EMU away from TIF flags.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmAWh7cTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYof8zEAC+Qm4Myg9SiHWr8EiZa4+TqmUxTge8
oV36+Je18y7vHFElGBByCwfEHvsLO/mi3kgKn2lBZsSyiSiUs15p0S5M/7A7HmbW
mcFmHioECe7VUL5Ml1Y6mhyhA9o3QdAv3PAHwNBbvUwbJSrCS7rld94T4xeZiaBh
y00qFikxKTbblgSZpVKDG7wUKYHVQwJMqYVw6I6Y4iB+QfM1EGQxWFzV2td3H/UE
A7g8Ay8QOXxd/agnwZaOTHrQy2Rsnp3n9sD5Y6hVpZLT3FulxsaSftK/ngn9uTou
bFYagpXxJRPt6TylK2Y8Nn2Y1ZcLoq/bj7XKSN0MpgcM+y3/vV9GUOpyFmDTug2F
P5onx7S6vKxG3ews+WlTxHYaSRWbO0OHWLTM+FHbW7ben/DjWNVNBa4L1u3w0Skq
igyqmCzQURjkDbsCaMsdKPeG0KJOlCqTNj4aImskNGv5OUt77rziGg42jI07MLYV
mE9+e/cw5P1FVoVaaMwplUvOmGaG8647IdapDo0UctHm7Y+GC81bwry/bbW/oesi
7acnmCrO/sILwzE1H+YQnxofVlTXW/pCx3MUfNUEyJOUuI7orobfd1MOJjVUKj++
Zm5bRy8h0RZ5q9Xy2GwCh0mSRihbtQzBXdbwIDpltFYUBF6cHh1ryqKAHBE55JiB
IQ4J+3OK1f8dNQ==
=WMj1
-----END PGP SIGNATURE-----
Merge tag 'core-urgent-2021-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull single stepping fix from Thomas Gleixner:
"A single fix for the single step reporting regression caused by
getting the condition wrong when moving SYSCALL_EMU away from TIF
flags"
[ There's apparently another problem too, fix pending ]
* tag 'core-urgent-2021-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry: Unbreak single step reporting behaviour
When MSI_FLAG_ACTIVATE_EARLY is set (which is the case for PCI),
__msi_domain_alloc_irqs() performs the activation of the interrupt (which
in the case of PCI results in the endpoint being programmed) as soon as the
interrupt is allocated.
But it appears that this is only done for the first vector, introducing an
inconsistent behaviour for PCI Multi-MSI.
Fix it by iterating over the number of vectors allocated to each MSI
descriptor. This is easily achieved by introducing a new
"for_each_msi_vector" iterator, together with a tiny bit of refactoring.
Fixes: f3b0946d62 ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210123122759.1781359-1-maz@kernel.org
Our system encountered a re-init error when re-registering same kretprobe,
where the kretprobe_instance in rp->free_instances is illegally accessed
after re-init.
Implementation to avoid re-registration has been introduced for kprobe
before, but lags for register_kretprobe(). We must check if kprobe has
been re-registered before re-initializing kretprobe, otherwise it will
destroy the data struct of kretprobe registered, which can lead to memory
leak, system crash, also some unexpected behaviors.
We use check_kprobe_rereg() to check if kprobe has been re-registered
before running register_kretprobe()'s body, for giving a warning message
and terminate registration process.
Link: https://lkml.kernel.org/r/20210128124427.2031088-1-bobo.shaobowang@huawei.com
Cc: stable@vger.kernel.org
Fixes: 1f0ab40976 ("kprobes: Prevent re-registration of the same kprobe")
[ The above commit should have been done for kretprobes too ]
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
- Fix a deadlock caused by attempting to acquire the same mutex
twice in a row in the "kexec jump" code (Baoquan He).
- Modify the hibernation image saving code to flush the unwritten
data to the swap storage later so as to avoid failing to write the
image signature which is possible in some cases (Laurent Badel).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmAUNj8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxrEcP/2KQPLD4PkHMw8qr2h2m9Dp6Lc5bl+C2
bEL/IeDNojtndF7z9q3Fp7EOpffJJV1q9zX06HEKZF4d59fa9gE5oGt9bRcpRbpf
74cDRTLCNr4UpigzTJux2wfgy9XZ8mWuRzIQUTOHgn17YK2tKteTFInxsCqo45+A
i6zj0EYM/0UVGX48ZPf/JS6QqzI5Zh73dOuz/PjqTsmKBKQl3X1mJRGyLKeBhb6I
MTaBR622PyTDCXzksLxApk4k1Oh7+f6TRUMmykA8KdIwRZCfdp23AxzT8EWaRXZD
BNUwCBCKLSiQFtuySvXLgeMAf2yPk0B+0CHFAriy8YiuGqJSN4Q4/PtnDl7TS61J
BieKAJPbNClvNRc3j8XxyWHR1lcNabxsoE4l4PKXVrrsHu7qrylJV1+d/ZfeL5o+
k0izFUf5PCECBo0nIA1sWWWJU0ro5YQ3mkTB6Yk0jTt4PK//UaZjrFhpbebtPWnS
M06El03mzebRDl87K6L5/kDAty8yx+5Y1L3Y/KSk3X4LTsySnwsIbPJh1ZUL9HLe
FXJRa7zUYX0CiwXT65oWhnrbaat02BA/CrkFVmkFPA/+izhgN580TcDx7ljC3Hyt
1WrsWyvmmmPYrTDqB6DirrwwAYqF9XO53lqf42CFSzdu+fjoDHwDVUyEOMQMO50p
HuLwvCyGb7Mm
=jh7b
-----END PGP SIGNATURE-----
Merge tag 'pm-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a deadlock in the 'kexec jump' code and address a possible
hibernation image creation issue.
Specifics:
- Fix a deadlock caused by attempting to acquire the same mutex twice
in a row in the "kexec jump" code (Baoquan He)
- Modify the hibernation image saving code to flush the unwritten
data to the swap storage later so as to avoid failing to write the
image signature which is possible in some cases (Laurent Badel)"
* tag 'pm-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: hibernate: flush swap writer after marking
kernel: kexec: remove the lock operation of system_transition_mutex
Fix kprobe_on_func_entry() returns error code instead of false so that
register_kretprobe() can return an appropriate error code.
append_trace_kprobe() expects the kprobe registration returns -ENOENT
when the target symbol is not found, and it checks whether the target
module is unloaded or not. If the target module doesn't exist, it
defers to probe the target symbol until the module is loaded.
However, since register_kretprobe() returns -EINVAL instead of -ENOENT
in that case, it always fail on putting the kretprobe event on unloaded
modules. e.g.
Kprobe event:
/sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events
[ 16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue.
Kretprobe event: (p -> r)
/sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events
sh: write error: Invalid argument
/sys/kernel/debug/tracing # cat error_log
[ 41.122514] trace_kprobe: error: Failed to register probe event
Command: r xfs:xfs_end_io
^
To fix this bug, change kprobe_on_func_entry() to detect symbol lookup
failure and return -ENOENT in that case. Otherwise it returns -EINVAL
or 0 (succeeded, given address is on the entry).
Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@devnote2
Cc: stable@vger.kernel.org
Fixes: 59158ec4ae ("tracing/kprobes: Check the probe on unloaded module correctly")
Reported-by: Jianlin Lv <Jianlin.Lv@arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Eaerlier, tracing was disabled when reading the trace file. This behavior
was changed with:
commit 06e0a548ba ("tracing: Do not disable tracing when reading the
trace file").
This doesn't seem to work with the latency tracers.
The above mentioned commit dit not only change the behavior but also added
an option to emulate the old behavior. The idea with this patch is to
enable this pause-on-trace option when the latency tracers are used.
Link: https://lkml.kernel.org/r/20210119164344.37500-2-Viktor.Rosendahl@bmw.de
Cc: stable@vger.kernel.org
Fixes: 06e0a548ba ("tracing: Do not disable tracing when reading the trace file")
Signed-off-by: Viktor Rosendahl <Viktor.Rosendahl@bmw.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.
The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.
CPU 1 CPU 2
----- -----
do_idle()
cpu_suspend()
pause_graph_tracing()
task_struct->tracing_graph_pause++ (0 -> 1)
start_graph_tracing()
for_each_online_cpu(cpu) {
ftrace_graph_init_idle_task(cpu)
task-struct->tracing_graph_pause = 0 (1 -> 0)
unpause_graph_tracing()
task_struct->tracing_graph_pause-- (0 -> -1)
The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.
There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.
Cc: stable@vger.kernel.org
Fixes: 380c4b1411 ("tracing/function-graph-tracer: append the tracing_graph_flag")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
Reported-by: pierre.gondois@arm.com
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Daniel Borkmann says:
====================
pull-request: bpf 2021-01-29
1) Fix two copy_{from,to}_user() warn_on_once splats for BPF cgroup getsockopt
infra when user space is trying to race against optlen, from Loris Reiff.
2) Fix a missing fput() in BPF inode storage map update helper, from Pan Bian.
3) Fix a build error on unresolved symbols on disabled networking / keys LSM
hooks, from Mikko Ylinen.
4) Fix preload BPF prog build when the output directory from make points to a
relative path, from Quentin Monnet.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, preload: Fix build when $(O) points to a relative path
bpf: Drop disabled LSM hooks from the sleepable set
bpf, inode_storage: Put file handler if no storage was found
bpf, cgroup: Fix problematic bounds check
bpf, cgroup: Fix optlen WARN_ON_ONCE toctou
====================
Link: https://lore.kernel.org/r/20210129001556.6648-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The move of TIF_SYSCALL_EMU to SYSCALL_WORK_SYSCALL_EMU broke single step
reporting. The original code reported the single step when TIF_SINGLESTEP
was set and TIF_SYSCALL_EMU was not set. The SYSCALL_WORK conversion got
the logic wrong and now the reporting only happens when both bits are set.
Restore the original behaviour.
[ tglx: Massaged changelog and dropped the pointless double negation ]
Fixes: 64eb35f701 ("ptrace: Migrate TIF_SYSCALL_EMU to use SYSCALL_WORK flag")
Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Link: https://lore.kernel.org/r/877do3gaq9.fsf@m5Zedd9JOGzJrf0
Building the kernel with CONFIG_BPF_PRELOAD, and by providing a relative
path for the output directory, may fail with the following error:
$ make O=build bindeb-pkg
...
/.../linux/tools/scripts/Makefile.include:5: *** O=build does not exist. Stop.
make[7]: *** [/.../linux/kernel/bpf/preload/Makefile:9: kernel/bpf/preload/libbpf.a] Error 2
make[6]: *** [/.../linux/scripts/Makefile.build:500: kernel/bpf/preload] Error 2
make[5]: *** [/.../linux/scripts/Makefile.build:500: kernel/bpf] Error 2
make[4]: *** [/.../linux/Makefile:1799: kernel] Error 2
make[4]: *** Waiting for unfinished jobs....
In the case above, for the "bindeb-pkg" target, the error is produced by
the "dummy" check in Makefile.include, called from libbpf's Makefile.
This check changes directory to $(PWD) before checking for the existence
of $(O). But at this step we have $(PWD) pointing to "/.../linux/build",
and $(O) pointing to "build". So the Makefile.include tries in fact to
assert the existence of a directory named "/.../linux/build/build",
which does not exist.
Note that the error does not occur for all make targets and
architectures combinations. This was observed on x86 for "bindeb-pkg",
or for a regular build for UML [0].
Here are some details. The root Makefile recursively calls itself once,
after changing directory to $(O). The content for the variable $(PWD) is
preserved across recursive calls to make, so it is unchanged at this
step. For "bindeb-pkg", $(PWD) is eventually updated because the target
writes a new Makefile (as debian/rules) and calls it indirectly through
dpkg-buildpackage. This script does not preserve $(PWD), which is reset
to the current working directory when the target in debian/rules is
called.
Although not investigated, it seems likely that something similar causes
UML to change its value for $(PWD).
Non-trivial fixes could be to remove the use of $(PWD) from the "dummy"
check, or to make sure that $(PWD) and $(O) are preserved or updated to
always play well and form a valid $(PWD)/$(O) path across the different
targets and architectures. Instead, we take a simpler approach and just
update $(O) when calling libbpf's Makefile, so it points to an absolute
path which should always resolve for the "dummy" check run (through
includes) by that Makefile.
David Gow previously posted a slightly different version of this patch
as a RFC [0], two months ago or so.
[0] https://lore.kernel.org/bpf/20201119085022.3606135-1-davidgow@google.com/t/#u
Fixes: d71fa5c976 ("bpf: Add kernel module with user mode driver that populates bpffs.")
Reported-by: David Gow <davidgow@google.com>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/bpf/20210126161320.24561-1-quentin@isovalent.com
Some networking and keys LSM hooks are conditionally enabled
and when building the new sleepable BPF LSM hooks with those
LSM hooks disabled, the following build error occurs:
BTFIDS vmlinux
FAILED unresolved symbol bpf_lsm_socket_socketpair
To fix the error, conditionally add the relevant networking/keys
LSM hooks to the sleepable set.
Fixes: 423f16108c ("bpf: Augment the set of sleepable LSM hooks")
Signed-off-by: Mikko Ylinen <mikko.ylinen@linux.intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20210125063936.89365-1-mikko.ylinen@linux.intel.com
fixup_pi_state_owner() tries to ensure that the state of the rtmutex,
pi_state and the user space value related to the PI futex are consistent
before returning to user space. In case that the user space value update
faults and the fault cannot be resolved by faulting the page in via
fault_in_user_writeable() the function returns with -EFAULT and leaves
the rtmutex and pi_state owner state inconsistent.
A subsequent futex_unlock_pi() operates on the inconsistent pi_state and
releases the rtmutex despite not owning it which can corrupt the RB tree of
the rtmutex and cause a subsequent kernel stack use after free.
It was suggested to loop forever in fixup_pi_state_owner() if the fault
cannot be resolved, but that results in runaway tasks which is especially
undesired when the problem happens due to a programming error and not due
to malice.
As the user space value cannot be fixed up, the proper solution is to make
the rtmutex and the pi_state consistent so both have the same owner. This
leaves the user space value out of sync. Any subsequent operation on the
futex will fail because the 10th rule of PI futexes (pi_state owner and
user space value are consistent) has been violated.
As a consequence this removes the inept attempts of 'fixing' the situation
in case that the current task owns the rtmutex when returning with an
unresolvable fault by unlocking the rtmutex which left pi_state::owner and
rtmutex::owner out of sync in a different and only slightly less dangerous
way.
Fixes: 1b7558e457 ("futexes: fix fault handling in futex_lock_pi")
Reported-by: gzobqq@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Too many gotos already and an upcoming fix would make it even more
unreadable.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
No point in open coding it. This way it gains the extra sanity checks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Nothing uses the argument. Remove it as preparation to use
pi_state_update_owner().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Updating pi_state::owner is done at several places with the same
code. Provide a function for it and use that at the obvious places.
This is also a preparation for a bug fix to avoid yet another copy of the
same code or alternatively introducing a completely unpenetratable mess of
gotos.
Originally-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
If that unexpected case of inconsistent arguments ever happens then the
futex state is left completely inconsistent and the printk is not really
helpful. Replace it with a warning and make the state consistent.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
In case that futex_lock_pi() was aborted by a signal or a timeout and the
task returned without acquiring the rtmutex, but is the designated owner of
the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to
establish consistent state. In that case it invokes fixup_pi_state_owner()
which in turn tries to acquire the rtmutex again. If that succeeds then it
does not propagate this success to fixup_owner() and futex_lock_pi()
returns -EINTR or -ETIMEOUT despite having the futex locked.
Return success from fixup_pi_state_owner() in all cases where the current
task owns the rtmutex and therefore the futex and propagate it correctly
through fixup_owner(). Fixup the other callsite which does not expect a
positive return value.
Fixes: c1e2f0eaf0 ("futex: Avoid violating the 10th rule of futex")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmAOyaUACgkQUqAMR0iA
lPJmKw//cBHluHINQnKgGMfbdjyJilIm2Atd5jm95ysmblXjgAzbB8AJqnKFqhmH
hC+YiTCa7t84tw2Vh2NpTnosRFjdq8jtjjFpGlqLuLoREprSSRRmKyOob15XIm5d
5aio1Vh9ks+VoWwHUFfc6zyJuTNGfwRPWgIxA9KdQvt4C8+M3IiQjT3UcdXYzWOB
Of74rBLZzlImRasL9SFW06oLxZN02CSsA3I49D37WezHFRJ0LlptVPAOVbmTsPXa
WnX7Q8xOe70WsFH23so7+4H4mieSpkBqNz/+WHFBAekohae0BW5nnK84Nl9+uuUh
3FF28fXZW3xyGd5JmmBD70tVCV/nUKTSOrkc6Eo0dcrwpwxq+R1wPAojgSUgru8L
1em015/kRC1DRejCr1mdT3muySQF9gQ7/nEa+7cpo4bqmUb36TNN1pS9OjJMg4/7
4ZgoKB2lQ1UfueDqiPMgDz/JJ91jRcUCkB86MotyaOGO7FI8oaRSuSp8rvorQ13n
5LEHV8sX1ENIcu9jTcqN4oxEBg8dXVkxHwLFqWB283Rugk21L0f1IYbUbRPH4RF1
L5AVoaWBxA6ftXDI/OywszrWtr93sDqcmdebmqT0CTXOiHtTqt2cad8InjWHDh4h
wtIdV22ak7glMLR0DdMAPKjggjzHsXJLMy9SsowEKVzDN52jVcs=
=pBad
-----END PGP SIGNATURE-----
Merge tag 'printk-for-5.11-urgent-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fix from Petr Mladek:
"The fix of a potential buffer overflow in 5.11-rc5 introduced another
one. The trailing '\0' might be written up to the message "len" past
the buffer. Fortunately, it is not that easy to hit.
Most readers use 1kB buffers for a single message. Typical messages
fit into the temporary buffer with enough reserve.
Also readers do not rely on the '\0'. It is related to the previous
fix. Some readers required the space for the trailing '\0'. We decided
to write it there to avoid such regressions in the future.
The most realistic victims are dumpers using kmsg_dump_get_buffer().
They are filling the entire buffer with as many messages as possible.
They are typically used when handling panic()"
* tag 'printk-for-5.11-urgent-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: fix string termination for record_print_text()
Flush the swap writer after, not before, marking the files, to ensure the
signature is properly written.
Fixes: 6f612af578 ("PM / Hibernate: Group swap ops")
Signed-off-by: Laurent Badel <laurentbadel@eaton.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Function kernel_kexec() is called with lock system_transition_mutex
held in reboot system call. While inside kernel_kexec(), it will
acquire system_transition_mutex agin. This will lead to dead lock.
The dead lock should be easily triggered, it hasn't caused any
failure report just because the feature 'kexec jump' is almost not
used by anyone as far as I know. An inquiry can be made about who
is using 'kexec jump' and where it's used. Before that, let's simply
remove the lock operation inside CONFIG_KEXEC_JUMP ifdeffery scope.
Fixes: 55f2503c3b ("PM / reboot: Eliminate race between reboot and suspend")
Signed-off-by: Baoquan He <bhe@redhat.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Pingfan Liu <kernelfans@gmail.com>
Cc: 4.19+ <stable@vger.kernel.org> # 4.19+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit f0e386ee0c ("printk: fix buffer overflow potential for
print_text()") added string termination in record_print_text().
However it used the wrong base pointer for adding the terminator.
This led to a 0-byte being written somewhere beyond the buffer.
Use the correct base pointer when adding the terminator.
Fixes: f0e386ee0c ("printk: fix buffer overflow potential for print_text()")
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210124202728.4718-1-john.ogness@linutronix.de
- Fix to not lose IPIs on bcm2836.
- Fix for a bogus marking of ITS devices as shared due to unitialized
stack variable.
- Clear a phantom interrupt on qcom-pdc to unblock suspend.
- Small cleanups, warning and build fixes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmANZ4kACgkQEsHwGGHe
VUohthAAtkwq9kXmn1H4hQuay8Zly7L5D7dn6EUrVl+347cZi+DZmjYRv0Mu/5Ad
L7KhLHDJ6XqGW4qsPxGkYJPgaWiXyo+C0OQXLdwFcrT0kOvjbOYoFCAarRkqIdQb
5jr6J6t1MWb5Ktb8xTebUXv5kn91rvYj8ArcdlaVB8Kdp6jB8Irq9AROjZMK3Seh
AHC9jWnf3tGPUXRhNk0rpBeDzdkI6G82KjO2LdZMT2SC9WLX2vH710wBeY8+Um42
5rseckupLl7KJb02WpK0IDi/6bidWc9K3NsOh9QqjQme4O9a0459q7f6SoR8/PYi
FWwB95NwGWahnVk26qfLwiV/Sd3RKM72uQxICu6XaI0+asrbOflIuK1NbEAbpf+S
ZgtVqV3c8i09ij3NsJIWsZ5vjwtFyRUU26JgTxzWKzRf4FftSW0pHWr/VvMflXhK
l0E68kvB4KmV7Bd5r3gavctWVFeMrZhTGAHxiiG91oISUpp+X634CCa4NyMh+Ryr
6V+LhIADlzNOuyzHsG8xpsUmtSwutcsHSFQtqUkcXpG/1o2OpSPmhYQRySD4lz+4
dpo3CFtvrBUWRAHo2bsaLorqU0LlBFc+h9sTUlYMLgyRKkOQtde1UN1lTIu7z0Ll
GFqAYmttyjSlDKo/5MeBQeWp4Oq8d+Ckz9xkNIL4qJV0LLTUvMs=
=Z32e
-----END PGP SIGNATURE-----
Merge tag 'irq_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Fix a kernel panic in mips-cpu due to invalid irq domain hierarchy.
- Fix to not lose IPIs on bcm2836.
- Fix for a bogus marking of ITS devices as shared due to unitialized
stack variable.
- Clear a phantom interrupt on qcom-pdc to unblock suspend.
- Small cleanups, warning and build fixes.
* tag 'irq_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Export irq_check_status_bit()
irqchip/mips-cpu: Set IPI domain parent chip
irqchip/pruss: Simplify the TI_PRUSS_INTC Kconfig
irqchip/loongson-liointc: Fix build warnings
driver core: platform: Add extra error check in devm_platform_get_irqs_affinity()
irqchip/bcm2836: Fix IPI acknowledgement after conversion to handle_percpu_devid_irq
irqchip/irq-sl28cpld: Convert comma to semicolon
genirq/msi: Initialize msi_alloc_info before calling msi_domain_prepare_irqs()
single CPU vs such which are affine to only one CPU, mark per-cpu workqueue
threads as such and make sure that marking "survives" CPU hotplug. Fix CPU
hotplug issues with such kthreads.
- A fix to not push away tasks on CPUs coming online.
- Have workqueue CPU hotplug code use cpu_possible_mask when breaking affinity
on CPU offlining so that pending workers can finish on newly arrived onlined
CPUs too.
- Dump tasks which haven't vacated a CPU which is currently being unplugged.
- Register a special scale invariance callback which gets called on resume
from RAM to read out APERF/MPERF after resume and thus make the schedutil
scaling governor more precise.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmANYCAACgkQEsHwGGHe
VUo+OBAAjfqkijDlXiGX6lrT5gRx5NZICpeMgbWa7J13XHT1ysD/b0fMGFIUyF6k
aszDLTl8U/S1/qGAYlzTSPAFcdZ+ENiFqQ48ozMk4jZC3p0quHTjs/PdiSG6kYBi
+e4smht+bSyLKxsG8hN0kJ+mLEd+uIQ13kP4YkxPgWbJ9WNP/U6HHGBo0rBchtSe
Kn6bdd8CfwmC6rSazp7kdQoFoWeQaoMI1ODX3VphK1GtL1wq8WSICzRhpg3caeyG
3lCIddoNW9mCA9Nkc6R6HeV3uW9JGkPAjnmtTIEHDbg9pib7xNT978ieTQuqNDCi
DlAHDGumzoaiVJZhD/1fj/RXMJr2YUHxtrXWNsXpiKJ9g8Tn+WC0UW/4+Mx2L/km
0RSoXJlMs1fGopS2I/fObZ6RPhmg4D+gJsMCdaHQzX4NgxZAGhNNPxMckZ0IM8A0
2NNXSHUZHVTHeJEW0E/glOcpWb5hG+vDwiBMNEWfTwYpTfrw2EEOZaKniZE7WlSL
4ItM9rkLGl1KToJzAH4A0oUtSy3vtSCo8B1noGlc09Lj+oCIBlr81z9+C79a2oxG
qE7Xd4X7y7Qs3JeCbRZWQa7/2Kf1v4XnjELrJJeCZC85r0ZqJDwRX8w7lkmW2XPU
m4J2prr/DDZSqrRh23/xC1fsU+vcBKSfKUFKAH4Lg2VIaUfSUEk=
=2DAF
-----END PGP SIGNATURE-----
Merge tag 'sched_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Correct the marking of kthreads which are supposed to run on a
specific, single CPU vs such which are affine to only one CPU, mark
per-cpu workqueue threads as such and make sure that marking
"survives" CPU hotplug. Fix CPU hotplug issues with such kthreads.
- A fix to not push away tasks on CPUs coming online.
- Have workqueue CPU hotplug code use cpu_possible_mask when breaking
affinity on CPU offlining so that pending workers can finish on newly
arrived onlined CPUs too.
- Dump tasks which haven't vacated a CPU which is currently being
unplugged.
- Register a special scale invariance callback which gets called on
resume from RAM to read out APERF/MPERF after resume and thus make
the schedutil scaling governor more precise.
* tag 'sched_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Relax the set_cpus_allowed_ptr() semantics
sched: Fix CPU hotplug / tighten is_per_cpu_kthread()
sched: Prepare to use balance_push in ttwu()
workqueue: Restrict affinity change to rescuer
workqueue: Tag bound workers with KTHREAD_IS_PER_CPU
kthread: Extract KTHREAD_IS_PER_CPU
sched: Don't run cpu-online with balance_push() enabled
workqueue: Use cpu_possible_mask instead of cpu_active_mask to break affinity
sched/core: Print out straggler tasks in sched_cpu_dying()
x86: PM: Register syscore_ops for scale invariance
happening every 2 seconds instead of the intended every 11 minutes.
- Get rid of now unused get_seconds().
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmANVSEACgkQEsHwGGHe
VUr6pQ/+MLoKCld4bR4rXbzD2TSKaIaHt7BuYCOcE4Z3PyxpfYFdX0iJVn8J277X
VAvbHOEJLFijGylpCD5vcMnK018GaW5DTvPgb7ijOEgdzHzqOoC/a30WMJWtnBVo
dKrtEgbQPgB8eSPuQah5RSWRd6lrywJp79WNbB4k7yqsTVvQLQmbKtjc656jCvCI
PKpc7zT/UZcChue6zZUtSXnLPBOM1aog0/7pXbViZc2F8Lv3ulIZy27o9sXSetu2
bsdZzexKEY/Thqx1LiEOt5PhQm5gKSBsBWuS+wPIyI0V0E5eb0rnAlTLyx7hE3yL
9R9axXPhGH5gRM05ikYaF2k4VscImgGLfT18blIb4jWSZ/C+HCtlyfJmcx5VgG4Z
F4ZC5esiUD9vM++cdrw8Xeu/QmP+yHBrcspKshjK/NcaEnTm7DyhtyGbFzdEglti
NpnNyhrX6EX1AY9RoVpLS3delytbdYABK2BXUBdVpLb034BG5f9p2Z6D1hsmzqE+
rOLxTS0d34enfFglwxCjUkwrO/weEvdpCeqcnm0TyLtq6BGoGGbkKcf510MQ1CB+
UqZxS7WIeYF1IIrm4YVD/ONiH6O1hn/QK6AmHFivgW5fKxvZXIJG850+kYX1JNca
LUfF43xi6OfgGdllUWaPsi6fRlkvP+YVd7y37d9pSC7QStdysLQ=
=fFOL
-----END PGP SIGNATURE-----
Merge tag 'timers_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Borislav Petkov:
- Fix an integer overflow in the NTP RTC synchronization which led to
the latter happening every 2 seconds instead of the intended every 11
minutes.
- Get rid of now unused get_seconds().
* tag 'timers_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ntp: Fix RTC synchronization on 32-bit platforms
timekeeping: Remove unused get_seconds()
- Differentiate which aspects of the FPU state get saved/restored when the FPU
is used in-kernel and fix a boot crash on K7 due to early MXCSR access before
CR4.OSFXSR is even set.
- A couple of noinstr annotation fixes
- Correct die ID setting on AMD for users of topology information which need
the correct die ID
- A SEV-ES fix to handle string port IO to/from kernel memory properly
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmANUr0ACgkQEsHwGGHe
VUos4hAAlBik/z+y+DaZGJyxtpST2YQaEbwbW3UMqyLsdVnLTTRnKzC1T+fEfD2Q
SxtCPYH5iuPbCgOOoQboWt6Aa53JlX9bRBZ/87Ub/ELJ9NgMxMQFXAiaDZAAY6Zy
L2B13KpoGOifPjrGDgksnafyqYv1CYesiArfOffHgvC3/0j7ONdda2SRDQ697TBw
FSV/WfUjCo0+JdXRRaP6YH5t9MxFerHxVH38xTDFwXikS9CVyddosLo5EP2wAQvi
5+160i2jB25vyMEsFBr5wE0xDpWLUdClVpzHXXPG2i0P+NHATiBcreTMPzeYOUXu
Hfc/y4ukOVDoMGlHLNKHq89alI87soMJIEjm2sAG1ZIypKyMJw7YUXQNRR3TcP0U
c7/C3W1mCWD1+8nLtlIMM0Z20DacQOf9YWko95+uh08+S52KpTOgnx+mpoZjK1PQ
Wv9HxPJKycrgRNhfverN5FSiOEW/DdvqNfVHTjuuzNLyKdM1NoZ/YTIyABk4RfFq
USUnC5rk4GqvCYdaLTEKkAJvLCmRKgVYd75Rc4/pPKILS6kv82vpj3BjClBaH0h1
yrvpafvXzOhwKP/J5q0vm57NJdqPZwuW4Ah+74tptmQL4rga84U4FOs3JpNJq0uu
1mj6xSFD8ZyI11BSkYbZAHTy1eNERze+azftCSPq/6EifYvqnsE=
=3rZM
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Add a new Intel model number for Alder Lake
- Differentiate which aspects of the FPU state get saved/restored when
the FPU is used in-kernel and fix a boot crash on K7 due to early
MXCSR access before CR4.OSFXSR is even set.
- A couple of noinstr annotation fixes
- Correct die ID setting on AMD for users of topology information which
need the correct die ID
- A SEV-ES fix to handle string port IO to/from kernel memory properly
* tag 'x86_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Add another Alder Lake CPU to the Intel family
x86/mmx: Use KFPU_387 for MMX string operations
x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state
x86/topology: Make __max_die_per_package available unconditionally
x86: __always_inline __{rd,wr}msr()
x86/mce: Remove explicit/superfluous tracing
locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEP
locking/lockdep: Cure noinstr fail
x86/sev: Fix nonistr violation
x86/entry: Fix noinstr fail
x86/cpu/amd: Set __max_die_per_package on AMD
x86/sev-es: Handle string port IO to kernel memory properly
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYA1opwAKCRCRxhvAZXjc
osnpAP4wjExvtwgh1eA7IgBPtAFzL1EPK2lrv7WM6yuMJNh23wEAxU+quoNrBT7U
R5UQvmXi2SwxjeGXR/BTLq/HU9rSJA4=
=6YJX
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2021-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull misc fixes from Christian Brauner:
- Jann reported sparse complaints because of a missing __user
annotation in a helper we added way back when we added
pidfd_send_signal() to avoid compat syscall handling. Fix it.
- Yanfei replaces a reference in a comment to the _do_fork() helper I
removed a while ago with a reference to the new kernel_clone()
replacement
- Alexander Guril added a simple coding style fix
* tag 'for-linus-2021-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
kthread: remove comments about old _do_fork() helper
Kernel: fork.c: Fix coding style: Do not use {} around single-line statements
signal: Add missing __user annotation to copy_siginfo_from_user_any
Put file f if inode_storage_ptr() returns NULL.
Fixes: 8ea636848a ("bpf: Implement bpf_local_storage for inodes")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20210121020856.25507-1-bianpan2016@163.com
Since ctx.optlen is signed, a larger value than max_value could be
passed, as it is later on used as unsigned, which causes a WARN_ON_ONCE
in the copy_to_user.
Fixes: 0d01da6afc ("bpf: implement getsockopt and setsockopt hooks")
Signed-off-by: Loris Reiff <loris.reiff@liblor.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20210122164232.61770-2-loris.reiff@liblor.ch
A toctou issue in `__cgroup_bpf_run_filter_getsockopt` can trigger a
WARN_ON_ONCE in a check of `copy_from_user`.
`*optlen` is checked to be non-negative in the individual getsockopt
functions beforehand. Changing `*optlen` in a race to a negative value
will result in a `copy_from_user(ctx.optval, optval, ctx.optlen)` with
`ctx.optlen` being a negative integer.
Fixes: 0d01da6afc ("bpf: implement getsockopt and setsockopt hooks")
Signed-off-by: Loris Reiff <loris.reiff@liblor.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20210122164232.61770-1-loris.reiff@liblor.ch
Now that we have KTHREAD_IS_PER_CPU to denote the critical per-cpu
tasks to retain during CPU offline, we can relax the warning in
set_cpus_allowed_ptr(). Any spurious kthread that wants to get on at
the last minute will get pushed off before it can run.
While during CPU online there is no harm, and actual benefit, to
allowing kthreads back on early, it simplifies hotplug code and fixes
a number of outstanding races.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Lai jiangshan <jiangshanlai@gmail.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210121103507.240724591@infradead.org
Prior to commit 1cf12e08bc ("sched/hotplug: Consolidate task
migration on CPU unplug") we'd leave any task on the dying CPU and
break affinity and force them off at the very end.
This scheme had to change in order to enable migrate_disable(). One
cannot wait for migrate_disable() to complete while stuck in
stop_machine(). Furthermore, since we need at the very least: idle,
hotplug and stop threads at any point before stop_machine, we can't
break affinity and/or push those away.
Under the assumption that all per-cpu kthreads are sanely handled by
CPU hotplug, the new code no long breaks affinity or migrates any of
them (which then includes the critical ones above).
However, there's an important difference between per-cpu kthreads and
kthreads that happen to have a single CPU affinity which is lost. The
latter class very much relies on the forced affinity breaking and
migration semantics previously provided.
Use the new kthread_is_per_cpu() infrastructure to tighten
is_per_cpu_kthread() and fix the hot-unplug problems stemming from the
change.
Fixes: 1cf12e08bc ("sched/hotplug: Consolidate task migration on CPU unplug")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210121103507.102416009@infradead.org
In preparation of using the balance_push state in ttwu() we need it to
provide a reliable and consistent state.
The immediate problem is that rq->balance_callback gets cleared every
schedule() and then re-set in the balance_push_callback() itself. This
is not a reliable signal, so add a variable that stays set during the
entire time.
Also move setting it before the synchronize_rcu() in
sched_cpu_deactivate(), such that we get guaranteed visibility to
ttwu(), which is a preempt-disable region.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210121103506.966069627@infradead.org
create_worker() will already set the right affinity using
kthread_bind_mask(), this means only the rescuer will need to change
it's affinity.
Howveer, while in cpu-hot-unplug a regular task is not allowed to run
on online&&!active as it would be pushed away quite agressively. We
need KTHREAD_IS_PER_CPU to survive in that environment.
Therefore set the affinity after getting that magic flag.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210121103506.826629830@infradead.org