mirror of https://gitee.com/openkylin/linux.git
967836 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
John Fastabend | 4363023d26 |
bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list
When skb has a frag_list its possible for skb_to_sgvec() to fail. This
happens when the scatterlist has fewer elements to store pages than would
be needed for the initial skb plus any of its frags.
This case appears rare, but is possible when running an RX parser/verdict
programs exposed to the internet. Currently, when this happens we throw
an error, break the pipe, and kfree the msg. This effectively breaks the
application or forces it to do a retry.
Lets catch this case and handle it by doing an skb_linearize() on any
skb we receive with frags. At this point skb_to_sgvec should not fail
because the failing conditions would require frags to be in place.
Fixes:
|
|
John Fastabend | 2443ca6667 |
bpf, sockmap: Handle memory acct if skb_verdict prog redirects to self
If the skb_verdict_prog redirects an skb knowingly to itself, fix your
BPF program this is not optimal and an abuse of the API please use
SK_PASS. That said there may be cases, such as socket load balancing,
where picking the socket is hashed based or otherwise picks the same
socket it was received on in some rare cases. If this happens we don't
want to confuse userspace giving them an EAGAIN error if we can avoid
it.
To avoid double accounting in these cases. At the moment even if the
skb has already been charged against the sockets rcvbuf and forward
alloc we check it again and do set_owner_r() causing it to be orphaned
and recharged. For one this is useless work, but more importantly we
can have a case where the skb could be put on the ingress queue, but
because we are under memory pressure we return EAGAIN. The trouble
here is the skb has already been accounted for so any rcvbuf checks
include the memory associated with the packet already. This rolls
up and can result in unnecessary EAGAIN errors in userspace read()
calls.
Fix by doing an unlikely check and skipping checks if skb->sk == sk.
Fixes:
|
|
John Fastabend | 6fa9201a89 |
bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self
If a socket redirects to itself and it is under memory pressure it is
possible to get a socket stuck so that recv() returns EAGAIN and the
socket can not advance for some time. This happens because when
redirecting a skb to the same socket we received the skb on we first
check if it is OK to enqueue the skb on the receiving socket by checking
memory limits. But, if the skb is itself the object holding the memory
needed to enqueue the skb we will keep retrying from kernel side
and always fail with EAGAIN. Then userspace will get a recv() EAGAIN
error if there are no skbs in the psock ingress queue. This will continue
until either some skbs get kfree'd causing the memory pressure to
reduce far enough that we can enqueue the pending packet or the
socket is destroyed. In some cases its possible to get a socket
stuck for a noticeable amount of time if the socket is only receiving
skbs from sk_skb verdict programs. To reproduce I make the socket
memory limits ridiculously low so sockets are always under memory
pressure. More often though if under memory pressure it looks like
a spurious EAGAIN error on user space side causing userspace to retry
and typically enough has moved on the memory side that it works.
To fix skip memory checks and skb_orphan if receiving on the same
sock as already assigned.
For SK_PASS cases this is easy, its always the same socket so we
can just omit the orphan/set_owner pair.
For backlog cases we need to check skb->sk and decide if the orphan
and set_owner pair are needed.
Fixes:
|
|
John Fastabend | 70796fb751 |
bpf, sockmap: Use truesize with sk_rmem_schedule()
We use skb->size with sk_rmem_scheduled() which is not correct. Instead use truesize to align with socket and tcp stack usage of sk_rmem_schedule. Suggested-by: Daniel Borkman <daniel@iogearbox.net> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/160556570616.73229.17003722112077507863.stgit@john-XPS-13-9370 |
|
John Fastabend | 36cd0e696a |
bpf, sockmap: Ensure SO_RCVBUF memory is observed on ingress redirect
Fix sockmap sk_skb programs so that they observe sk_rcvbuf limits. This
allows users to tune SO_RCVBUF and sockmap will honor them.
We can refactor the if(charge) case out in later patches. But, keep this
fix to the point.
Fixes:
|
|
John Fastabend | c9c89dcd87 |
bpf, sockmap: Fix partial copy_page_to_iter so progress can still be made
If copy_page_to_iter() fails or even partially completes, but with fewer
bytes copied than expected we currently reset sg.start and return EFAULT.
This proves problematic if we already copied data into the user buffer
before we return an error. Because we leave the copied data in the user
buffer and fail to unwind the scatterlist so kernel side believes data
has been copied and user side believes data has _not_ been received.
Expected behavior should be to return number of bytes copied and then
on the next read we need to return the error assuming its still there. This
can happen if we have a copy length spanning multiple scatterlist elements
and one or more complete before the error is hit.
The error is rare enough though that my normal testing with server side
programs, such as nginx, httpd, envoy, etc., I have never seen this. The
only reliable way to reproduce that I've found is to stream movies over
my browser for a day or so and wait for it to hang. Not very scientific,
but with a few extra WARN_ON()s in the code the bug was obvious.
When we review the errors from copy_page_to_iter() it seems we are hitting
a page fault from copy_page_to_iter_iovec() where the code checks
fault_in_pages_writeable(buf, copy) where buf is the user buffer. It
also seems typical server applications don't hit this case.
The other way to try and reproduce this is run the sockmap selftest tool
test_sockmap with data verification enabled, but it doesn't reproduce the
fault. Perhaps we can trigger this case artificially somehow from the
test tools. I haven't sorted out a way to do that yet though.
Fixes:
|
|
Tariq Toukan | 138559b9f9 |
net/tls: Fix wrong record sn in async mode of device resync
In async_resync mode, we log the TCP seq of records until the async request
is completed. Later, in case one of the logged seqs matches the resync
request, we return it, together with its record serial number. Before this
fix, we mistakenly returned the serial number of the current record
instead.
Fixes:
|
|
Jens Axboe | c993df5a68 |
io_uring: don't double complete failed reissue request
Zorro reports that an xfstest test case is failing, and it turns out that
for the reissue path we can potentially issue a double completion on the
request for the failure path. There's an issue around the retry as well,
but for now, at least just make sure that we handle the error path
correctly.
Cc: stable@vger.kernel.org
Fixes:
|
|
Taehee Yoo | a5bbcbf290 |
netdevsim: set .owner to THIS_MODULE
If THIS_MODULE is not set, the module would be removed while debugfs is being used. It eventually makes kernel panic. Fixes: |
|
Mickaël Salaün | fb14528e44 |
seccomp: Set PF_SUPERPRIV when checking capability
Replace the use of security_capable(current_cred(), ...) with ns_capable_noaudit() which set PF_SUPERPRIV. Since commit |
|
Mickaël Salaün | cf23705244 |
ptrace: Set PF_SUPERPRIV when checking capability
Commit |
|
Alex Marginean | fd5736bf9f |
enetc: Workaround for MDIO register access issue
Due to a hardware issue, an access to MDIO registers that is concurrent with other ENETC register accesses may lead to the MDIO access being dropped or corrupted. The workaround introduces locking for all register accesses to the ENETC register space. To reduce performance impact, a readers-writers locking scheme has been implemented. The writer in this case is the MDIO access code (irrelevant whether that MDIO access is a register read or write), and the reader is any access code to non-MDIO ENETC registers. Also, the datapath functions acquire the read lock fewer times and use _hot accessors. All the rest of the code uses the _wa accessors which lock every register access. The commit introducing MDIO support is - commit |
|
Linus Torvalds | 0fa8ee0d9a |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov: "A fix for use-after-free in the Sun keyboard driver, a fix to firmware updates on newer ICs in the Elan touchpad diver, and a couple misc driver fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: elan_i2c - fix firmware update on newer ICs Input: resistive-adc-touch - fix kconfig dependency on IIO_BUFFER Input: sunkbd - avoid use-after-free in teardown paths Input: i8042 - allow insmod to succeed on devices without an i8042 controller Input: adxl34x - clean up a data type in adxl34x_probe() |
|
Wang Hai | 68ec32daf7 |
net/mlx5: fix error return code in mlx5e_tc_nic_init()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes:
|
|
Eli Cohen | 5b8631c7b2 |
net/mlx5: E-Switch, Fail mlx5_esw_modify_vport_rate if qos disabled
Avoid calling mlx5_esw_modify_vport_rate() if qos is not enabled and
avoid unnecessary syndrome messages from firmware.
Fixes:
|
|
Vladyslav Tarasiuk | 470b747582 |
net/mlx5: Disable QoS when min_rates on all VFs are zero
Currently when QoS is enabled for VF and any min_rate is configured,
the driver sets bw_share value to at least 1 and doesn’t allow to set
it to 0 to make minimal rate unlimited. It means there is always a
minimal rate configured for every VF, even if user tries to remove it.
In order to make QoS disable possible, check whether all vports have
configured min_rate = 0. If this is true, set their bw_share to 0 to
disable min_rate limitations.
Fixes:
|
|
Vladyslav Tarasiuk | 1ce5fc724a |
net/mlx5: Clear bw_share upon VF disable
Currently, if user disables VFs with some min and max rates configured,
they are cleared. But QoS data is not cleared and restored upon next VF
enable placing limits on minimal rate for given VF, when user expects
none.
To match cleared vport->info struct with QoS-related min and max rates
upon VF disable, clear vport->qos struct too.
Fixes:
|
|
Michael Guralnik | 8cbcc5ef2a |
net/mlx5: Add handling of port type in rule deletion
Handle destruction of rules with port destination type to enable
full destruction of flow.
Without this handling of TX rules the deletion of these rules fails.
Dmesg of flow destruction failure:
[ 203.714146] mlx5_core 0000:00:0b.0: mlx5_cmd_check:753:(pid 342): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x144b7a)
[ 210.547387] ------------[ cut here ]------------
[ 210.548663] refcount_t: decrement hit 0; leaking memory.
[ 210.550651] WARNING: CPU: 4 PID: 342 at lib/refcount.c:31 refcount_warn_saturate+0x5c/0x110
[ 210.550654] Modules linked in: mlx5_ib mlx5_core ib_ipoib rdma_ucm rdma_cm iw_cm ib_cm ib_umad ib_uverbs ib_core
[ 210.550675] CPU: 4 PID: 342 Comm: test Not tainted 5.8.0-rc2+ #116
[ 210.550678] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 210.550680] RIP: 0010:refcount_warn_saturate+0x5c/0x110
[ 210.550685] Code: c6 d1 1b 01 00 0f 84 ad 00 00 00 5b 5d c3 80 3d b5 d1 1b 01 00 75 f4 48 c7 c7 20 d1 15 82 c6 05 a5 d1 1b 01 01 e8 a7 eb af ff <0f> 0b eb dd 80 3d 99 d1 1b 01 00 75 d4 48 c7 c7 c0 cf 15 82 c6 05
[ 210.550687] RSP: 0018:ffff8881642e77e8 EFLAGS: 00010282
[ 210.550691] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
[ 210.550694] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed102c85ceef
[ 210.550696] RBP: ffff888161720428 R08: ffffffff8124c10e R09: ffffed103243beae
[ 210.550698] R10: ffff8881921df56b R11: ffffed103243bead R12: ffff8881841b4180
[ 210.550701] R13: ffff888161720428 R14: ffff8881616d0000 R15: ffff888161720380
[ 210.550704] FS: 00007fc27f025740(0000) GS:ffff888192000000(0000) knlGS:0000000000000000
[ 210.550706] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 210.550708] CR2: 0000557e4b41a6a0 CR3: 0000000002415004 CR4: 0000000000360ea0
[ 210.550711] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 210.550713] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 210.550715] Call Trace:
[ 210.550717] mlx5_del_flow_rules+0x484/0x490 [mlx5_core]
[ 210.550720] ? mlx5_cmd_set_fte+0xa80/0xa80 [mlx5_core]
[ 210.550722] mlx5_ib_destroy_flow+0x17f/0x280 [mlx5_ib]
[ 210.550724] uverbs_free_flow+0x4c/0x90 [ib_uverbs]
[ 210.550726] destroy_hw_idr_uobject+0x41/0xb0 [ib_uverbs]
[ 210.550728] uverbs_destroy_uobject+0xaa/0x390 [ib_uverbs]
[ 210.550731] __uverbs_cleanup_ufile+0x129/0x1b0 [ib_uverbs]
[ 210.550733] ? uverbs_destroy_uobject+0x390/0x390 [ib_uverbs]
[ 210.550735] uverbs_destroy_ufile_hw+0x78/0x190 [ib_uverbs]
[ 210.550737] ib_uverbs_close+0x36/0x140 [ib_uverbs]
[ 210.550739] __fput+0x181/0x380
[ 210.550741] task_work_run+0x88/0xd0
[ 210.550743] do_exit+0x5f6/0x13b0
[ 210.550745] ? sched_clock_cpu+0x30/0x140
[ 210.550747] ? is_current_pgrp_orphaned+0x70/0x70
[ 210.550750] ? lock_downgrade+0x360/0x360
[ 210.550752] ? mark_held_locks+0x1d/0x90
[ 210.550754] do_group_exit+0x8a/0x140
[ 210.550756] get_signal+0x20a/0xf50
[ 210.550758] do_signal+0x8c/0xbe0
[ 210.550760] ? hrtimer_nanosleep+0x1d8/0x200
[ 210.550762] ? nanosleep_copyout+0x50/0x50
[ 210.550764] ? restore_sigcontext+0x320/0x320
[ 210.550766] ? __hrtimer_init+0xf0/0xf0
[ 210.550768] ? timespec64_add_safe+0x150/0x150
[ 210.550770] ? mark_held_locks+0x1d/0x90
[ 210.550772] ? lockdep_hardirqs_on_prepare+0x14c/0x240
[ 210.550774] __prepare_exit_to_usermode+0x119/0x170
[ 210.550776] do_syscall_64+0x65/0x300
[ 210.550778] ? trace_hardirqs_off+0x10/0x120
[ 210.550781] ? mark_held_locks+0x1d/0x90
[ 210.550783] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
[ 210.550785] ? lockdep_hardirqs_on+0x112/0x190
[ 210.550787] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 210.550789] RIP: 0033:0x7fc27f1cd157
[ 210.550791] Code: Bad RIP value.
[ 210.550793] RSP: 002b:00007ffd4db27ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000023
[ 210.550798] RAX: fffffffffffffdfc RBX: ffffffffffffff80 RCX: 00007fc27f1cd157
[ 210.550800] RDX: 00007fc27f025740 RSI: 00007ffd4db27eb0 RDI: 00007ffd4db27eb0
[ 210.550803] RBP: 0000000000000016 R08: 0000000000000000 R09: 000000000000000e
[ 210.550805] R10: 00007ffd4db27dc7 R11: 0000000000000246 R12: 0000000000400c00
[ 210.550808] R13: 00007ffd4db285f0 R14: 0000000000000000 R15: 0000000000000000
[ 210.550809] irq event stamp: 49399
[ 210.550812] hardirqs last enabled at (49399): [<ffffffff81172d36>] console_unlock+0x556/0x6f0
[ 210.550815] hardirqs last disabled at (49398): [<ffffffff81172897>] console_unlock+0xb7/0x6f0
[ 210.550818] softirqs last enabled at (48706): [<ffffffff81e0037b>] __do_softirq+0x37b/0x60c
[ 210.550820] softirqs last disabled at (48697): [<ffffffff81c00e2f>] asm_call_on_stack+0xf/0x20
[ 210.550822] ---[ end trace ad18c0e6fa846454 ]---
[ 210.581862] mlx5_core 0000:00:0c.0: mlx5_destroy_flow_table:2132:(pid 342): Flow table 262150 wasn't destroyed, refcount > 1
Fixes:
|
|
Maor Dickman | 219b3267ca |
net/mlx5e: Fix check if netdev is bond slave
Bond events handler uses bond_slave_get_rtnl to check if net device
is bond slave. bond_slave_get_rtnl return the rcu rx_handler pointer
from the netdev which exists for bond slaves but also exists for
devices that are attached to linux bridge so using it as indication
for bond slave is wrong.
Fix by using netif_is_lag_port instead.
Fixes:
|
|
Huy Nguyen | 6248ce991f |
net/mlx5e: Fix IPsec packet drop by mlx5e_tc_update_skb
Both TC and IPsec crypto offload use metadata_regB to store
private information. Since TC does not use bit 31 of regB, IPsec
will use bit 31 as the IPsec packet marker. The IPsec's regB usage
is changed to:
Bit31: IPsec marker
Bit30-24: IPsec syndrome
Bit23-0: IPsec obj id
Fixes:
|
|
Huy Nguyen | 5cfb540ef2 |
net/mlx5e: Set IPsec WAs only in IP's non checksum partial case.
The IP's checksum partial still requires L4 csum flag on Ethernet WQE.
Make the IPsec WAs only for the IP's non checksum partial case
(for example icmd packet)
Fixes:
|
|
Maxim Mikityanskiy | ea63609857 |
net/mlx5e: Fix refcount leak on kTLS RX resync
On resync, the driver calls inet_lookup_established
(__inet6_lookup_established) that increases sk_refcnt of the socket. To
decrease it, the driver set skb->destructor to sock_edemux. However, it
didn't work well, because the TCP stack also sets this destructor for
early demux, and the refcount gets decreased only once, while increased
two times (in mlx5e and in the TCP stack). It leads to a socket leak, a
TLS context leak, which in the end leads to calling tls_dev_del twice:
on socket close and on driver unload, which in turn leads to a crash.
This commit fixes the refcount leak by calling sock_gen_put right away
after using the socket, thus fixing all the subsequent issues.
Fixes:
|
|
Linus Torvalds | 111e91a6df |
- fix system call exit path; avoid return to user space with
any TIF/CIF/PIF set - fix file permission for cpum_sfb_size parameter - another small defconfig update -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAl+0EOoACgkQIg7DeRsp bsIh5g/+LuitpduJRvbyb9Km1A+CSflhC7USScBnyO0n2MRlo1c4M2rRIoUtOVD4 ktOV03CcKAjahw3umIx9euHkqYo5qgUVkSgx9q23R0GMf3iSXwh4wKKpe/YAuiad qsAunpD6xRtKm2xqnnSGYiZ8gKDRw7N+nZBWNrZa74I/thcNZbq1d7TBmrTJrkYL EH8JxTvPohNpjtDYOwVh8XcKPl1tT3R7N9/bmudTiOtGQtDWCsOjg4XisAc10ovw thCmr1t32SBifdhk6HE7AQrA73EpazDQlAUZlPVb+E7JJypp5gUDSkJO1wZ+TkhW WJgIgJGzeyJ9iqLQQcnwdxQW91spKr/gYw9yy5gZDZ1uvCclqdfKo/sha1N+xX3F j67+h/LEOGV3d02mBlDi6+4fnjHbnyWhDUivi3Atp7PHmWGd1qTPjLqZ3NsXZrLT 8sTX6c77g8YqzC++Q2goXPDmToxqcT1LCPpAVSNYY3BdAsOgvMJSlFVia1If5SAv 6MU8MUWTORBqh7c/hB0Ka+cVJUxtZ6Pt/HESM9qONhTEmAqNfeWvPgsSSlLytl39 PS9RDL6vw29rsOvu9kLEaISRl1G31RaLRYLdIZ4HyTl+8m+skQ3VAmursEnLrTnb oRFBuNp6Y5jPGWkqXhE6t7z3ozzRNZERXA1AEqM2VozKHOXYtiI= =TVeJ -----END PGP SIGNATURE----- Merge tag 's390-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - fix system call exit path; avoid return to user space with any TIF/CIF/PIF set - fix file permission for cpum_sfb_size parameter - another small defconfig update * tag 's390-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cpum_sf.c: fix file permission for cpum_sfb_size s390: update defconfigs s390: fix system call exit path |
|
Linus Torvalds | ed129cd75a |
- fix bug preventing booting on several platforms
- fix for build error, when modules need has_transparent_hugepage - fix for memleak in alchemy clk setup -----BEGIN PGP SIGNATURE----- iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAl+0DxIaHHRzYm9nZW5k QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHAJARAAo9kPj9P488PlKjw168SJ neNK00OmtW8D6grnXIHnj/lYzJ1ul+dgbfEiKYMKrBibwc9jxdqeBjR+aLbADs0H 8u4Z3+dLRNzCqnxd4EYwP53+WdICrlNN9c7H+AjvWP4aMzZLNQW2fAQi+7hGR8p7 gvR+7CZyBHtVns92IGrMPEm1CBVrP/PrKxU6TRi3GYxxf7jCGYMHRI/bwXkBSGhg mzxeCtygnDKivUcLv2FqFQ6BmSbPbxZVqQNKkBCqEyEIvYmvNkmUMuqf36NYOtBp dXloYV5px9FjEqBhfyRYeO3xElyphCm4fYWV4+Bvf60w4m3c33peezg8ltJxqFAu zFZk6NhfgZrRRJTGTfJ3juisagmtLzGx5rcK/JJGmcvW3975cwMlEtd545JGrECW Lq3GZLwROc8hEh8dydrf3jGjP3+Ty2emO6rjYQrDqoWAhu4IqpLIb7s58IWzVlYv z7KejBo1bTuSGar6sHREmEGCPzlrJp4ebq6niD4j57Yk9gzp0BoTd+JogE/EaCz/ P92OV/Log/hgKhFB0sI3Mtl2OOF3KIPq1IFfeu8731yMq9muOVZMhTOS03LRySFl yyxVFn+uI00oE5A7Vw4ETEdiw1lbBjD4rC6HLQK36RhlZR5vGGL8rIejXRuKJCZY F5yTub6Y6KDbKpBCjAkDros= =5NkQ -----END PGP SIGNATURE----- Merge tag 'mips_fixes_5.10_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fix bug preventing booting on several platforms - fix for build error, when modules need has_transparent_hugepage - fix for memleak in alchemy clk setup * tag 'mips_fixes_5.10_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: Alchemy: Fix memleak in alchemy_clk_setup_cpu MIPS: kernel: Fix for_each_memblock conversion MIPS: export has_transparent_hugepage() for modules |
|
Ryan Sharpelletti | 1b9e2a8c99 |
tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate
During loss recovery, retransmitted packets are forced to use TCP
timestamps to calculate the RTT samples, which have a millisecond
granularity. BBR is designed using a microsecond granularity. As a
result, multiple RTT samples could be truncated to the same RTT value
during loss recovery. This is problematic, as BBR will not enter
PROBE_RTT if the RTT sample is <= the current min_rtt sample, meaning
that if there are persistent losses, PROBE_RTT will constantly be
pushed off and potentially never re-entered. This patch makes sure
that BBR enters PROBE_RTT by checking if RTT sample is < the current
min_rtt sample, rather than <=.
The Netflix transport/TCP team discovered this bug in the Linux TCP
BBR code during lab tests.
Fixes:
|
|
Joel Stanley | 3d5179458d |
net: ftgmac100: Fix crash when removing driver
When removing the driver we would hit BUG_ON(!list_empty(&dev->ptype_specific))
in net/core/dev.c due to still having the NC-SI packet handler
registered.
# echo 1e660000.ethernet > /sys/bus/platform/drivers/ftgmac100/unbind
------------[ cut here ]------------
kernel BUG at net/core/dev.c:10254!
Internal error: Oops - BUG: 0 [#1] SMP ARM
CPU: 0 PID: 115 Comm: sh Not tainted 5.10.0-rc3-next-20201111-00007-g02e0365710c4 #46
Hardware name: Generic DT based system
PC is at netdev_run_todo+0x314/0x394
LR is at cpumask_next+0x20/0x24
pc : [<806f5830>] lr : [<80863cb0>] psr: 80000153
sp : 855bbd58 ip : 00000001 fp : 855bbdac
r10: 80c03d00 r9 : 80c06228 r8 : 81158c54
r7 : 00000000 r6 : 80c05dec r5 : 80c05d18 r4 : 813b9280
r3 : 813b9054 r2 : 8122c470 r1 : 00000002 r0 : 00000002
Flags: Nzcv IRQs on FIQs off Mode SVC_32 ISA ARM Segment none
Control: 00c5387d Table: 85514008 DAC: 00000051
Process sh (pid: 115, stack limit = 0x7cb5703d)
...
Backtrace:
[<806f551c>] (netdev_run_todo) from [<80707eec>] (rtnl_unlock+0x18/0x1c)
r10:00000051 r9:854ed710 r8:81158c54 r7:80c76bb0 r6:81158c10 r5:8115b410
r4:813b9000
[<80707ed4>] (rtnl_unlock) from [<806f5db8>] (unregister_netdev+0x2c/0x30)
[<806f5d8c>] (unregister_netdev) from [<805a8180>] (ftgmac100_remove+0x20/0xa8)
r5:8115b410 r4:813b9000
[<805a8160>] (ftgmac100_remove) from [<805355e4>] (platform_drv_remove+0x34/0x4c)
Fixes:
|
|
Zhang Changzhong | 7b027c249d |
net: b44: fix error return code in b44_init_one()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes:
|
|
Linus Torvalds | be1dd6692a |
perf tools updates for v5.10: 3rd batch.
- Fix file corruption due to event deletion in 'perf inject'. - Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy', silencing perf build warning. - Avoid an msan warning in a copied stack in 'perf test'. - Correct tracepoint field name "flags" in ARM's CS-ETM hardware tracing 'perf test' entry. - Update branch sample pattern for cs-etm to cope with excluding guest in userspace counting. - Don't free "lock_seq_stat" if read_count isn't zero in 'perf lock'. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Test results: The first ones are container based builds of tools/perf with and without libelf support. Where clang is available, it is also used to build perf with/without libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang when clang and its devel libraries are installed. The objtool and samples/bpf/ builds are disabled now that I'm switching from using the sources in a local volume to fetching them from a http server to build it inside the container, to make it easier to build in a container cluster. Those will come back later. Several are cross builds, the ones with -x-ARCH and the android one, and those may not have all the features built, due to lack of multi-arch devel packages, available and being used so far on just a few, like debian:experimental-x-{arm64,mipsel}. The 'perf test' one will perform a variety of tests exercising tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands with a variety of command line event specifications to then intercept the sys_perf_event syscall to check that the perf_event_attr fields are set up as expected, among a variety of other unit tests. Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/ with a variety of feature sets, exercising the build with an incomplete set of features as well as with a complete one. It is planned to have it run on each of the containers mentioned above, using some container orchestration infrastructure. Get in contact if interested in helping having this in place. $ grep "model name" -m1 /proc/cpuinfo model name: AMD Ryzen 9 3900X 12-Core Processor $ export PERF_TARBALL=http://192.168.86.5/perf/perf-5.10.0-rc3.tar.xz $ dm 1 71.39 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final) 2 70.77 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final) 3 73.70 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final) 4 82.24 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0) 5 82.21 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1) 6 84.79 alpine:3.9 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1) 7 106.15 alpine:3.10 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0) 8 120.21 alpine:3.11 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 9.0.0 (https://git.alpinelinux.org/aports f7f0d2c2b8bcd6a5843401a9a702029556492689) (based on LLVM 9.0.0) 9 111.49 alpine:3.12 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 10.0.0 (https://gitlab.alpinelinux.org/alpine/aports.git 7445adce501f8473efdb93b17b5eaf2f1445ed4c) 10 119.72 alpine:edge : Ok gcc (Alpine 10.2.0) 10.2.0, Alpine clang version 10.0.1 11 70.03 alt:p8 : Ok x86_64-alt-linux-gcc (GCC) 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1), clang version 3.8.0 (tags/RELEASE_380/final) 12 86.89 alt:p9 : Ok x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1), clang version 10.0.0 13 84.45 alt:sisyphus : Ok x86_64-alt-linux-gcc (GCC) 9.3.1 20200518 (ALT Sisyphus 9.3.1-alt1), clang version 10.0.1 14 67.56 amazonlinux:1 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final) 15 101.57 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-9), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2) 16 22.59 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease) 17 22.50 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease) 18 12.05 centos:6 : FAIL gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23) Ancient gcc get this wrong: /git/linux/tools/lib/bpf/libbpf.h:203: error: wrong number of arguments specified for 'deprecated' attribute But when using NO_LIBBPF we should not hit this, changes for that to happen will land in 5.11 as they require more surgery than advisable for this late in 5.10 rc. 19 32.72 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44) 20 118.02 centos:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), clang version 9.0.1 (Red Hat 9.0.1-2.module_el8.2.0+309+0c7b6b03) 21 62.52 clearlinux:latest : Ok gcc (Clear Linux OS for Intel Architecture) 10.2.1 20201105 releases/gcc-10.2.0-465-g2b4cba9a30, clang version 10.0.1 22 78.48 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0) 23 79.18 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final) 24 76.33 debian:10 : Ok gcc (Debian 8.3.0-6) 8.3.0, clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final) 25 93.16 debian:experimental : Ok gcc (Debian 10.2.0-16) 10.2.0, Debian clang version 11.0.0-4 26 30.47 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 9.3.0-8) 9.3.0 27 31.23 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7) 28 31.77 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.5.0 (tags/RELEASE_350/final) 29 70.66 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final) 30 82.84 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final) 31 25.41 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710 32 84.37 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final) 33 96.38 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final) 34 97.16 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final) 35 107.89 fedora:28 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final) 36 113.43 fedora:29 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29) 37 117.85 fedora:30 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc30) 38 25.36 fedora:30-x-ARC-uClibc : Ok arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225 39 116.54 fedora:31 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 9.0.1 (Fedora 9.0.1-4.fc31) 40 98.36 fedora:32 : Ok gcc (GCC) 10.2.1 20201016 (Red Hat 10.2.1-6), clang version 10.0.1 (Fedora 10.0.1-3.fc32) 41 97.72 fedora:33 : Ok gcc (GCC) 10.2.1 20201005 (Red Hat 10.2.1-5), clang version 11.0.0 (Fedora 11.0.0-1.fc33) 42 98.60 fedora:rawhide : Ok gcc (GCC) 10.2.1 20201112 (Red Hat 10.2.1-8), clang version 11.0.0 (Fedora 11.0.0-2.fc34) 43 35.00 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 9.3.0-r1 p3) 9.3.0 44 70.27 mageia:5 : Ok gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final) 45 86.27 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final) 46 103.63 manjaro:latest : Ok gcc (GCC) 10.2.0, clang version 10.0.1 47 228.99 openmandriva:cooker : Ok gcc (GCC) 10.2.0 20200723 (OpenMandriva), OpenMandriva 11.0.0-1 clang version 11.0.0 (/builddir/build/BUILD/llvm-project-llvmorg-11.0.0/clang 63e22714ac938c6b537bd958f70680d3331a2030) 48 120.44 opensuse:15.0 : Ok gcc (SUSE Linux) 7.4.1 20190905 [gcc-7-branch revision 275407], clang version 5.0.1 (tags/RELEASE_501/final 312548) 49 127.26 opensuse:15.1 : Ok gcc (SUSE Linux) 7.5.0, clang version 7.0.1 (tags/RELEASE_701/final 349238) 50 117.76 opensuse:15.2 : Ok gcc (SUSE Linux) 7.5.0, clang version 9.0.1 51 114.96 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5, clang version 3.8.0 (tags/RELEASE_380/final 262553) 52 110.55 opensuse:tumbleweed : Ok gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c], clang version 10.0.1 53 11.81 oraclelinux:6 : FAIL gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1) See explanation for centos:6 54 32.31 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44.0.3) 55 116.81 oraclelinux:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5.0.3), clang version 9.0.1 (Red Hat 9.0.1-2.0.1.module+el8.2.0+5599+9ed9ef6d) 56 28.44 ubuntu:12.04 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0) 57 31.16 ubuntu:14.04 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4 58 79.71 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final) 59 27.05 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 60 27.14 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 61 25.89 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 62 26.39 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 63 26.32 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 64 25.61 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 65 91.53 ubuntu:18.04 : Ok gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final) 66 28.76 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0 67 27.64 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0 68 22.90 ubuntu:18.04-x-m68k : Ok m68k-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 69 27.23 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 70 29.56 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 71 29.76 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 72 161.85 ubuntu:18.04-x-riscv64 : Ok riscv64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 73 25.41 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 74 27.77 ubuntu:18.04-x-sh4 : Ok sh4-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 75 25.38 ubuntu:18.04-x-sparc64 : Ok sparc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 76 71.37 ubuntu:19.10 : Ok gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, clang version 8.0.1-3build1 (tags/RELEASE_801/final) 77 27.53 ubuntu:19.10-x-alpha : Ok alpha-linux-gnu-gcc (Ubuntu 9.2.1-9ubuntu1) 9.2.1 20191008 78 24.90 ubuntu:19.10-x-hppa : Ok hppa-linux-gnu-gcc (Ubuntu 9.2.1-9ubuntu1) 9.2.1 20191008 79 75.35 ubuntu:20.04 : Ok gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, clang version 10.0.0-4ubuntu1 80 31.41 ubuntu:20.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 81 76.36 ubuntu:20.10 : Ok gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0, Ubuntu clang version 11.0.0-2 $ # uname -a Linux quaco 5.8.14-200.fc32.x86_64 #1 SMP Wed Oct 7 14:47:56 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux # git log --oneline -1 |
|
Zhang Changzhong | cb47d16ea2 |
qed: fix error return code in qed_iwarp_ll2_start()
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: |
|
Linus Torvalds | 9dacf44c38 |
Merge branch 'urgent-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney: "A single commit that fixes a bug that was introduced a couple of merge windows ago, but which rather more recently converged to an agreed-upon fix. The bug is that interrupts can be incorrectly enabled while holding an irq-disabled spinlock. This can of course result in self-deadlocks. The bug is a bit difficult to trigger. It requires that a preempted task be blocking a preemptible-RCU grace period long enough to trigger an RCU CPU stall warning. In addition, an interrupt must occur at just the right time, and that interrupt's handler must acquire that same irq-disabled spinlock. Still, a deadlock is a deadlock. Furthermore, we do now have a fix, and that fix survives kernel test robot, -next, and rcutorture testing. It has also been verified by Sebastian as fixing the bug. Therefore..." * 'urgent-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled |
|
Xiongfeng Wang | 6654b57866 |
drm/sun4i: dw-hdmi: fix error return code in sun8i_dw_hdmi_bind()
Fix to return a negative error code from the error handling case instead
of 0 in function sun8i_dw_hdmi_bind().
Fixes:
|
|
Lukas Wunner |
04a9cd51d3
|
spi: npcm-fiu: Don't leak SPI master in probe error path
If the calls to of_match_device(), of_alias_get_id(),
devm_ioremap_resource(), devm_regmap_init_mmio() or devm_clk_get()
fail on probe of the NPCM FIU SPI driver, the spi_controller struct is
erroneously not freed.
Fix by switching over to the new devm_spi_alloc_master() helper.
Fixes:
|
|
Serge Semin |
a41b0ad07b
|
spi: dw: Set transfer handler before unmasking the IRQs
It turns out the IRQs most like can be unmasked before the controller is enabled with no problematic consequences. The manual doesn't explicitly state that, but the examples perform the controller initialization procedure in that order. So the commit |
|
Laurent Pinchart | dc293f2106 |
xtensa: uaccess: Add missing __user to strncpy_from_user() prototype
When adding __user annotations in commit
|
|
Joakim Tjernlund | 54a2a3898f |
ALSA: usb-audio: Add delay quirk for all Logitech USB devices
Found one more Logitech device, BCC950 ConferenceCam, which needs the same delay here. This makes 3 out of 3 devices I have tried. Therefore, add a delay for all Logitech devices as it does not hurt. Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Cc: <stable@vger.kernel.org> # 4.19.y, 5.4.y Link: https://lore.kernel.org/r/20201117122803.24310-1-joakim.tjernlund@infinera.com Signed-off-by: Takashi Iwai <tiwai@suse.de> |
|
Rafael J. Wysocki | 14c620cf2e |
Merge branch 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull cpufreq-arm fixes for 5.10-rc5 from Viresh Kumar: "- tegra186: Fix ->get() callback. - arm/scmi: Add dummy clock provider to fix failure." * 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: scmi: Fix OPP addition failure with a dummy clock provider cpufreq: tegra186: Fix get frequency callback |
|
Sami Tolvanen | ebd19fc372 |
perf/x86: fix sysfs type mismatches
This change switches rapl to use PMU_FORMAT_ATTR, and fixes two other macros to use device_attribute instead of kobj_attribute to avoid callback type mismatches that trip indirect call checking with Clang's Control-Flow Integrity (CFI). Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20201113183126.1239404-1-samitolvanen@google.com |
|
Boqun Feng | 43be4388e9 |
lockdep: Put graph lock/unlock under lock_recursion protection
A warning was hit when running xfstests/generic/068 in a Hyper-V guest: [...] ------------[ cut here ]------------ [...] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled()) [...] WARNING: CPU: 2 PID: 1350 at kernel/locking/lockdep.c:5280 check_flags.part.0+0x165/0x170 [...] ... [...] Workqueue: events pwq_unbound_release_workfn [...] RIP: 0010:check_flags.part.0+0x165/0x170 [...] ... [...] Call Trace: [...] lock_is_held_type+0x72/0x150 [...] ? lock_acquire+0x16e/0x4a0 [...] rcu_read_lock_sched_held+0x3f/0x80 [...] __send_ipi_one+0x14d/0x1b0 [...] hv_send_ipi+0x12/0x30 [...] __pv_queued_spin_unlock_slowpath+0xd1/0x110 [...] __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20 [...] .slowpath+0x9/0xe [...] lockdep_unregister_key+0x128/0x180 [...] pwq_unbound_release_workfn+0xbb/0xf0 [...] process_one_work+0x227/0x5c0 [...] worker_thread+0x55/0x3c0 [...] ? process_one_work+0x5c0/0x5c0 [...] kthread+0x153/0x170 [...] ? __kthread_bind_mask+0x60/0x60 [...] ret_from_fork+0x1f/0x30 The cause of the problem is we have call chain lockdep_unregister_key() -> <irq disabled by raw_local_irq_save()> lockdep_unlock() -> arch_spin_unlock() -> __pv_queued_spin_unlock_slowpath() -> pv_kick() -> __send_ipi_one() -> trace_hyperv_send_ipi_one(). Although this particular warning is triggered because Hyper-V has a trace point in ipi sending, but in general arch_spin_unlock() may call another function having a trace point in it, so put the arch_spin_lock() and arch_spin_unlock() after lock_recursion protection to fix this problem and avoid similiar problems. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201113110512.1056501-1-boqun.feng@gmail.com |
|
Juri Lelli | 2279f540ea |
sched/deadline: Fix priority inheritance with multiple scheduling classes
Glenn reported that "an application [he developed produces] a BUG in deadline.c when a SCHED_DEADLINE task contends with CFS tasks on nested PTHREAD_PRIO_INHERIT mutexes. I believe the bug is triggered when a CFS task that was boosted by a SCHED_DEADLINE task boosts another CFS task (nested priority inheritance). ------------[ cut here ]------------ kernel BUG at kernel/sched/deadline.c:1462! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 12 PID: 19171 Comm: dl_boost_bug Tainted: ... Hardware name: ... RIP: 0010:enqueue_task_dl+0x335/0x910 Code: ... RSP: 0018:ffffc9000c2bbc68 EFLAGS: 00010002 RAX: 0000000000000009 RBX: ffff888c0af94c00 RCX: ffffffff81e12500 RDX: 000000000000002e RSI: ffff888c0af94c00 RDI: ffff888c10b22600 RBP: ffffc9000c2bbd08 R08: 0000000000000009 R09: 0000000000000078 R10: ffffffff81e12440 R11: ffffffff81e1236c R12: ffff888bc8932600 R13: ffff888c0af94eb8 R14: ffff888c10b22600 R15: ffff888bc8932600 FS: 00007fa58ac55700(0000) GS:ffff888c10b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa58b523230 CR3: 0000000bf44ab003 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: ? intel_pstate_update_util_hwp+0x13/0x170 rt_mutex_setprio+0x1cc/0x4b0 task_blocks_on_rt_mutex+0x225/0x260 rt_spin_lock_slowlock_locked+0xab/0x2d0 rt_spin_lock_slowlock+0x50/0x80 hrtimer_grab_expiry_lock+0x20/0x30 hrtimer_cancel+0x13/0x30 do_nanosleep+0xa0/0x150 hrtimer_nanosleep+0xe1/0x230 ? __hrtimer_init_sleeper+0x60/0x60 __x64_sys_nanosleep+0x8d/0xa0 do_syscall_64+0x4a/0x100 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fa58b52330d ... ---[ end trace 0000000000000002 ]— He also provided a simple reproducer creating the situation below: So the execution order of locking steps are the following (N1 and N2 are non-deadline tasks. D1 is a deadline task. M1 and M2 are mutexes that are enabled * with priority inheritance.) Time moves forward as this timeline goes down: N1 N2 D1 | | | | | | Lock(M1) | | | | | | Lock(M2) | | | | | | Lock(M2) | | | | Lock(M1) | | (!!bug triggered!) | Daniel reported a similar situation as well, by just letting ksoftirqd run with DEADLINE (and eventually block on a mutex). Problem is that boosted entities (Priority Inheritance) use static DEADLINE parameters of the top priority waiter. However, there might be cases where top waiter could be a non-DEADLINE entity that is currently boosted by a DEADLINE entity from a different lock chain (i.e., nested priority chains involving entities of non-DEADLINE classes). In this case, top waiter static DEADLINE parameters could be null (initialized to 0 at fork()) and replenish_dl_entity() would hit a BUG(). Fix this by keeping track of the original donor and using its parameters when a task is boosted. Reported-by: Glenn Elliott <glenn@aurora.tech> Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lkml.kernel.org/r/20201117061432.517340-1-juri.lelli@redhat.com |
|
Peter Zijlstra | ec618b84f6 |
sched: Fix rq->nr_iowait ordering
schedule() ttwu()
deactivate_task(); if (p->on_rq && ...) // false
atomic_dec(&task_rq(p)->nr_iowait);
if (prev->in_iowait)
atomic_inc(&rq->nr_iowait);
Allows nr_iowait to be decremented before it gets incremented,
resulting in more dodgy IO-wait numbers than usual.
Note that because we can now do ttwu_queue_wakelist() before
p->on_cpu==0, we lose the natural ordering and have to further delay
the decrement.
Fixes:
|
|
Peter Zijlstra | f97bb5272d |
sched: Fix data-race in wakeup
Mel reported that on some ARM64 platforms loadavg goes bananas and Will tracked it down to the following race: CPU0 CPU1 schedule() prev->sched_contributes_to_load = X; deactivate_task(prev); try_to_wake_up() if (p->on_rq &&) // false if (smp_load_acquire(&p->on_cpu) && // true ttwu_queue_wakelist()) p->sched_remote_wakeup = Y; smp_store_release(prev->on_cpu, 0); where both p->sched_contributes_to_load and p->sched_remote_wakeup are in the same word, and thus the stores X and Y race (and can clobber one another's data). Whereas prior to commit |
|
Quentin Perret | 8e1ac4299a |
sched/fair: Fix overutilized update in enqueue_task_fair()
enqueue_task_fair() attempts to skip the overutilized update for new
tasks as their util_avg is not accurate yet. However, the flag we check
to do so is overwritten earlier on in the function, which makes the
condition pretty much a nop.
Fix this by saving the flag early on.
Fixes:
|
|
Zhang Qilong | ac3b57adf8 |
MIPS: Alchemy: Fix memleak in alchemy_clk_setup_cpu
If the clk_register fails, we should free h before
function returns to prevent memleak.
Fixes:
|
|
Manish Narani | d06d60d52e |
mmc: sdhci-of-arasan: Issue DLL reset explicitly
In the current implementation DLL reset will be issued for
each ITAP and OTAP setting inside ATF, this is creating issues
in some scenarios and this sequence is not inline with the TRM.
To fix the issue, DLL reset should be removed from the ATF and
host driver will request it explicitly.
This patch update host driver to explicitly request for DLL reset
before ITAP (assert DLL) and after OTAP (release DLL) settings.
Fixes:
|
|
Manish Narani | d338c6d01d |
mmc: sdhci-of-arasan: Use Mask writes for Tap delays
Mask the ITAP and OTAP delay bits before updating with the new
tap value for Versal platform.
Fixes:
|
|
Manish Narani | 9e95343293 |
mmc: sdhci-of-arasan: Allow configuring zero tap values
Allow configuring the Output and Input tap values with zero to avoid
failures in some cases (one of them is SD boot mode) where the output
and input tap values may be already set to non-zero.
Fixes:
|
|
Adrian Hunter | 60d5356610 |
mmc: sdhci-pci: Prefer SDR25 timing for High Speed mode for BYT-based Intel controllers
A UHS setting of SDR25 can give better results for High Speed mode.
This is because there is no setting corresponding to high speed. Currently
SDHCI sets no value, which means zero which is also the setting for SDR12.
There was an attempt to change this in sdhci.c but it caused problems for
some drivers, so it was reverted and the change was made to sdhci-brcmstb
in commit
|
|
Greg Kroah-Hartman | 2dde2821b5 |
First set of IIO and counter fixes for the 5.10 cycle.
IIO cros_ec - Provide defauts for max and min frequency when older machines fail to return them correctly. ingenic-adc - Fix wrong vref value for JZ4770 SoC - Fix AUX / VBAT readings when touchscreen in use by pausing touchscreen readings during a read of these channels. kxcjk1013 - Fix an issue with KIOX010A ACPI id using devices which need to run a ACPI device specific method to avoid leaving the keyboard disabled. Includes a minor precursor patch to make this fix easier to do. mt6577-auxadc - Fix an issue with dev_comp not being set resulting in a null ptr deref. st_lsm6dsx - Set a 10ms min shub slave timeout to handle fast snesors where more time is needed to set up the config than the cycles allowed. stm32-adc - Fix an issue due to a clash between an ADC configured to use IRQs and a second configured to use DMA cause by some incorrect register masking. vcnl4035 - Kconfig missing dependency Counter ti-eqep - wrong value for max_register as one beyond the end instead of the end. -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAl+yyk8RHGppYzIzQGtl cm5lbC5vcmcACgkQVIU0mcT0FohJURAAi5xCdV2cUQoAUCi2ldlwEVaIb+yCdHzk V1mB6O9sdJTmkaJx6qopkx/Dn0Sou1G48paKlzOFznno0no0PRY4IXDZq5TdR+M+ 0WTR6TR0UwT1AIVELzhMPuW9+G1YbWdxRDBEeXOAT+HrJftZLpgU2Mk3Of8h7p0n /CAaT60wwVf13VJ9ESjr7UKqisuMIt0zsqN5lS8sXRNnIx1mCZe1DFb6v1cs3w6n G77lub5ylNxu+u6l6YTYILNZ+Sj6NCiUzxVE7R5eSpmfk0+t7bFtA6tXXekZCdvq +DxzM2vnLXmx1PwNJOd97sArLI0I94AmOwrcphKaa/HAb4akRfcPhapsNjGMny/n I8AlCDrLcFZq//dAd0mt7A676CyL1sNG8w7wu91IDfsPkfYgInNemlft5nmwokGu f+crhtKoNJjaV5n1yIXDPgXpH0s6U5Hmy+ELMLr2rw85hwpwzJF0pYd4+DqRr8aI RCogkKP2ey80RfZlDl8qRTiz/wUSmlxsujjZPAYG+1Spaj08EiT3e5Oad+NmVuKr jcehdqOFZHXML7nK2ntpuQ7UW1xY4USyxjtPJ6q6IUVEiF4wTOGhHX9ePM1y38b1 W/F4Lcq2mAQ9rxoWP+FUHuB36zMB0cZ1KXVohvlciTkdl6DSyF4eio19M5MAYqxI RolkTf2Fa4A= =QQWq -----END PGP SIGNATURE----- Merge tag 'iio-fixes-for-5.10a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus Jonathan writes: First set of IIO and counter fixes for the 5.10 cycle. IIO cros_ec - Provide defauts for max and min frequency when older machines fail to return them correctly. ingenic-adc - Fix wrong vref value for JZ4770 SoC - Fix AUX / VBAT readings when touchscreen in use by pausing touchscreen readings during a read of these channels. kxcjk1013 - Fix an issue with KIOX010A ACPI id using devices which need to run a ACPI device specific method to avoid leaving the keyboard disabled. Includes a minor precursor patch to make this fix easier to do. mt6577-auxadc - Fix an issue with dev_comp not being set resulting in a null ptr deref. st_lsm6dsx - Set a 10ms min shub slave timeout to handle fast snesors where more time is needed to set up the config than the cycles allowed. stm32-adc - Fix an issue due to a clash between an ADC configured to use IRQs and a second configured to use DMA cause by some incorrect register masking. vcnl4035 - Kconfig missing dependency Counter ti-eqep - wrong value for max_register as one beyond the end instead of the end. * tag 'iio-fixes-for-5.10a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: iio: accel: kxcjk1013: Add support for KIOX010A ACPI DSM for setting tablet-mode iio: accel: kxcjk1013: Replace is_smo8500_device with an acpi_type enum iio: light: fix kconfig dependency bug for VCNL4035 iio/adc: ingenic: Fix AUX/VBAT readings when touchscreen is used iio/adc: ingenic: Fix battery VREF for JZ4770 SoC iio: imu: st_lsm6dsx: set 10ms as min shub slave timeout counter/ti-eqep: Fix regmap max_register iio: adc: stm32-adc: fix a regression when using dma and irq iio: adc: mediatek: fix unset field iio: cros_ec: Use default frequencies when EC returns invalid information |
|
Chen Yu | 1a371e67dc |
x86/microcode/intel: Check patch signature before saving microcode for early loading
Currently, scan_microcode() leverages microcode_matches() to check
if the microcode matches the CPU by comparing the family and model.
However, the processor stepping and flags of the microcode signature
should also be considered when saving a microcode patch for early
update.
Use find_matching_signature() in scan_microcode() and get rid of the
now-unused microcode_matches() which is a good cleanup in itself.
Complete the verification of the patch being saved for early loading in
save_microcode_patch() directly. This needs to be done there too because
save_mc_for_early() will call save_microcode_patch() too.
The second reason why this needs to be done is because the loader still
tries to support, at least hypothetically, mixed-steppings systems and
thus adds all patches to the cache that belong to the same CPU model
albeit with different steppings.
For example:
microcode: CPU: sig=0x906ec, pf=0x2, rev=0xd6
microcode: mc_saved[0]: sig=0x906e9, pf=0x2a, rev=0xd6, total size=0x19400, date = 2020-04-23
microcode: mc_saved[1]: sig=0x906ea, pf=0x22, rev=0xd6, total size=0x19000, date = 2020-04-27
microcode: mc_saved[2]: sig=0x906eb, pf=0x2, rev=0xd6, total size=0x19400, date = 2020-04-23
microcode: mc_saved[3]: sig=0x906ec, pf=0x22, rev=0xd6, total size=0x19000, date = 2020-04-27
microcode: mc_saved[4]: sig=0x906ed, pf=0x22, rev=0xd6, total size=0x19400, date = 2020-04-23
The patch which is being saved for early loading, however, can only be
the one which fits the CPU this runs on so do the signature verification
before saving.
[ bp: Do signature verification in save_microcode_patch()
and rewrite commit message. ]
Fixes:
|
|
Thomas Bogendoerfer | 61a2f1aecf |
MIPS: kernel: Fix for_each_memblock conversion
The loop over all memblocks works with PFNs and not physical
addresses, so we need for_each_mem_pfn_range().
Fixes:
|