mirror of https://gitee.com/openkylin/linux.git
10796 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Linus Torvalds | 61a09258f2 |
Second RDMA 5.6 pull request
- Fix busted syzkaller fix in 'get_new_pps' - this turned out to crash on certain HW configurations - Bug fixes for various missed things in error unwinds - Add a missing rcu_read_lock annotation in hfi/qib - Fix two ODP related regressions from the recent mmu notifier changes - Several more syzkaller bugs in siw, RDMA netlink, verbs and iwcm - Revert an old patch in CMA as it is now shown to not be allocating port numbers properly -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl5iWSQACgkQOG33FX4g mxoadw//ZkIcG25OMhgc4iqOXT+brCCYosdi1MB8ptcW/lx+t2jH8VD9cd8kOW4M VfFIpiuqVc6U06BpoRJkSV3Ix5Hiw0nQVD9q1mNiqSs0fyAuJG0NGtVeqWWXSFFC ptHzn1z5Aw9GV2necS+nJcZ3NceMW/rP255LHioqVfj7xSFJiymXfncH7YwQZOop S88Dr3m+DibW+ueVwvtLPvSPaWL40NGZo4sNuITrfiJuHYvstWedUMtYkGCGjrmT bUI7lpYgsakVTlM2LTtlAFrAoL/adkfrNbiCVLqGLpoy3DIdXVscQzt9CRnCP1iF t1l0jY+2YNAMMfjktLDnhUU7wfAwgw/XTNoqzlRCAAiTp7D8+eo560Txj9xyjGw+ spxGOWuDEVWlBOFHHltRbQ13QZ06vA7yg0YqoIuEg86c+X38NoVEA3sRf59v05qM XqPcdIBusjRfd8kZsk07uYbp5VQsNHSfL2ZtxAFwiWFr4stjBcwqrx3sFw5610uZ Pt6uWN6JlGRb7A35I0ZuRwWhN1HTFkd7rIKK3d5hTWcqefH6JAkZldMsG0qt/YW2 nRnoZhUNwtP2YI6eOTpskQCyK41tqP5tC84k1GMBuAxMYw40FFqN9/M7v0h9NWq7 Eq8BMjbLB6DDR8cBJk7uoYfpYM6slnGLlDGfrLRR9j1oWv6iuCY= =SFSu -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: "Nothing particularly exciting, some small ODP regressions from the mmu notifier rework, another bunch of syzkaller fixes, and a bug fix for a botched syzkaller fix in the first rc pull request. - Fix busted syzkaller fix in 'get_new_pps' - this turned out to crash on certain HW configurations - Bug fixes for various missed things in error unwinds - Add a missing rcu_read_lock annotation in hfi/qib - Fix two ODP related regressions from the recent mmu notifier changes - Several more syzkaller bugs in siw, RDMA netlink, verbs and iwcm - Revert an old patch in CMA as it is now shown to not be allocating port numbers properly" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/iwcm: Fix iwcm work deallocation RDMA/siw: Fix failure handling during device creation RDMA/nldev: Fix crash when set a QP to a new counter but QPN is missing RDMA/odp: Ensure the mm is still alive before creating an implicit child RDMA/core: Fix protection fault in ib_mr_pool_destroy IB/mlx5: Fix implicit ODP race IB/hfi1, qib: Ensure RCU is locked when accessing list RDMA/core: Fix pkey and port assignment in get_new_pps RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen() RDMA/rw: Fix error flow during RDMA context initialization RDMA/core: Fix use of logical OR in get_new_pps Revert "RDMA/cma: Simplify rdma_resolve_addr() error flow" |
|
Bernard Metzler | 810dbc6908 |
RDMA/iwcm: Fix iwcm work deallocation
The dealloc_work_entries() function must update the work_free_list pointer
while freeing its entries, since potentially called again on same list. A
second iteration of the work list caused system crash. This happens, if
work allocation fails during cma_iw_listen() and free_cm_id() tries to
free the list again during cleanup.
Fixes:
|
|
Bernard Metzler | 12e5eef0f4 |
RDMA/siw: Fix failure handling during device creation
A failing call to ib_device_set_netdev() during device creation caused
system crash due to xa_destroy of uninitialized xarray hit by device
deallocation. Fixed by moving xarray initialization before potential
device deallocation.
Fixes:
|
|
Mark Zhang | 78f34a16c2 |
RDMA/nldev: Fix crash when set a QP to a new counter but QPN is missing
This fixes the kernel crash when a RDMA_NLDEV_CMD_STAT_SET command is
received, but the QP number parameter is not available.
iwpm_register_pid: Unable to send a nlmsg (client = 2)
infiniband syz1: RDMA CMA: cma_listen_on_dev, error -98
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 9754 Comm: syz-executor069 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:nla_get_u32 include/net/netlink.h:1474 [inline]
RIP: 0010:nldev_stat_set_doit+0x63c/0xb70 drivers/infiniband/core/nldev.c:1760
Code: fc 01 0f 84 58 03 00 00 e8 41 83 bf fb 4c 8b a3 58 fd ff ff 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 6d
RSP: 0018:ffffc900068bf350 EFLAGS: 00010247
RAX: dffffc0000000000 RBX: ffffc900068bf728 RCX: ffffffff85b60470
RDX: 0000000000000000 RSI: ffffffff85b6047f RDI: 0000000000000004
RBP: ffffc900068bf750 R08: ffff88808c3ee140 R09: ffff8880a25e6010
R10: ffffed10144bcddc R11: ffff8880a25e6ee3 R12: 0000000000000000
R13: ffff88809acb0000 R14: ffff888092a42c80 R15: 000000009ef2e29a
FS: 0000000001ff0880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4733e34000 CR3: 00000000a9b27000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline]
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x5d9/0x980 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:672
____sys_sendmsg+0x753/0x880 net/socket.c:2343
___sys_sendmsg+0x100/0x170 net/socket.c:2397
__sys_sendmsg+0x105/0x1d0 net/socket.c:2430
__do_sys_sendmsg net/socket.c:2439 [inline]
__se_sys_sendmsg net/socket.c:2437 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4403d9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc0efbc5c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004403d9
RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000004
RBP: 00000000006ca018 R08: 0000000000000008 R09: 00000000004002c8
R10: 000000000000004a R11: 0000000000000246 R12: 0000000000401c60
R13: 0000000000401cf0 R14: 0000000000000000 R15: 0000000000000000
Fixes:
|
|
Jason Gunthorpe | a4e63bce14 |
RDMA/odp: Ensure the mm is still alive before creating an implicit child
Registration of a mmu_notifier requires the caller to hold a mmget() on
the mm as registration is not permitted to race with exit_mmap(). There is
a BUG_ON inside the mmu_notifier to guard against this.
Normally creating a umem is done against current which implicitly holds
the mmget(), however an implicit ODP child is created from a pagefault
work queue and is not guaranteed to have a mmget().
Call mmget() around this registration and abort faulting if the MM has
gone to exit_mmap().
Before the patch below the notifier was registered when the implicit ODP
parent was created, so there was no chance to register a notifier outside
of current.
Fixes:
|
|
Maor Gottlieb | e38b55ea04 |
RDMA/core: Fix protection fault in ib_mr_pool_destroy
Fix NULL pointer dereference in the error flow of ib_create_qp_user
when accessing to uninitialized list pointers - rdma_mrs and sig_mrs.
The following crash from syzkaller revealed it.
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
CPU: 1 PID: 23167 Comm: syz-executor.1 Not tainted 5.5.0-rc5 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
RIP: 0010:ib_mr_pool_destroy+0x81/0x1f0
Code: 00 00 fc ff df 49 c1 ec 03 4d 01 fc e8 a8 ea 72 fe 41 80 3c 24 00
0f 85 62 01 00 00 48 8b 13 48 89 d6 4c 8d 6a c8 48 c1 ee 03 <42> 80 3c
3e 00 0f 85 34 01 00 00 48 8d 7a 08 4c 8b 02 48 89 fe 48
RSP: 0018:ffffc9000951f8b0 EFLAGS: 00010046
RAX: 0000000000040000 RBX: ffff88810f268038 RCX: ffffffff82c41628
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc9000951f850
RBP: ffff88810f268020 R08: 0000000000000004 R09: fffff520012a3f0a
R10: 0000000000000001 R11: fffff520012a3f0a R12: ffffed1021e4d007
R13: ffffffffffffffc8 R14: 0000000000000246 R15: dffffc0000000000
FS: 00007f54bc788700(0000) GS:ffff88811b100000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000116920002 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
rdma_rw_cleanup_mrs+0x15/0x30
ib_destroy_qp_user+0x674/0x7d0
ib_create_qp_user+0xb01/0x11c0
create_qp+0x1517/0x2130
ib_uverbs_create_qp+0x13e/0x190
ib_uverbs_write+0xaa5/0xdf0
__vfs_write+0x7c/0x100
vfs_write+0x168/0x4a0
ksys_write+0xc8/0x200
do_syscall_64+0x9c/0x390
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x465b49
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f54bc787c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000465b49
RDX: 0000000000000040 RSI: 0000000020000540 RDI: 0000000000000003
RBP: 00007f54bc787c70 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f54bc7886bc
R13: 00000000004ca2ec R14: 000000000070ded0 R15: 0000000000000005
Fixes:
|
|
Artemy Kovalyov | de5ed007a0 |
IB/mlx5: Fix implicit ODP race
Following race may occur because of the call_srcu and the placement of
the synchronize_srcu vs the xa_erase.
CPU0 CPU1
mlx5_ib_free_implicit_mr: destroy_unused_implicit_child_mr:
xa_erase(odp_mkeys)
synchronize_srcu()
xa_lock(implicit_children)
if (still in xarray)
atomic_inc()
call_srcu()
xa_unlock(implicit_children)
xa_erase(implicit_children):
xa_lock(implicit_children)
__xa_erase()
xa_unlock(implicit_children)
flush_workqueue()
[..]
free_implicit_child_mr_rcu:
(via call_srcu)
queue_work()
WARN_ON(atomic_read())
[..]
free_implicit_child_mr_work:
(via wq)
free_implicit_child_mr()
mlx5_mr_cache_invalidate()
mlx5_ib_update_xlt() <-- UMR QP fail
atomic_dec()
The wait_event() solves the race because it blocks until
free_implicit_child_mr_work() completes.
Fixes:
|
|
Dennis Dalessandro | 817a68a658 |
IB/hfi1, qib: Ensure RCU is locked when accessing list
The packet handling function, specifically the iteration of the qp list
for mad packet processing misses locking RCU before running through the
list. Not only is this incorrect, but the list_for_each_entry_rcu() call
can not be called with a conditional check for lock dependency. Remedy
this by invoking the rcu lock and unlock around the critical section.
This brings MAD packet processing in line with what is done for non-MAD
packets.
Fixes:
|
|
Maor Gottlieb | 801b67f3ea |
RDMA/core: Fix pkey and port assignment in get_new_pps
When port is part of the modify mask, then we should take it from the
qp_attr and not from the old pps. Same for PKEY. Otherwise there are
panics in some configurations:
RIP: 0010:get_pkey_idx_qp_list+0x50/0x80 [ib_core]
Code: c7 18 e8 13 04 30 ef 0f b6 43 06 48 69 c0 b8 00 00 00 48 03 85 a0 04 00 00 48 8b 50 20 48 8d 48 20 48 39 ca 74 1a 0f b7 73 04 <66> 39 72 10 75 08 eb 10 66 39 72 10 74 0a 48 8b 12 48 39 ca 75 f2
RSP: 0018:ffffafb3480932f0 EFLAGS: 00010203
RAX: ffff98059ababa10 RBX: ffff980d926e8cc0 RCX: ffff98059ababa30
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff98059ababa28
RBP: ffff98059b940000 R08: 00000000000310c0 R09: ffff97fe47c07480
R10: 0000000000000036 R11: 0000000000000200 R12: 0000000000000071
R13: ffff98059b940000 R14: ffff980d87f948a0 R15: 0000000000000000
FS: 00007f88deb31740(0000) GS:ffff98059f600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000853e26001 CR4: 00000000001606e0
Call Trace:
port_pkey_list_insert+0x3d/0x1b0 [ib_core]
? kmem_cache_alloc_trace+0x215/0x220
ib_security_modify_qp+0x226/0x3a0 [ib_core]
_ib_modify_qp+0xcf/0x390 [ib_core]
ipoib_init_qp+0x7f/0x200 [ib_ipoib]
? rvt_modify_port+0xd0/0xd0 [rdmavt]
? ib_find_pkey+0x99/0xf0 [ib_core]
ipoib_ib_dev_open_default+0x1a/0x200 [ib_ipoib]
ipoib_ib_dev_open+0x96/0x130 [ib_ipoib]
ipoib_open+0x44/0x130 [ib_ipoib]
__dev_open+0xd1/0x160
__dev_change_flags+0x1ab/0x1f0
dev_change_flags+0x23/0x60
do_setlink+0x328/0xe30
? __nla_validate_parse+0x54/0x900
__rtnl_newlink+0x54e/0x810
? __alloc_pages_nodemask+0x17d/0x320
? page_fault+0x30/0x50
? _cond_resched+0x15/0x30
? kmem_cache_alloc_trace+0x1c8/0x220
rtnl_newlink+0x43/0x60
rtnetlink_rcv_msg+0x28f/0x350
? kmem_cache_alloc+0x1fb/0x200
? _cond_resched+0x15/0x30
? __kmalloc_node_track_caller+0x24d/0x2d0
? rtnl_calcit.isra.31+0x120/0x120
netlink_rcv_skb+0xcb/0x100
netlink_unicast+0x1e0/0x340
netlink_sendmsg+0x317/0x480
? __check_object_size+0x48/0x1d0
sock_sendmsg+0x65/0x80
____sys_sendmsg+0x223/0x260
? copy_msghdr_from_user+0xdc/0x140
___sys_sendmsg+0x7c/0xc0
? skb_dequeue+0x57/0x70
? __inode_wait_for_writeback+0x75/0xe0
? fsnotify_grab_connector+0x45/0x80
? __dentry_kill+0x12c/0x180
__sys_sendmsg+0x58/0xa0
do_syscall_64+0x5b/0x200
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f88de467f10
Link: https://lore.kernel.org/r/20200227125728.100551-1-leon@kernel.org
Cc: <stable@vger.kernel.org>
Fixes:
|
|
Jason Gunthorpe | c14dfddbd8 |
RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen()
The algorithm pre-allocates a cm_id since allocation cannot be done while
holding the cm.lock spinlock, however it doesn't free it on one error
path, leading to a memory leak.
Fixes:
|
|
Linus Torvalds | b98b809c0a |
SCSI fixes on 20200221
Four non-core fixes. Two are reverts of target fixes which turned out to have unwanted side effects, one is a revert of an RDMA fix with the same problem and the final one fixes an incorrect warning about memory allocation failures in megaraid_sas (the driver actually reduces the allocation size until it succeeds). Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXlBhuyYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdMGAQCR8Qi2 m2kPgccUvJwVmnJ+DRJ3MRRX3Kn0IJIDoIc0IgEA6/W33+7xY8qQ0uahOyOT90tz g7Y2I7TxQ+dsL9pqs80= =JIpx -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four non-core fixes. Two are reverts of target fixes which turned out to have unwanted side effects, one is a revert of an RDMA fix with the same problem and the final one fixes an incorrect warning about memory allocation failures in megaraid_sas (the driver actually reduces the allocation size until it succeeds)" Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: Revert "target: iscsi: Wait for all commands to finish before freeing a session" scsi: Revert "RDMA/isert: Fix a recently introduced regression related to logout" scsi: megaraid_sas: silence a warning scsi: Revert "target/core: Inline transport_lun_remove_cmd()" |
|
Max Gurtovoy | 6affca140c |
RDMA/rw: Fix error flow during RDMA context initialization
In case the SGL was mapped for P2P DMA operation, we must unmap it using
pci_p2pdma_unmap_sg during the error unwind of rdma_rw_ctx_init()
Fixes:
|
|
Nathan Chancellor | 4ca501d6aa |
RDMA/core: Fix use of logical OR in get_new_pps
Clang warns:
../drivers/infiniband/core/security.c:351:41: warning: converting the
enum constant to a boolean [-Wint-in-bool-context]
if (!(qp_attr_mask & (IB_QP_PKEY_INDEX || IB_QP_PORT)) && qp_pps) {
^
1 warning generated.
A bitwise OR should have been used instead.
Fixes:
|
|
Parav Pandit | e4103312d7 |
Revert "RDMA/cma: Simplify rdma_resolve_addr() error flow"
This reverts commit |
|
Bart Van Assche | 76261ada16 |
scsi: Revert "RDMA/isert: Fix a recently introduced regression related to logout"
Since commit |
|
Jason Gunthorpe | 685eff5131 |
IB/mlx5: Use div64_u64 for num_var_hw_entries calculation
On i386:
ERROR: "__udivdi3" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!
ERROR: "__divdi3" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!
Fixes:
|
|
Leon Romanovsky | 1dd017882e |
RDMA/core: Fix protection fault in get_pkey_idx_qp_list
We don't need to set pkey as valid in case that user set only one of pkey
index or port number, otherwise it will be resulted in NULL pointer
dereference while accessing to uninitialized pkey list. The following
crash from Syzkaller revealed it.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
CPU: 1 PID: 14753 Comm: syz-executor.2 Not tainted 5.5.0-rc5 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
RIP: 0010:get_pkey_idx_qp_list+0x161/0x2d0
Code: 01 00 00 49 8b 5e 20 4c 39 e3 0f 84 b9 00 00 00 e8 e4 42 6e fe 48
8d 7b 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04
02 84 c0 74 08 3c 01 0f 8e d0 00 00 00 48 8d 7d 04 48 b8
RSP: 0018:ffffc9000bc6f950 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff82c8bdec
RDX: 0000000000000002 RSI: ffffc900030a8000 RDI: 0000000000000010
RBP: ffff888112c8ce80 R08: 0000000000000004 R09: fffff5200178df1f
R10: 0000000000000001 R11: fffff5200178df1f R12: ffff888115dc4430
R13: ffff888115da8498 R14: ffff888115dc4410 R15: ffff888115da8000
FS: 00007f20777de700(0000) GS:ffff88811b100000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2f721000 CR3: 00000001173ca002 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
port_pkey_list_insert+0xd7/0x7c0
ib_security_modify_qp+0x6fa/0xfc0
_ib_modify_qp+0x8c4/0xbf0
modify_qp+0x10da/0x16d0
ib_uverbs_modify_qp+0x9a/0x100
ib_uverbs_write+0xaa5/0xdf0
__vfs_write+0x7c/0x100
vfs_write+0x168/0x4a0
ksys_write+0xc8/0x200
do_syscall_64+0x9c/0x390
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes:
|
|
Zhu Yanjun | 8ac0e6641c |
RDMA/rxe: Fix soft lockup problem due to using tasklets in softirq
When run stress tests with RXE, the following Call Traces often occur
watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [swapper/2:0]
...
Call Trace:
<IRQ>
create_object+0x3f/0x3b0
kmem_cache_alloc_node_trace+0x129/0x2d0
__kmalloc_reserve.isra.52+0x2e/0x80
__alloc_skb+0x83/0x270
rxe_init_packet+0x99/0x150 [rdma_rxe]
rxe_requester+0x34e/0x11a0 [rdma_rxe]
rxe_do_task+0x85/0xf0 [rdma_rxe]
tasklet_action_common.isra.21+0xeb/0x100
__do_softirq+0xd0/0x298
irq_exit+0xc5/0xd0
smp_apic_timer_interrupt+0x68/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
...
The root cause is that tasklet is actually a softirq. In a tasklet
handler, another softirq handler is triggered. Usually these softirq
handlers run on the same cpu core. So this will cause "soft lockup Bug".
Fixes:
|
|
Leon Romanovsky | 9b6d3bbc13 |
RDMA/mlx5: Prevent overflow in mmap offset calculations
The cmd and index variables declared as u16 and the result is supposed to
be stored in u64. The C arithmetic rules doesn't promote "(index >> 8) <<
16" to be u64 and leaves the end result to be u16.
Fixes:
|
|
Yonatan Cohen | 9ea04d0df6 |
IB/umad: Fix kernel crash while unloading ib_umad
When disassociating a device from umad we must ensure that the sysfs
access is prevented before blocking the fops, otherwise assumptions in
syfs don't hold:
CPU0 CPU1
ib_umad_kill_port() ibdev_show()
port->ib_dev = NULL
dev_name(port->ib_dev)
The prior patch made an error in moving the device_destroy(), it should
have been split into device_del() (above) and put_device() (below). At
this point we already have the split, so move the device_del() back to its
original place.
kernel stack
PF: error_code(0x0000) - not-present page
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
RIP: 0010:ibdev_show+0x18/0x50 [ib_umad]
RSP: 0018:ffffc9000097fe40 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffffa0441120 RCX: ffff8881df514000
RDX: ffff8881df514000 RSI: ffffffffa0441120 RDI: ffff8881df1e8870
RBP: ffffffff81caf000 R08: ffff8881df1e8870 R09: 0000000000000000
R10: 0000000000001000 R11: 0000000000000003 R12: ffff88822f550b40
R13: 0000000000000001 R14: ffffc9000097ff08 R15: ffff8882238bad58
FS: 00007f1437ff3740(0000) GS:ffff888236940000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000004e8 CR3: 00000001e0dfc001 CR4: 00000000001606e0
Call Trace:
dev_attr_show+0x15/0x50
sysfs_kf_seq_show+0xb8/0x1a0
seq_read+0x12d/0x350
vfs_read+0x89/0x140
ksys_read+0x55/0xd0
do_syscall_64+0x55/0x1b0
entry_SYSCALL_64_after_hwframe+0x44/0xa9:
Fixes:
|
|
Yishai Hadas | a8af8694a5 |
RDMA/mlx5: Fix async events cleanup flows
As in the prior patch, the devx code is not fully cleaning up its
event_lists before finishing driver_destroy allowing a later read to
trigger user after free conditions.
Re-arrange things so that the event_list is always empty after destroy and
ensure it remains empty until the file is closed.
Fixes:
|
|
Michael Guralnik | a0767da777 |
RDMA/core: Add missing list deletion on freeing event queue
When the uobject file scheme was revised to allow device disassociation
from the file it became possible for read() to still happen the driver
destroys the uobject.
The old clode code was not tolerant to concurrent read, and when it was
moved to the driver destroy it creates a bug.
Ensure the event_list is empty after driver destroy by adding the missing
list_del(). Otherwise read() can trigger a use after free and double
kfree.
Fixes:
|
|
Krishnamraju Eraparaju | 663218a3e7 |
RDMA/siw: Remove unwanted WARN_ON in siw_cm_llp_data_ready()
Warnings like below can fill up the dmesg while disconnecting RDMA connections. Hence, remove the unwanted WARN_ON. WARNING: CPU: 6 PID: 0 at drivers/infiniband/sw/siw/siw_cm.c:1229 siw_cm_llp_data_ready+0xc1/0xd0 [siw] RIP: 0010:siw_cm_llp_data_ready+0xc1/0xd0 [siw] Call Trace: <IRQ> tcp_data_queue+0x226/0xb40 tcp_rcv_established+0x220/0x620 tcp_v4_do_rcv+0x12a/0x1e0 tcp_v4_rcv+0xb05/0xc00 ip_local_deliver_finish+0x69/0x210 ip_local_deliver+0x6b/0xe0 ip_rcv+0x273/0x362 __netif_receive_skb_core+0xb35/0xc30 netif_receive_skb_internal+0x3d/0xb0 napi_gro_frags+0x13b/0x200 t4_ethrx_handler+0x433/0x7d0 [cxgb4] process_responses+0x318/0x580 [cxgb4] napi_rx_handler+0x14/0x100 [cxgb4] net_rx_action+0x149/0x3b0 __do_softirq+0xe3/0x30a irq_exit+0x100/0x110 do_IRQ+0x7f/0xe0 common_interrupt+0xf/0xf </IRQ> Link: https://lore.kernel.org/r/20200207141429.27927-1-krishna2@chelsio.com Signed-off-by: Krishnamraju Eraparaju <krishna2@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> |
|
Krishnamraju Eraparaju | d219face90 |
RDMA/iw_cxgb4: initiate CLOSE when entering TERM
As per draft-hilland-iwarp-verbs-v1.0, sec 6.2.3, always initiate a CLOSE
when entering into TERM state.
In c4iw_modify_qp(), disconnect operation should only be performed when
the modify_qp call is invoked from ib_core. And all other internal
modify_qp calls(invoked within iw_cxgb4) that needs 'disconnect' should
call c4iw_ep_disconnect() explicitly after modify_qp. Otherwise, deadlocks
like below can occur:
Call Trace:
schedule+0x2f/0xa0
schedule_preempt_disabled+0xa/0x10
__mutex_lock.isra.5+0x2d0/0x4a0
c4iw_ep_disconnect+0x39/0x430 => tries to reacquire ep lock again
c4iw_modify_qp+0x468/0x10d0
rx_data+0x218/0x570 => acquires ep lock
process_work+0x5f/0x70
process_one_work+0x1a7/0x3b0
worker_thread+0x30/0x390
kthread+0x112/0x130
ret_from_fork+0x35/0x40
Fixes:
|
|
Mark Zhang | 10189e8e6f |
IB/mlx5: Return failure when rts2rts_qp_counters_set_id is not supported
When binding a QP with a counter and the QP state is not RESET, return
failure if the rts2rts_qp_counters_set_id is not supported by the
device.
This is to prevent cases like manual bind for Connect-IB devices from
returning success when the feature is not supported.
Fixes:
|
|
Avihai Horon | a72f4ac1d7 |
RDMA/core: Fix invalid memory access in spec_filter_size
Add a check that the size specified in the flow spec header doesn't cause
an overflow when calculating the filter size, and thus prevent access to
invalid memory. The following crash from syzkaller revealed it.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
CPU: 1 PID: 17834 Comm: syz-executor.3 Not tainted 5.5.0-rc5 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
RIP: 0010:memchr_inv+0xd3/0x330
Code: 89 f9 89 f5 83 e1 07 0f 85 f9 00 00 00 49 89 d5 49 c1 ed 03 45 85
ed 74 6f 48 89 d9 48 b8 00 00 00 00 00 fc ff df 48 c1 e9 03 <80> 3c 01
00 0f 85 0d 02 00 00 44 0f b6 e5 48 b8 01 01 01 01 01 01
RSP: 0018:ffffc9000a13fa50 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 7fff88810de9d820 RCX: 0ffff11021bd3b04
RDX: 000000000000fff8 RSI: 0000000000000000 RDI: 7fff88810de9d820
RBP: 0000000000000000 R08: ffff888110d69018 R09: 0000000000000009
R10: 0000000000000001 R11: ffffed10236267cc R12: 0000000000000004
R13: 0000000000001fff R14: ffff88810de9d820 R15: 0000000000000040
FS: 00007f9ee0e51700(0000) GS:ffff88811b100000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000115ea0006 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
spec_filter_size.part.16+0x34/0x50
ib_uverbs_kern_spec_to_ib_spec_filter+0x691/0x770
ib_uverbs_ex_create_flow+0x9ea/0x1b40
ib_uverbs_write+0xaa5/0xdf0
__vfs_write+0x7c/0x100
vfs_write+0x168/0x4a0
ksys_write+0xc8/0x200
do_syscall_64+0x9c/0x390
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x465b49
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f9ee0e50c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000465b49
RDX: 00000000000003a0 RSI: 00000000200007c0 RDI: 0000000000000004
RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9ee0e516bc
R13: 00000000004ca2da R14: 000000000070deb8 R15: 00000000ffffffff
Modules linked in:
Dumping ftrace buffer:
(ftrace buffer empty)
Fixes:
|
|
Kaike Wan | f92e487188 |
IB/rdmavt: Reset all QPs when the device is shut down
When the hfi1 device is shut down during a system reboot, it is possible
that some QPs might have not not freed by ULPs. More requests could be
post sent and a lingering timer could be triggered to schedule more packet
sends, leading to a crash:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000102
IP: [ffffffff810a65f2] __queue_work+0x32/0x3c0
PGD 0
Oops: 0000 1 SMP
Modules linked in: nvmet_rdma(OE) nvmet(OE) nvme(OE) dm_round_robin nvme_rdma(OE) nvme_fabrics(OE) nvme_core(OE) pal_raw(POE) pal_pmt(POE) pal_cache(POE) pal_pile(POE) pal(POE) pal_compatible(OE) rpcrdma sunrpc ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support mxm_wmi ipmi_ssif pcspkr ses enclosure joydev scsi_transport_sas i2c_i801 sg mei_me lpc_ich mei ioatdma shpchp ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter acpi_pad dm_multipath hangcheck_timer ip_tables ext4 mbcache jbd2 mlx4_en
sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core crct10dif_pclmul crct10dif_common hfi1(OE) igb crc32c_intel rdmavt(OE) ahci ib_core libahci libata ptp megaraid_sas pps_core dca i2c_algo_bit i2c_core devlink dm_mirror dm_region_hash dm_log dm_mod
CPU: 23 PID: 0 Comm: swapper/23 Tainted: P OE ------------ 3.10.0-693.el7.x86_64 #1
Hardware name: Intel Corporation S2600CWR/S2600CWR, BIOS SE5C610.86B.01.01.0028.121720182203 12/17/2018
task: ffff8808f4ec4f10 ti: ffff8808f4ed8000 task.ti: ffff8808f4ed8000
RIP: 0010:[ffffffff810a65f2] [ffffffff810a65f2] __queue_work+0x32/0x3c0
RSP: 0018:ffff88105df43d48 EFLAGS: 00010046
RAX: 0000000000000086 RBX: 0000000000000086 RCX: 0000000000000000
RDX: ffff880f74e758b0 RSI: 0000000000000000 RDI: 000000000000001f
RBP: ffff88105df43d80 R08: ffff8808f3c583c8 R09: ffff8808f3c58000
R10: 0000000000000002 R11: ffff88105df43da8 R12: ffff880f74e758b0
R13: 000000000000001f R14: 0000000000000000 R15: ffff88105a300000
FS: 0000000000000000(0000) GS:ffff88105df40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000102 CR3: 00000000019f2000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffff88105b6dd708 0000001f00000286 0000000000000086 ffff88105a300000
ffff880f74e75800 0000000000000000 ffff88105a300000 ffff88105df43d98
ffffffff810a6b85 ffff88105a301e80 ffff88105df43dc8 ffffffffc0224cde
Call Trace:
IRQ
[ffffffff810a6b85] queue_work_on+0x45/0x50
[ffffffffc0224cde] _hfi1_schedule_send+0x6e/0xc0 [hfi1]
[ffffffffc0170570] ? get_map_page+0x60/0x60 [rdmavt]
[ffffffffc0224d62] hfi1_schedule_send+0x32/0x70 [hfi1]
[ffffffffc0170644] rvt_rc_timeout+0xd4/0x120 [rdmavt]
[ffffffffc0170570] ? get_map_page+0x60/0x60 [rdmavt]
[ffffffff81097316] call_timer_fn+0x36/0x110
[ffffffffc0170570] ? get_map_page+0x60/0x60 [rdmavt]
[ffffffff8109982d] run_timer_softirq+0x22d/0x310
[ffffffff81090b3f] __do_softirq+0xef/0x280
[ffffffff816b6a5c] call_softirq+0x1c/0x30
[ffffffff8102d3c5] do_softirq+0x65/0xa0
[ffffffff81090ec5] irq_exit+0x105/0x110
[ffffffff816b76c2] smp_apic_timer_interrupt+0x42/0x50
[ffffffff816b5c1d] apic_timer_interrupt+0x6d/0x80
EOI
[ffffffff81527a02] ? cpuidle_enter_state+0x52/0xc0
[ffffffff81527b48] cpuidle_idle_call+0xd8/0x210
[ffffffff81034fee] arch_cpu_idle+0xe/0x30
[ffffffff810e7bca] cpu_startup_entry+0x14a/0x1c0
[ffffffff81051af6] start_secondary+0x1b6/0x230
Code: 89 e5 41 57 41 56 49 89 f6 41 55 41 89 fd 41 54 49 89 d4 53 48 83 ec 10 89 7d d4 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 02 00 00 41 f6 86 02 01 00 00 01 0f 85 58 02 00 00 49 c7 c7 28 19 01 00
RIP [ffffffff810a65f2] __queue_work+0x32/0x3c0
RSP ffff88105df43d48
CR2: 0000000000000102
The solution is to reset the QPs before the device resources are freed.
This reset will change the QP state to prevent post sends and delete
timers to prevent callbacks.
Fixes:
|
|
Mike Marciniszyn | be8638344c |
IB/hfi1: Close window for pq and request coliding
Cleaning up a pq can result in the following warning and panic:
WARNING: CPU: 52 PID: 77418 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0
list_del corruption, ffff88cb2c6ac068->next is LIST_POISON1 (dead000000000100)
Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables
nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit]
CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G OE ------------ 3.10.0-957.38.3.el7.x86_64 #1
Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019
Call Trace:
[<ffffffff90365ac0>] dump_stack+0x19/0x1b
[<ffffffff8fc98b78>] __warn+0xd8/0x100
[<ffffffff8fc98bff>] warn_slowpath_fmt+0x5f/0x80
[<ffffffff8ff970c3>] __list_del_entry+0x63/0xd0
[<ffffffff8ff9713d>] list_del+0xd/0x30
[<ffffffff8fddda70>] kmem_cache_destroy+0x50/0x110
[<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1]
[<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1]
[<ffffffff8fe4519c>] __fput+0xec/0x260
[<ffffffff8fe453fe>] ____fput+0xe/0x10
[<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0
[<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0
[<ffffffff90379134>] int_signal+0x12/0x17
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300
PGD 2cdab19067 PUD 2f7bfdb067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables
nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit]
CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.38.3.el7.x86_64 #1
Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019
task: ffff88cc26db9040 ti: ffff88b5393a8000 task.ti: ffff88b5393a8000
RIP: 0010:[<ffffffff8fe1f93e>] [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300
RSP: 0018:ffff88b5393abd60 EFLAGS: 00010287
RAX: 0000000000000000 RBX: ffff88cb2c6ac000 RCX: 0000000000000003
RDX: 0000000000000400 RSI: 0000000000000400 RDI: ffffffff9095b800
RBP: ffff88b5393abdb0 R08: ffffffff9095b808 R09: ffffffff8ff77c19
R10: ffff88b73ce1f160 R11: ffffddecddde9800 R12: ffff88cb2c6ac000
R13: 000000000000000c R14: ffff88cf3fdca780 R15: 0000000000000000
FS: 00002aaaaab52500(0000) GS:ffff88b73ce00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000002d27664000 CR4: 00000000007607e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
[<ffffffff8fe20d44>] __kmem_cache_shutdown+0x14/0x80
[<ffffffff8fddda78>] kmem_cache_destroy+0x58/0x110
[<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1]
[<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1]
[<ffffffff8fe4519c>] __fput+0xec/0x260
[<ffffffff8fe453fe>] ____fput+0xe/0x10
[<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0
[<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0
[<ffffffff90379134>] int_signal+0x12/0x17
Code: 00 00 ba 00 04 00 00 0f 4f c2 3d 00 04 00 00 89 45 bc 0f 84 e7 01 00 00 48 63 45 bc 49 8d 04 c4 48 89 45 b0 48 8b 80 c8 00 00 00 <48> 8b 78 10 48 89 45 c0 48 83 c0 10 48 89 45 d0 48 8b 17 48 39
RIP [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300
RSP <ffff88b5393abd60>
CR2: 0000000000000010
The panic is the result of slab entries being freed during the destruction
of the pq slab.
The code attempts to quiesce the pq, but looking for n_req == 0 doesn't
account for new requests.
Fix the issue by using SRCU to get a pq pointer and adjust the pq free
logic to NULL the fd pq pointer prior to the quiesce.
Fixes:
|
|
Kaike Wan | a70ed0f2e6 |
IB/hfi1: Acquire lock to release TID entries when user file is closed
Each user context is allocated a certain number of RcvArray (TID)
entries and these entries are managed through TID groups. These groups
are put into one of three lists in each user context: tid_group_list,
tid_used_list, and tid_full_list, depending on the number of used TID
entries within each group. When TID packets are expected, one or more
TID groups will be allocated. After the packets are received, the TID
groups will be freed. Since multiple user threads may access the TID
groups simultaneously, a mutex exp_mutex is used to synchronize the
access. However, when the user file is closed, it tries to release
all TID groups without acquiring the mutex first, which risks a race
condition with another thread that may be releasing its TID groups,
leading to data corruption.
This patch addresses the issue by acquiring the mutex first before
releasing the TID groups when the file is closed.
Fixes:
|
|
Kamal Heib | 8a4f300b97 |
RDMA/hfi1: Fix memory leak in _dev_comp_vect_mappings_create
Make sure to free the allocated cpumask_var_t's to avoid the following
reported memory leak by kmemleak:
$ cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8897f812d6a8 (size 8):
comm "kworker/1:1", pid 347, jiffies 4294751400 (age 101.703s)
hex dump (first 8 bytes):
00 00 00 00 00 00 00 00 ........
backtrace:
[<00000000bff49664>] alloc_cpumask_var_node+0x4c/0xb0
[<0000000075d3ca81>] hfi1_comp_vectors_set_up+0x20f/0x800 [hfi1]
[<0000000098d420df>] hfi1_init_dd+0x3311/0x4960 [hfi1]
[<0000000071be7e52>] init_one+0x25e/0xf10 [hfi1]
[<000000005483d4c2>] local_pci_probe+0xd4/0x180
[<000000007c3cbc6e>] work_for_cpu_fn+0x51/0xa0
[<000000001d626905>] process_one_work+0x8f0/0x17b0
[<000000007e569e7e>] worker_thread+0x536/0xb50
[<00000000fd39a4a5>] kthread+0x30c/0x3d0
[<0000000056f2edb3>] ret_from_fork+0x3a/0x50
Fixes:
|
|
Linus Torvalds | 8fdd4019bc |
RDMA subsystem updates for 5.6
- Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4, rxe, i40iw - Larger series doing cleanup and rework for hns and hfi1. - Some general reworking of the CM code to make it a little more understandable - Unify the different code paths connected to the uverbs FD scheme - New UAPI ioctls conversions for get context and get async fd - Trace points for CQ and CM portions of the RDMA stack - mlx5 driver support for virtio-net formatted rings as RDMA raw ethernet QPs - verbs support for setting the PCI-E relaxed ordering bit on DMA traffic connected to a MR - A couple of bug fixes that came too late to make rc7 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl4zPwQACgkQOG33FX4g mxoURw//fuQmuJ7aTMH+0qrhaZUmzXOcI/WKvY0YMyYLvxolRcIO+uCL239wxezR 9iTHPO7HeYXUQ4W8Hi/fTyuQ9hzaPOP3wgOJfQhm4QT/XDpRW0H3Mb+hTLHTUAcA rgKc9suAn+5BbIDOz7hEfeOTssx1wYrLsaHDc11NZ42JuG6uvPR33lhXiKWG+5tH 2MpfeTU6BjL035dm3YZXCo+ouobpdMuvzJItYIsB2E5Nl0s91SMzsymIYiD0gb3t yUJ3wqPW3pchfAl8VEn+W5AHTUYYgGjmEblL8WdVq5JRrkQgQzj8QtCRT9NOPAT0 LivCvgBrm0kscaQS2TjtG56Ojbwz8z1QPE/4shf0pj/G2lZfacYDAeaUc/2VafxY y/KG+3dB1DxtYY3eXJUxbB7Vpk7kfr35p5b75NdMhd2t49oPgV7EKoZMLYGzfX4S PtyNyNSiwx8qsRTr4lznOMswmrDLfG4XiywWgYo6NGOWyKYlARWIYBAEQZ0DPTiE 9mqJ19gusdSdAgm8LGDInPmH6/AojGOVzYonJFWdlOtwCXGNXL4Gx02x4WYHykDG w+oy5NMJbU3b6+MWEagkuQNcrwqv02MT1mB/Lgv4GPm6rS0UXR7zUPDeccE50fSL X36k28UlftlPlaD7PeJdTOAhyBv5DxfpL5rbB2TfpUTpNxjayuU= =hepK -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "A very quiet cycle with few notable changes. Mostly the usual list of one or two patches to drivers changing something that isn't quite rc worthy. The subsystem seems to be seeing a larger number of rework and cleanup style patches right now, I feel that several vendors are prepping their drivers for new silicon. Summary: - Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4, rxe, i40iw - Larger series doing cleanup and rework for hns and hfi1. - Some general reworking of the CM code to make it a little more understandable - Unify the different code paths connected to the uverbs FD scheme - New UAPI ioctls conversions for get context and get async fd - Trace points for CQ and CM portions of the RDMA stack - mlx5 driver support for virtio-net formatted rings as RDMA raw ethernet QPs - verbs support for setting the PCI-E relaxed ordering bit on DMA traffic connected to a MR - A couple of bug fixes that came too late to make rc7" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (108 commits) RDMA/core: Make the entire API tree static RDMA/efa: Mask access flags with the correct optional range RDMA/cma: Fix unbalanced cm_id reference count during address resolve RDMA/umem: Fix ib_umem_find_best_pgsz() IB/mlx4: Fix leak in id_map_find_del IB/opa_vnic: Spelling correction of 'erorr' to 'error' IB/hfi1: Fix logical condition in msix_request_irq RDMA/cm: Remove CM message structs RDMA/cm: Use IBA functions for complex structure members RDMA/cm: Use IBA functions for simple structure members RDMA/cm: Use IBA functions for swapping get/set acessors RDMA/cm: Use IBA functions for simple get/set acessors RDMA/cm: Add SET/GET implementations to hide IBA wire format RDMA/cm: Add accessors for CM_REQ transport_type IB/mlx5: Return the administrative GUID if exists RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence IB/mlx4: Fix memory leak in add_gid error flow IB/mlx5: Expose RoCE accelerator counters RDMA/mlx5: Set relaxed ordering when requested RDMA/core: Add the core support field to METHOD_GET_CONTEXT ... |
|
John Hubbard | f1f6a7dd9b |
mm, tree-wide: rename put_user_page*() to unpin_user_page*()
In order to provide a clearer, more symmetric API for pinning and unpinning DMA pages. This way, pin_user_pages*() calls match up with unpin_user_pages*() calls, and the API is a lot closer to being self-explanatory. Link: http://lkml.kernel.org/r/20200107224558.2362728-23-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
John Hubbard | dfa0a4fff1 |
IB/{core,hw,umem}: set FOLL_PIN via pin_user_pages*(), fix up ODP
Convert infiniband to use the new pin_user_pages*() calls. Also, revert earlier changes to Infiniband ODP that had it using put_user_page(). ODP is "Case 3" in Documentation/core-api/pin_user_pages.rst, which is to say, normal get_user_pages() and put_page() is the API to use there. The new pin_user_pages*() calls replace corresponding get_user_pages*() calls, and set the FOLL_PIN flag. The FOLL_PIN flag requires that the caller must return the pages via put_user_page*() calls, but infiniband was already doing that as part of an earlier commit. Link: http://lkml.kernel.org/r/20200107224558.2362728-14-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
John Hubbard | 4789fcdd14 |
IB/umem: use get_user_pages_fast() to pin DMA pages
And get rid of the mmap_sem calls, as part of that. Note that get_user_pages_fast() will, if necessary, fall back to __gup_longterm_unlocked(), which takes the mmap_sem as needed. Link: http://lkml.kernel.org/r/20200107224558.2362728-10-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Jason Gunthorpe | 8889f6fa35 |
RDMA/core: Make the entire API tree static
Compilation of mlx5 driver without CONFIG_INFINIBAND_USER_ACCESS generates
the following error.
on x86_64:
ld: drivers/infiniband/hw/mlx5/main.o: in function `mlx5_ib_handler_MLX5_IB_METHOD_VAR_OBJ_ALLOC':
main.c:(.text+0x186d): undefined reference to `ib_uverbs_get_ucontext_file'
ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x2480): undefined reference to `uverbs_idr_class'
ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x24d8): undefined reference to `uverbs_destroy_def_handler'
This is happening because some parts of the UAPI description are not
static. This is a hold over from earlier code that relied on struct
pointers to refer to object types, now object types are referenced by
number. Remove the unused globals and add statics to the remaining UAPI
description elements.
Remove the redundent #ifdefs around mlx5_ib_*defs and obsolete
mlx5_ib_get_devx_tree().
The compiler now trims alot more unused code, including the above
problematic definitions when !CONFIG_INFINIBAND_USER_ACCESS.
Fixes:
|
|
Gal Pressman | ba19e16651 |
RDMA/efa: Mask access flags with the correct optional range
The uapi value IB_UVERBS_ACCESS_OPTIONAL_RANGE shouldn't be used inside
the driver, use IB_ACCESS_OPTIONAL instead.
Fixes:
|
|
Linus Torvalds | bd2463ac7d |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller: 1) Add WireGuard 2) Add HE and TWT support to ath11k driver, from John Crispin. 3) Add ESP in TCP encapsulation support, from Sabrina Dubroca. 4) Add variable window congestion control to TIPC, from Jon Maloy. 5) Add BCM84881 PHY driver, from Russell King. 6) Start adding netlink support for ethtool operations, from Michal Kubecek. 7) Add XDP drop and TX action support to ena driver, from Sameeh Jubran. 8) Add new ipv4 route notifications so that mlxsw driver does not have to handle identical routes itself. From Ido Schimmel. 9) Add BPF dynamic program extensions, from Alexei Starovoitov. 10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes. 11) Add support for macsec HW offloading, from Antoine Tenart. 12) Add initial support for MPTCP protocol, from Christoph Paasch, Matthieu Baerts, Florian Westphal, Peter Krystad, and many others. 13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu Cherian, and others. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits) net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC udp: segment looped gso packets correctly netem: change mailing list qed: FW 8.42.2.0 debug features qed: rt init valid initialization changed qed: Debug feature: ilt and mdump qed: FW 8.42.2.0 Add fw overlay feature qed: FW 8.42.2.0 HSI changes qed: FW 8.42.2.0 iscsi/fcoe changes qed: Add abstraction for different hsi values per chip qed: FW 8.42.2.0 Additional ll2 type qed: Use dmae to write to widebus registers in fw_funcs qed: FW 8.42.2.0 Parser offsets modified qed: FW 8.42.2.0 Queue Manager changes qed: FW 8.42.2.0 Expose new registers and change windows qed: FW 8.42.2.0 Internal ram offsets modifications MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver Documentation: net: octeontx2: Add RVU HW and drivers overview octeontx2-pf: ethtool RSS config support octeontx2-pf: Add basic ethtool support ... |
|
Parav Pandit | b4fb4cc5ba |
RDMA/cma: Fix unbalanced cm_id reference count during address resolve
Below commit missed the AF_IB and loopback code flow in
rdma_resolve_addr(). This leads to an unbalanced cm_id refcount in
cma_work_handler() which puts the refcount which was not incremented prior
to queuing the work.
A call trace is observed with such code flow:
BUG: unable to handle kernel NULL pointer dereference at (null)
[<ffffffff96b67e16>] __mutex_lock_slowpath+0x166/0x1d0
[<ffffffff96b6715f>] mutex_lock+0x1f/0x2f
[<ffffffffc0beabb5>] cma_work_handler+0x25/0xa0
[<ffffffff964b9ebf>] process_one_work+0x17f/0x440
[<ffffffff964baf56>] worker_thread+0x126/0x3c0
Hence, hold the cm_id reference when scheduling the resolve work item.
Fixes:
|
|
Artemy Kovalyov | 36798d5ae1 |
RDMA/umem: Fix ib_umem_find_best_pgsz()
Except for the last entry, the ending iova alignment sets the maximum
possible page size as the low bits of the iova must be zero when starting
the next chunk.
Fixes:
|
|
Linus Torvalds | c0e809e244 |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Kernel side changes: - Ftrace is one of the last W^X violators (after this only KLP is left). These patches move it over to the generic text_poke() interface and thereby get rid of this oddity. This requires a surprising amount of surgery, by Peter Zijlstra. - x86/AMD PMUs: add support for 'Large Increment per Cycle Events' to count certain types of events that have a special, quirky hw ABI (by Kim Phillips) - kprobes fixes by Masami Hiramatsu Lots of tooling updates as well, the following subcommands were updated: annotate/report/top, c2c, clang, record, report/top TUI, sched timehist, tests; plus updates were done to the gtk ui, libperf, headers and the parser" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) perf/x86/amd: Add support for Large Increment per Cycle Events perf/x86/amd: Constrain Large Increment per Cycle events perf/x86/intel/rapl: Add Comet Lake support tracing: Initialize ret in syscall_enter_define_fields() perf header: Use last modification time for timestamp perf c2c: Fix return type for histogram sorting comparision functions perf beauty sockaddr: Fix augmented syscall format warning perf/ui/gtk: Fix gtk2 build perf ui gtk: Add missing zalloc object perf tools: Use %define api.pure full instead of %pure-parser libperf: Setup initial evlist::all_cpus value perf report: Fix no libunwind compiled warning break s390 issue perf tools: Support --prefix/--prefix-strip perf report: Clarify in help that --children is default tools build: Fix test-clang.cpp with Clang 8+ perf clang: Fix build with Clang 9 kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic tools lib: Fix builds when glibc contains strlcpy() perf report/top: Make 'e' visible in the help and make it toggle showing callchains perf report/top: Do not offer annotation for symbols without samples ... |
|
Linus Torvalds | 634cd4b6af |
Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar: "The main changes in this cycle were: - Cleanup of the GOP [graphics output] handling code in the EFI stub - Complete refactoring of the mixed mode handling in the x86 EFI stub - Overhaul of the x86 EFI boot/runtime code - Increase robustness for mixed mode code - Add the ability to disable DMA at the root port level in the EFI stub - Get rid of RWX mappings in the EFI memory map and page tables, where possible - Move the support code for the old EFI memory mapping style into its only user, the SGI UV1+ support code. - plus misc fixes, updates, smaller cleanups. ... and due to interactions with the RWX changes, another round of PAT cleanups make a guest appearance via the EFI tree - with no side effects intended" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits) efi/x86: Disable instrumentation in the EFI runtime handling code efi/libstub/x86: Fix EFI server boot failure efi/x86: Disallow efi=old_map in mixed mode x86/boot/compressed: Relax sed symbol type regex for LLVM ld.lld efi/x86: avoid KASAN false positives when accessing the 1: 1 mapping efi: Fix handling of multiple efi_fake_mem= entries efi: Fix efi_memmap_alloc() leaks efi: Add tracking for dynamically allocated memmaps efi: Add a flags parameter to efi_memory_map efi: Fix comment for efi_mem_type() wrt absent physical addresses efi/arm: Defer probe of PCIe backed efifb on DT systems efi/x86: Limit EFI old memory map to SGI UV machines efi/x86: Avoid RWX mappings for all of DRAM efi/x86: Don't map the entire kernel text RW for mixed mode x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd efi/libstub/x86: Fix unused-variable warning efi/libstub/x86: Use mandatory 16-byte stack alignment in mixed mode efi/libstub/x86: Use const attribute for efi_is_64bit() efi: Allow disabling PCI busmastering on bridges during boot efi/x86: Allow translating 64-bit arguments for mixed mode calls ... |
|
Linus Torvalds | 6a1000bd27 |
ioremap changes for 5.6
- remove ioremap_nocache given that is is equivalent to ioremap everywhere -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl4vKHwLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYMPGBAAuVNUZaZfWYHpiVP2oRcUQUguFiD3NTbknsyzV2oH J9P0GfeENSKwE9OOhZ7XIjnCZAJwQgTK/ppQY5yiQ/KAtYyyXjXEJ6jqqjiTDInr +3+I3t/LhkgrK7tMrb7ylTGa/d7KhaciljnOXC8+b75iddvM9I1z2pbHDbppZMS9 wT4RXL/cFtRb85AfOyPLybcka3f5P2gGvQz38qyimhJYEzHDXZu9VO1Bd20f8+Xf eLBKX0o6yWMhcaPLma8tm0M0zaXHEfLHUKLSOkiOk+eHTWBZ3b/w5nsOQZYZ7uQp 25yaClbameAn7k5dHajduLGEJv//ZjLRWcN3HJWJ5vzO111aHhswpE7JgTZJSVWI ggCVkytD3ESXapvswmACSeCIDMmiJMzvn6JvwuSMVB7a6e5mcqTuGo/FN+DrBF/R IP+/gY/T7zIIOaljhQVkiEIIwiD/akYo0V9fheHTBnqcKEDTHV4WjKbeF6aCwcO+ b8inHyXZSKSMG//UlDuN84/KH/o1l62oKaB1uDIYrrL8JVyjAxctWt3GOt5KgSFq wVz1lMw4kIvWtC/Sy2H4oB+RtODLp6yJDqmvmPkeJwKDUcd/1JKf0KsZ8j3FpGei /rEkBEss0KBKyFAgBSRO2jIpdj2epgcBcsdB/r5mlhcn8L77AS6mHbA173kY4pQ/ Kdg= =TUCJ -----END PGP SIGNATURE----- Merge tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap Pull ioremap updates from Christoph Hellwig: "Remove the ioremap_nocache API (plus wrappers) that are always identical to ioremap" * tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap: remove ioremap_nocache and devm_ioremap_nocache MIPS: define ioremap_nocache to ioremap |
|
Håkon Bugge | ea660ad7c1 |
IB/mlx4: Fix leak in id_map_find_del
Using CX-3 virtual functions, either from a bare-metal machine or
pass-through from a VM, MAD packets are proxied through the PF driver.
Since the VF drivers have separate name spaces for MAD Transaction Ids
(TIDs), the PF driver has to re-map the TIDs and keep the book keeping in
a cache.
Following the RDMA Connection Manager (CM) protocol, it is clear when an
entry has to evicted from the cache. When a DREP is sent from
mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar when
a REJ is received by the mlx4_ib_demux_cm_handler(), id_map_find_del() is
called.
This function wipes out the TID in use from the IDR or XArray and removes
the id_map_entry from the table.
In short, it does everything except the topping of the cake, which is to
remove the entry from the list and free it. In other words, for the REJ
case enumerated above, one id_map_entry will be leaked.
For the other case above, a DREQ has been received first. The reception of
the DREQ will trigger queuing of a delayed work to delete the
id_map_entry, for the case where the VM doesn't send back a DREP.
In the normal case, the VM _will_ send back a DREP, and id_map_find_del()
will be called.
But this scenario introduces a secondary leak. First, when the DREQ is
received, a delayed work is queued. The VM will then return a DREP, which
will call id_map_find_del(). As stated above, this will free the TID used
from the XArray or IDR. Now, there is window where that particular TID can
be re-allocated, lets say by an outgoing REQ. This TID will later be wiped
out by the delayed work, when the function id_map_ent_timeout() is
called. But the id_map_entry allocated by the outgoing REQ will not be
de-allocated, and we have a leak.
Both leaks are fixed by removing the id_map_find_del() function and only
using schedule_delayed(). Of course, a check in schedule_delayed() to see
if the work already has been queued, has been added.
Another benefit of always using the delayed version for deleting entries,
is that we do get a TimeWait effect; a TID no longer in use, will occupy
the XArray or IDR for CM_CLEANUP_CACHE_TIMEOUT time, without any ability
of being re-used for that time period.
Fixes:
|
|
Linus Torvalds | 54343d9518 |
SCSI fixes on 20200126
Two last minute fixes, both in drivers. The fnic one is a highly unlikely condition, but the RDMA one is a recently introduced regression that causes a kernel warning to trigger in every RDMA logon, which would be unsightly if it got into the final release. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJsEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXi3VRyYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbrpAP9I/pEp TWu/QkqFFrmuYbzuxtRML7X2T7+B96J/CRtQvQD3TAIW0gvw49Uj25yEwTRnVzCs 1A+eELAahzBPW+rRBw== =C3yx -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two last minute fixes, both in drivers. The fnic one is a highly unlikely condition, but the RDMA one is a recently introduced regression that causes a kernel warning to trigger in every RDMA logon, which would be unsightly if it got into the final release" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: RDMA/isert: Fix a recently introduced regression related to logout scsi: fnic: do not queue commands during fwreset |
|
Dillon Brock | 7f04c71f1f |
IB/opa_vnic: Spelling correction of 'erorr' to 'error'
Correcting a minor spelling mistake in the comments. Link: https://lore.kernel.org/r/20200118162542.15188-1-dab9861@gmail.com Signed-off-by: Dillon Brock <dab9861@gmail.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> |
|
Nathan Chancellor | 79ba4f9310 |
IB/hfi1: Fix logical condition in msix_request_irq
Clang warns:
drivers/infiniband/hw/hfi1/msix.c:136:22: warning: overlapping
comparisons always evaluate to false [-Wtautological-overlap-compare]
if (type < IRQ_SDMA && type >= IRQ_OTHER)
~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
1 warning generated.
It is impossible for something to be less than 0 (IRQ_SDMA) and greater
than or equal to 3 (IRQ_OTHER) at the same time. A logical OR should
have been used to keep the same logic as before.
Link: https://lore.kernel.org/r/20200116222658.5285-1-natechancellor@gmail.com
Link: https://github.com/ClangBuiltLinux/linux/issues/841
Fixes:
|
|
Jason Gunthorpe | 13e0af1801 |
RDMA/cm: Remove CM message structs
All accesses now use the new IBA acessor scheme, so delete the structs entirely and generate the structures from the schema file. Link: https://lore.kernel.org/r/20200116170037.30109-8-jgg@ziepe.ca Tested-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> |
|
Jason Gunthorpe | 4ca662a30a |
RDMA/cm: Use IBA functions for complex structure members
Use a Coccinelle spatch to replace CM structure members used as structures, arrays, or pointers with IBA_GET/SET versions. Applied with $ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c The spatch file was generated using the template pattern: @@ expression src; expression len; {struct} *msg; @@ - memcpy(msg->{old_name}, src, len) + IBA_SET_MEM({new_name}, msg, src, len) @@ {struct} *msg; identifier x; @@ - msg->{old_name}.x + IBA_GET_MEM_PTR({new_name}, msg)->x @@ {struct} *msg; @@ - &msg->{old_name} + IBA_GET_MEM_PTR({new_name}, msg) For GIDs: @@ {struct} *msg; @@ - msg->{old_name} + *IBA_GET_MEM_PTR({new_name}, msg) For non-GIDs: @@ {struct} *msg; @@ - msg->{old_name} + IBA_GET_MEM_PTR({new_name}, msg) Iterated for every remaining IBA_CHECK_OFF()/IBA_CHECK_GET() pairing. Touched up with clang-format after. Link: https://lore.kernel.org/r/20200116170037.30109-7-jgg@ziepe.ca Tested-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> |
|
Jason Gunthorpe | 91b60a7128 |
RDMA/cm: Use IBA functions for simple structure members
Use a Coccinelle spatch script to replace use of simple CM structure members with IBA_GET/SET versions. Applied with $ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c The spatch file was generated using the template pattern: @@ expression val; {struct} *msg; @@ - msg->{old_name} = val + IBA_SET({new_name}, msg, be{bits}_to_cpu(val)) @@ {struct} *msg; @@ - msg->{old_name} + cpu_to_be{bits}(IBA_GET({new_name}, msg)) Iterated for every IBA_CHECK_OFF that isn't a CM_FIELD_MLOC. And the below iterated over all byte sizes to remove doubled byte swaps: @@ expression val; @@ -be{bits}_to_cpu(cpu_to_be{bits}(val)) +val (and __be_to_cpu and ntoh varients) Touched up with clang-format after. Link: https://lore.kernel.org/r/20200116170037.30109-6-jgg@ziepe.ca Tested-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> |
|
Jason Gunthorpe | 01adb7f46f |
RDMA/cm: Use IBA functions for swapping get/set acessors
Use a Coccinelle spatch script to replace CM helper functions that return/accept BE values with IBA_GET/SET versions. Applied with $ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c The spatch file was generated using the template pattern: @@ expression val; {struct} *msg; @@ - {old_setter}(msg, val) + IBA_SET({new_name}, msg, be{bits}_to_cpu(val)) @@ {struct} *msg; @@ - {old_getter}(msg) + cpu_to_be{bits}(IBA_GET({new_name}, msg)) Iterated for every IBA_CHECK_GET_BE()/IBA_CHECK_SET_BE() pairing. And the below iterated over all byte sizes to remove doubled byte swaps: @@ expression val; @@ -be{bits}_to_cpu(cpu_to_be{bits}(val)) +val (and __be_to_cpu and ntoh varients) Touched up with clang-format after. Link: https://lore.kernel.org/r/20200116170037.30109-5-jgg@ziepe.ca Tested-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> |