linux/drivers/infiniband/core
Parav Pandit 549af00833 IB/core: Avoid deadlock during netlink message handling
When rdmacm module is not loaded, and when netlink message is received to
get char device info, it results into a deadlock due to recursive locking
of rdma_nl_mutex with the below call sequence.

[..]
  rdma_nl_rcv()
  mutex_lock()
   [..]
   rdma_nl_rcv_msg()
      ib_get_client_nl_info()
         request_module()
           iw_cm_init()
             rdma_nl_register()
               mutex_lock(); <- Deadlock, acquiring mutex again

Due to above call sequence, following call trace and deadlock is observed.

  kernel: __mutex_lock+0x35e/0x860
  kernel: ? __mutex_lock+0x129/0x860
  kernel: ? rdma_nl_register+0x1a/0x90 [ib_core]
  kernel: rdma_nl_register+0x1a/0x90 [ib_core]
  kernel: ? 0xffffffffc029b000
  kernel: iw_cm_init+0x34/0x1000 [iw_cm]
  kernel: do_one_initcall+0x67/0x2d4
  kernel: ? kmem_cache_alloc_trace+0x1ec/0x2a0
  kernel: do_init_module+0x5a/0x223
  kernel: load_module+0x1998/0x1e10
  kernel: ? __symbol_put+0x60/0x60
  kernel: __do_sys_finit_module+0x94/0xe0
  kernel: do_syscall_64+0x5a/0x270
  kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe

  process stack trace:
  [<0>] __request_module+0x1c9/0x460
  [<0>] ib_get_client_nl_info+0x5e/0xb0 [ib_core]
  [<0>] nldev_get_chardev+0x1ac/0x320 [ib_core]
  [<0>] rdma_nl_rcv_msg+0xeb/0x1d0 [ib_core]
  [<0>] rdma_nl_rcv+0xcd/0x120 [ib_core]
  [<0>] netlink_unicast+0x179/0x220
  [<0>] netlink_sendmsg+0x2f6/0x3f0
  [<0>] sock_sendmsg+0x30/0x40
  [<0>] ___sys_sendmsg+0x27a/0x290
  [<0>] __sys_sendmsg+0x58/0xa0
  [<0>] do_syscall_64+0x5a/0x270
  [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe

To overcome this deadlock and to allow multiple netlink messages to
progress in parallel, following scheme is implemented.

1. Split the lock protecting the cb_table into a per-index lock, and make
   it a rwlock. This lock is used to ensure no callbacks are running after
   unregistration returns. Since a module will not be registered once it
   is already running callbacks, this avoids the deadlock.

2. Use smp_store_release() to update the cb_table during registration so
   that no lock is required. This avoids lockdep problems with thinking
   all the rwsems are the same lock class.

Fixes: 0e2d00eb6f ("RDMA: Add NLDEV_GET_CHARDEV to allow char dev discovery and autoload")
Link: https://lore.kernel.org/r/20191015080733.18625-1-leon@kernel.org
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-24 20:49:37 -03:00
..
Makefile RDMA/counter: Add set/clear per-port auto mode support 2019-07-05 10:22:54 -03:00
addr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-09-28 17:47:33 -07:00
agent.c RDMA: Mark if destroy address handle is in a sleepable context 2018-12-19 16:28:03 -07:00
agent.h
cache.c RDMA/core: Annotate destroy of mutex to ensure that it is released as unlocked 2019-07-25 12:07:14 -03:00
cgroup.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
cm.c RDMA/cm: Fix memory leak in cm_add/remove_one 2019-10-04 14:58:31 -03:00
cm_msgs.h RDMA: Use __packed annotation instead of __attribute__ ((packed)) 2019-03-25 21:14:12 -03:00
cma.c RDMA/iwcm: Fix a lock inversion issue 2019-10-01 12:11:50 -03:00
cma_configfs.c RDMA/core: Annotate destroy of mutex to ensure that it is released as unlocked 2019-07-25 12:07:14 -03:00
cma_priv.h IB/cma: Define option to set ack timeout and pack tos_set 2019-02-08 16:14:21 -07:00
core_priv.h IB/core: Avoid deadlock during netlink message handling 2019-10-24 20:49:37 -03:00
counters.c Merge tag 'v5.3-rc8' into rdma.git for-next 2019-09-13 16:59:51 -03:00
cq.c rdma: Enable ib_alloc_cq to spread work over a device's comp_vectors 2019-08-05 11:50:32 -04:00
device.c IB/core: Avoid deadlock during netlink message handling 2019-10-24 20:49:37 -03:00
fmr_pool.c RDMA: Delete DEBUG code 2019-08-20 13:27:53 -04:00
iwcm.c RDMA/iwcm: move iw_rem_ref() calls out of spinlock 2019-10-18 14:40:01 -04:00
iwcm.h iw_cm: free cm_id resources on the last deref 2016-08-02 13:15:18 -04:00
iwpm_msg.c RDMA/iwpm: Delete unnecessary checks before the macro call "dev_kfree_skb" 2019-08-27 13:09:23 -03:00
iwpm_util.c RDMA/iwpm: Delete unnecessary checks before the macro call "dev_kfree_skb" 2019-08-27 13:09:23 -03:00
iwpm_util.h RDMA/IWPM: Support no port mapping requirements 2019-02-04 16:26:02 -07:00
mad.c IB/mad: Fix use-after-free in ib mad completion handling 2019-08-01 11:58:54 -04:00
mad_priv.h RDMA: Use __packed annotation instead of __attribute__ ((packed)) 2019-03-25 21:14:12 -03:00
mad_rmpp.c RDMA: Mark if destroy address handle is in a sleepable context 2018-12-19 16:28:03 -07:00
mad_rmpp.h
mr_pool.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
multicast.c IB/core, ipoib: Do not overreact to SM LID change event 2019-05-07 16:06:03 -03:00
netlink.c IB/core: Avoid deadlock during netlink message handling 2019-10-24 20:49:37 -03:00
nldev.c RDMA/nldev: Skip counter if port doesn't match 2019-10-24 15:31:23 -03:00
opa_smi.h RDMA: Start use ib_device_ops 2018-12-12 07:40:16 -07:00
packer.c
rdma_core.c uverbs: Convert idr to XArray 2019-04-25 12:27:11 -03:00
rdma_core.h RDMA/core: Clear out the udata before error unwind 2019-05-27 14:35:26 -03:00
restrack.c RDMA/restrack: Rewrite PID namespace check to be reliable 2019-08-20 13:44:44 -04:00
restrack.h RDMA/restrack: Make is_visible_in_pid_ns() as an API 2019-07-05 10:22:54 -03:00
roce_gid_mgmt.c drivers: use in_dev_for_each_ifa_rtnl/rcu 2019-06-02 18:06:26 -07:00
rw.c PCI/P2PDMA: Introduce pci_p2pdma_unmap_sg() 2019-08-16 08:41:26 -05:00
sa.h RDMA/core: Annotate timeout as unsigned long 2018-10-16 13:34:01 -04:00
sa_query.c RDMA/core: Support netlink commands in non init_net net namespaces 2019-07-25 14:12:41 -03:00
security.c IB/core: Fix wrong iterating on ports 2019-10-04 15:50:27 -03:00
smi.c
smi.h RDMA: Start use ib_device_ops 2018-12-12 07:40:16 -07:00
sysfs.c RDMA: Introduce ib_port_phys_state enum 2019-08-12 10:18:52 -04:00
ucma.c RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEV 2019-06-18 22:44:08 -04:00
ud_header.c
umem.c mm/gup: add make_dirty arg to put_user_pages_dirty_lock() 2019-09-24 15:54:08 -07:00
umem_odp.c RDMA/odp: Lift umem_mutex out of ib_umem_odp_unmap_dma_pages() 2019-10-04 15:54:21 -03:00
user_mad.c Merge branch 'odp_fixes' into rdma.git for-next 2019-08-21 14:10:36 -03:00
uverbs.h RDMA/uverbs: Prevent potential underflow 2019-10-22 15:05:36 -03:00
uverbs_cmd.c RDMA subsystem updates for 5.4 2019-09-21 10:26:24 -07:00
uverbs_ioctl.c mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options 2019-07-12 11:05:46 -07:00
uverbs_main.c RDMA subsystem updates for 5.4 2019-09-21 10:26:24 -07:00
uverbs_marshall.c IB/cm: Replace members of sa_path_rec with 'struct sgid_attr *' 2018-06-25 14:19:57 -06:00
uverbs_std_types.c IB: Remove 'uobject->context' dependency in object destroy APIs 2019-04-01 14:59:35 -03:00
uverbs_std_types_counters.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_cq.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
uverbs_std_types_device.c IB/uverbs: Fix ioctl query port to consider device disassociation 2019-01-25 11:58:06 -07:00
uverbs_std_types_dm.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_flow_action.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_mr.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
uverbs_uapi.c RDMA: Move driver_id into struct ib_device_ops 2019-06-10 16:56:02 -03:00
verbs.c IB/core: Use rdma_read_gid_l2_fields to compare GID L2 fields 2019-10-18 14:55:33 -04:00