Commit Graph

11254 Commits

Author SHA1 Message Date
Jason Gunthorpe a0e46db4e7 RDMA/cm: Increment the refcount inside cm_find_listen()
All callers need the 'get', so do it in a central place before returning
the pointer.

Link: https://lore.kernel.org/r/20200506074701.9775-11-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:54 -03:00
Jason Gunthorpe 51e8463cfc RDMA/cm: Remove needless cm_id variable
Just put the expression in the only reader

Link: https://lore.kernel.org/r/20200506074701.9775-10-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:54 -03:00
Jason Gunthorpe 1cc44279f2 RDMA/cm: Remove the cm_free_id() wrapper function
Just call xa_erase directly during cm_destroy_id()

Link: https://lore.kernel.org/r/20200506074701.9775-9-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:54 -03:00
Jason Gunthorpe cfa68b0d04 RDMA/cm: Make find_remote_id() return a cm_id_private
The only caller doesn't care about the timewait, so acquire and return the
cm_id_private from the function.

Link: https://lore.kernel.org/r/20200506074701.9775-8-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:54 -03:00
Jason Gunthorpe 09fb406a56 RDMA/cm: Add a note explaining how the timewait is eventually freed
The way the cm_timewait_info is converted into a work and then freed
is very subtle and surprising, add a note clarifying the lifetime
here.

Link: https://lore.kernel.org/r/20200506074701.9775-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:53 -03:00
Jason Gunthorpe 9767a27e1a RDMA/cm: Pass the cm_id_private into cm_cleanup_timewait
Also rename it to cm_remove_remote(). This function now removes the
tracking of the remote ID/QPN in the redblack trees from a cm_id_private.

Replace a open-coded version with a call. The open coded version was
deleting only the remote_id, however at this call site the qpn can not
have been in the RB tree either, so the cm_remove_remote() will do the
same.

Link: https://lore.kernel.org/r/20200506074701.9775-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:53 -03:00
Jason Gunthorpe e83f195aa4 RDMA/cm: Pull duplicated code into cm_queue_work_unlock()
While unlocking a spinlock held by the caller is a disturbing pattern,
this extensively duplicated code is even worse. Pull all the duplicates
into a function and explain the purpose of the algorithm.

The on creation side call in cm_req_handler() which is different has been
micro-optimized on the basis that the work_count == -1 during creation,
remove that and just use the normal function.

Link: https://lore.kernel.org/r/20200506074701.9775-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:53 -03:00
Danit Goldberg 42113eed8f RDMA/cm: Remove unused store to ret in cm_rej_handler
The 'goto out' label doesn't read ret, so don't set it.

Link: https://lore.kernel.org/r/20200506074701.9775-4-leon@kernel.org
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:53 -03:00
Jason Gunthorpe d3552fb65d RDMA/cm: Remove return code from add_cm_id_to_port_list
This cannot happen, all callers pass in one of the two pointers. Use
a WARN_ON guard instead.

Link: https://lore.kernel.org/r/20200506074701.9775-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:53 -03:00
Jason Gunthorpe f8f2a576cb RDMA/addr: Mark addr_resolve as might_sleep()
Under one path through ib_nl_fetch_ha() this calls nlmsg_new(GFP_KERNEL)
which is a sleeping call. This is a very rare path, so mark fetch_ha() and
the module external entry point that conditionally calls through to
fetch_ha() as might_sleep().

Link: https://lore.kernel.org/r/20200506074701.9775-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 21:32:52 -03:00
Lang Cheng 90ae0b57e4 RDMA/hns: Combine enable flags of qp
It's easier to understand and maintain enable flags of qp using a single
field in type of unsigned long than defining a field for every flags in
the structure hns_roce_qp, and we can add new flags for features more
conveniently in the future.

Link: https://lore.kernel.org/r/1588674607-25337-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 20:37:06 -03:00
Weihang Li 30661322b8 RDMA/hns: Extend capability flags for HIP08_C
12 bits is not enough for HIP08_C, so extend a new field in length of 16
bits for it.

Link: https://lore.kernel.org/r/1588674607-25337-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 20:37:06 -03:00
Leon Romanovsky 17793833f8 RDMA/ucma: Return stable IB device index as identifier
The librdmacm uses node_guid as identifier to correlate between IB devices
and CMA devices. However FW resets cause to such "connection" to be lost
and require from the user to restart its application.

Extend UCMA to return IB device index, which is stable identifier.

Link: https://lore.kernel.org/r/20200504132541.355710-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 19:52:27 -03:00
Jason Gunthorpe ccfdbaa5cf RDMA/uverbs: Move IB_EVENT_DEVICE_FATAL to destroy_uobj
When multiple async FDs were allowed to exist the idea was for all
broadcast events to be delivered to all async FDs, however
IB_EVENT_DEVICE_FATAL was missed.

Instead of having ib_uverbs_free_hw_resources() special case the global
async_fd, have it cause the event during the uobject destruction. Every
async fd is now a uobject so simply generate the IB_EVENT_DEVICE_FATAL
while destroying the async fd uobject. This ensures every async FD gets a
copy of the event.

Fixes: d680e88e20 ("RDMA/core: Add UVERBS_METHOD_ASYNC_EVENT_ALLOC")
Link: https://lore.kernel.org/r/20200507063348.98713-3-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 17:02:25 -03:00
Jason Gunthorpe c485b19d52 RDMA/uverbs: Do not discard the IB_EVENT_DEVICE_FATAL event
The commit below moved all of the destruction to the disassociate step and
cleaned up the event channel during destroy_uobj.

However, when ib_uverbs_free_hw_resources() pushes IB_EVENT_DEVICE_FATAL
and then immediately goes to destroy all uobjects this causes
ib_uverbs_free_event_queue() to discard the queued event if userspace
hasn't already read() it.

Unlike all other event queues async FD needs to defer the
ib_uverbs_free_event_queue() until FD release. This still unregisters the
handler from the IB device during disassociation.

Fixes: 3e032c0e92 ("RDMA/core: Make ib_uverbs_async_event_file into a uobject")
Link: https://lore.kernel.org/r/20200507063348.98713-2-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 17:02:25 -03:00
Potnuri Bharat Teja c8b1f340e5 RDMA/iw_cxgb4: Fix incorrect function parameters
While reading the TCB field in t4_tcb_get_field32() the wrong mask is
passed as a parameter which leads the driver eventually to a kernel
panic/app segfault from access to an illegal SRQ index while flushing the
SRQ completions during connection teardown.

Fixes: 11a27e2121 ("iw_cxgb4: complete the cached SRQ buffers")
Link: https://lore.kernel.org/r/20200511185608.5202-1-bharat@chelsio.com
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 11:47:48 -03:00
Maor Gottlieb 50bbe3d34f RDMA/core: Fix double put of resource
Do not decrease the reference count of resource tracker object twice in
the error flow of res_get_common_doit.

Fixes: c5dfe0ea6f ("RDMA/nldev: Add resource tracker doit callback")
Link: https://lore.kernel.org/r/20200507062942.98305-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 11:47:48 -03:00
Jack Morgenstein 1901b91f99 IB/core: Fix potential NULL pointer dereference in pkey cache
The IB core pkey cache is populated by procedure ib_cache_update().
Initially, the pkey cache pointer is NULL. ib_cache_update allocates a
buffer and populates it with the device's pkeys, via repeated calls to
procedure ib_query_pkey().

If there is a failure in populating the pkey buffer via ib_query_pkey(),
ib_cache_update does not replace the old pkey buffer cache with the
updated one -- it leaves the old cache as is.

Since initially the pkey buffer cache is NULL, when calling
ib_cache_update the first time, a failure in ib_query_pkey() will cause
the pkey buffer cache pointer to remain NULL.

In this situation, any calls subsequent to ib_get_cached_pkey(),
ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
dereference the NULL pkey cache pointer, causing a kernel panic.

Fix this by checking the ib_cache_update() return value.

Fixes: 8faea9fd4a ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Link: https://lore.kernel.org/r/20200507071012.100594-1-leon@kernel.org
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 11:47:48 -03:00
Mike Marciniszyn fa8dac3968 IB/hfi1: Fix another case where pq is left on waitlist
The commit noted below fixed a case where a pq is left on the sdma wait
list.

It however missed another case.

user_sdma_send_pkts() has two calls from hfi1_user_sdma_process_request().

If the first one fails as indicated by -EBUSY, the pq will be placed on
the waitlist as by design.

If the second call then succeeds, the pq is still on the waitlist setting
up a race with the interrupt handler if a subsequent request uses a
different SDMA engine

Fix by deleting the first call.

The use of pcount and the intent to send a short burst of packets followed
by the larger balance of packets was never correctly implemented, because
the two calls always send pcount packets no matter what.  A subsequent
patch will correct that issue.

Fixes: 9a293d1e21 ("IB/hfi1: Ensure pq is not left on waitlist")
Link: https://lore.kernel.org/r/20200504130917.175613.43231.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 11:47:48 -03:00
Denis V. Lunev 856ec7f646 IB/i40iw: Remove bogus call to netdev_master_upper_dev_get()
Local variable netdev is not used in these calls.

It should be noted, that this change is required to work in bonded mode.
Otherwise we would get the following assert:

 "RTNL: assertion failed at net/core/dev.c (5665)"

With the calltrace as follows:
	dump_stack+0x19/0x1b
	netdev_master_upper_dev_get+0x61/0x70
	i40iw_addr_resolve_neigh+0x1e8/0x220
	i40iw_make_cm_node+0x296/0x700
	? i40iw_find_listener.isra.10+0xcc/0x110
	i40iw_receive_ilq+0x3d4/0x810
	i40iw_puda_poll_completion+0x341/0x420
	i40iw_process_ceq+0xa5/0x280
	i40iw_ceq_dpc+0x1e/0x40
	tasklet_action+0x83/0x140
	__do_softirq+0x125/0x2bb
	call_softirq+0x1c/0x30
	do_softirq+0x65/0xa0
	irq_exit+0x105/0x110
	do_IRQ+0x56/0xf0
	common_interrupt+0x16a/0x16a
	? cpuidle_enter_state+0x57/0xd0
	cpuidle_idle_call+0xde/0x230
	arch_cpu_idle+0xe/0xc0
	cpu_startup_entry+0x14a/0x1e0
	start_secondary+0x1f7/0x270
	start_cpu+0x5/0x14

Link: https://lore.kernel.org/r/20200428131511.11049-1-den@openvz.org
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 11:47:48 -03:00
Jack Morgenstein 6693ca95bd IB/mlx4: Test return value of calls to ib_get_cached_pkey
In the mlx4_ib_post_send() flow, some functions call ib_get_cached_pkey()
without checking its return value. If ib_get_cached_pkey() returns an
error code, these functions should return failure.

Fixes: 1ffeb2eb8b ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support")
Fixes: 225c7b1fee ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Fixes: e622f2f4ad ("IB: split struct ib_send_wr")
Link: https://lore.kernel.org/r/20200426075921.130074-1-leon@kernel.org
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 11:47:48 -03:00
Sudip Mukherjee bb43c8e382 RDMA/rxe: Always return ERR_PTR from rxe_create_mmap_info()
The commit below modified rxe_create_mmap_info() to return ERR_PTR's but
didn't update the callers to handle them. Modify rxe_create_mmap_info() to
only return ERR_PTR and fix all error checking after
rxe_create_mmap_info() is called.

Ensure that all other exit paths properly set the error return.

Fixes: ff23dfa134 ("IB: Pass only ib_udata in function prototypes")
Link: https://lore.kernel.org/r/20200425233545.17210-1-sudipm.mukherjee@gmail.com
Link: https://lore.kernel.org/r/20200511183742.GB225608@mwanda
Cc: stable@vger.kernel.org [5.4+]
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 11:47:00 -03:00
Colin Ian King 52c81f47f0 RDMA/mlx5: Remove duplicated assignment to variable rcqe_sz
The variable rcqe_sz is being unnecessarily assigned twice, fix this by
removing one of the duplicates.

Fixes: 8bde2c509e ("RDMA/mlx5: Update all DRIVER QP places to use QP subtype")
Link: https://lore.kernel.org/r/20200507151610.52636-1-colin.king@canonical.com
Addresses-Coverity: ("Evaluation order violation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-07 21:02:58 -03:00
David S. Miller 3793faad7b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts were all overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 22:10:13 -07:00
Mark Bloch 42caf9cb59 RDMA/mlx5: Allow only raw Ethernet QPs when RoCE isn't enabled
When operating in switchdev mode or using devlink to disable RoCE
only raw Ethernet QPs are allowed to be created.

When in switchdev mode this can lead to passing an invalid port number
as part of the modify qp firmware cmd and will lead to a syndrome
reported back to the user, such as:

 * mlx5_cmd_check:803:(pid 50148): RST2INIT_QP(0x502) op_mod(0x0) failed,
   status bad parameter(0x3), syndrome (0x177405).

Internal UD QP might be used to test for write combining support (even if
externally we report RoCE as disabled) check for that specific flag and
allow is specifically.

Fixes: b5ca15ad7e ("IB/mlx5: Add proper representors support")
Link: https://lore.kernel.org/r/20200506071602.7177-3-leon@kernel.org
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:52:01 -03:00
Mark Bloch 8d93efb8c5 RDMA/mlx5: Assign profile before calling stages
Assign the profile to the IB device before executing stages. This will
allow to check which profile is being used from within a stage.

Link: https://lore.kernel.org/r/20200506071602.7177-2-leon@kernel.org
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:52:01 -03:00
Leon Romanovsky 029e88fd1e RDMA/mlx5: Move all WR logic from qp.c to separate file
Split qp.c by removing all WR logic to separate file.

Link: https://lore.kernel.org/r/20200506065513.4668-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:42:45 -03:00
Max Gurtovoy 6671cde83d RDMA/mlx5: Refactor mlx5_post_send() to improve readability
Add small helpers in order to avoid code duplication and improve code
readability. Decrease the amount of code in the gigantic post_send
function and divide it to readable methods that will help in code
maintenance in the future.

Link: https://lore.kernel.org/r/20200506065513.4668-3-leon@kernel.org
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:42:45 -03:00
Leon Romanovsky 31578defe4 RDMA/mlx5: Update mlx5_ib to use new cmd interface
Reuse newly introduced mlx5_cmd_exec_in() and mlx5_cmd_exec_inout() to
reduce code duplication in mlx5_ib module.

Link: https://lore.kernel.org/r/20200506065513.4668-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:42:45 -03:00
Wenpeng Liang e4faa478c6 RDMA/hns: Remove redundant assignment of caps
These caps are assigned in query_pf_caps() or set_default_caps(), and
should not be assigned out of these two functions.

Link: https://lore.kernel.org/r/1588242691-12913-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:33:20 -03:00
Weihang Li b713128de7 RDMA/hns: Adjust lp_pktn_ini dynamically
lp_pktn_ini means the number of loopback slice packets for long messages,
it should depend on MTU(fixed to 4096B currently) and max size of SQ
inline.

Link: https://lore.kernel.org/r/1588242691-12913-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:33:20 -03:00
Weihang Li 23190b8f47 RDMA/hns: Fix comments with non-English symbols
There is a comments with some chinese semicolons that cause encoding
issues each time hns_roc_hw_v2.h was modified from a IDE. So fix this by
using correct symbols.

Link: https://lore.kernel.org/r/1588242691-12913-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:33:19 -03:00
Xi Wang 67954a6e37 RDMA/hns: Optimize SRQ buffer size calculating process
Optimize the SRQ's WQE buffer parameters calculating process to make the
codes more readable by using new functions about multi-hop addressing to
calculating capabilities of SRQ.

Link: https://lore.kernel.org/r/1588071823-40200-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:26:44 -03:00
Yixian Liu ffb1308b88 RDMA/hns: Move SRQ code to the reasonable place
Just move the SRQ related code to more reasonable place, and unify format
of some prints.

Link: https://lore.kernel.org/r/1588071823-40200-5-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:26:43 -03:00
Xi Wang 54d6638765 RDMA/hns: Optimize WQE buffer size calculating process
Optimize the QP's WQE buffer parameters calculating process to make the
codes more readable mainly by merging calculation of extended sge space of
kernel and userspace. In addition, add some inline functions to simply
codes about multi-hop addressing.

Link: https://lore.kernel.org/r/1588071823-40200-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:26:43 -03:00
Xi Wang 2929c40f08 RDMA/hns: Remove unused MTT functions
The MTT (Memory Translate Table) interface is no longer used to configure
the buffer address to BT (Base Address Table) that requires driver
mapping.  Because the MTT is not compatible with multi-hop addressing of
the hip08, it is replaced by MTR (Memory Translate Region) interface, and
all the MTT functions should be removed.

Link: https://lore.kernel.org/r/1588071823-40200-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:26:43 -03:00
Xi Wang 9b2cf76c9f RDMA/hns: Optimize PBL buffer allocation process
PBL table has its own implementation for multi-hop addressing currently,
but for the hardware, all table's addressing use the same logic, there is
no need to implement repeatedly. So optimize the PBL buffer allocation
process by using the mtr's interfaces.

Link: https://lore.kernel.org/r/1588071823-40200-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 17:26:43 -03:00
Mark Zhang 5ac55dfc6d RDMA/mlx5: Set UDP source port based on the grh.flow_label
Calculate UDP source port based on the grh.flow_label. If grh.flow_label
is not valid, we will use minimal supported UDP source port.

Link: https://lore.kernel.org/r/20200504051935.269708-6-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 16:51:44 -03:00
Mark Zhang f665340519 RDMA/cma: Initialize the flow label of CM's route path record
If flow label is not set by the user or it's not IPv4, initialize it with
the cma src/dst based on the "Kernighan and Ritchie's hash function".

Link: https://lore.kernel.org/r/20200504051935.269708-5-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 16:51:44 -03:00
Mark Zhang 2b880b2e5e RDMA/mlx5: Define RoCEv2 udp source port when set path
Calculate and set UDP source port based on the flow label. If flow label
is not defined in GRH then calculate it based on lqpn/rqpn.

Link: https://lore.kernel.org/r/20200504051935.269708-4-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 16:51:44 -03:00
Maor Gottlieb 9611d53aa1 RDMA/core: Consider flow label when building skb
Use rdma_flow_label_to_udp_sport to calculate the UDP source port of the
RoCEV2 packet.

Link: https://lore.kernel.org/r/20200504051935.269708-3-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 16:51:43 -03:00
Jason Gunthorpe 11a0ae4c4b RDMA: Allow ib_client's to fail when add() is called
When a client is added it isn't allowed to fail, but all the client's have
various failure paths within their add routines.

This creates the very fringe condition where the client was added, failed
during add and didn't set the client_data. The core code will then still
call other client_data centric ops like remove(), rename(), get_nl_info(),
and get_net_dev_by_params() with NULL client_data - which is confusing and
unexpected.

If the add() callback fails, then do not call any more client ops for the
device, even remove.

Remove all the now redundant checks for NULL client_data in ops callbacks.

Update all the add() callbacks to return error codes
appropriately. EOPNOTSUPP is used for cases where the ULP does not support
the ib_device - eg because it only works with IB.

Link: https://lore.kernel.org/r/20200421172440.387069-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 11:57:33 -03:00
Maor Gottlieb 04c349a965 RDMA/mad: Remove snoop interface
Snoop interface is not used. Remove it.

Link: https://lore.kernel.org/r/20200413132408.931084-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06 11:50:22 -03:00
Dan Carpenter 37e31d2d26 i40iw: Fix error handling in i40iw_manage_arp_cache()
The i40iw_arp_table() function can return -EOVERFLOW if
i40iw_alloc_resource() fails so we can't just test for "== -1".

Fixes: 4e9042e647 ("i40iw: add hw and utils files")
Link: https://lore.kernel.org/r/20200422092211.GA195357@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-04 14:12:35 -03:00
Gal Pressman f86e34374a RDMA/efa: Count admin commands errors
Add a new stat that counts admin commands failures, which might help when
debugging different issues.

Link: https://lore.kernel.org/r/20200420062213.44577-4-galpress@amazon.com
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02 20:32:14 -03:00
Gal Pressman eca5757f80 RDMA/efa: Count mmap failures
Add a new stat that counts mmap failures, which might help when debugging
different issues.

Link: https://lore.kernel.org/r/20200420062213.44577-3-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02 20:32:14 -03:00
Gal Pressman b2ea69b3b4 RDMA/efa: Report create CQ error counter
Create CQ errors are already being counted, report them along all other
counters.

Link: https://lore.kernel.org/r/20200420062213.44577-2-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02 20:32:14 -03:00
Maor Gottlieb cfc1a89e44 RDMA/mlx5: Set lag tx affinity according to slave
The patch sets the lag tx affinity of the data QPs and the GSI QPs
according to the LAG xmit slave.

For GSI QPs, in case the link layer is Ethenet (RoCE) we create two GSI
QPs, one for each physical port. When the driver selects the GSI QP, it
will consider the port affinity result.  For connected QPs, the driver
sets the affinity of the xmit slave.

The above, ensures that RC QP and it's corresponding GSI QP will transmit
from the same physical port.

Link: https://lore.kernel.org/r/20200430192146.12863-17-maorg@mellanox.com
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02 20:19:54 -03:00
Maor Gottlieb 5163b2743a RDMA/mlx5: Refactor affinity related code
Move affinity related code in modify qp to function.  It's a preparation
for next patch the extend the affinity calculation to consider the xmit
slave.

Link: https://lore.kernel.org/r/20200430192146.12863-16-maorg@mellanox.com
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02 20:19:54 -03:00
Maor Gottlieb 51aab12631 RDMA/core: Get xmit slave for LAG
Add a call to rdma_lag_get_ah_roce_slave() when the address handle is
created. Lower driver can use it to select the QP's affinity port.

Link: https://lore.kernel.org/r/20200430192146.12863-15-maorg@mellanox.com
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02 20:19:54 -03:00
Maor Gottlieb bd3920eac1 RDMA/core: Add LAG functionality
Add support to get the RoCE LAG xmit slave by building skb of the RoCE
packet and call to master_get_xmit_slave.  If driver wants to get the
slave assume all slaves are available, then need to set
RDMA_LAG_FLAGS_HASH_ALL_SLAVES in flags.

Link: https://lore.kernel.org/r/20200430192146.12863-14-maorg@mellanox.com
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02 20:19:54 -03:00
Maor Gottlieb fa5d010c56 RDMA: Group create AH arguments in struct
Following patch adds additional argument to the create AH function, so it
make sense to group ah_attr and flags arguments in struct.

Link: https://lore.kernel.org/r/20200430192146.12863-13-maorg@mellanox.com
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Acked-by: Devesh Sharma <devesh.sharma@broadcom.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Acked-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02 20:19:53 -03:00
Aharon Landau 0eacc574aa RDMA/mlx5: Verify that QP is created with RQ or SQ
RAW packet QP and underlay QP must be created with either
RQ or SQ, check that.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/20200427154636.381474-37-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:46 -03:00
Leon Romanovsky 968f0b6f9c RDMA/mlx5: Consolidate into special function all create QP calls
Finish separation to blocks of mlx5_ib_create_qp() functions,
so all internal create QP implementation are located in one place.

Link: https://lore.kernel.org/r/20200427154636.381474-36-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:46 -03:00
Leon Romanovsky 6367da46d3 RDMA/mlx5: Remove redundant destroy QP call
After major refactoring in create QP flow, it is no needed to call
to destroy QP in XRC_TGT flow.

Link: https://lore.kernel.org/r/20200427154636.381474-35-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:46 -03:00
Leon Romanovsky 08d5397660 RDMA/mlx5: Copy response to the user in one place
Update all the places in create QP flows to copy response
to the user in one place.

Link: https://lore.kernel.org/r/20200427154636.381474-34-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:46 -03:00
Leon Romanovsky 6f2cf76e6e RDMA/mlx5: Handle udate outlen checks in one place
Place in one function all udata size checks. This will allow
us move ib_copy_to_udata() in general place and ensure that
it will be performed after call to the FW.

Link: https://lore.kernel.org/r/20200427154636.381474-33-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:45 -03:00
Leon Romanovsky 5d6fffed1c RDMA/mlx5: Promote RSS RAW QP flags check to higher level
Move check that user didn't supplied RSS RAW QP unsupported
command flags to the function that checks all such flags.

Link: https://lore.kernel.org/r/20200427154636.381474-32-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:45 -03:00
Leon Romanovsky f78d358cec RDMA/mlx5: Group all create QP parameters to simplify in-kernel interfaces
The amount of parameters passed in and out between internal mlx5
create QP functions is too large to easily follow the flow. Change
it by grouping all create QP parameter into one structure.

Link: https://lore.kernel.org/r/20200427154636.381474-31-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:45 -03:00
Leon Romanovsky 747c519cdb RDMA/mlx5: Reduce amount of duplication in QP destroy
Delete both PD argument and checks if udata was provided, in favour
of unified destroy QP functions.

Link: https://lore.kernel.org/r/20200427154636.381474-30-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:45 -03:00
Leon Romanovsky 98fc1126c4 RDMA/mlx5: Separate to user/kernel create QP flows
The kernel and user create QP flows have very little common code,
separate them to simplify the future work of creating per-type
create_*_qp() functions.

Link: https://lore.kernel.org/r/20200427154636.381474-29-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:44 -03:00
Leon Romanovsky 04bcc1c2d0 RDMA/mlx5: Separate XRC_TGT QP creation from common flow
XRC_TGT QP doesn't fail into kernel or user flow separation. It is
initiated by the user, but is created through in-kernel verbs flow
and doesn't have PD and udata in similar way to kernel QPs.

So let's separate creation of that QP type from the common flow.

Link: https://lore.kernel.org/r/20200427154636.381474-28-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:44 -03:00
Leon Romanovsky 21aad80b17 RDMA/mlx5: Globally parse DEVX UID
Remove duplication in parsing of DEVX UID.

Link: https://lore.kernel.org/r/20200427154636.381474-27-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:44 -03:00
Leon Romanovsky 0ce300b15a RDMA/mlx5: Delete impossible inlen check
The inlen is set to be above zero in all flows before
and can't be negative at this stage.

Link: https://lore.kernel.org/r/20200427154636.381474-26-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:44 -03:00
Leon Romanovsky 03c4077b28 RDMA/mlx5: Rely on existence of udata to separate kernel/user flows
Instead of keeping special field to separate kernel/user create/destroy
flows, rely on existence of udata pointer. All allocation flows are
using kzalloc() and leave uninitialized pointers as NULL which makes
MLX5_QP_EMPTY and MLX5_QP_KERNEL flows to be the same.

Link: https://lore.kernel.org/r/20200427154636.381474-25-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:43 -03:00
Leon Romanovsky 76883a6cc1 RDMA/mlx5: Remove second user copy in create_user_qp
Combine copy_from_user() from create_user_qp() and general code.

Link: https://lore.kernel.org/r/20200427154636.381474-24-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:43 -03:00
Leon Romanovsky 5ce0592b0e RDMA/mlx5: Combine copy of create QP command in RSS RAW QP
Change the create QP flow to handle all copy_from_user() operations in
one place.

Link: https://lore.kernel.org/r/20200427154636.381474-23-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:43 -03:00
Leon Romanovsky 266424eba6 RDMA/mlx5: Promote RSS RAW QP attribute check in higher level
Perform check of attributes of RAW PACKET QP in separate function.

Link: https://lore.kernel.org/r/20200427154636.381474-22-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:43 -03:00
Leon Romanovsky 7aede1a25f RDMA/mlx5: Store QP type in the vendor QP structure
QP type is stored in the IB/core QP struct, but it doesn't have all the
needed information, like internal QP type used in the driver itself.
Update mlx5_ib to have cached QP type which includes both IBTA and
Mellanox specific one.

Such change allows us to make even further cleanup of QP creation flow.

Link: https://lore.kernel.org/r/20200427154636.381474-21-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:42 -03:00
Leon Romanovsky 3ae7e66a01 RDMA/mlx5: Delete unsupported QP types
There is no need to explicitly check unsupported QP types,
rely on  "default" keyword in switch-case to catch them.

Link: https://lore.kernel.org/r/20200427154636.381474-20-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-30 18:45:42 -03:00
Jason Gunthorpe dfb25edd97 Merge branch 'mlx5_ib_qp_refactor_1' into rdma.git for-next
Leon Romanovsky says:

====================
This is first part of series which tries to return some sanity to
mlx5_ib_create_qp() function. Such refactoring is required to make
extension of that function with less worries of breaking driver.

Extra goal of such refactoring is to ensure that QP is allocated at the
beginning of function and released at the end. It will allow us to move QP
allocation to be under IB/core responsibility.
====================

Based on the mlx5-next branch at
 git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Due to dependencies

* branch 'mlx5_ib_qp_refactor_1': (66 commits)
  RDMA/mlx5: Process all vendor flags in one place
  RDMA/mlx5: Return all configured create flags through query QP
  RDMA/mlx5: Change scatter CQE flag to be set like other vendor flags
  RDMA/mlx5: Use flags_en mechanism to mark QP created with WQE signature
  RDMA/mlx5: Process create QP flags in one place
  RDMA/mlx5: Delete create QP flags obfuscation
  RDMA/mlx5: Initial separation of RAW_PACKET QP from common flow
  RDMA/mlx5: Remove second copy from user for non RSS RAW QPs
  RDMA/mlx5: Move DRIVER QP flags check into separate function
  RDMA/mlx5: Update all DRIVER QP places to use QP subtype
  RDMA/mlx5: Split scatter CQE configuration for DCT QP
  RDMA/mlx5: Separate create QP flows to be based on type
  RDMA/mlx5: Set QP subtype immediately when it is known
  RDMA/mlx5: Avoid setting redundant NULL for XRC QPs
  RDMA/mlx5: Prepare QP allocation for future removal
  RDMA/mlx5: Perform check if QP creation flow is valid
  RDMA/mlx5: Delete impossible GSI port check
  RDMA/mlx5: Organize QP types checks in one place

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 21:44:51 -03:00
Leon Romanovsky 37518fa49f RDMA/mlx5: Process all vendor flags in one place
Check that vendor flags provided through ucmd are valid.

Link: https://lore.kernel.org/r/20200427154636.381474-19-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:24 -03:00
Leon Romanovsky a8f3ea61e1 RDMA/mlx5: Return all configured create flags through query QP
The "flags" field in struct mlx5_ib_qp contains all UAPI flags
configured at the create QP stage. Return all the data as is
without masking.

Link: https://lore.kernel.org/r/20200427154636.381474-18-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:24 -03:00
Leon Romanovsky 90ecb37a75 RDMA/mlx5: Change scatter CQE flag to be set like other vendor flags
In similar way to wqe_sig, the scat_cqe was treated differently from
other create QP vendor flags. Change it to be similar to other flags
and use flags_en mechanism.

Link: https://lore.kernel.org/r/20200427154636.381474-17-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:23 -03:00
Leon Romanovsky c95e6d5397 RDMA/mlx5: Use flags_en mechanism to mark QP created with WQE signature
MLX5_QP_FLAG_SIGNATURE is exposed to the users but in the kernel
the create_qp flow treated it differently from other MLX5_QP_FLAG_*s.
Fix it by ditching wq_sig boolean variable and use general flag_en
mechanism.

Link: https://lore.kernel.org/r/20200427154636.381474-16-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:23 -03:00
Leon Romanovsky 2978975ce7 RDMA/mlx5: Process create QP flags in one place
create_flags is checked in too many places and scattered across all
the code, consolidate all the checks inside one function, so we will
be easily see the flow. As part of such change, delete unreachable code,
because IB/core is responsible sanitize the input.

Link: https://lore.kernel.org/r/20200427154636.381474-15-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:23 -03:00
Leon Romanovsky 2be08c308f RDMA/mlx5: Delete create QP flags obfuscation
There is no point in redefinition of stable and exposed to users create
flags. Their values won't be changed and it is equal to used by the
mlx5. Delete the mlx5 definitions and use IB/core fields.

Link: https://lore.kernel.org/r/20200427154636.381474-14-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:23 -03:00
Leon Romanovsky 5d0dc3d96c RDMA/mlx5: Initial separation of RAW_PACKET QP from common flow
Create initial function for IB_QPT_RAW_PACKET flow.

Link: https://lore.kernel.org/r/20200427154636.381474-13-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:22 -03:00
Leon Romanovsky 2dfac92dbb RDMA/mlx5: Remove second copy from user for non RSS RAW QPs
Change the common code to use already copied user command buffer.

Link: https://lore.kernel.org/r/20200427154636.381474-12-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:22 -03:00
Leon Romanovsky 2fdddbd5c9 RDMA/mlx5: Move DRIVER QP flags check into separate function
Perform validation of DRIVER QP in relevant function.

Link: https://lore.kernel.org/r/20200427154636.381474-11-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:22 -03:00
Leon Romanovsky 8bde2c509e RDMA/mlx5: Update all DRIVER QP places to use QP subtype
Instead of overwriting QP init attributes with driver QP subtype,
use that subtype directly. This change will allow us to remove
logic which cached QP init attributes.

Link: https://lore.kernel.org/r/20200427154636.381474-10-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:22 -03:00
Leon Romanovsky fd9dab7edc RDMA/mlx5: Split scatter CQE configuration for DCT QP
DCT QPs have separate creation flow and can be easily extracted
from configure_responder_scat_cqe(), this makes both updated
functions more clear.

Link: https://lore.kernel.org/r/20200427154636.381474-9-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:22 -03:00
Leon Romanovsky 47c806121a RDMA/mlx5: Separate create QP flows to be based on type
Move driver QP creation flow to separate functions to simplify
the create_qp() and allow future separation of create_qp_common()
to subtypes.

Link: https://lore.kernel.org/r/20200427154636.381474-8-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:21 -03:00
Leon Romanovsky 318d2b06fb RDMA/mlx5: Set QP subtype immediately when it is known
There is no need to delay QP subtype assignment to the end of the
create_qp() function and it is better to move it to be immediately
after it is checked so we would be able to rewrite later checks
to be based on it and not on over-written struct ib_qp_init_attr.

Link: https://lore.kernel.org/r/20200427154636.381474-7-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:21 -03:00
Leon Romanovsky c86936e6eb RDMA/mlx5: Avoid setting redundant NULL for XRC QPs
There is no need to set NULL in recv_cq and send_cq, they are already
set to NULL by the IB/core logic.

Link: https://lore.kernel.org/r/20200427154636.381474-6-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:21 -03:00
Leon Romanovsky 9c2ba4ede4 RDMA/mlx5: Prepare QP allocation for future removal
Unify the QP memory allocation across different paths,
so it will be in one place.

Link: https://lore.kernel.org/r/20200427154636.381474-5-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:21 -03:00
Leon Romanovsky 2242cc25ce RDMA/mlx5: Perform check if QP creation flow is valid
Fast check that kernel and user flows provides enough
data to create QP.

Link: https://lore.kernel.org/r/20200427154636.381474-4-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:20 -03:00
Leon Romanovsky 1265d9f7a5 RDMA/mlx5: Delete impossible GSI port check
GSI QP is created in the kernel with very strict parameters,
there is no possible way that port number will be wrong in
such flow.

Link: https://lore.kernel.org/r/20200427154636.381474-3-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:20 -03:00
Leon Romanovsky 6eb7edffb2 RDMA/mlx5: Organize QP types checks in one place
Perform check if QP type is supported in one place at the beginning of
the create_qp function instead of current implementation with checks
buried inside of the code.

Link: https://lore.kernel.org/r/20200427154636.381474-2-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-28 20:42:20 -03:00
Raed Salem 244faedfd4 net/mlx5: Refactor imm_inval_pkey field in cqe struct
The imm_inval_pkey field can hold four different types of data,
depends on the usage, the data could be one of the below:
- Immediate field of the received message
- Invalidate rkey
- Pkey of the packet
- Flow table metadata

Current implementation doesn't reflect the intended usage of the
field at usage time.

Reflect the different types by replace this field with a union,
modify code where this field is used to reflect its intended
usage.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-28 12:45:15 -07:00
Erez Shitrit dff8e2d152 net/mlx5: Use aligned variable while allocating ICM memory
The alignment value is part of the input structure, so use it and spare
extra memory allocation when is not needed.
Now, using the new ability when allocating icm for Direct-Rule
insertion.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-28 12:45:10 -07:00
Huy Nguyen d65dbedfd2 net/mlx5: Add support for COPY steering action
Add COPY type to modify_header action. IPsec feature is the first
feature that needs COPY steering action.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-28 12:44:44 -07:00
Leon Romanovsky f0abc761bb RDMA/core: Fix race between destroy and release FD object
The call to ->lookup_put() was too early and it caused an unlock of the
read/write protection of the uobject after the FD was put. This allows a
race:

     CPU1                                 CPU2
 rdma_lookup_put_uobject()
   lookup_put_fd_uobject()
     fput()
				   fput()
				     uverbs_uobject_fd_release()
				       WARN_ON(uverbs_try_lock_object(uobj,
					       UVERBS_LOOKUP_WRITE));
   atomic_dec(usecnt)

Fix the code by changing the order, first unlock and call to
->lookup_put() after that.

Fixes: 3832125624 ("IB/core: Add support for idr types")
Link: https://lore.kernel.org/r/20200423060122.6182-1-leon@kernel.org
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-24 15:40:41 -03:00
Sudip Mukherjee 47c370c1a5 IB/rdmavt: Always return ERR_PTR from rvt_create_mmap_info()
The commit below modified rvt_create_mmap_info() to return ERR_PTR's but
didn't update the callers to handle them. Modify rvt_create_mmap_info() to
only return ERR_PTR and fix all error checking after
rvt_create_mmap_info() was called.

Fixes: ff23dfa134 ("IB: Pass only ib_udata in function prototypes")
Link: https://lore.kernel.org/r/20200424173146.10970-1-sudipm.mukherjee@gmail.com
Cc: stable@vger.kernel.org [5.4+]
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-24 15:31:26 -03:00
Lang Cheng a97bf49f82 RDMA/hns: Simplify the status judgment code of hns_roce_v1_m_qp()
Use status table to reduce cyclomatic complexity.

Link: https://lore.kernel.org/r/1586938475-37049-7-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-24 10:19:11 -03:00
Lang Cheng 357f342946 RDMA/hns: Simplify the state judgment code of qp
Use state table to make the qp state migrate code more readable.

Link: https://lore.kernel.org/r/1586938475-37049-6-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-24 10:19:11 -03:00
Lang Cheng 7c044adca2 RDMA/hns: Simplify the cqe code of poll cq
Encapsulate codes to get status of cqe into a function and use map table
instead of switch-case to reduce cyclomatic complexity of
hns_roce_v2_poll_one().

Link: https://lore.kernel.org/r/1586938475-37049-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-24 10:19:10 -03:00
Lang Cheng a3de9e8381 RDMA/hns: Simplify the qp state convert code
Use type map table to reduce the cyclomatic complexity.

Link: https://lore.kernel.org/r/1586938475-37049-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-24 10:19:10 -03:00
Lijun Ou 375898e83d RDMA/hns: Optimize hns_roce_v2_set_mac()
Removes the unnecessary memset opertaion and adjust style of some lines in
hns_roce_v2_set_mac().

Link: https://lore.kernel.org/r/1586938475-37049-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-24 10:19:10 -03:00
Lijun Ou 9976ea27b5 RDMA/hns: Optimize hns_roce_config_link_table()
Remove the unnecessary memset operation and adjust style of some lines in
hns_roce_config_link_table().

Link: https://lore.kernel.org/r/1586938475-37049-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-24 10:19:10 -03:00
Leon Romanovsky e0b4b4722d net/mlx5: Update transobj.c new cmd interface
Do mass update of transobj.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:16 +03:00
Leon Romanovsky 5d1c9a114a net/mlx5: Update vport.c to new cmd interface
Do mass update of vport.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:02 +03:00
Leon Romanovsky 322f3d45a1 RDMA/bnxt: Delete 'nq_ptr' variable which is not used
The variable "nq_ptr" is set but never used, this generates the following
warning while compiling kernel with W=1 option.

drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_service_nq':
drivers/infiniband/hw/bnxt_re/qplib_fp.c:303:25: warning:
   variable 'nq_ptr' set but not used [-Wunused-but-set-variable]
303 |  struct nq_base *nqe, **nq_ptr;
    |

Fixes: fddcbbb02a ("RDMA/bnxt_re: Simplify obtaining queue entry from hw ring")
Link: https://lore.kernel.org/r/20200419132046.123887-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 17:05:13 -03:00
Xi Wang 744b7bdfa7 RDMA/hns: Support 0 hop addressing for CQE buffer
Add the zero hop addressing support by using mtr interface for CQE buffer,
so the hns driver can support addressing hopnum between 0 to 3 for CQE.

Link: https://lore.kernel.org/r/1586779091-51410-7-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 16:22:11 -03:00
Xi Wang 6fd610c573 RDMA/hns: Support 0 hop addressing for SRQ buffer
Add the zero hop addressing support by using mtr interface for SRQ buffer,
so the hns driver can support addressing hopnum between 0 to 3 for SRQ.

Link: https://lore.kernel.org/r/1586779091-51410-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 16:21:37 -03:00
Xi Wang d563099e3e RDMA/hns: Support 0 hop addressing for WQE buffer
Add the zero hop addressing support by using new mtr interface for WQE
buffer and simple mtr invoking process, so WQE buffer can support hopnum
between 0 to 3.

Link: https://lore.kernel.org/r/1586779091-51410-5-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 15:59:54 -03:00
Xi Wang 477a0a3870 RDMA/hns: Optimize 0 hop addressing for EQE buffer
Use the new mtr interface to simple the hop 0 addressing and multihop
addressing process.

Link: https://lore.kernel.org/r/1586779091-51410-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 15:59:54 -03:00
Xi Wang cc23267aed RDMA/hns: Optimize hns buffer allocation flow
When the value of nbufs is 1, the buffer is in direct mode, which may cause
confusion. So optimizes current codes to make it easier to maintain and
understand.

Link: https://lore.kernel.org/r/1586779091-51410-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 15:59:54 -03:00
Xi Wang 3c873161a0 RDMA/hns: Add support for addressing when hopnum is 0
Currently, WQE and EQE table have already used the mtr interface to config
and access memory by multi-hop addressing when hopnum is from 1 to 3. But
if hopnum is 0, each table need write its own but repetitive logic, and
many duplicate code exists in the mtr interfaces invoke process.

So wraps the public logic as 3 functions: hns_roce_mtr_create(),
hns_roce_mtr_destroy() and hns_roce_mtr_map() to support hopnum ranges from
0 to 3. In addition, makes the mtr interfaces easier to use.

Link: https://lore.kernel.org/r/1586779091-51410-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 15:59:53 -03:00
Leon Romanovsky 83a2670212 RDMA/core: Fix overwriting of uobj in case of error
In case of failure to get file, the uobj is overwritten and causes to
supply bad pointer as an input to uverbs_uobject_put().

  BUG: KASAN: null-ptr-deref in atomic_fetch_sub include/asm-generic/atomic-instrumented.h:199 [inline]
  BUG: KASAN: null-ptr-deref in refcount_sub_and_test include/linux/refcount.h:253 [inline]
  BUG: KASAN: null-ptr-deref in refcount_dec_and_test include/linux/refcount.h:281 [inline]
  BUG: KASAN: null-ptr-deref in kref_put include/linux/kref.h:64 [inline]
  BUG: KASAN: null-ptr-deref in uverbs_uobject_put+0x22/0x90 drivers/infiniband/core/rdma_core.c:57
  Write of size 4 at addr 0000000000000030 by task syz-executor.4/1691

  CPU: 1 PID: 1691 Comm: syz-executor.4 Not tainted 5.6.0 #17
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
  Call Trace:
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x94/0xce lib/dump_stack.c:118
   __kasan_report+0x10c/0x190 mm/kasan/report.c:515
   kasan_report+0x32/0x50 mm/kasan/common.c:625
   check_memory_region_inline mm/kasan/generic.c:187 [inline]
   check_memory_region+0x16d/0x1c0 mm/kasan/generic.c:193
   atomic_fetch_sub include/asm-generic/atomic-instrumented.h:199 [inline]
   refcount_sub_and_test include/linux/refcount.h:253 [inline]
   refcount_dec_and_test include/linux/refcount.h:281 [inline]
   kref_put include/linux/kref.h:64 [inline]
   uverbs_uobject_put+0x22/0x90 drivers/infiniband/core/rdma_core.c:57
   alloc_begin_fd_uobject+0x1d0/0x250 drivers/infiniband/core/rdma_core.c:486
   rdma_alloc_begin_uobject+0xa8/0xf0 drivers/infiniband/core/rdma_core.c:509
   __uobj_alloc include/rdma/uverbs_std_types.h:117 [inline]
   ib_uverbs_create_comp_channel+0x16d/0x230 drivers/infiniband/core/uverbs_cmd.c:982
   ib_uverbs_write+0xaa5/0xdf0 drivers/infiniband/core/uverbs_main.c:665
   __vfs_write+0x7c/0x100 fs/read_write.c:494
   vfs_write+0x168/0x4a0 fs/read_write.c:558
   ksys_write+0xc8/0x200 fs/read_write.c:611
   do_syscall_64+0x9c/0x390 arch/x86/entry/common.c:295
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x466479
  Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
  RSP: 002b:00007efe9f6a7c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000466479
  RDX: 0000000000000018 RSI: 0000000020000040 RDI: 0000000000000003
  RBP: 00007efe9f6a86bc R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005
  R13: 0000000000000bf2 R14: 00000000004cb80a R15: 00000000006fefc0

Fixes: 849e149063 ("RDMA/core: Do not allow alloc_commit to fail")
Link: https://lore.kernel.org/r/20200421082929.311931-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 15:53:42 -03:00
Leon Romanovsky 0fb00941dc RDMA/core: Prevent mixed use of FDs between shared ufiles
FDs can only be used on the ufile that created them, they cannot be mixed
to other ufiles. We are lacking a check to prevent it.

  BUG: KASAN: null-ptr-deref in atomic64_sub_and_test include/asm-generic/atomic-instrumented.h:1547 [inline]
  BUG: KASAN: null-ptr-deref in atomic_long_sub_and_test include/asm-generic/atomic-long.h:460 [inline]
  BUG: KASAN: null-ptr-deref in fput_many+0x1a/0x140 fs/file_table.c:336
  Write of size 8 at addr 0000000000000038 by task syz-executor179/284

  CPU: 0 PID: 284 Comm: syz-executor179 Not tainted 5.5.0-rc5+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
  Call Trace:
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x94/0xce lib/dump_stack.c:118
   __kasan_report+0x18f/0x1b7 mm/kasan/report.c:510
   kasan_report+0xe/0x20 mm/kasan/common.c:639
   check_memory_region_inline mm/kasan/generic.c:185 [inline]
   check_memory_region+0x15d/0x1b0 mm/kasan/generic.c:192
   atomic64_sub_and_test include/asm-generic/atomic-instrumented.h:1547 [inline]
   atomic_long_sub_and_test include/asm-generic/atomic-long.h:460 [inline]
   fput_many+0x1a/0x140 fs/file_table.c:336
   rdma_lookup_put_uobject+0x85/0x130 drivers/infiniband/core/rdma_core.c:692
   uobj_put_read include/rdma/uverbs_std_types.h:96 [inline]
   _ib_uverbs_lookup_comp_file drivers/infiniband/core/uverbs_cmd.c:198 [inline]
   create_cq+0x375/0xba0 drivers/infiniband/core/uverbs_cmd.c:1006
   ib_uverbs_create_cq+0x114/0x140 drivers/infiniband/core/uverbs_cmd.c:1089
   ib_uverbs_write+0xaa5/0xdf0 drivers/infiniband/core/uverbs_main.c:769
   __vfs_write+0x7c/0x100 fs/read_write.c:494
   vfs_write+0x168/0x4a0 fs/read_write.c:558
   ksys_write+0xc8/0x200 fs/read_write.c:611
   do_syscall_64+0x9c/0x390 arch/x86/entry/common.c:294
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x44ef99
  Code: 00 b8 00 01 00 00 eb e1 e8 74 1c 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c4 ff ff ff f7 d8 64 89 01 48
  RSP: 002b:00007ffc0b74c028 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 00007ffc0b74c030 RCX: 000000000044ef99
  RDX: 0000000000000040 RSI: 0000000020000040 RDI: 0000000000000005
  RBP: 00007ffc0b74c038 R08: 0000000000401830 R09: 0000000000401830
  R10: 00007ffc0b74c038 R11: 0000000000000246 R12: 0000000000000000
  R13: 0000000000000000 R14: 00000000006be018 R15: 0000000000000000

Fixes: cf8966b347 ("IB/core: Add support for fd objects")
Link: https://lore.kernel.org/r/20200421082929.311931-2-leon@kernel.org
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 15:53:23 -03:00
Jason Gunthorpe 39c011a538 RDMA/uverbs: Fix a race with disassociate and exit_mmap()
If uverbs_user_mmap_disassociate() is called while the mmap is
concurrently doing exit_mmap then the ordering of the
rdma_user_mmap_entry_put() is not reliable.

The put must be done before uvers_user_mmap_disassociate() returns,
otherwise there can be a use after free on the ucontext, and a left over
entry in the xarray. If the put is not done here then it is done during
rdma_umap_close() later.

Add the missing put to the error exit path.

  WARNING: CPU: 7 PID: 7111 at drivers/infiniband/core/rdma_core.c:810 uverbs_destroy_ufile_hw+0x2a5/0x340 [ib_uverbs]
  Modules linked in: bonding ipip tunnel4 geneve ip6_udp_tunnel udp_tunnel ip6_gre ip6_tunnel tunnel6 ip_gre ip_tunnel gre mlx5_ib mlx5_core mlxfw pci_hyperv_intf act_ct nf_flow_table ptp pps_core rdma_ucm ib_uverbs ib_ipoib ib_umad 8021q garp mrp openvswitch nsh nf_conncount nfsv3 nfs_acl xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype iptable_filter xt_conntrack br_netfilter bridge stp llc rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache overlay rpcrdma ib_isert iscsi_target_mod ib_iser kvm_intel ib_srpt iTCO_wdt target_core_mod iTCO_vendor_support kvm ib_srp nf_nat irqbypass crc32_pclmul crc32c_intel nf_conntrack rfkill nf_defrag_ipv6 virtio_net nf_defrag_ipv4 pcspkr ghash_clmulni_intel i2c_i801 net_failover failover i2c_core lpc_ich mfd_core rdma_cm ib_cm iw_cm button ib_core sunrpc sch_fq_codel ip_tables serio_raw [last unloaded: tunnel4]
  CPU: 7 PID: 7111 Comm: python3 Tainted: G        W         5.6.0-rc6-for-upstream-dbg-2020-03-21_06-41-26-18 #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
  RIP: 0010:uverbs_destroy_ufile_hw+0x2a5/0x340 [ib_uverbs]
  Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 74 49 8b 84 24 08 01 00 00 48 85 c0 0f 84 13 ff ff ff 48 89 ef ff d0 e9 09 ff ff ff <0f> 0b e9 77 ff ff ff e8 0f d8 fa e0 e9 c5 fd ff ff e8 05 d8 fa e0
  RSP: 0018:ffff88840e0779a0 EFLAGS: 00010286
  RAX: dffffc0000000000 RBX: ffff8882a7721c00 RCX: 0000000000000000
  RDX: 1ffff11054ee469f RSI: ffffffff8446d7e0 RDI: ffff8882a77234f8
  RBP: ffff8882a7723400 R08: ffffed1085c0112c R09: 0000000000000001
  R10: 0000000000000001 R11: ffffed1085c0112b R12: ffff888403c30000
  R13: 0000000000000002 R14: ffff8882a7721cb0 R15: ffff8882a7721cd0
  FS:  00007f2046089700(0000) GS:ffff88842de00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f7cfe9a6e20 CR3: 000000040b8ac006 CR4: 0000000000360ee0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   ib_uverbs_remove_one+0x273/0x480 [ib_uverbs]
   ? up_write+0x15c/0x4a0
   remove_client_context+0xa6/0xf0 [ib_core]
   disable_device+0x12d/0x200 [ib_core]
   ? remove_client_context+0xf0/0xf0 [ib_core]
   ? mnt_get_count+0x1d0/0x1d0
   __ib_unregister_device+0x79/0x150 [ib_core]
   ib_unregister_device+0x21/0x30 [ib_core]
   __mlx5_ib_remove+0x91/0x110 [mlx5_ib]
   ? __mlx5_ib_remove+0x110/0x110 [mlx5_ib]
   mlx5_remove_device+0x241/0x310 [mlx5_core]
   mlx5_unregister_device+0x4d/0x1e0 [mlx5_core]
   mlx5_unload_one+0xc0/0x260 [mlx5_core]
   remove_one+0x5c/0x160 [mlx5_core]
   pci_device_remove+0xef/0x2a0
   ? pcibios_free_irq+0x10/0x10
   device_release_driver_internal+0x1d8/0x470
   unbind_store+0x152/0x200
   ? sysfs_kf_write+0x3b/0x180
   ? sysfs_file_ops+0x160/0x160
   kernfs_fop_write+0x284/0x460
   ? __sb_start_write+0x243/0x3a0
   vfs_write+0x197/0x4a0
   ksys_write+0x156/0x1e0
   ? __x64_sys_read+0xb0/0xb0
   ? do_syscall_64+0x73/0x1330
   ? do_syscall_64+0x73/0x1330
   do_syscall_64+0xe7/0x1330
   ? down_write_nested+0x3e0/0x3e0
   ? syscall_return_slowpath+0x970/0x970
   ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
   ? lockdep_hardirqs_off+0x1de/0x2d0
   ? trace_hardirqs_off_thunk+0x1a/0x1c
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7f20a3ff0cdb
  Code: 53 48 89 d5 48 89 f3 48 83 ec 18 48 89 7c 24 08 e8 5a fd ff ff 48 89 ea 41 89 c0 48 89 de 48 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 90 fd ff ff 48
  RSP: 002b:00007f2046087040 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 00007f2038016df0 RCX: 00007f20a3ff0cdb
  RDX: 000000000000000d RSI: 00007f2038016df0 RDI: 0000000000000018
  RBP: 000000000000000d R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000100 R11: 0000000000000293 R12: 00007f2046e29630
  R13: 00007f20280035a0 R14: 0000000000000018 R15: 00007f2038016df0

Fixes: c043ff2cfb ("RDMA: Connect between the mmap entry and the umap_priv structure")
Link: https://lore.kernel.org/r/20200413132136.930388-1-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 15:43:46 -03:00
Aharon Landau 2d7e3ff7b6 RDMA/mlx5: Set GRH fields in query QP on RoCE
GRH fields such as sgid_index, hop limit, et. are set in the QP context
when QP is created/modified.

Currently, when query QP is performed, we fill the GRH fields only if the
GRH bit is set in the QP context, but this bit is not set for RoCE. Adjust
the check so we will set all relevant data for the RoCE too.

Since this data is returned to userspace, the below is an ABI regression.

Fixes: d8966fcd4c ("IB/core: Use rdma_ah_attr accessor functions")
Link: https://lore.kernel.org/r/20200413132028.930109-1-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 15:43:46 -03:00
Leon Romanovsky 333fbaa025 net/mlx5: Move QP logic to mlx5_ib
The mlx5_core doesn't need any functionality coded in qp.c, so move
that file to drivers/infiniband/ be under mlx5_ib responsibility.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:21 +03:00
Leon Romanovsky 42f9bbd112 RDMA/mlx5: Alphabetically sort build artifacts
Sort .o objects in makefile to make addition of new object
less cumbersome.

Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:20 +03:00
Leon Romanovsky 9c275ee4ad net/mlx5: Delete not-used cmd header
The structures defined in the cmd header are not used and can be safely
removed from the driver. This patch removes that file and deletes all
relevant includes.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:20 +03:00
Leon Romanovsky bfd745f8f3 RDMA/mlx5: Delete Q counter allocations command
Remove mlx5_ib implementation of Q counter allocation logic
together with cleaning boolean which controlled validity of the
counter. It is not needed, because counter_id == 0 means that
counter is not valid.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:20 +03:00
Leon Romanovsky 66247fbb28 net/mlx5: Remove Q counter low level helper APIs
mlx5 core users are encouraged to use low level API (mlx5_cmd_exec)
without the need of helper functions, do this for q counters, remove
helper functions and call mlx5_cmd_exec directly from users.

This will help reduce the total amount of code and reduction of the
mlx5_core symbol table.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:20 +03:00
Max Gurtovoy 95a776e8a6 RDMA/rw: use DIV_ROUND_UP to calculate nr_ops
Don't open-code DIV_ROUND_UP() kernel macro.

Link: https://lore.kernel.org/r/20200413133905.933343-1-leon@kernel.org
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-15 11:34:49 -03:00
Jason Gunthorpe 6e051971b0 RDMA/siw: Fix potential siw_mem refcnt leak in siw_fastreg_mr()
siw_fastreg_mr() invokes siw_mem_id2obj(), which returns a local reference
of the siw_mem object to "mem" with increased refcnt.  When
siw_fastreg_mr() returns, "mem" becomes invalid, so the refcount should be
decreased to keep refcount balanced.

The issue happens in one error path of siw_fastreg_mr(). When "base_mr"
equals to NULL but "mem" is not NULL, the function forgets to decrease the
refcnt increased by siw_mem_id2obj() and causes a refcnt leak.

Reorganize the flow so that the goto unwind can be used as expected.

Fixes: b9be6f18cf ("rdma/siw: transmit path")
Link: https://lore.kernel.org/r/1586939949-69856-1-git-send-email-xiyuyang19@fudan.edu.cn
Reported-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-15 11:26:51 -03:00
Alaa Hleihel c08cfb2d8d RDMA/mlx4: Initialize ib_spec on the stack
Initialize ib_spec on the stack before using it, otherwise we will have
garbage values that will break creating default rules with invalid parsing
error.

Fixes: a37a1a4284 ("IB/mlx4: Add mechanism to support flow steering over IB links")
Link: https://lore.kernel.org/r/20200413132235.930642-1-leon@kernel.org
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 20:42:31 -03:00
Leon Romanovsky dd302ee41e RDMA/cma: Limit the scope of rdma_is_consumer_reject function
The function is local to cma.c, so let's limit its scope.

Link: https://lore.kernel.org/r/20200413132323.930869-1-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 19:44:12 -03:00
Devesh Sharma 8ce111d00e RDMA/bnxt_re: Remove dead code from rcfw
In the previous refactoring serise there were few leftover functions which
are not is use anymore.  Removed them as it is a dead code.

Fixes: 6f53196bc5 ("RDMA/bnxt_re: Refactor doorbell management functions")
Link: https://lore.kernel.org/r/1585851136-2316-5-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:39:35 -03:00
Devesh Sharma fddcbbb02a RDMA/bnxt_re: Simplify obtaining queue entry from hw ring
Restructring the data path and control path queue management code to
simplify the way a queue element is extracted from the hardware ring.

Introduced a new function which will give a pointer to the next ring item
depending upon the current cons/prod index in the hardware queue.

Further, there are hardcoding when size of queue entry is calculated,
replacing it with an inline function. This function would be easier to
expand if need going forward.

The code section to initialize the PSN search areas has also been
restructured and couple of functions has been added there.

Link: https://lore.kernel.org/r/1585851136-2316-4-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:39:35 -03:00
Devesh Sharma c78671a4e6 RDMA/bnxt_re: Update missing hsi data structures
Adding fast path support data structure into hardware HSI. These
structures are header only definition of RQE/SRQE/SQE. This is to help
calculating the size of hardware wqe size.

Link: https://lore.kernel.org/r/1585851136-2316-3-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:39:35 -03:00
Devesh Sharma 99bf84e24e RDMA/bnxt_re: Reduce device page size detection code
Getting rid of the repeated code in the driver when deciding on the page
size of the hardware ring memory. A new common function would translate
the ring page size into device specific page size.

Link: https://lore.kernel.org/r/1585851136-2316-2-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:39:34 -03:00
Zou Wei 4f95308911 IB/qib: Remove unused variable ret
This patch fixes below warnings reported by coccicheck

drivers/infiniband/hw/qib/qib_iba7322.c:6878:8-11:
Unneeded variable: "ret". Return "0" on line 6907
drivers/infiniband/hw/qib/qib_iba7322.c:2378:5-8:
Unneeded variable: "ret". Return "0" on line 2513

Link: https://lore.kernel.org/r/1586745724-107477-1-git-send-email-zou_wei@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:34:17 -03:00
Mauro Carvalho Chehab 255e636df4 IB: Fix some documentation warnings
Parsing verbs.c with kernel-doc produce some warnings:

	./drivers/infiniband/core/verbs.c:2579: WARNING: Unexpected indentation.
	./drivers/infiniband/core/verbs.c:2581: WARNING: Block quote ends without a blank line; unexpected unindent.
	./drivers/infiniband/core/verbs.c:2613: WARNING: Unexpected indentation.
	./drivers/infiniband/core/verbs.c:2579: WARNING: Unexpected indentation.
	./drivers/infiniband/core/verbs.c:2581: WARNING: Block quote ends without a blank line; unexpected unindent.
	./drivers/infiniband/core/verbs.c:2613: WARNING: Unexpected indentation.

Address them by adding an extra blank line and converting the parameters
on one of the arguments to a table.

Link: https://lore.kernel.org/r/4c5466d0f450c5a9952138150c3485740b37f9c5.1586881715.git.mchehab+huawei@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:31:28 -03:00
Jason Gunthorpe 1587982e70 RDMA: Remove a few extra calls to ib_get_client_data()
These four places already have easy access to the client data, just use
that instead.

Link: https://lore.kernel.org/r/0-v1-fae83f600b4a+68-less_get_client_data%25jgg@mellanox.com
Acked-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:05:08 -03:00
Dan Carpenter 9836535158 RDMA/cm: Fix an error check in cm_alloc_id_priv()
The xa_alloc_cyclic_irq() function returns either 0 or 1 on success and
negatives on error.  This code treats 1 as an error and returns ERR_PTR(1)
which will cause an Oops in the caller.

Fixes: ae78ff3a0f ("RDMA/cm: Convert local_id_table to XArray")
Link: https://lore.kernel.org/r/20200407093714.GA80285@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:01:38 -03:00
Jason Gunthorpe eb356e6dc1 RDMA/uverbs: Make the event_queue fds return POLLERR when disassociated
If is_closed is set, and the event list is empty, then read() will return
-EIO without blocking. After setting is_closed in
ib_uverbs_free_event_queue(), we do trigger a wake_up on the poll_wait,
but the fops->poll() function does not check it, so poll will continue to
sleep on an empty list.

Fixes: 14e23bd6d2 ("RDMA/core: Fix locking in ib_uverbs_event_read")
Link: https://lore.kernel.org/r/0-v1-ace813388969+48859-uverbs_poll_fix%25jgg@mellanox.com
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 15:56:34 -03:00
Yishai Hadas cf26deff90 RDMA/mlx5: Fix udata response upon SRQ creation
Fix udata response upon SRQ creation to use the UAPI structure (i.e.
mlx5_ib_create_srq_resp). It did not zero the reserved field in userspace.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/20200406173540.1466477-1-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 15:56:34 -03:00
Zhu Yanjun 0184afd15a RDMA/rxe: Set default vendor ID
The RXE driver doesn't set vendor_id and user space applications see
zeros. This causes to pyverbs tests to fail with the following traceback,
because the expectation is to have valid vendor_id.

Traceback (most recent call last):
  File "tests/test_device.py", line 51, in test_query_device
    self.verify_device_attr(attr)
  File "tests/test_device.py", line 77, in verify_device_attr
    assert attr.vendor_id != 0

In order to fix it, we will set vendor_id 0XFFFFFF, according to the IBTA
v1.4 A3.3.1 VENDOR INFORMATION section.

"""
A vendor that produces a generic controller (i.e., one that supports a
standard I/O protocol such as SRP), which does not have vendor specific
device drivers, may use the value of 0xFFFFFF in the VendorID field.
"""

Before:

hca_id: rxe0
        transport:                      InfiniBand (0)
        fw_ver:                         0.0.0
        node_guid:                      5054:00ff:feaa:5363
        sys_image_guid:                 5054:00ff:feaa:5363
        vendor_id:                      0x0000

After:

hca_id: rxe0
        transport:                      InfiniBand (0)
        fw_ver:                         0.0.0
        node_guid:                      5054:00ff:feaa:5363
        sys_image_guid:                 5054:00ff:feaa:5363
        vendor_id:                      0xffffff

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200406173501.1466273-1-leon@kernel.org
Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 15:52:38 -03:00
Leon Romanovsky 0c6949c3d1 RDMA/cm: Fix missing RDMA_CM_EVENT_REJECTED event after receiving REJ message
The cm_reset_to_idle() call before formatting event changed the CM_ID
state from IB_CM_REQ_RCVD to be IB_CM_IDLE. It caused to wrong value of
CM_REJ_MESSAGE_REJECTED field.

The result of that was that rdma_reject() calls in the passive side didn't
generate RDMA_CM_EVENT_REJECTED event in the active side.

Fixes: 81ddb41f87 ("RDMA/cm: Allow ib_send_cm_rej() to be done under lock")
Link: https://lore.kernel.org/r/20200406173242.1465911-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 15:50:49 -03:00
Colin Ian King f70968f05d i40iw: fix null pointer dereference on a null wqe pointer
Currently the null check for wqe is incorrect and lets a null wqe
be passed to set_64bit_val and this indexes into the null pointer
causing a null pointer dereference.  Fix this by fixing the null
pointer check to return an error if wqe is null.

Link: https://lore.kernel.org/r/20200401224921.405279-1-colin.king@canonical.com
Addresses-Coverity: ("dereference after a null check")
Fixes: 4b34e23f4e ("i40iw: Report correct firmware version")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 15:44:50 -03:00
Linus Torvalds 919dce2470 RDMA 5.7 pull request
The majority of the patches are cleanups, refactorings and clarity
 improvements
 
 - Various driver updates for siw, bnxt_re, rxe, efa, mlx5, hfi1
 
 - Lots of cleanup patches for hns
 
 - Convert more places to use refcount
 
 - Aggressively lock the RDMA CM code that syzkaller says isn't working
 
 - Work to clarify ib_cm
 
 - Use the new ib_device lifecycle model in bnxt_re
 
 - Fix mlx5's MR cache which seems to be failing more often with the new
   ODP code
 
 - mlx5 'dynamic uar' and 'tx steering' user interfaces
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl6CSr0ACgkQOG33FX4g
 mxrtKg//XovbOfYAO7nC05FtGz9iEkIUBiwQOjUojgNSi6RMNDqRW1bmqKUugm1o
 9nXA6tw+fueEvUNSD541SCcxkUZJzWvubO9wHB6N3Fgy68N3Vf2rKV3EBBTh99rK
 Cb7rnmTTN6izRyI1wdyP2sjDJGyF8zvsgIrG2sibzLnqlyScrnD98YS0FdPZUfOa
 a1mmXBN/T7eaQ4TbE3lLLzGnifRlGmZ5vxEvOQmAlOdqAfIKQdbbW7oCRLVIleso
 gfQlOOvIgzHRwQ3VrFa3i6ETYtywXq7EgmQxCjqPVJQjCA79n5TLBkP1iRhvn8xi
 3+LO4YCkiSJ/NjTA2d9KwT6K4djj3cYfcueuqo2MoXXr0YLiY6TLv1OffKcUIq7c
 LM3d4CSwIAG+C2FZwaQrdSGa2h/CNfLAEeKxv430zggeDNKlwHJPV5w3rUJ8lT56
 wlyT7Lzosl0O9Z/1BSLYckTvbBCtYcmanVyCfHG8EJKAM1/tXy5LS8btJ3e51rPu
 XekR9ELrTTA2CTuoSCQGP6J0dBD2U7qO4XRCQ9N5BYLrI6RdP7Z4xYzzey49Z3Cx
 JaF86eurM7nS5biUszTtwww8AJMyYicB+0VyjBfk+mhv90w8tS1vZ1aZKzaQ1L6Z
 jWn8WgIN4rWY0YGQs6PiovT1FplyGs3p1wNmjn92WO0wZZ3WsmQ=
 =ae+a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "The majority of the patches are cleanups, refactorings and clarity
  improvements.

  This cycle saw some more activity from Syzkaller, I think we are now
  clean on all but one of those bugs, including the long standing and
  obnoxious rdma_cm locking design defect. Continue to see many drivers
  getting cleanups, with a few new user visible features.

  Summary:

   - Various driver updates for siw, bnxt_re, rxe, efa, mlx5, hfi1

   - Lots of cleanup patches for hns

   - Convert more places to use refcount

   - Aggressively lock the RDMA CM code that syzkaller says isn't
     working

   - Work to clarify ib_cm

   - Use the new ib_device lifecycle model in bnxt_re

   - Fix mlx5's MR cache which seems to be failing more often with the
     new ODP code

   - mlx5 'dynamic uar' and 'tx steering' user interfaces"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (144 commits)
  RDMA/bnxt_re: make bnxt_re_ib_init static
  IB/qib: Delete struct qib_ivdev.qp_rnd
  RDMA/hns: Fix uninitialized variable bug
  RDMA/hns: Modify the mask of QP number for CQE of hip08
  RDMA/hns: Reduce the maximum number of extend SGE per WQE
  RDMA/hns: Reduce PFC frames in congestion scenarios
  RDMA/mlx5: Add support for RDMA TX flow table
  net/mlx5: Add support for RDMA TX steering
  IB/hfi1: Call kobject_put() when kobject_init_and_add() fails
  IB/hfi1: Fix memory leaks in sysfs registration and unregistration
  IB/mlx5: Move to fully dynamic UAR mode once user space supports it
  IB/mlx5: Limit the scope of struct mlx5_bfreg_info to mlx5_ib
  IB/mlx5: Extend QP creation to get uar page index from user space
  IB/mlx5: Extend CQ creation to get uar page index from user space
  IB/mlx5: Expose UAR object and its alloc/destroy commands
  IB/hfi1: Get rid of a warning
  RDMA/hns: Remove redundant judgment of qp_type
  RDMA/hns: Remove redundant assignment of wc->smac when polling cq
  RDMA/hns: Remove redundant qpc setup operations
  RDMA/hns: Remove meaningless prints
  ...
2020-04-01 18:18:18 -07:00
Linus Torvalds 29d9f30d4c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Fix the iwlwifi regression, from Johannes Berg.

   2) Support BSS coloring and 802.11 encapsulation offloading in
      hardware, from John Crispin.

   3) Fix some potential Spectre issues in qtnfmac, from Sergey
      Matyukevich.

   4) Add TTL decrement action to openvswitch, from Matteo Croce.

   5) Allow paralleization through flow_action setup by not taking the
      RTNL mutex, from Vlad Buslov.

   6) A lot of zero-length array to flexible-array conversions, from
      Gustavo A. R. Silva.

   7) Align XDP statistics names across several drivers for consistency,
      from Lorenzo Bianconi.

   8) Add various pieces of infrastructure for offloading conntrack, and
      make use of it in mlx5 driver, from Paul Blakey.

   9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki.

  10) Lots of parallelization improvements during configuration changes
      in mlxsw driver, from Ido Schimmel.

  11) Add support to devlink for generic packet traps, which report
      packets dropped during ACL processing. And use them in mlxsw
      driver. From Jiri Pirko.

  12) Support bcmgenet on ACPI, from Jeremy Linton.

  13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei
      Starovoitov, and your's truly.

  14) Support XDP meta-data in virtio_net, from Yuya Kusakabe.

  15) Fix sysfs permissions when network devices change namespaces, from
      Christian Brauner.

  16) Add a flags element to ethtool_ops so that drivers can more simply
      indicate which coalescing parameters they actually support, and
      therefore the generic layer can validate the user's ethtool
      request. Use this in all drivers, from Jakub Kicinski.

  17) Offload FIFO qdisc in mlxsw, from Petr Machata.

  18) Support UDP sockets in sockmap, from Lorenz Bauer.

  19) Fix stretch ACK bugs in several TCP congestion control modules,
      from Pengcheng Yang.

  20) Support virtual functiosn in octeontx2 driver, from Tomasz
      Duszynski.

  21) Add region operations for devlink and use it in ice driver to dump
      NVM contents, from Jacob Keller.

  22) Add support for hw offload of MACSEC, from Antoine Tenart.

  23) Add support for BPF programs that can be attached to LSM hooks,
      from KP Singh.

  24) Support for multiple paths, path managers, and counters in MPTCP.
      From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti,
      and others.

  25) More progress on adding the netlink interface to ethtool, from
      Michal Kubecek"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits)
  net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline
  cxgb4/chcr: nic-tls stats in ethtool
  net: dsa: fix oops while probing Marvell DSA switches
  net/bpfilter: remove superfluous testing message
  net: macb: Fix handling of fixed-link node
  net: dsa: ksz: Select KSZ protocol tag
  netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write
  net: stmmac: add EHL 2.5Gbps PCI info and PCI ID
  net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID
  net: stmmac: create dwmac-intel.c to contain all Intel platform
  net: dsa: bcm_sf2: Support specifying VLAN tag egress rule
  net: dsa: bcm_sf2: Add support for matching VLAN TCI
  net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions
  net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT
  net: dsa: bcm_sf2: Disable learning for ASP port
  net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge
  net: dsa: b53: Prevent tagged VLAN on port 7 for 7278
  net: dsa: b53: Restore VLAN entries upon (re)configuration
  net: dsa: bcm_sf2: Fix overflow checks
  hv_netvsc: Remove unnecessary round_up for recv_completion_cnt
  ...
2020-03-31 17:29:33 -07:00
Linus Torvalds a776c270a0 Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "The EFI changes in this cycle are much larger than usual, for two
  (positive) reasons:

   - The GRUB project is showing signs of life again, resulting in the
     introduction of the generic Linux/UEFI boot protocol, instead of
     x86 specific hacks which are increasingly difficult to maintain.
     There's hope that all future extensions will now go through that
     boot protocol.

   - Preparatory work for RISC-V EFI support.

  The main changes are:

   - Boot time GDT handling changes

   - Simplify handling of EFI properties table on arm64

   - Generic EFI stub cleanups, to improve command line handling, file
     I/O, memory allocation, etc.

   - Introduce a generic initrd loading method based on calling back
     into the firmware, instead of relying on the x86 EFI handover
     protocol or device tree.

   - Introduce a mixed mode boot method that does not rely on the x86
     EFI handover protocol either, and could potentially be adopted by
     other architectures (if another one ever surfaces where one
     execution mode is a superset of another)

   - Clean up the contents of 'struct efi', and move out everything that
     doesn't need to be stored there.

   - Incorporate support for UEFI spec v2.8A changes that permit
     firmware implementations to return EFI_UNSUPPORTED from UEFI
     runtime services at OS runtime, and expose a mask of which ones are
     supported or unsupported via a configuration table.

   - Partial fix for the lack of by-VA cache maintenance in the
     decompressor on 32-bit ARM.

   - Changes to load device firmware from EFI boot service memory
     regions

   - Various documentation updates and minor code cleanups and fixes"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
  efi/libstub/arm: Fix spurious message that an initrd was loaded
  efi/libstub/arm64: Avoid image_base value from efi_loaded_image
  partitions/efi: Fix partition name parsing in GUID partition entry
  efi/x86: Fix cast of image argument
  efi/libstub/x86: Use ULONG_MAX as upper bound for all allocations
  efi: Fix a mistype in comments mentioning efivar_entry_iter_begin()
  efi/libstub: Avoid linking libstub/lib-ksyms.o into vmlinux
  efi/x86: Preserve %ebx correctly in efi_set_virtual_address_map()
  efi/x86: Ignore the memory attributes table on i386
  efi/x86: Don't relocate the kernel unless necessary
  efi/x86: Remove extra headroom for setup block
  efi/x86: Add kernel preferred address to PE header
  efi/x86: Decompress at start of PE image load address
  x86/boot/compressed/32: Save the output address instead of recalculating it
  efi/libstub/x86: Deal with exit() boot service returning
  x86/boot: Use unsigned comparison for addresses
  efi/x86: Avoid using code32_start
  efi/x86: Make efi32_pe_entry() more readable
  efi/x86: Respect 32-bit ABI in efi32_pe_entry()
  efi/x86: Annotate the LOADED_IMAGE_PROTOCOL_GUID with SYM_DATA
  ...
2020-03-30 16:13:08 -07:00
YueHaibing b4d8ddf835 RDMA/bnxt_re: make bnxt_re_ib_init static
Fix sparse warning:

drivers/infiniband/hw/bnxt_re/main.c:1313:5:
 warning: symbol 'bnxt_re_ib_init' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20200330110219.24448-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-30 15:03:19 -03:00
Saeed Mahameed e999a7343d Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  mlx5: Remove uninitialized use of key in mlx5_core_create_mkey
  {IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib
  {IB,net}/mlx5: Assign mkey variant in mlx5_ib only
  {IB,net}/mlx5: Setup mkey variant before mr create command invocation

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-29 23:42:11 -07:00
David S. Miller f0b5989745 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor comment conflict in mac80211.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29 21:25:29 -07:00
George Spelvin 3e87f43130 IB/qib: Delete struct qib_ivdev.qp_rnd
I was checking the field to see if it needed the full get_random_bytes()
and discovered it's unused.

Only compile-tested, as I don't have the hardware, but I'm still pretty
confident.

Link: https://lore.kernel.org/r/202003281643.02SGh6eG002694@sdf.org
Signed-off-by: George Spelvin <lkml@sdf.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-29 11:15:16 -03:00
Gustavo A. R. Silva d35dc58dd2 RDMA/hns: Fix uninitialized variable bug
There is a potential execution path in which variable *ret* is returned
without being properly initialized, previously.

Fix this by initializing variable *ret* to 0.

Link: https://lore.kernel.org/r/20200328023539.GA32016@embeddedor
Addresses-Coverity-ID: 1491917 ("Uninitialized scalar variable")
Fixes: 2f49de21f3 ("RDMA/hns: Optimize mhop get flow for multi-hop addressing")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-29 11:09:55 -03:00
Lang Cheng 90e735aecc RDMA/hns: Modify the mask of QP number for CQE of hip08
The hip08 supports up to 1M QPs, so the qpn mask of cqe should be
modified.

Link: https://lore.kernel.org/r/1585194018-4381-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-29 11:04:21 -03:00
Lang Cheng 019cd05ce5 RDMA/hns: Reduce the maximum number of extend SGE per WQE
Just reduce the default number to 64 for backward compatibility, the
driver can still get this configuration from the firmware.

Link: https://lore.kernel.org/r/1585194018-4381-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-29 11:04:21 -03:00
Jihua Tao 9d04d56c47 RDMA/hns: Reduce PFC frames in congestion scenarios
The original value means sending 16 packets at a time, and it should be
configured to 0 which means sending 1 packet instead. It is modified to
reduce the number of PFC frames to make sure the performance meets
expectations when flow control is enabled on hip08.

Link: https://lore.kernel.org/r/1585194018-4381-2-git-send-email-liweihang@huawei.com
Signed-off-by: Jihua Tao <taojihua4@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-29 11:04:21 -03:00
Jason Gunthorpe dbdf8909d0 Merge branch 'mlx5_tx_steering' into rdma.git for-next
Leon Romanovsky says:

====================
Those two patches from Michael extends mlx5_core and mlx5_ib flow steering
to support RDMA TX in similar way to already supported RDMA RX.
====================

Based on the mlx5-next branch at
 git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Due to dependencies

* branch 'mlx5_tx_steering':
  RDMA/mlx5: Add support for RDMA TX flow table
  net/mlx5: Add support for RDMA TX steering
2020-03-27 13:26:59 -03:00
Michael Guralnik af9c38411d RDMA/mlx5: Add support for RDMA TX flow table
Enable user application to add rules for RDMA TX steering table.
Rules in this steering table will allow to steer transmitted RDMA
traffic.

Link: https://lore.kernel.org/r/20200324061425.1570190-3-leon@kernel.org
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27 13:24:48 -03:00
Kaike Wan dfb5394f80 IB/hfi1: Call kobject_put() when kobject_init_and_add() fails
When kobject_init_and_add() returns an error in the function
hfi1_create_port_files(), the function kobject_put() is not called for the
corresponding kobject, which potentially leads to memory leak.

This patch fixes the issue by calling kobject_put() even if
kobject_init_and_add() fails.

Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200326163813.21129.44280.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27 13:13:36 -03:00
Kaike Wan 5c15abc432 IB/hfi1: Fix memory leaks in sysfs registration and unregistration
When the hfi1 driver is unloaded, kmemleak will report the following
issue:

unreferenced object 0xffff8888461a4c08 (size 8):
comm "kworker/0:0", pid 5, jiffies 4298601264 (age 2047.134s)
hex dump (first 8 bytes):
73 64 6d 61 30 00 ff ff sdma0...
backtrace:
[<00000000311a6ef5>] kvasprintf+0x62/0xd0
[<00000000ade94d9f>] kobject_set_name_vargs+0x1c/0x90
[<0000000060657dbb>] kobject_init_and_add+0x5d/0xb0
[<00000000346fe72b>] 0xffffffffa0c5ecba
[<000000006cfc5819>] 0xffffffffa0c866b9
[<0000000031c65580>] 0xffffffffa0c38e87
[<00000000e9739b3f>] local_pci_probe+0x41/0x80
[<000000006c69911d>] work_for_cpu_fn+0x16/0x20
[<00000000601267b5>] process_one_work+0x171/0x380
[<0000000049a0eefa>] worker_thread+0x1d1/0x3f0
[<00000000909cf2b9>] kthread+0xf8/0x130
[<0000000058f5f874>] ret_from_fork+0x35/0x40

This patch fixes the issue by:

- Releasing dd->per_sdma[i].kobject in hfi1_unregister_sysfs().
  - This will fix the memory leak.

- Calling kobject_put() to unwind operations only for those entries in
   dd->per_sdma[] whose operations have succeeded (including the current
   one that has just failed) in hfi1_verbs_register_sysfs().

Cc: <stable@vger.kernel.org>
Fixes: 0cb2aa690c ("IB/hfi1: Add sysfs interface for affinity setup")
Link: https://lore.kernel.org/r/20200326163807.21129.27371.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27 13:13:36 -03:00
Yishai Hadas 0a2fd01c28 IB/mlx5: Move to fully dynamic UAR mode once user space supports it
Move to fully dynamic UAR mode once user space supports it.  In this case
we prevent any legacy mode of UARs on the allocated context and prevent
redundant allocation of the static ones.

Link: https://lore.kernel.org/r/20200324060143.1569116-6-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27 12:59:05 -03:00
Leon Romanovsky 2152862298 IB/mlx5: Limit the scope of struct mlx5_bfreg_info to mlx5_ib
struct mlx5_bfreg_info is used by mlx5_ib only but is exposed to both RDMA
and netdev parts of mlx5 driver. Move that struct to mlx5_ib namespace,
clean vertical space alignment and convert lib_uar_4k from bool to
bitfield.

Link: https://lore.kernel.org/r/20200324060143.1569116-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27 12:59:04 -03:00
Yishai Hadas ac42a5ee92 IB/mlx5: Extend QP creation to get uar page index from user space
Extend QP creation to get uar page index from user space, this mode can be
used with the UAR dynamic mode APIs to allocate/destroy a UAR object.

As part of enabling this option blocked the weird/un-supported cross
channel option which uses index 0 hard-coded.

This QP flag wasn't exposed to user space as part of any formal upstream
release, the dynamic option can allow having valid UAR page index instead.

Link: https://lore.kernel.org/r/20200324060143.1569116-4-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27 12:59:04 -03:00
Yishai Hadas 64d99f6a62 IB/mlx5: Extend CQ creation to get uar page index from user space
Extend CQ creation to get uar page index from user space, this mode can be
used with the UAR dynamic mode APIs to allocate/destroy a UAR object.

Link: https://lore.kernel.org/r/20200324060143.1569116-3-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27 12:59:04 -03:00
Yishai Hadas 342ee59de9 IB/mlx5: Expose UAR object and its alloc/destroy commands
Expose UAR object and its alloc/destroy commands to be used over the ioctl
interface by user space applications.

This API supports both BF & NC modes and enables a dynamic allocation of
UARs once really needed.

As the number of driver objects were limited by the core ones when the
merged tree is prepared, had to decrease the number of core objects to
enable the new UAR object usage.

Link: https://lore.kernel.org/r/20200324060143.1569116-2-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27 12:59:04 -03:00
Mauro Carvalho Chehab a4da83c215 IB/hfi1: Get rid of a warning
The right markup for a variable is @foo, and not @foo[].

Using a wrong markup caused this warning:

	./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:243: WARNING: Inline strong start-string without end-string.

Link: https://lore.kernel.org/r/9dce702510505556d75a13d9641e09218a4b4a65.1584456635.git.mchehab+huawei@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27 12:51:04 -03:00
Weihang Li e0b0722643 RDMA/hns: Remove redundant judgment of qp_type
Type of qp has been checked in check_send_valid(), so this judgment should
be removed.

Link: https://lore.kernel.org/r/1584674622-52773-11-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:29 -03:00
Weihang Li cd4a70bb7d RDMA/hns: Remove redundant assignment of wc->smac when polling cq
The field smac in ib_wc was used for create AH and then it will be treated
as destination mac address in UD sqwqe, but related code about filling smac
into AH has been removed in core. Actually, the dmac in UD sqwqe is parsed
from the dgid in grh which is passed in by ULP now, so this assignment
should be removed.

Link: https://lore.kernel.org/r/1584674622-52773-10-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:29 -03:00
Lang Cheng f4c5d869c8 RDMA/hns: Remove redundant qpc setup operations
Before calling modify_qp_reset_to_init(), the entire qpc mask has been
cleared, so it is no longer necessary to clear the specific fields in the
mask.

Link: https://lore.kernel.org/r/1584674622-52773-9-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:28 -03:00
Wenpeng Liang bceda6e67b RDMA/hns: Remove meaningless prints
ceq and aeq is a ring buffer, consumer index of them will be set to zero
after reaching the maximum value. The warning should be removed or it may
mislead the users.

Link: https://lore.kernel.org/r/1584674622-52773-8-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:28 -03:00
Lang Cheng f91b919687 RDMA/hns: Remove definition of cq doorbell structure
The struct hns_roce_v2_cq_db is unused, it should be removed.

Link: https://lore.kernel.org/r/1584674622-52773-7-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:28 -03:00
Lang Cheng fd72926c33 RDMA/hns: Adjust the qp status value sequence of the hardware
Interchange SQD and SQE to match the protocol.

Link: https://lore.kernel.org/r/1584674622-52773-6-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:28 -03:00
Lijun Ou 99e713f8da RDMA/hns: Optimize hns_roce_alloc_vf_resource()
The capbilities of hardware should be got at first and then used in
hns_roce_alloc_vf_resource(). Also removes an unnecessary if ... else
condition in it.

Link: https://lore.kernel.org/r/1584674622-52773-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:27 -03:00
Lang Cheng d398d4ca5f RDMA/hns: Simplify attribute judgment code
Combine attribute flags before masking them.

Link: https://lore.kernel.org/r/1584674622-52773-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:27 -03:00
Weihang Li 30d41e18c3 RDMA/hns: Fix a wrong judgment of return value
hns_roce_alloc_mtt_range() never return -1, ret should be checked
whether it is zero instead of -1.

Fixes: 1ceb0b11a8 ("RDMA/hns: Fix non-standard error codes")
Link: https://lore.kernel.org/r/1584674622-52773-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:27 -03:00
Lijun Ou ae1c61489c RDMA/hns: Unify format of prints
Use ibdev_err/dbg/warn() instead of dev_err/dbg/warn(), and modify some
prints into format of "failed to do something, ret = n".

Link: https://lore.kernel.org/r/1584674622-52773-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:52:26 -03:00
Sergey Gorenko 26e28deb81 IB/iser: Always check sig MR before putting it to the free pool
libiscsi calls the check_protection transport handler only if SCSI-Respose
is received. So, the handler is never called if iSCSI task is completed
for some other reason like a timeout or error handling. And this behavior
looks correct. But the iSER does not handle this case properly because it
puts a non-checked signature MR to the free pool. Then the error occurs at
reusing the MR because it is not allowed to invalidate a signature MR
without checking.

This commit adds an extra check to iser_unreg_mem_fastreg(), which is a
part of the task cleanup flow. Now the signature MR is checked there if it
is needed.

Link: https://lore.kernel.org/r/20200325151210.1548-1-sergeygo@mellanox.com
Signed-off-by: Sergey Gorenko <sergeygo@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:46:54 -03:00
Zhu Yanjun d0ca2c35dd RDMA/rxe: Set sys_image_guid to be aligned with HW IB devices
The RXE driver doesn't set sys_image_guid and user space applications see
zeros. This causes to pyverbs tests to fail with the following traceback,
because the IBTA spec requires to have valid sys_image_guid.

 Traceback (most recent call last):
   File "./tests/test_device.py", line 51, in test_query_device
     self.verify_device_attr(attr)
   File "./tests/test_device.py", line 74, in verify_device_attr
     assert attr.sys_image_guid != 0

In order to fix it, set sys_image_guid to be equal to node_guid.

Before:
 5: rxe0: ... node_guid 5054:00ff:feaa:5363 sys_image_guid
 0000:0000:0000:0000

After:
 5: rxe0: ... node_guid 5054:00ff:feaa:5363 sys_image_guid
 5054:00ff:feaa:5363

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200323112800.1444784-1-leon@kernel.org
Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 16:45:29 -03:00
Takashi Iwai 23ab5261e2 IB/hfi1: Use scnprintf() for avoiding potential buffer overflow
Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

Link: https://lore.kernel.org/r/20200319154641.23711-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 15:06:14 -03:00
Avihai Horon 987914ab84 RDMA/cm: Update num_paths in cma_resolve_iboe_route error flow
After a successful allocation of path_rec, num_paths is set to 1, but any
error after such allocation will leave num_paths uncleared.

This causes to de-referencing a NULL pointer later on. Hence, num_paths
needs to be set back to 0 if such an error occurs.

The following crash from syzkaller revealed it.

  kasan: CONFIG_KASAN_INLINE enabled
  kasan: GPF could be caused by NULL-ptr deref or user memory access
  general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
  CPU: 0 PID: 357 Comm: syz-executor060 Not tainted 4.18.0+ #311
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
  rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
  RIP: 0010:ib_copy_path_rec_to_user+0x94/0x3e0
  Code: f1 f1 f1 f1 c7 40 0c 00 00 f4 f4 65 48 8b 04 25 28 00 00 00 48 89
  45 c8 31 c0 e8 d7 60 24 ff 48 8d 7b 4c 48 89 f8 48 c1 e8 03 <42> 0f b6
  14 30 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85
  RSP: 0018:ffff88006586f980 EFLAGS: 00010207
  RAX: 0000000000000009 RBX: 0000000000000000 RCX: 1ffff1000d5fe475
  RDX: ffff8800621e17c0 RSI: ffffffff820d45f9 RDI: 000000000000004c
  RBP: ffff88006586fa50 R08: ffffed000cb0df73 R09: ffffed000cb0df72
  R10: ffff88006586fa70 R11: ffffed000cb0df73 R12: 1ffff1000cb0df30
  R13: ffff88006586fae8 R14: dffffc0000000000 R15: ffff88006aff2200
  FS: 00000000016fc880(0000) GS:ffff88006d000000(0000)
  knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000020000040 CR3: 0000000063fec000 CR4: 00000000000006b0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
  ? ib_copy_path_rec_from_user+0xcc0/0xcc0
  ? __mutex_unlock_slowpath+0xfc/0x670
  ? wait_for_completion+0x3b0/0x3b0
  ? ucma_query_route+0x818/0xc60
  ucma_query_route+0x818/0xc60
  ? ucma_listen+0x1b0/0x1b0
  ? sched_clock_cpu+0x18/0x1d0
  ? sched_clock_cpu+0x18/0x1d0
  ? ucma_listen+0x1b0/0x1b0
  ? ucma_write+0x292/0x460
  ucma_write+0x292/0x460
  ? ucma_close_id+0x60/0x60
  ? sched_clock_cpu+0x18/0x1d0
  ? sched_clock_cpu+0x18/0x1d0
  __vfs_write+0xf7/0x620
  ? ucma_close_id+0x60/0x60
  ? kernel_read+0x110/0x110
  ? time_hardirqs_on+0x19/0x580
  ? lock_acquire+0x18b/0x3a0
  ? finish_task_switch+0xf3/0x5d0
  ? _raw_spin_unlock_irq+0x29/0x40
  ? _raw_spin_unlock_irq+0x29/0x40
  ? finish_task_switch+0x1be/0x5d0
  ? __switch_to_asm+0x34/0x70
  ? __switch_to_asm+0x40/0x70
  ? security_file_permission+0x172/0x1e0
  vfs_write+0x192/0x460
  ksys_write+0xc6/0x1a0
  ? __ia32_sys_read+0xb0/0xb0
  ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
  ? do_syscall_64+0x1d/0x470
  do_syscall_64+0x9e/0x470
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 3c86aa70bf ("RDMA/cm: Add RDMA CM support for IBoE devices")
Link: https://lore.kernel.org/r/20200318101741.47211-1-leon@kernel.org
Signed-off-by: Avihai Horon <avihaih@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26 14:43:12 -03:00
Maor Gottlieb ba80013fba RDMA/mlx5: Block delay drop to unprivileged users
It has been discovered that this feature can globally block the RX port,
so it should be allowed for highly privileged users only.

Fixes: 03404e8ae652("IB/mlx5: Add support to dropless RQ")
Link: https://lore.kernel.org/r/20200322124906.1173790-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-25 09:56:30 -03:00
Yishai Hadas 1f3db16188 IB/mlx5: Generally use the WC auto detection test result
Now that we have direct and reliable detection of WC support by the
system, use is broadly. The only case we have to worry about is when the
WC autodetector cannot run.

For this fringe case generally assume that that WC is available, except in
the well defined case of no PAT support on x86 which is tested by calling
arch_can_pci_mmap_wc().

If WC is wrongly assumed to be available then it causes a small
performance hit on paths in userspace that are tuned to the assumption
that WC is available. There is no functional loss.

It is very unlikely that any platforms exist that lack WC and also care
about the micro optimization of WC in the fringe case where autodetection
does not work.

By removing the fairly bogus CONFIG tests this makes WC work broadly on
all arches and all platforms.

Link: https://lore.kernel.org/r/20200318100323.46659-1-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 20:22:21 -03:00
Xi Wang 38dcb35048 RDMA/hns: Optimize mhop put flow for multi-hop addressing
Optimizes hns_roce_table_mhop_get() by encapsulating code about clearing
hem into clear_mhop_hem(), which will make the code flow clearer.

Link: https://lore.kernel.org/r/1584417324-2255-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 20:18:56 -03:00
Xi Wang 2f49de21f3 RDMA/hns: Optimize mhop get flow for multi-hop addressing
Splits hns_roce_table_mhop_get() into 4 sub-functions to make the code flow
clearer.

Link: https://lore.kernel.org/r/1584417324-2255-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 20:18:56 -03:00
Selvin Xavier b1d56fdcb6 RDMA/bnxt_re: Wait for all the CQ events before freeing CQ data structures
Destroy CQ command to firmware returns the num_cnq_events as a
response. This indicates the driver about the number of CQ events
generated for this CQ. Driver should wait for all these events before
freeing the CQ host structures.  Also, add routine to clean all the
pending notification for the CQs getting destroyed. This avoids the
possibility of accessing the CQ data structures after its freed.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1584120842-3200-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 20:15:36 -03:00
Leon Romanovsky 950bf4f177 RDMA/mlx5: Fix access to wrong pointer while performing flush due to error
The main difference between send and receive SW completions is related to
separate treatment of WQ queue. For receive completions, the initial index
to be flushed is stored in "tail", while for send completions, it is in
deleted "last_poll".

  CPU: 54 PID: 53405 Comm: kworker/u161:0 Kdump: loaded Tainted: G           OE    --------- -t - 4.18.0-147.el8.ppc64le #1
  Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
  NIP:  c000003c7c00a000 LR: c00800000e586af4 CTR: c000003c7c00a000
  REGS: c0000036cc9db940 TRAP: 0400   Tainted: G           OE    --------- -t -  (4.18.0-147.el8.ppc64le)
  MSR:  9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24004488  XER: 20040000
  CFAR: c00800000e586af0 IRQMASK: 0
  GPR00: c00800000e586ab4 c0000036cc9dbbc0 c00800000e5f1a00 c0000037d8433800
  GPR04: c000003895a26800 c0000037293f2000 0000000000000201 0000000000000011
  GPR08: c000003895a26c80 c000003c7c00a000 0000000000000000 c00800000ed30438
  GPR12: c000003c7c00a000 c000003fff684b80 c00000000017c388 c00000396ec4be40
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: c00000000151e498 0000000000000010 c000003895a26848 0000000000000010
  GPR24: 0000000000000010 0000000000010000 c000003895a26800 0000000000000000
  GPR28: 0000000000000010 c0000037d8433800 c000003895a26c80 c000003895a26800
  NIP [c000003c7c00a000] 0xc000003c7c00a000
  LR [c00800000e586af4] __ib_process_cq+0xec/0x1b0 [ib_core]
  Call Trace:
  [c0000036cc9dbbc0] [c00800000e586ab4] __ib_process_cq+0xac/0x1b0 [ib_core] (unreliable)
  [c0000036cc9dbc40] [c00800000e586c88] ib_cq_poll_work+0x40/0xb0 [ib_core]
  [c0000036cc9dbc70] [c000000000171f44] process_one_work+0x2f4/0x5c0
  [c0000036cc9dbd10] [c000000000172a0c] worker_thread+0xcc/0x760
  [c0000036cc9dbdc0] [c00000000017c52c] kthread+0x1ac/0x1c0
  [c0000036cc9dbe30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80

Fixes: 8e3b688301 ("RDMA/mlx5: Delete unreachable handle_atomic code by simplifying SW completion")
Link: https://lore.kernel.org/r/20200318091640.44069-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 19:54:57 -03:00
Mike Marciniszyn 2d47fbacf2 RDMA/core: Ensure security pkey modify is not lost
The following modify sequence (loosely based on ipoib) will lose a pkey
modifcation:

- Modify (pkey index, port)
- Modify (new pkey index, NO port)

After the first modify, the qp_pps list will have saved the pkey and the
unit on the main list.

During the second modify, get_new_pps() will fetch the port from qp_pps
and read the new pkey index from qp_attr->pkey_index.  The state will
still be zero, or IB_PORT_PKEY_NOT_VALID. Because of the invalid state,
the new values will never replace the one in the qp pps list, losing the
new pkey.

This happens because the following if statements will never correct the
state because the first term will be false. If the code had been executed,
it would incorrectly overwrite valid values.

  if ((qp_attr_mask & IB_QP_PKEY_INDEX) && (qp_attr_mask & IB_QP_PORT))
	  new_pps->main.state = IB_PORT_PKEY_VALID;

  if (!(qp_attr_mask & (IB_QP_PKEY_INDEX | IB_QP_PORT)) && qp_pps) {
	  new_pps->main.port_num = qp_pps->main.port_num;
	  new_pps->main.pkey_index = qp_pps->main.pkey_index;
	  if (qp_pps->main.state != IB_PORT_PKEY_NOT_VALID)
		  new_pps->main.state = IB_PORT_PKEY_VALID;
  }

Fix by joining the two if statements with an or test to see if qp_pps is
non-NULL and in the correct state.

Fixes: 1dd017882e ("RDMA/core: Fix protection fault in get_pkey_idx_qp_list")
Link: https://lore.kernel.org/r/20200313124704.14982.55907.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 19:53:25 -03:00
Dan Carpenter a766fa8473 IB/mlx5: Fix a NULL vs IS_ERR() check
The kzalloc() function returns NULL, not error pointers.

Fixes: 30f2fe40c7 ("IB/mlx5: Introduce UAPIs to manage packet pacing")
Link: https://lore.kernel.org/r/20200320132641.GF95012@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 19:47:55 -03:00
Andrew Morton 5fb5186383 RDMA/siw: Suppress uninitialized var warning
drivers/infiniband/sw/siw/siw_qp_rx.c: In function siw_proc_send:
./include/linux/spinlock.h:288:3: warning: flags may be used uninitialized in this function [-Wmaybe-uninitialized]
   _raw_spin_unlock_irqrestore(lock, flags); \
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/sw/siw/siw_qp_rx.c:335:16: note: flags was declared here
  unsigned long flags;

Link: https://lore.kernel.org/r/20200323184627.ZWPg91uin%akpm@linux-foundation.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-23 22:22:37 -03:00
Mike Marciniszyn 9a293d1e21 IB/hfi1: Ensure pq is not left on waitlist
The following warning can occur when a pq is left on the dmawait list and
the pq is then freed:

  WARNING: CPU: 47 PID: 3546 at lib/list_debug.c:29 __list_add+0x65/0xc0
  list_add corruption. next->prev should be prev (ffff939228da1880), but was ffff939cabb52230. (next=ffff939cabb52230).
  Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ast ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev drm_panel_orientation_quirks i2c_i801 mei_me lpc_ich mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables
  nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci libahci i2c_algo_bit dca libata ptp pps_core crc32c_intel [last unloaded: i2c_algo_bit]
  CPU: 47 PID: 3546 Comm: wrf.exe Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.41.1.el7.x86_64 #1
  Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019
  Call Trace:
  [<ffffffff91f65ac0>] dump_stack+0x19/0x1b
  [<ffffffff91898b78>] __warn+0xd8/0x100
  [<ffffffff91898bff>] warn_slowpath_fmt+0x5f/0x80
  [<ffffffff91a1dabe>] ? ___slab_alloc+0x24e/0x4f0
  [<ffffffff91b97025>] __list_add+0x65/0xc0
  [<ffffffffc03926a5>] defer_packet_queue+0x145/0x1a0 [hfi1]
  [<ffffffffc0372987>] sdma_check_progress+0x67/0xa0 [hfi1]
  [<ffffffffc03779d2>] sdma_send_txlist+0x432/0x550 [hfi1]
  [<ffffffff91a20009>] ? kmem_cache_alloc+0x179/0x1f0
  [<ffffffffc0392973>] ? user_sdma_send_pkts+0xc3/0x1990 [hfi1]
  [<ffffffffc0393e3a>] user_sdma_send_pkts+0x158a/0x1990 [hfi1]
  [<ffffffff918ab65e>] ? try_to_del_timer_sync+0x5e/0x90
  [<ffffffff91a3fe1a>] ? __check_object_size+0x1ca/0x250
  [<ffffffffc0395546>] hfi1_user_sdma_process_request+0xd66/0x1280 [hfi1]
  [<ffffffffc034e0da>] hfi1_aio_write+0xca/0x120 [hfi1]
  [<ffffffff91a4245b>] do_sync_readv_writev+0x7b/0xd0
  [<ffffffff91a4409e>] do_readv_writev+0xce/0x260
  [<ffffffff918df69f>] ? pick_next_task_fair+0x5f/0x1b0
  [<ffffffff918db535>] ? sched_clock_cpu+0x85/0xc0
  [<ffffffff91f6b16a>] ? __schedule+0x13a/0x860
  [<ffffffff91a442c5>] vfs_writev+0x35/0x60
  [<ffffffff91a4447f>] SyS_writev+0x7f/0x110
  [<ffffffff91f78ddb>] system_call_fastpath+0x22/0x27

The issue happens when wait_event_interruptible_timeout() returns a value
<= 0.

In that case, the pq is left on the list. The code continues sending
packets and potentially can complete the current request with the pq still
on the dmawait list provided no descriptor shortage is seen.

If the pq is torn down in that state, the sdma interrupt handler could
find the now freed pq on the list with list corruption or memory
corruption resulting.

Fix by adding a flush routine to ensure that the pq is never on a list
after processing a request.

A follow-up patch series will address issues with seqlock surfaced in:
https://lore.kernel.org/r/20200320003129.GP20941@ziepe.ca

The seqlock use for sdma will then be converted to a spin lock since the
list_empty() doesn't need the protection afforded by the sequence lock
currently in use.

Fixes: a0d406934a ("staging/rdma/hfi1: Add page lock limit check for SDMA requests")
Link: https://lore.kernel.org/r/20200320200200.23203.37777.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-23 21:57:57 -03:00
Leon Romanovsky fa8a44f6b2 RDMA/efa: Use in-kernel offsetofend() to check field availability
Remove custom and duplicated variant of offsetofend().

Link: https://lore.kernel.org/r/20200310091438.248429-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 21:06:37 -03:00
Kaike Wan 5ab17a24cb IB/hfi1: Remove kobj from hfi1_devdata
The field kobj was added to hfi1_devdata structure to manage the life time
of the hfi1_devdata structure for PSM accesses:

commit e11ffbd575 ("IB/hfi1: Do not free hfi1 cdev parent structure early")

Later another mechanism user_refcount/user_comp was introduced to provide
the same functionality:

commit acd7c8fe14 ("IB/hfi1: Fix an Oops on pci device force remove")

This patch will remove this kobj field, as it is no longer needed.

Link: https://lore.kernel.org/r/20200316210500.7753.4145.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 19:53:47 -03:00
Mike Marciniszyn d61ba1b9ae IB/rdmavt: Delete unused routine
This routine was obsoleted by the patch below.

Delete it.

Fixes: a2a074ef39 ("RDMA: Handle ucontext allocations by IB/core")
Link: https://lore.kernel.org/r/20200316210454.7753.94689.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 19:53:47 -03:00
Lang Cheng 026ded3734 RDMA/hns: Check if depth of qp is 0 before configure
Depth of qp shouldn't be allowed to be set to zero, after ensuring that,
subsequent process can be simplified. And when qp is changed from reset to
reset, the capability of minimum qp depth was used to identify hardware of
hip06, it should be changed into a more readable form.

Link: https://lore.kernel.org/r/1584006624-11846-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 19:30:36 -03:00
Sindhu, Devale 4b34e23f4e i40iw: Report correct firmware version
The driver uses a hard-coded value for FW version and reports an
inconsistent FW version between ibv_devinfo and
/sys/class/infiniband/i40iw/fw_ver.

Retrieve the FW version via a Control QP (CQP) operation and report it
consistently across sysfs and query device.

Fixes: d374984179 ("i40iw: add files for iwarp interface")
Link: https://lore.kernel.org/r/20200313214406.2159-1-shiraz.saleem@intel.com
Reported-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Sindhu, Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 13:53:44 -03:00
Xi Wang d6a3627e31 RDMA/hns: Optimize wqe buffer set flow for post send
Splits hns_roce_v2_post_send() into three sub-functions: set_rc_wqe(),
set_ud_wqe() and update_sq_db() to simplify the code.

Link: https://lore.kernel.org/r/1583839084-31579-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 10:23:12 -03:00
Xi Wang 1133401412 RDMA/hns: Optimize base address table config flow for qp buffer
Currently, before the qp is created, a page size needs to be calculated
for the base address table to store all base addresses in the mtr. As a
result, the parameter configuration of the mtr is complex. So integrate
the process of calculating the base table page size into the hem related
interface to simplify the process of using mtr.

Link: https://lore.kernel.org/r/1583839084-31579-5-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 10:23:12 -03:00
Xi Wang e363f7de4e RDMA/hns: Optimize the wr opcode conversion from ib to hns
Simplify the wr opcode conversion from ib to hns by using a map table
instead of the switch-case statement.

Link: https://lore.kernel.org/r/1583839084-31579-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 10:23:12 -03:00
Xi Wang 00a59d30f3 RDMA/hns: Optimize wqe buffer filling process for post send
Encapsulates the wqe buffer process details for datagram seg, fast mr seg
and atomic seg.

Link: https://lore.kernel.org/r/1583839084-31579-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 10:23:12 -03:00
Xi Wang 6c6e39212b RDMA/hns: Rename wqe buffer related functions
There are serval global functions related to wqe buffer in the hns driver
and are called in different files. These symbols cannot directly represent
the namespace they belong to. So add prefix 'hns_roce_' to 3 wqe buffer
related global functions: get_recv_wqe(), get_send_wqe(), and
get_send_extend_sge().

Link: https://lore.kernel.org/r/1583839084-31579-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-18 10:23:11 -03:00
Selvin Xavier 4e88cef11d RDMA/bnxt_re: Remove unnecessary sched count
Since the lifetime of bnxt_re_task is controlled by the kref of device,
sched_count is no longer required.  Remove it.

Link: https://lore.kernel.org/r/1584117207-2664-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 20:15:03 -03:00
Jason Gunthorpe 8a6c617047 RDMA/bnxt_re: Fix lifetimes in bnxt_re_task
A work queue cannot just rely on the ib_device not being freed, it must
hold a kref on the memory so that the BNXT_RE_FLAG_IBDEV_REGISTERED check
works.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1584117207-2664-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 20:15:03 -03:00
Jason Gunthorpe 3cae58047c RDMA/bnxt_re: Use ib_device_try_get()
There are a couple places in this driver running from a work queue that
need the ib_device to be registered. Instead of using a broken internal
bit rely on the new core code to guarantee device registration.

Link: https://lore.kernel.org/r/1584117207-2664-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 20:15:03 -03:00
Jason Gunthorpe 67b3c8dcea RDMA/cm: Make sure the cm_id is in the IB_CM_IDLE state in destroy
The first switch statement in cm_destroy_id() tries to move the ID to
either IB_CM_IDLE or IB_CM_TIMEWAIT. Both states will block concurrent
MAD handlers from progressing.

Previous patches removed the unreliably lock/unlock sequences in this
flow, this patch removes the extra locking steps and adds the missing
parts to guarantee that destroy reaches IB_CM_IDLE. There is no point in
leaving the ID in the IB_CM_TIMEWAIT state the memory about to be kfreed.

Rework things to hold the lock across all the state transitions and
directly assert when done that it ended up in IB_CM_IDLE as expected.

This was accompanied by a careful audit of all the state transitions here,
which generally did end up in IDLE on their success and non-racy paths.

Link: https://lore.kernel.org/r/20200310092545.251365-16-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:54 -03:00
Jason Gunthorpe 6a8824a74b RDMA/cm: Allow ib_send_cm_sidr_rep() to be done under lock
The first thing ib_send_cm_sidr_rep() does is obtain the lock, so use the
usual unlocked wrapper, locked actor pattern here.

Get rid of the cm_reject_sidr_req() wrapper so each call site can call the
locked or unlocked version as required.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

Link: https://lore.kernel.org/r/20200310092545.251365-15-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:54 -03:00
Jason Gunthorpe 81ddb41f87 RDMA/cm: Allow ib_send_cm_rej() to be done under lock
The first thing ib_send_cm_rej() does is obtain the lock, so use the usual
unlocked wrapper, locked actor pattern here.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

While here simplify some of the logic in the implementation.

Link: https://lore.kernel.org/r/20200310092545.251365-14-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:54 -03:00
Jason Gunthorpe 87cabf3e09 RDMA/cm: Allow ib_send_cm_drep() to be done under lock
The first thing ib_send_cm_drep() does is obtain the lock, so use the
usual unlocked wrapper, locked actor pattern here.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

Link: https://lore.kernel.org/r/20200310092545.251365-13-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:53 -03:00
Jason Gunthorpe e029fdc068 RDMA/cm: Allow ib_send_cm_dreq() to be done under lock
The first thing ib_send_cm_dreq() does is obtain the lock, so use the
usual unlocked wrapper, locked actor pattern here.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

Link: https://lore.kernel.org/r/20200310092545.251365-12-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:53 -03:00
Jason Gunthorpe 00777a68ae RDMA/cm: Add some lockdep assertions for cm_id_priv->lock
These functions all touch state, so must be called under the lock.
Inspection shows this is currently true.

Link: https://lore.kernel.org/r/20200310092545.251365-11-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:53 -03:00
Jason Gunthorpe d1de9a8807 RDMA/cm: Add missing locking around id.state in cm_dup_req_handler
All accesses to id.state must be done under the spinlock.

Fixes: a977049dac ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20200310092545.251365-10-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:53 -03:00