Enable RDMA DIM by default for better user experience.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Userspace expects the IB_TM_CAP_RC bit to indicate that the device
supports RC transport tag matching with rendezvous offload. However the
firmware splits this into two capabilities for eager and rendezvous tag
matching.
Only if the FW supports both modes should userspace be told the tag
matching capability is available.
Cc: <stable@vger.kernel.org> # 4.13
Fixes: eb76189435 ("IB/mlx5: Fill XRQ capabilities")
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Max Gurtovoy says:
====================
Those two patches introduce VHCA tunnel mechanism to DEVX interface
needed for Bluefield SOC. See extensive commit messages for more
information.
====================
Based on the mlx5-next branch from
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux for
dependencies
* branch 'vcha-tunnel':
IB/mlx5: Implement VHCA tunnel mechanism in DEVX
net/mlx5: Introduce VHCA tunnel device capability
This mechanism will allow function-A to perform operations "on behalf" of
function-B via tunnel object. Function-A will have privileges for creating
and using this tunnel object.
For example, in the device emulation feature presented in Bluefield-1 SoC,
using device emulation capability, one can present NVMe function to the
host OS.
Since the NVMe function doesn't have a normal command interface to the HCA
HW, here is a need to create a channel that will be able to issue commands
"on behalf" of this function.
This channel is the VHCA_TUNNEL general object. The emulation software
will create this tunnel for every managed function and issue commands via
devx general cmd interface using the appropriate tunnel ID. When devX
context will receive a command with non-zero vhca_tunnel_id, it will pass
the command as-is down to the HCA.
All the validation, security and resource tracking of the commands and the
created tunneled objects is in the responsibility of the HCA FW. When a
VHCA_TUNNEL object destroyed, the device will issue an internal
FLR (function level reset) to the emulated function associated with this
tunnel. This will destroy all the created resources using the tunnel
mechanism.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Here Clean up unnecessary initial value for some variable.
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When smmu is enable, if execute the perftest command and then use 'kill
-9' to exit, follow this operation repeatedly, the kernel will have a high
probability to print the following smmu event:
arm-smmu-v3 arm-smmu-v3.1.auto: event 0x10 received:
arm-smmu-v3 arm-smmu-v3.1.auto: 0x00007d0000000010
arm-smmu-v3 arm-smmu-v3.1.auto: 0x0000020900000080
arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000
arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000
This is because the hw will periodically refresh the qpc cache until the
next reset.
This patch fixed it by removing the action that release qpc memory in the
'hns_roce_qp_free' function.
Fixes: 9a4435375c ("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The format specifier \"%p\" can leak kernel addresses. Use \"%pK\"
instead.
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The buffer size of qp which used to allocate qp buffer space for storing
sqwqe and rqwqe will be the length of buffer space. The kernel driver will
use the buffer address and the same size to get the user memory. The same
size named buff_size of qp. According the algorithm of calculating, The
size of the two is not equal when users set the max sge of sq.
Fixes: b28ca7ccef ("RDMA/hns: Limit extend sq sge num")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When hw resetting, there is no response from hw when driver sending cmdq.
If driver still send cmdq to hw, the reset process may be blocked. So
reset flag should be set to intercept the cmdq command when driver
receiving "notify down" signal.
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Currently, the depth of cq only supports 64K. According to the UM, the
depth of cq is up to 4M, Therefore the ba page size of cqe was modified to
support the maximum specification of cq depth.
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Hip06 reserve 12 qps, Hip08 reserve 8 qps. When the QP is released, the
chip model is not judged, and the Hip08 cannot release the qpn 8~12
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
It uses hns_roce_mtr_init in hns_roce_create_qp_common function. As a
result, it should use hns_roce_mtr_cleanup function for cleaning mtr when
destroying qp.
Fixes: 8d18ad83f1 ("RDMA/hns: Fix bug when wqe num is larger than 16K")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add support for ib callback counter_alloc_stats() and
counter_update_stats().
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add support for ib callbacks counter_bind_qp(), counter_unbind_qp() and
counter_dealloc().
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add counter set id as a parameter so that this API can be used for
querying any q counter.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Support bind a qp with counter. If counter is null then bind the qp to the
default counter. Different QP state has different operation:
- RESET: Set the counter field so that it will take effective during
RST2INIT change;
- RTS: Issue an RTS2RTS change to update the QP counter;
- Other: Set the counter field and mark the counter_pending flag, when QP
is moved to RTS state and this flag is set, then issue an RTS2RTS
modification to update the counter.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Required for dependencies in the next patches.
* mlx5-next:
net/mlx5: Add rts2rts_qp_counters_set_id field in hca cap
net/mlx5: Properly name the generic WQE control field
net/mlx5: Introduce TLS TX offload hardware bits and structures
net/mlx5: Refactor mlx5_esw_query_functions for modularity
net/mlx5: E-Switch prepare functions change handler to be modular
net/mlx5: Introduce and use mlx5_eswitch_get_total_vports()
Make admin commands id easier to distinguish by using relevant bits from
the producer counter.
This allows us to differentiate admin commands with the same producer
index (happens after admin queue overlap), which is helpful when
debugging.
Signed-off-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There is no need in custom memory zeroing, because it can be done
by using kzalloc from the beginning.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The patch below wasn't fully tested for all combinations of module and
configs, and causes a compile failure:
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_ah.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_alloc.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_cmd.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_cq.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_db.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_hem.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_mr.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_pd.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_qp.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_restrack.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_srq.o
see include/linux/module.h for more information
ERROR: "hns_roce_bitmap_cleanup" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_bitmap_init" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_free_cmd_mailbox" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_alloc_cmd_mailbox" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_table_get" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_bitmap_alloc" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_table_find" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
The fix is to put the module sub components in the right line.
Fixes: e9816ddf2a ("RDMA/hns: Cleanup unnecessary exported symbols")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
No need any more to hold mlx5_core_dev on the devx_object, it can be
accessed from ib_dev.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add DEVX support for CQ events by creating and destroying the CQ via
mlx5_core and set an handler to manage its completions.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Implement DEVX dispatching event by looking up for the applicable
subscriptions for the reported event and using their target fd to
signal/set the event.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Enable subscription for device events over DEVX.
Each subscription is added to the two level xarray data structure
according to its event number and the DEVX object information in case was
given with the given target fd.
Those events will be reported over the given fd once will occur.
Downstream patches will mange the dispatching to any subscription.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Register DEVX with with mlx5_core to get async events. This will enable
to dispatch the applicable events to its consumers in down stream patches.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Introduce MLX5_IB_OBJECT_DEVX_ASYNC_EVENT_FD and its initial
implementation.
This object is from type class FD and will be used to read DEVX
async events.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Instead MLX5_TOTAL_VPORTS, use mlx5_eswitch_get_total_vports().
mlx5_eswitch_get_total_vports() in subsequent patch accounts for SF
vports as well.
Expanding MLX5_TOTAL_VPORTS macro would require exposing SF internals to
more generic vport.h header file. Such exposure is not desired.
Hence a mlx5_eswitch_get_total_vports() is introduced.
Given that mlx5_eswitch_get_total_vports() API wants to work on const
mlx5_core_dev*, change its helper functions also to accept const *dev.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Required for dependencies in the next patches.
Resolved the conflicts:
- esw_destroy_offloads_acl_tables() use the newer mlx5_esw_for_all_vports()
version
- esw_offloads_steering_init() drop the cap test
- esw_offloads_init() drop the extra function arguments
* branch 'mlx5-next': (39 commits)
net/mlx5: Expose device definitions for object events
net/mlx5: Report EQE data upon CQ completion
net/mlx5: Report a CQ error event only when a handler was set
net/mlx5: mlx5_core_create_cq() enhancements
net/mlx5: Expose the API to register for ANY event
net/mlx5: Use event mask based on device capabilities
net/mlx5: Fix mlx5_core_destroy_cq() error flow
net/mlx5: E-Switch, Handle UC address change in switchdev mode
net/mlx5: E-Switch, Consider host PF for inline mode and vlan pop
net/mlx5: E-Switch, Use iterator for vlan and min-inline setups
net/mlx5: E-Switch, Reg/unreg function changed event at correct stage
net/mlx5: E-Switch, Consolidate eswitch function number of VFs
net/mlx5: E-Switch, Refactor eswitch SR-IOV interface
net/mlx5: Handle host PF vport mac/guid for ECPF
net/mlx5: E-Switch, Use correct flags when configuring vlan
net/mlx5: Reduce dependency on enabled_vfs counter and num_vfs
net/mlx5: Don't handle VF func change if host PF is disabled
net/mlx5: Limit scope of mlx5_get_next_phys_dev() to PCI PF devices
net/mlx5: Move pci status reg access mutex to mlx5_pci_init
net/mlx5: Rename mlx5_pci_dev_type to mlx5_coredev_type
...
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Currently during dual port IB device registration in below code flow,
ib_register_device()
ib_device_register_sysfs()
ib_setup_port_attrs()
add_port()
get_counter_table()
get_perf_mad()
process_mad()
mlx5_ib_process_mad()
mlx5_ib_process_mad() fails on 2nd port when both the ports are not fully
setup at the device level (because 2nd port is unaffiliated).
As a result, get_perf_mad() registers different PMA counter group for 1st
and 2nd port, namely pma_counter_ext and pma_counter. However both ports
have the same capability and counter offsets.
Due to this when counters are read by the user via sysfs in below code
flow, counters are queried from wrong location from the device mainly from
PPCNT instead of VPORT counters.
show_pma_counter()
get_perf_mad()
process_mad()
mlx5_ib_process_mad()
process_pma_cmd()
This shows all zero counters for 2nd port.
To overcome this, process_pma_cmd() is invoked, and when unaffiliated port
is not yet setup during device registration phase, make the query on the
first port. while at it, only process_pma_cmd() needs to work on the
native port number and underlying mdev, so shift the get, put calls to
where its needed inside process_pma_cmd().
Fixes: 212f2a87b7 ("IB/mlx5: Route MADs for dual port RoCE")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Report EQE data upon CQ completion to let upper layers use this data.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Enhance mlx5_core_create_cq() to get the command out buffer from the
callers to let them use the output.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Use the reported device capabilities for the supported user events (i.e.
affiliated and un-affiliated) to set the EQ mask.
As the event mask can be up to 256 defined by 4 entries of u64 change
the applicable code to work accordingly.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function 'hns_roce_function_clear':
drivers/infiniband/hw/hns/hns_roce_hw_v2.c:1135:7: warning:
variable 'fclr_write_fail_flag' set but not used [-Wunused-but-set-variable]
It is never used, so can be removed.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The API for ib_query_qp requires the driver to set qp_state and
cur_qp_state on return, add the missing sets.
Fixes: d374984179 ("i40iw: add files for iwarp interface")
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Use kmemdump instead of kzmalloc + memcpy.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In commit af7ddd8a62 ("Merge tag 'dma-mapping-4.21' of
git://git.infradead.org/users/hch/dma-mapping"),
dma_alloc_coherent/dmam_alloc_coherent always zeroed the returned memory.
So the memset after a coherent allocation function is not needed.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Devlink eswitch mode is not necessarily related to SR-IOV, e.g, ECPF
can be at offload mode when SR-IOV is not enabled.
Rename the interface and eswitch mode names to decouple from SR-IOV,
and cleanup eswitch messages accordingly.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When an IB rep is loaded, netdev for the same vport is saved for later
reference. However, it's not cleaned up when doing unload. For ECPF,
kernel crashes when driver is referring to the already removed netdev.
Following steps lead to a shown call trace:
1. Create n VFs from host PF
2. Distroy the VFs
3. Run "rdma link" from ARM
Call trace:
mlx5_ib_get_netdev+0x9c/0xe8 [mlx5_ib]
mlx5_query_port_roce+0x268/0x558 [mlx5_ib]
mlx5_ib_rep_query_port+0x14/0x34 [mlx5_ib]
ib_query_port+0x9c/0xfc [ib_core]
fill_port_info+0x74/0x28c [ib_core]
nldev_port_get_doit+0x1a8/0x1e8 [ib_core]
rdma_nl_rcv_msg+0x16c/0x1c0 [ib_core]
rdma_nl_rcv+0xe8/0x144 [ib_core]
netlink_unicast+0x184/0x214
netlink_sendmsg+0x288/0x354
sock_sendmsg+0x18/0x2c
__sys_sendto+0xbc/0x138
__arm64_sys_sendto+0x28/0x34
el0_svc_common+0xb0/0x100
el0_svc_handler+0x6c/0x84
el0_svc+0x8/0xc
Cleanup the rep and netdev reference when unloading IB rep.
Fixes: 26628e2d58 ("RDMA/mlx5: Move to single device multiport ports in switchdev mode")
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In the single IB device mode, the mapping between vport number and
rep relies on a counter. However for dynamic vport allocation, it is
desired to keep consistent map of eswitch vport and IB port.
Hence, simplify code to remove the free running counter and instead
use the available vport index during load/unload sequence from the
eswitch.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Suggested-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The call in debugfs.c for try_module_get() is not needed. A reference to
the module will be taken by the VFS layer as long as the owner field is
set in the file ops struct. So set this as well as remove the call.
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This was missed in the original implementation of the memory management
extensions.
Fixes: 0db3dfa03c ("IB/hfi1: Work request processing for fast register mr and invalidate")
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Uninline the aspm API since it increases code space for no reason.
Move the aspm module param to the new aspm C file.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add some helper functions to hide struct rvt_swqe details.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Historically rdmavt destroy_ah() has returned an -EBUSY when the AH has a
non-zero reference count. IBTA 11.2.2 notes no such return value or error
case:
Output Modifiers:
- Verb results:
- Operation completed successfully.
- Invalid HCA handle.
- Invalid address handle.
ULPs never test for this error and this will leak memory.
The reference count exists to allow for driver independent progress
mechanisms to process UD SWQEs in parallel with post sends. The SWQE will
hold a reference count until the UD SWQE completes and then drops the
reference.
Fix by removing need to reference count the AH. Add a UD specific
allocation to each SWQE entry to cache the necessary information for
independent progress. Copy the information during the post send
processing.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When a completion queue is full, the associated queue pairs are not put
into the error state. According to the IBTA specification, this is a
violation.
Quote from IBTA spec:
C9-218: A Requester Class F error occurs when the CQ is inaccessible or
full and an attempt is made to complete a WQE. The Affected QP shall be
moved to the error state and affiliated asynchronous errors generated as
described in 11.6.3.1 Affiliated Asynchronous Events on page 678. The
current WQE and any subsequent WQEs are left in an unknown state.
C11-37: The CI shall generate a CQ Error when a CQ overrun is
detected. This condition will result in an Affiliated Asynchronous Error
for any associated Work Queues when they attempt to use that
CQ. Completions can no longer be added to the CQ. It is not guaranteed
that completions present in the CQ at the time the error occurred can be
retrieved. Possible causes include a CQ overrun or a CQ protection error.
Put the qp in error state when cq is full. Implement a state called full
to continue to put other associated QPs in error state.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The rvt_cq_wc struct elements are shared between rdmavt and the providers
but not in uapi directory. As per the comment in
https://marc.info/?l=linux-rdma&m=152296522708522&w=2 The hfi1 driver and
the rdma core driver are not using shared structures in the uapi
directory.
In that case, move rvt_cq_wc struct into the rvt-abi.h header file and
create a rvt_k_cq_w for the kernel completion queue.
Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
VTS5zps=
=huBY
-----END PGP SIGNATURE-----
Merge tag 'v5.2-rc6' into rdma.git for-next
For dependencies in next patches.
Resolve conflicts:
- Use uverbs_get_cleared_udata() with new cq allocation flow
- Continue to delete nes despite SPDX conflict
- Resolve list appends in mlx5_command_str()
- Use u16 for vport_rule stuff
- Resolve list appends in struct ib_client
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If vport metadata matching is enabled in eswitch, the rule created
must be changed to match on the metadata, instead of source port.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Refactor the flow data structures, add new flow_context and move
flow_tag into it, as flow_tag doesn't belong to the rule action.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
There is a spelling mistake in an dev_err message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>