The commit below introduced a few compilation warnings.
In file included from ./include/rdma/ib_verbs.h:64,
from ./include/linux/mlx5/device.h:37,
from ./include/linux/mlx5/driver.h:51,
from drivers/net/ethernet/mellanox/mlx5/core/uar.c:36:
./include/linux/dim.h:378:1: warning: 'rdma_dim_prof' defined but not
used [-Wunused-const-variable=]
rdma_dim_prof[RDMA_DIM_PARAMS_NUM_PROFILES] = {
^~~~~~~~~~~~~
In file included from ./include/rdma/ib_verbs.h:64,
from ./include/linux/mlx5/device.h:37,
from ./include/linux/mlx5/driver.h:51,
from
drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:37:
./include/linux/dim.h:378:1: warning: 'rdma_dim_prof' defined but not
used [-Wunused-const-variable=]
rdma_dim_prof[RDMA_DIM_PARAMS_NUM_PROFILES] = {
^~~~~~~~~~~~~
Since only ib_cq_rdma_dim_work() in drivers/infiniband/core/cq.c uses it,
just move the definition over there.
Fixes: f4915455dc ("linux/dim: Implement RDMA adaptive moderation (DIM)")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/infiniband/sw/siw/siw_cm.c: In function siw_cm_llp_state_change:
drivers/infiniband/sw/siw/siw_cm.c:1278:17: warning: variable s set but not used [-Wunused-but-set-variable]
Fixes: 6c52fdc244 ("rdma/siw: connection management")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If LIBCRC32C and DMA_VIRT_OPS are not enabled:
drivers/infiniband/sw/siw/siw_main.o: In function `siw_newlink':
siw_main.c:(.text+0x35c): undefined reference to `dma_virt_ops'
drivers/infiniband/sw/siw/siw_qp_rx.o: In function `siw_csum_update':
siw_qp_rx.c:(.text+0x16): undefined reference to `crc32c'
Fix the first issue by adding a select of DMA_VIRT_OPS. Fix the second
issue by replacing the unneeded dependency on CRYPTO_CRC32 by a dependency
on LIBCRC32C.
Reported-by: noreply@ellerman.id.au (first issue)
Fixes: c0cf5bdde4 ("rdma/siw: addition to kernel build environment")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
ifa is protected by rcu or rtnl, add the missing locking. In this case we
have to use rtnl since siw_listen_address() is sleeping.
Fixes: 6c52fdc244 ("rdma/siw: connection management")
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There is already a warning if we cannot start any thread, and stopping
those threads is not worth spamming the console.
This also corrects a warning from gcc:
drivers/infiniband/sw/siw/siw_main.c: In function 'siw_create_tx_threads':
drivers/infiniband/sw/siw/siw_main.c:91:11: warning:
variable 'rv' set but not used [-Wunused-but-set-variable]
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
clang warns:
drivers/infiniband/sw/rdmavt/cq.c:260:7: warning: variable 'err' is used
uninitialized whenever 'if' condition is true
[-Wsometimes-uninitialized]
if (err)
^~~
drivers/infiniband/sw/rdmavt/cq.c:310:9: note: uninitialized use occurs
here
return err;
^~~
drivers/infiniband/sw/rdmavt/cq.c:260:3: note: remove the 'if' if its
condition is always false
if (err)
^~~~~~~~
drivers/infiniband/sw/rdmavt/cq.c:253:7: warning: variable 'err' is used
uninitialized whenever 'if' condition is true
[-Wsometimes-uninitialized]
if (!cq->ip) {
^~~~~~~
drivers/infiniband/sw/rdmavt/cq.c:310:9: note: uninitialized use occurs
here
return err;
^~~
drivers/infiniband/sw/rdmavt/cq.c:253:3: note: remove the 'if' if its
condition is always false
if (!cq->ip) {
^~~~~~~~~~~~~~
drivers/infiniband/sw/rdmavt/cq.c:211:9: note: initialize the variable
'err' to silence this warning
int err;
^
= 0
2 warnings generated.
The function scoped err variable is uninitialized when the flow jumps into
the if statement. The if scoped err variable shadows the function scoped
err variable, preventing the err assignments within the if statement to be
reflected at the function level, which will cause uninitialized use when
the goto statements are taken.
Just remove the if scoped err declaration so that there is only one copy
of the err variable for this function.
Fixes: 239b0e52d8 ("IB/hfi1: Move rvt_cq_wc struct into uapi directory")
Link: https://github.com/ClangBuiltLinux/linux/issues/594
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Use the neighbour lock when copying the MAC address from the neighbour
data struct in dst_fetch_ha.
When not using the lock, it is possible for the function to race with
neigh_update(), causing it to copy an torn MAC address:
rdma_resolve_addr()
rdma_resolve_ip()
addr_resolve()
addr_resolve_neigh()
fetch_ha()
dst_fetch_ha()
memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN)
and
net_ioctl()
arp_ioctl()
arp_rec_delete()
arp_invalidate()
neigh_update()
__neigh_update()
memcpy(&neigh->ha, lladdr, dev->addr_len)
It is possible to provoke this error by calling rdma_resolve_addr() in a
tight loop, while deleting the corresponding ARP entry in another tight
loop.
Fixes: 51d4597451 ("infiniband: addr: Consolidate code to fetch neighbour hardware address from dst.")
Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com>
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5.4-rc1 will have new compile time debugging to test that headers can be
compiled stand alone. Many rdma headers are already broken and excluded
from the mechanism, however to avoid compile failures during the merge
window fix enough so that the newly added header compiles clean.
Fixes: 413d334750 ("RDMA/counter: Add set/clear per-port auto mode support")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Mark Zhang <markz@mellanox.com>
While creating new RDMA devices based on netdevice name, consider the net
namespace of the caller skb's socket similar to rest of the doit()
callbacks and nldev_dellink() which deletes the RDMA device created using
nldev_newlink().
Fixes: 3856ec4b93 ("RDMA/core: Add RDMA_NLDEV_CMD_NEWLINK/DELLINK support")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Calculate the correct byte_len on the receiving side when a work
completion is generated with IB_WC_RECV_RDMA_WITH_IMM opcode.
According to the IBA byte_len must indicate the number of written bytes,
whereas it was always equal to zero for the IB_WC_RECV_RDMA_WITH_IMM
opcode, even though data was transferred.
Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Konstantin Taranov <konstantin.taranov@inf.ethz.ch>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Enable RDMA DIM by default for better user experience.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Added parameter in ib_device for enabling dynamic interrupt moderation so
that it can be configured in userspace using rdma tool.
In order to set adaptive-moderation for an ib device the command is:
rdma dev set [DEV] adaptive-moderation [on|off]
Please set on/off.
rdma dev show
0: mlx5_0: node_type ca fw 16.26.0055 node_guid 248a:0703:00a5:29d0
sys_image_guid 248a:0703:00a5:29d0 adaptive-moderation on
rdma resource show cq
dev mlx5_0 cqn 0 cqe 1023 users 4 poll-ctx UNBOUND_WORKQUEUE
adaptive-moderation off comm [ib_core]
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
RDMA DIM implements a different algorithm from net DIM and is based on
completions which is how we can implement interrupt moderation in RDMA.
The algorithm optimizes for number of completions and ratio between
completions and events. In order to avoid long latencies, the
implementation performs fast reduction of moderation level when the
traffic changes.
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Generic DIM
From: Tal Gilboa and Yamin Fridman
Implement net DIM over a generic DIM library, add RDMA DIM
dim.h lib exposes an implementation of the DIM algorithm for
dynamically-tuned interrupt moderation for networking interfaces.
We want a similar functionality for other protocols, which might need to
optimize interrupts differently. Main motivation here is DIM for NVMf
storage protocol.
Current DIM implementation prioritizes reducing interrupt overhead over
latency. Also, in order to reduce DIM's own overhead, the algorithm might
take some time to identify it needs to change profiles. While this is
acceptable for networking, it might not work well on other scenarios.
Here we propose a new structure to DIM. The idea is to allow a slightly
modified functionality without the risk of breaking Net DIM behavior for
netdev. We verified there are no degradations in current DIM behavior with
the modified solution.
Suggested solution:
- Common logic is implemented in lib/dim/dim.c
- Net DIM (existing) logic is implemented in lib/dim/net_dim.c, which uses
the common logic in dim.c
- Any new DIM logic will be implemented in "lib/dim/new_dim.c".
This new implementation will expose modified versions of profiles,
dim_step() and dim_decision().
- DIM API is declared in include/linux/dim.h for all implementations.
Pros for this solution are:
- Zero impact on existing net_dim implementation and usage
- Relatively more code reuse (compared to two separate solutions)
- Increased extensibility
Required for dependencies in the next series.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Userspace expects the IB_TM_CAP_RC bit to indicate that the device
supports RC transport tag matching with rendezvous offload. However the
firmware splits this into two capabilities for eager and rendezvous tag
matching.
Only if the FW supports both modes should userspace be told the tag
matching capability is available.
Cc: <stable@vger.kernel.org> # 4.13
Fixes: eb76189435 ("IB/mlx5: Fill XRQ capabilities")
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
While this contains some uAPI stuff, it was intended to be read by a
kernel doc. So, let's not move it to a different dir, but, instead, just
add it to the driver-api bookset.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Max Gurtovoy says:
====================
Those two patches introduce VHCA tunnel mechanism to DEVX interface
needed for Bluefield SOC. See extensive commit messages for more
information.
====================
Based on the mlx5-next branch from
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux for
dependencies
* branch 'vcha-tunnel':
IB/mlx5: Implement VHCA tunnel mechanism in DEVX
net/mlx5: Introduce VHCA tunnel device capability
This mechanism will allow function-A to perform operations "on behalf" of
function-B via tunnel object. Function-A will have privileges for creating
and using this tunnel object.
For example, in the device emulation feature presented in Bluefield-1 SoC,
using device emulation capability, one can present NVMe function to the
host OS.
Since the NVMe function doesn't have a normal command interface to the HCA
HW, here is a need to create a channel that will be able to issue commands
"on behalf" of this function.
This channel is the VHCA_TUNNEL general object. The emulation software
will create this tunnel for every managed function and issue commands via
devx general cmd interface using the appropriate tunnel ID. When devX
context will receive a command with non-zero vhca_tunnel_id, it will pass
the command as-is down to the HCA.
All the validation, security and resource tracking of the commands and the
created tunneled objects is in the responsibility of the HCA FW. When a
VHCA_TUNNEL object destroyed, the device will issue an internal
FLR (function level reset) to the emulated function associated with this
tunnel. This will destroy all the created resources using the tunnel
mechanism.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
rvt was using ib_sge as part of it's ABI, which is not allowed. Introduce
a new struct with the same layout and use it instead.
Fixes: dabac6e460 ("IB/hfi1: Move receive work queue struct into uapi directory")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The initializer for the variable cannot be inside the macro (and zero
initialization isn't needed anyhow).
include/linux/percpu-defs.h:92:33: warning: '__pcpu_unique_use_cnt' initialized and declared 'extern'
extern __PCPU_DUMMY_ATTRS char __pcpu_unique_##name; \
^~~~~~~~~~~~~~
include/linux/percpu-defs.h:115:2: note: in expansion of macro 'DEFINE_PER_CPU_SECTION'
DEFINE_PER_CPU_SECTION(type, name, "")
^~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/sw/siw/siw_main.c:129:8: note: in expansion of macro 'DEFINE_PER_CPU'
static DEFINE_PER_CPU(atomic_t, use_cnt = ATOMIC_INIT(0));
^~~~~~~~~~~~~~
Also the rules for PER_CPU require the variable names to be globally
unique, so prefix them with siw_
Fixes: b9be6f18cf ("rdma/siw: transmit path")
Fixes: bdcf26bf9b ("rdma/siw: network and RDMA core interface")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In some cases (not in this particular one) variable self-initialization
can lead to undefined behavior. In this case, it is just obscure code.
Signed-off-by: Maksym Planeta <mplaneta@os.inf.tu-dresden.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Here Clean up unnecessary initial value for some variable.
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When smmu is enable, if execute the perftest command and then use 'kill
-9' to exit, follow this operation repeatedly, the kernel will have a high
probability to print the following smmu event:
arm-smmu-v3 arm-smmu-v3.1.auto: event 0x10 received:
arm-smmu-v3 arm-smmu-v3.1.auto: 0x00007d0000000010
arm-smmu-v3 arm-smmu-v3.1.auto: 0x0000020900000080
arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000
arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000
This is because the hw will periodically refresh the qpc cache until the
next reset.
This patch fixed it by removing the action that release qpc memory in the
'hns_roce_qp_free' function.
Fixes: 9a4435375c ("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The format specifier \"%p\" can leak kernel addresses. Use \"%pK\"
instead.
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The buffer size of qp which used to allocate qp buffer space for storing
sqwqe and rqwqe will be the length of buffer space. The kernel driver will
use the buffer address and the same size to get the user memory. The same
size named buff_size of qp. According the algorithm of calculating, The
size of the two is not equal when users set the max sge of sq.
Fixes: b28ca7ccef ("RDMA/hns: Limit extend sq sge num")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When using the device emulation feature (introduced in Bluefield-1 SOC),
a privileged function (the device emulation manager) will be able to
create a channel to execute commands on behalf of the emulated function.
This channel will be a general object of type VHCA_TUNNEL that will have
a unique ID for each emulated function. This ID will be passed in each
cmd that will be issued by the emulation SW in a well known offset in
the command header.
This channel is needed since the emulated function doesn't have a normal
command interface to the HCA HW, but some basic configuration for that
function is needed (e.g. initialize and enable the HCA). For that matter,
a specific command-set was defined and only those commands will be issued
by the HCA.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
When hw resetting, there is no response from hw when driver sending cmdq.
If driver still send cmdq to hw, the reset process may be blocked. So
reset flag should be set to intercept the cmdq command when driver
receiving "notify down" signal.
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Currently, the depth of cq only supports 64K. According to the UM, the
depth of cq is up to 4M, Therefore the ba page size of cqe was modified to
support the maximum specification of cq depth.
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Hip06 reserve 12 qps, Hip08 reserve 8 qps. When the QP is released, the
chip model is not judged, and the Hip08 cannot release the qpn 8~12
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
It uses hns_roce_mtr_init in hns_roce_create_qp_common function. As a
result, it should use hns_roce_mtr_cleanup function for cleaning mtr when
destroying qp.
Fixes: 8d18ad83f1 ("RDMA/hns: Fix bug when wqe num is larger than 16K")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch adds the ability to return the hwstats of per-port default
counters (which can also be queried through sysfs nodes).
Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Provide an option to get current counter mode through RDMA netlink.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Provide an option to allow users to manually bind a qp with a counter
through RDMA netlink. Limit it to users with ADMIN capability only.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In manual mode a QP is bound to a counter manually. If counter is not
specified then a new one will be allocated.
Manual mode is enabled when user binds a QP, and disabled when the last
manually bound QP is unbound.
When auto-mode is turned off and there are counters left, manual mode is
enabled so that the user is able to access these counters.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Since a QP can only be bound to one counter, then if it is bound to a
separate counter, for backward compatibility purpose, the statistic value
must be:
* stat of default counter
+ stat of all running allocated counters
+ stat of all deallocated counters (history stats)
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add support for ib callback counter_alloc_stats() and
counter_update_stats().
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch adds the ability to return all available counters together with
their properties and hwstats.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Provide an option to enable/disable per-port counter auto mode through
RDMA netlink. Limit it to users with ADMIN capability only.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add support for ib callbacks counter_bind_qp(), counter_unbind_qp() and
counter_dealloc().
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add counter set id as a parameter so that this API can be used for
querying any q counter.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Support bind a qp with counter. If counter is null then bind the qp to the
default counter. Different QP state has different operation:
- RESET: Set the counter field so that it will take effective during
RST2INIT change;
- RTS: Issue an RTS2RTS change to update the QP counter;
- Other: Set the counter field and mark the counter_pending flag, when QP
is moved to RTS state and this flag is set, then issue an RTS2RTS
modification to update the counter.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In auto mode all QPs belong to one category are bind automatically to a
single counter set. Currently only "qp type" is supported.
In this mode the qp counter is set in RST2INIT modification, and when a qp
is destroyed the counter is unbound.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add an API to support set/clear per-port auto mode.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Remove is_visible_in_pid_ns() from nldev.c and make it as a restrack API,
so that it can be taken advantage by other parts like counter.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add rdma_restrack_attach_task() which is able to attach a task other then
"current" to a resource.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Introduce statistic counter as a new resource. It allows a user to monitor
specific objects (e.g., QPs) by binding to a counter.
In some cases a user counter resource is created with task other then
"current", because its creation is done as part of rdmatool call.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Required for dependencies in the next patches.
* mlx5-next:
net/mlx5: Add rts2rts_qp_counters_set_id field in hca cap
net/mlx5: Properly name the generic WQE control field
net/mlx5: Introduce TLS TX offload hardware bits and structures
net/mlx5: Refactor mlx5_esw_query_functions for modularity
net/mlx5: E-Switch prepare functions change handler to be modular
net/mlx5: Introduce and use mlx5_eswitch_get_total_vports()
Add rts2rts_qp_counters_set_id field in hca cap so that RTS2RTS
qp modification can be used to change the counter of a QP.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>