Commit Graph

269 Commits

Author SHA1 Message Date
Jason Gunthorpe 65a1662015 RDMA/bnxt_re: Using vmalloc requires including vmalloc.h
Add it

Fixes: 0c4dcd6028 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-26 13:24:41 -04:00
Devesh Sharma 6ccad8483b RDMA/bnxt_re: use ibdev based message printing functions
Replacing the dev_err/dbg/warn with ibdev_err/dbg/warn. In the IB device
provider driver these functions are recommended to use.

Currently qplib layer function calls has not been replaced due to
unavailability of ib_device pointer at that layer.

Link: https://lore.kernel.org/r/1581786665-23705-9-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:44 -04:00
Devesh Sharma 6f53196bc5 RDMA/bnxt_re: Refactor doorbell management functions
Moving all the fast path doorbell functions at one place under
qplib_res.h. To pass doorbell record information a new structure
bnxt_qplib_db_info has been introduced.  Every roce object holds an
instance of this structure and doorbell information is initialized during
resource creation.

When DB is rung only the current queue index is read from hardware ring
and rest of the data is taken from pre-initialized dbinfo structure.

Link: https://lore.kernel.org/r/1581786665-23705-8-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:44 -04:00
Devesh Sharma 9555352bac RDMA/bnxt_re: Refactor notification queue management code
Cleaning up the notification queue data structures and management
code. The CQ and SRQ event handlers have been type defined instead of
in-place declaration. NQ doorbell register descriptor has been added in
base NQ structure.  The nq->vector has been renamed to nq->msix_vec.

Link: https://lore.kernel.org/r/1581786665-23705-7-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma cee0c7bba4 RDMA/bnxt_re: Refactor command queue management code
Refactoring the command queue (rcfw) management code. A new data-structure
is introduced to describe the bar register.  each object which deals with
mmio space should have a descriptor structure. This structure specifically
hold DB register information.  Thus, slow path creq structure now hold a
bar register descriptor.

Further cleanup the rcfw structure to introduce the command queue context
and command response event queue context structures. Rest of the rcfw
related code has been touched to incorporate these three structures.

Link: https://lore.kernel.org/r/1581786665-23705-6-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma b08fe048a6 RDMA/bnxt_re: Refactor net ring allocation function
Introducing a new attribute structure to reduce the long list of arguments
passed in bnxt_re_net_ring_alloc() function.

The caller of bnxt_re_net_ring_alloc should fill in the list of attributes
in bnxt_re_ring_attr structure and then pass the pointer to the function.

Link: https://lore.kernel.org/r/1581786665-23705-5-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma 0c4dcd6028 RDMA/bnxt_re: Refactor hardware queue memory allocation
At top level there are three major data structure addition.  viz
bnxt_qplib_hwq_attr, bnxt_qplib_sg_info and bnxt_qplib_tqm_ctx

Intorduction of first data structure reduces the arguments list to
bnxt_re_alloc_init_hwq() function. There are changes all over the driver
code to incorporate this new structure. The caller needs to fill the
attribute data structure and pass to this function.

The second data structure is to pass memory region description
viz. sghead, page_size and page_shift. There are changes all over the
driver code to initialize bnxt_re_sg_info data structure. The new data
structure helps to reduce the argument list of __alloc_pbl() function
call.

Till now the TQM rings related members were not collected under any
specific data-structure making it hard to manage. The third data
sctructure bnxt_qplib_tqm_ctx is added to refactor the TQM queue
allocation and initialization.

Link: https://lore.kernel.org/r/1581786665-23705-4-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma 0cfb329db9 RDMA/bnxt_re: Replace chip context structure with pointer
The chip_ctx member in bnxt_re_dev structure is now a pointer to struct
bnxt_qplib_chip_ctx. Since the member type has changed there are changes
in rest of the code wherever dev->chip_ctx is used.

Link: https://lore.kernel.org/r/1581786665-23705-3-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma 8dae419f9e RDMA/bnxt_re: Refactor queue pair creation code
Restructuring the bnxt_re_create_qp function. Listing below the major
changes:
 - Monolithic central part of create_qp where attributes are initialized
   is now enclosed in one function and this new function has few more
   sub-functions.
 - Top level qp limit checking code moved to a function.
 - GSI QP creation and GSI Shadow qp creation code is handled in a sub
   function.

Link: https://lore.kernel.org/r/1581786665-23705-2-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:42 -04:00
Selvin Xavier 0a01623b74 RDMA/bnxt_re: Use rdma_read_gid_hw_context to retrieve HW gid index
bnxt_re HW maintains a GID table with only a single entry for the two
duplicate GID entries (v1 and v2). Driver needs to map stack gid index to
the HW table gid index.  Use the new API rdma_read_gid_hw_context () to
retrieve the HW GID context to get the HW table index.

Link: https://lore.kernel.org/r/1582107594-5180-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-19 16:54:25 -04:00
Linus Torvalds 8fdd4019bc RDMA subsystem updates for 5.6
- Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4, rxe,
   i40iw
 
 - Larger series doing cleanup and rework for hns and hfi1.
 
 - Some general reworking of the CM code to make it a little more
   understandable
 
 - Unify the different code paths connected to the uverbs FD scheme
 
 - New UAPI ioctls conversions for get context and get async fd
 
 - Trace points for CQ and CM portions of the RDMA stack
 
 - mlx5 driver support for virtio-net formatted rings as RDMA raw ethernet QPs
 
 - verbs support for setting the PCI-E relaxed ordering bit on DMA traffic
   connected to a MR
 
 - A couple of bug fixes that came too late to make rc7
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl4zPwQACgkQOG33FX4g
 mxoURw//fuQmuJ7aTMH+0qrhaZUmzXOcI/WKvY0YMyYLvxolRcIO+uCL239wxezR
 9iTHPO7HeYXUQ4W8Hi/fTyuQ9hzaPOP3wgOJfQhm4QT/XDpRW0H3Mb+hTLHTUAcA
 rgKc9suAn+5BbIDOz7hEfeOTssx1wYrLsaHDc11NZ42JuG6uvPR33lhXiKWG+5tH
 2MpfeTU6BjL035dm3YZXCo+ouobpdMuvzJItYIsB2E5Nl0s91SMzsymIYiD0gb3t
 yUJ3wqPW3pchfAl8VEn+W5AHTUYYgGjmEblL8WdVq5JRrkQgQzj8QtCRT9NOPAT0
 LivCvgBrm0kscaQS2TjtG56Ojbwz8z1QPE/4shf0pj/G2lZfacYDAeaUc/2VafxY
 y/KG+3dB1DxtYY3eXJUxbB7Vpk7kfr35p5b75NdMhd2t49oPgV7EKoZMLYGzfX4S
 PtyNyNSiwx8qsRTr4lznOMswmrDLfG4XiywWgYo6NGOWyKYlARWIYBAEQZ0DPTiE
 9mqJ19gusdSdAgm8LGDInPmH6/AojGOVzYonJFWdlOtwCXGNXL4Gx02x4WYHykDG
 w+oy5NMJbU3b6+MWEagkuQNcrwqv02MT1mB/Lgv4GPm6rS0UXR7zUPDeccE50fSL
 X36k28UlftlPlaD7PeJdTOAhyBv5DxfpL5rbB2TfpUTpNxjayuU=
 =hepK
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A very quiet cycle with few notable changes. Mostly the usual list of
  one or two patches to drivers changing something that isn't quite rc
  worthy. The subsystem seems to be seeing a larger number of rework and
  cleanup style patches right now, I feel that several vendors are
  prepping their drivers for new silicon.

  Summary:

   - Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4,
     rxe, i40iw

   - Larger series doing cleanup and rework for hns and hfi1.

   - Some general reworking of the CM code to make it a little more
     understandable

   - Unify the different code paths connected to the uverbs FD scheme

   - New UAPI ioctls conversions for get context and get async fd

   - Trace points for CQ and CM portions of the RDMA stack

   - mlx5 driver support for virtio-net formatted rings as RDMA raw
     ethernet QPs

   - verbs support for setting the PCI-E relaxed ordering bit on DMA
     traffic connected to a MR

   - A couple of bug fixes that came too late to make rc7"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (108 commits)
  RDMA/core: Make the entire API tree static
  RDMA/efa: Mask access flags with the correct optional range
  RDMA/cma: Fix unbalanced cm_id reference count during address resolve
  RDMA/umem: Fix ib_umem_find_best_pgsz()
  IB/mlx4: Fix leak in id_map_find_del
  IB/opa_vnic: Spelling correction of 'erorr' to 'error'
  IB/hfi1: Fix logical condition in msix_request_irq
  RDMA/cm: Remove CM message structs
  RDMA/cm: Use IBA functions for complex structure members
  RDMA/cm: Use IBA functions for simple structure members
  RDMA/cm: Use IBA functions for swapping get/set acessors
  RDMA/cm: Use IBA functions for simple get/set acessors
  RDMA/cm: Add SET/GET implementations to hide IBA wire format
  RDMA/cm: Add accessors for CM_REQ transport_type
  IB/mlx5: Return the administrative GUID if exists
  RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence
  IB/mlx4: Fix memory leak in add_gid error flow
  IB/mlx5: Expose RoCE accelerator counters
  RDMA/mlx5: Set relaxed ordering when requested
  RDMA/core: Add the core support field to METHOD_GET_CONTEXT
  ...
2020-01-31 14:40:36 -08:00
Linus Torvalds bd2463ac7d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Add WireGuard

 2) Add HE and TWT support to ath11k driver, from John Crispin.

 3) Add ESP in TCP encapsulation support, from Sabrina Dubroca.

 4) Add variable window congestion control to TIPC, from Jon Maloy.

 5) Add BCM84881 PHY driver, from Russell King.

 6) Start adding netlink support for ethtool operations, from Michal
    Kubecek.

 7) Add XDP drop and TX action support to ena driver, from Sameeh
    Jubran.

 8) Add new ipv4 route notifications so that mlxsw driver does not have
    to handle identical routes itself. From Ido Schimmel.

 9) Add BPF dynamic program extensions, from Alexei Starovoitov.

10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes.

11) Add support for macsec HW offloading, from Antoine Tenart.

12) Add initial support for MPTCP protocol, from Christoph Paasch,
    Matthieu Baerts, Florian Westphal, Peter Krystad, and many others.

13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu
    Cherian, and others.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits)
  net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC
  udp: segment looped gso packets correctly
  netem: change mailing list
  qed: FW 8.42.2.0 debug features
  qed: rt init valid initialization changed
  qed: Debug feature: ilt and mdump
  qed: FW 8.42.2.0 Add fw overlay feature
  qed: FW 8.42.2.0 HSI changes
  qed: FW 8.42.2.0 iscsi/fcoe changes
  qed: Add abstraction for different hsi values per chip
  qed: FW 8.42.2.0 Additional ll2 type
  qed: Use dmae to write to widebus registers in fw_funcs
  qed: FW 8.42.2.0 Parser offsets modified
  qed: FW 8.42.2.0 Queue Manager changes
  qed: FW 8.42.2.0 Expose new registers and change windows
  qed: FW 8.42.2.0 Internal ram offsets modifications
  MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver
  Documentation: net: octeontx2: Add RVU HW and drivers overview
  octeontx2-pf: ethtool RSS config support
  octeontx2-pf: Add basic ethtool support
  ...
2020-01-28 16:02:33 -08:00
Linus Torvalds 6a1000bd27 ioremap changes for 5.6
- remove ioremap_nocache given that is is equivalent to
    ioremap everywhere
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl4vKHwLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMPGBAAuVNUZaZfWYHpiVP2oRcUQUguFiD3NTbknsyzV2oH
 J9P0GfeENSKwE9OOhZ7XIjnCZAJwQgTK/ppQY5yiQ/KAtYyyXjXEJ6jqqjiTDInr
 +3+I3t/LhkgrK7tMrb7ylTGa/d7KhaciljnOXC8+b75iddvM9I1z2pbHDbppZMS9
 wT4RXL/cFtRb85AfOyPLybcka3f5P2gGvQz38qyimhJYEzHDXZu9VO1Bd20f8+Xf
 eLBKX0o6yWMhcaPLma8tm0M0zaXHEfLHUKLSOkiOk+eHTWBZ3b/w5nsOQZYZ7uQp
 25yaClbameAn7k5dHajduLGEJv//ZjLRWcN3HJWJ5vzO111aHhswpE7JgTZJSVWI
 ggCVkytD3ESXapvswmACSeCIDMmiJMzvn6JvwuSMVB7a6e5mcqTuGo/FN+DrBF/R
 IP+/gY/T7zIIOaljhQVkiEIIwiD/akYo0V9fheHTBnqcKEDTHV4WjKbeF6aCwcO+
 b8inHyXZSKSMG//UlDuN84/KH/o1l62oKaB1uDIYrrL8JVyjAxctWt3GOt5KgSFq
 wVz1lMw4kIvWtC/Sy2H4oB+RtODLp6yJDqmvmPkeJwKDUcd/1JKf0KsZ8j3FpGei
 /rEkBEss0KBKyFAgBSRO2jIpdj2epgcBcsdB/r5mlhcn8L77AS6mHbA173kY4pQ/
 Kdg=
 =TUCJ
 -----END PGP SIGNATURE-----

Merge tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap

Pull ioremap updates from Christoph Hellwig:
 "Remove the ioremap_nocache API (plus wrappers) that are always
  identical to ioremap"

* tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap:
  remove ioremap_nocache and devm_ioremap_nocache
  MIPS: define ioremap_nocache to ioremap
2020-01-27 13:03:00 -08:00
Jason Gunthorpe e8b3a426fb Use ODP MRs for kernel ULPs
The following series extends MR creation routines to allow creation of
 user MRs through kernel ULPs as a proxy. The immediate use case is to
 allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
 MRs to be created and such MRs were not possible to create prior this
 series.
 
 The first part of this patchset extends RDMA to have special verb
 ib_reg_user_mr(). The common use case that uses this function is a
 userspace application that allocates memory for HCA access but the
 responsibility to register the memory at the HCA is on an kernel ULP.
 This ULP acts as an agent for the userspace application.
 
 The second part provides advise MR functionality for ULPs. This is
 integral part of ODP flows and used to trigger pagefaults in advance
 to prepare memory before running working set.
 
 The third part is actual user of those in-kernel APIs.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQT1m3YD37UfMCUQBNwp8NhrnBAZsQUCXiVO8AAKCRAp8NhrnBAZ
 scTrAP9gb0d3qv0IOtHw5aGI1DAgjTUn/SzUOnsjDEn7DIoh9gEA2+ZmaEyLXKrl
 +UcZb31auy5P8ueJYokRLhLAyRcOIAg=
 =yaHb
 -----END PGP SIGNATURE-----

Merge tag 'rds-odp-for-5.5' into rdma.git for-next

From https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma

Leon Romanovsky says:

====================
Use ODP MRs for kernel ULPs

The following series extends MR creation routines to allow creation of
user MRs through kernel ULPs as a proxy. The immediate use case is to
allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
MRs to be created and such MRs were not possible to create prior this
series.

The first part of this patchset extends RDMA to have special verb
ib_reg_user_mr(). The common use case that uses this function is a
userspace application that allocates memory for HCA access but the
responsibility to register the memory at the HCA is on an kernel ULP.
This ULP acts as an agent for the userspace application.

The second part provides advise MR functionality for ULPs. This is
integral part of ODP flows and used to trigger pagefaults in advance
to prepare memory before running working set.

The third part is actual user of those in-kernel APIs.
====================

* tag 'rds-odp-for-5.5':
  net/rds: Use prefetch for On-Demand-Paging MR
  net/rds: Handle ODP mr registration/unregistration
  net/rds: Detect need of On-Demand-Paging memory registration
  RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
  IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs
  RDMA/mlx5: Don't fake udata for kernel path
  IB/mlx5: Add ODP WQE handlers for kernel QPs
  IB/core: Add interface to advise_mr for kernel users
  IB/core: Introduce ib_reg_user_mr
  IB: Allow calls to ib_umem_get from kernel ULPs

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-21 09:55:04 -04:00
Moni Shoua c320e527e1 IB: Allow calls to ib_umem_get from kernel ULPs
So far the assumption was that ib_umem_get() and ib_umem_odp_get()
are called from flows that start in UVERBS and therefore has a user
context. This assumption restricts flows that are initiated by ULPs
and need the service that ib_umem_get() provides.

This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device
directly by relying on the fact that both UVERBS and ULPs sets that
field correctly.

Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:14:28 +02:00
Christoph Hellwig 4bdc0d676a remove ioremap_nocache and devm_ioremap_nocache
ioremap has provided non-cached semantics by default since the Linux 2.6
days, so remove the additional ioremap_nocache interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2020-01-06 09:45:59 +01:00
Selvin Xavier 53bb802315 RDMA/bnxt_re: Report more number of completion vectors
Report the the data path MSIx vectors allocated by driver as number of
completion vectors. One interrupt vector is used for Control path. So
reporting one less than the total number of MSIx vectors allocated by the
driver.

Link: https://lore.kernel.org/r/1574671174-5064-7-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03 15:45:31 -04:00
Selvin Xavier c527572358 RDMA/bnxt_re: Fix Send Work Entry state check while polling completions
Some adapters need a fence Work Entry to handle retransmission.  Currently
the driver checks for this condition, only if the Send queue entry is
signalled. Implement the condition check, irrespective of the signalled
state of the Work queue entries

Failure to add the fence can result in access to memory that is already
marked as completed, triggering data corruption, transmission failure,
IOMMU failures, etc.

Fixes: 9152e0b722 ("RDMA/bnxt_re: HW workarounds for handling specific conditions")
Link: https://lore.kernel.org/r/1574671174-5064-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03 15:27:22 -04:00
Selvin Xavier 9a4467a6b2 RDMA/bnxt_re: Avoid freeing MR resources if dereg fails
The driver returns an error code for MR dereg, but frees the MR structure.
When the MR dereg is retried due to previous error, the system crashes as
the structure is already freed.

  BUG: unable to handle kernel NULL pointer dereference at 00000000000001b8
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 7 PID: 12178 Comm: ib_send_bw Kdump: loaded Not tainted 4.18.0-124.el8.x86_64 #1
  Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.1.10 03/10/2015
  RIP: 0010:__dev_printk+0x2a/0x70
  Code: 0f 1f 44 00 00 49 89 d1 48 85 f6 0f 84 f6 2b 00 00 4c 8b 46 70 4d 85 c0 75 04 4c 8b
46 10 48 8b 86 a8 00 00 00 48 85 c0 74 16 <48> 8b 08 0f be 7f 01 48 c7 c2 13 ac ac 83 83 ef 30 e9 10 fe ff ff
  RSP: 0018:ffffaf7c04607a60 EFLAGS: 00010006
  RAX: 00000000000001b8 RBX: ffffa0010c91c488 RCX: 0000000000000246
  RDX: ffffaf7c04607a68 RSI: ffffa0010c91caa8 RDI: ffffffff83a788eb
  RBP: ffffaf7c04607ac8 R08: 0000000000000000 R09: ffffaf7c04607a68
  R10: 0000000000000000 R11: 0000000000000001 R12: ffffaf7c04607b90
  R13: 000000000000000e R14: 0000000000000000 R15: 00000000ffffa001
  FS:  0000146fa1f1cdc0(0000) GS:ffffa0012fac0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000001b8 CR3: 000000007680a003 CR4: 00000000001606e0
  Call Trace:
   dev_err+0x6c/0x90
   ? dev_printk_emit+0x4e/0x70
   bnxt_qplib_rcfw_send_message+0x594/0x660 [bnxt_re]
   ? dev_err+0x6c/0x90
   bnxt_qplib_free_mrw+0x80/0xe0 [bnxt_re]
   bnxt_re_dereg_mr+0x2e/0xd0 [bnxt_re]
   ib_dereg_mr+0x2f/0x50 [ib_core]
   destroy_hw_idr_uobject+0x20/0x70 [ib_uverbs]
   uverbs_destroy_uobject+0x2e/0x170 [ib_uverbs]
   __uverbs_cleanup_ufile+0x6e/0x90 [ib_uverbs]
   uverbs_destroy_ufile_hw+0x61/0x130 [ib_uverbs]
   ib_uverbs_close+0x1f/0x80 [ib_uverbs]
   __fput+0xb7/0x230
   task_work_run+0x8a/0xb0
   do_exit+0x2da/0xb40
...
  RIP: 0033:0x146fa113a387
  Code: Bad RIP value.
  RSP: 002b:00007fff945d1478 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff02
  RAX: 0000000000000000 RBX: 000055a248908d70 RCX: 0000000000000000
  RDX: 0000146fa1f2b000 RSI: 0000000000000001 RDI: 000055a248906488
  RBP: 000055a248909630 R08: 0000000000010000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 000055a248906488
  R13: 0000000000000001 R14: 0000000000000000 R15: 000055a2489095f0

Do not free the MR structures, when driver returns error to the stack.

Fixes: 872f357824 ("RDMA/bnxt_re: Add support for MRs with Huge pages")
Link: https://lore.kernel.org/r/1574671174-5064-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03 15:20:42 -04:00
Devesh Sharma fca5b9dc09 RDMA/bnxt_re: Fix missing le16_to_cpu
From sparse:

drivers/infiniband/hw/bnxt_re/main.c:1274:18: warning: cast from restricted __le16
drivers/infiniband/hw/bnxt_re/main.c:1275:18: warning: cast from restricted __le16
drivers/infiniband/hw/bnxt_re/main.c:1276:18: warning: cast from restricted __le16
drivers/infiniband/hw/bnxt_re/main.c:1277:21: warning: restricted __le16 degrades to integer

Fixes: 2b827ea192 ("RDMA/bnxt_re: Query HWRM Interface version from FW")
Link: https://lore.kernel.org/r/1574317343-23300-4-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-25 10:31:47 -04:00
Devesh Sharma 98998ffe52 RDMA/bnxt_re: Fix stat push into dma buffer on gen p5 devices
Due to recent advances in the firmware for Broadcom's gen p5 series of
adaptors the driver code to report hardware counters has been broken
w.r.t. roce devices.

The new firmware command expects dma length to be specified during stat
dma buffer allocation.

Fixes: 2792b5b95e ("bnxt_en: Update firmware interface spec. to 1.10.0.89.")
Link: https://lore.kernel.org/r/1574317343-23300-3-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-25 10:31:47 -04:00
Luke Starrett e284b159c6 RDMA/bnxt_re: Fix chip number validation Broadcom's Gen P5 series
In the first version of Gen P5 ASIC, chip-id was always set to 0x1750 for
all adaptor port configurations. This has been fixed in the new chip rev.

Due to this missing fix users are not able to use adaptors based on latest
chip rev of Broadcom's Gen P5 adaptors.

Fixes: ae8637e131 ("RDMA/bnxt_re: Add chip context to identify 57500 series")
Link: https://lore.kernel.org/r/1574317343-23300-2-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Luke Starrett <luke.starrett@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-25 10:31:47 -04:00
Krzysztof Kozlowski 6e419e35e6 RDMA/bnxt_re: Fix Kconfig indentation
Adjust indentation from spaces to tab (+optional two spaces) as in coding
style with command like:
	$ sed -e 's/^        /\t/' -i */Kconfig

Link: https://lore.kernel.org/r/20191120134138.15245-1-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-25 10:31:47 -04:00
Christoph Hellwig 72b894b09a IB/umem: remove the dmasync argument to ib_umem_get
The argument is always ignored, so remove it.

Link: https://lore.kernel.org/r/20191113073214.9514-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-17 10:37:00 -04:00
Devesh Sharma 39c48c5146 RDMA/bnxt_re: Enable SRIOV VF support on Broadcom's 57500 adapter series
Broadcom's 575xx adapter series has support for SRIOV VFs.  Making changes
to enable SRIOV VF support. There are two major area where changes are
done:

 - Added new DB location for control-path and data-path DB ring

 - New devices do not need to issue the sriov-config slow-path command
   thus, skipping to call that firmware command.

For now enabling support for 64 RoCE VFs.

Link: https://lore.kernel.org/r/1570081715-14301-1-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-08 16:22:24 -03:00
Kamal Heib 39ce85f3b1 RDMA/bnxt_re: Remove unsupported modify_device callback
There is no need to return always zero for function which is not
supported.

Link: https://lore.kernel.org/r/20190923104158.5331-3-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-01 13:06:10 -03:00
Colin Ian King d97a3e92f3 RDMA/bnxt_re: Fix spelling mistake "missin_resp" -> "missing_resp"
There is a spelling mistake in a literal string, fix it.

Fixes: 89f81008ba ("RDMA/bnxt_re: expose detailed stats retrieved from HW")
Link: https://lore.kernel.org/r/20190911092856.11146-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-09-16 10:58:57 -03:00
Jason Gunthorpe 75c66515e4 Merge tag 'v5.3-rc8' into rdma.git for-next
To resolve dependencies in following patches

mlx5_ib.h conflict resolved by keeing both hunks

Linux 5.3-rc8

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-09-13 16:59:51 -03:00
Navid Emamdoost 4a9d46a9fe RDMA: Fix goto target to release the allocated memory
In bnxt_re_create_srq(), when ib_copy_to_udata() fails allocated memory
should be released by goto fail.

Fixes: 37cb11acf1 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Link: https://lore.kernel.org/r/20190910222120.16517-1-navid.emamdoost@gmail.com
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-09-13 16:55:55 -03:00
Selvin Xavier d37b1e5340 RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message
Driver copies FW commands to the HW queue as  units of 16 bytes. Some
of the command structures are not exact multiple of 16. So while copying
the data from those structures, the stack out of bounds messages are
reported by KASAN. The following error is reported.

[ 1337.530155] ==================================================================
[ 1337.530277] BUG: KASAN: stack-out-of-bounds in bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530413] Read of size 16 at addr ffff888725477a48 by task rmmod/2785

[ 1337.530540] CPU: 5 PID: 2785 Comm: rmmod Tainted: G           OE     5.2.0-rc6+ #75
[ 1337.530541] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[ 1337.530542] Call Trace:
[ 1337.530548]  dump_stack+0x5b/0x90
[ 1337.530556]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530560]  print_address_description+0x65/0x22e
[ 1337.530568]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530575]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530577]  __kasan_report.cold.3+0x37/0x77
[ 1337.530581]  ? _raw_write_trylock+0x10/0xe0
[ 1337.530588]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530590]  kasan_report+0xe/0x20
[ 1337.530592]  memcpy+0x1f/0x50
[ 1337.530600]  bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530608]  ? bnxt_qplib_creq_irq+0xa0/0xa0 [bnxt_re]
[ 1337.530611]  ? xas_create+0x3aa/0x5f0
[ 1337.530613]  ? xas_start+0x77/0x110
[ 1337.530615]  ? xas_clear_mark+0x34/0xd0
[ 1337.530623]  bnxt_qplib_free_mrw+0x104/0x1a0 [bnxt_re]
[ 1337.530631]  ? bnxt_qplib_destroy_ah+0x110/0x110 [bnxt_re]
[ 1337.530633]  ? bit_wait_io_timeout+0xc0/0xc0
[ 1337.530641]  bnxt_re_dealloc_mw+0x2c/0x60 [bnxt_re]
[ 1337.530648]  bnxt_re_destroy_fence_mr+0x77/0x1d0 [bnxt_re]
[ 1337.530655]  bnxt_re_dealloc_pd+0x25/0x60 [bnxt_re]
[ 1337.530677]  ib_dealloc_pd_user+0xbe/0xe0 [ib_core]
[ 1337.530683]  srpt_remove_one+0x5de/0x690 [ib_srpt]
[ 1337.530689]  ? __srpt_close_all_ch+0xc0/0xc0 [ib_srpt]
[ 1337.530692]  ? xa_load+0x87/0xe0
...
[ 1337.530840]  do_syscall_64+0x6d/0x1f0
[ 1337.530843]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1337.530845] RIP: 0033:0x7ff5b389035b
[ 1337.530848] Code: 73 01 c3 48 8b 0d 2d 0b 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 0a 2c 00 f7 d8 64 89 01 48
[ 1337.530849] RSP: 002b:00007fff83425c28 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 1337.530852] RAX: ffffffffffffffda RBX: 00005596443e6750 RCX: 00007ff5b389035b
[ 1337.530853] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005596443e67b8
[ 1337.530854] RBP: 0000000000000000 R08: 00007fff83424ba1 R09: 0000000000000000
[ 1337.530856] R10: 00007ff5b3902960 R11: 0000000000000206 R12: 00007fff83425e50
[ 1337.530857] R13: 00007fff8342673c R14: 00005596443e6260 R15: 00005596443e6750

[ 1337.530885] The buggy address belongs to the page:
[ 1337.530962] page:ffffea001c951dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 1337.530964] flags: 0x57ffffc0000000()
[ 1337.530967] raw: 0057ffffc0000000 0000000000000000 ffffffff1c950101 0000000000000000
[ 1337.530970] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 1337.530970] page dumped because: kasan: bad access detected

[ 1337.530996] Memory state around the buggy address:
[ 1337.531072]  ffff888725477900: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 f2 f2 f2
[ 1337.531180]  ffff888725477980: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
[ 1337.531288] >ffff888725477a00: 00 f2 f2 f2 f2 f2 f2 00 00 00 f2 00 00 00 00 00
[ 1337.531393]                                                  ^
[ 1337.531478]  ffff888725477a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531585]  ffff888725477b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531691] ==================================================================

Fix this by passing the exact size of each FW command to
bnxt_qplib_rcfw_send_message as req->cmd_size. Before sending
the command to HW, modify the req->cmd_size to number of 16 byte units.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1566468170-489-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-22 10:52:15 -04:00
Kamal Heib 72a7720fca RDMA: Introduce ib_port_phys_state enum
In order to improve readability, add ib_port_phys_state enum to replace
the use of magic numbers.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Andrew Boyer <aboyer@tobark.org>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190807103138.17219-2-kamalheib1@gmail.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-12 10:18:52 -04:00
Parav Pandit bda9045a20 IB/bnxt_re: Do not notifify GID change event
GID table entry operations such as add/remove/modify are triggered
by the IB core for RoCE ports.
Hence, remove GID change notification from hw driver.

Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Tested-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Link: https://lore.kernel.org/r/20190726182652.50037-1-parav@mellanox.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-07-29 13:50:48 -04:00
Selvin Xavier c56b593d2a RDMA/bnxt_re: Honor vlan_id in GID entry comparison
A GID entry consists of GID, vlan, netdev and smac.  Extend GID duplicate
check comparisons to consider vlan_id as well to support IPv6 VLAN based
link local addresses. Introduce a new structure (bnxt_qplib_gid_info) to
hold gid and vlan_id information.

The issue is discussed in the following thread
https://lore.kernel.org/r/AM0PR05MB4866CFEDCDF3CDA1D7D18AA5D1F20@AM0PR05MB4866.eurprd05.prod.outlook.com

Fixes: 823b23da71 ("IB/core: Allow vlan link local address based RoCE GIDs")
Cc: <stable@vger.kernel.org> # v5.2+
Link: https://lore.kernel.org/r/20190715091913.15726-1-selvin.xavier@broadcom.com
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Co-developed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 15:04:04 -03:00
Jason Gunthorpe 371bb62158 Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into rdma.git for-next

For dependencies in next patches.

Resolve conflicts:
- Use uverbs_get_cleared_udata() with new cq allocation flow
- Continue to delete nes despite SPDX conflict
- Resolve list appends in mlx5_command_str()
- Use u16 for vport_rule stuff
- Resolve list appends in struct ib_client

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-28 21:18:23 -03:00
Leon Romanovsky 836a0fbb3e RDMA: Check umem pointer validity prior to release
Update ib_umem_release() to behave similarly to kfree() and allow
submitting NULL pointer as safe input to this function.

Fixes: a52c8e2469 ("RDMA: Clean destroy CQ in drivers do not return errors")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-20 15:17:59 -04:00
Leon Romanovsky e39afe3d6d RDMA: Convert CQ allocations to be under core responsibility
Ensure that CQ is allocated and freed by IB/core and not by drivers.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-11 16:39:49 -04:00
Leon Romanovsky a52c8e2469 RDMA: Clean destroy CQ in drivers do not return errors
Like all other destroy commands, .destroy_cq() call is not supposed
to fail. In all flows, the attempt to return earlier caused to memory
leaks.

This patch converts .destroy_cq() to do not return any errors.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-11 16:17:10 -04:00
Jason Gunthorpe 7a15414252 RDMA: Move owner into struct ib_device_ops
This more closely follows how other subsytems work, with owner being a
member of the structure containing the function pointers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10 16:56:03 -03:00
Jason Gunthorpe 72c6ec18eb RDMA: Move uverbs_abi_ver into struct ib_device_ops
No reason for every driver to emit code to set this, just make it part of
the driver's existing static const ops structure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10 16:56:02 -03:00
Jason Gunthorpe b9560a419b RDMA: Move driver_id into struct ib_device_ops
No reason for every driver to emit code to set this, just make it part of
the driver's existing static const ops structure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10 16:56:02 -03:00
Thomas Gleixner ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Shiraz Saleem d85582517e RDMA/bnxt_re: Use core helpers to get aligned DMA address
Call the core helpers to retrieve the HW aligned address to use for the
MR, within a supported bnxt_re page size.

Remove checking the umem->hugtetlb flag as it is no longer required. The
new DMA block iterator will return the 2M aligned address if the MR is
backed by 2M huge pages.

Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-06 13:08:11 -03:00
Parav Pandit a70c07397f RDMA: Introduce and use GID attr helper to read RoCE L2 fields
Instead of RoCE drivers figuring out vlan, smac fields while working on
QP/AH, provide a helper routine to read the L2 fields such as vlan_id and
source mac address.

This moves logic from mlx5 driver to core for wider usage for RoCE ports.

This is a preparation patch to allow detaching netdev in subsequent patch.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-03 11:10:02 -03:00
Matthew Wilcox a7b36d5fa8 ib/bnxt: Remove mention of idr_alloc from comment
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-25 12:02:18 -03:00
Jason Gunthorpe 4b38da75e0 RDMA/drivers: Convert easy drivers to use ib_device_set_netdev()
Drivers that never change their ndev dynamically do not need to use
the get_netdev callback.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
2019-04-09 10:14:54 -03:00
Leon Romanovsky 68e326dea1 RDMA: Handle SRQ allocations by IB/core
Convert SRQ allocation from drivers to be in the IB/core

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Leon Romanovsky d345691471 RDMA: Handle AH allocations by IB/core
Simplify drivers by ensuring lifetime of ib_ah object. The changes
in .create_ah() go hand in hand with relevant update in .destroy_ah().

We will use this opportunity and convert .destroy_ah() to don't fail, as
it was suggested a long time ago, because there is nothing to do in case
of failure during destroy.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Shamir Rabinovitch ff23dfa134 IB: Pass only ib_udata in function prototypes
Now when ib_udata is passed to all the driver's object create/destroy APIs
the ib_udata will carry the ib_ucontext for every user command. There is
no need to also pass the ib_ucontext via the functions prototypes.

Make ib_udata the only argument psssed.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01 15:00:47 -03:00
Shamir Rabinovitch c4367a2635 IB: Pass uverbs_attr_bundle down ib_x destroy path
The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x
destroy path as ib_udata. The next patch will use the ib_udata to free the
drivers destroy path from the dependency in 'uobject->context' as we
already did for the create path.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01 14:57:35 -03:00
Selvin Xavier 5aa8484080 RDMA/bnxt_re: Use correct sizing on buffers holding page DMA addresses
umem->nmap is used while allocating internal buffer for storing
page DMA addresses. This causes out of bounds array access while iterating
the umem DMA-mapped SGL with umem page combining as umem->nmap can be
less than number of system pages in umem.

Use ib_umem_num_pages() instead of umem->nmap to size the page array.
Add a new structure (bnxt_qplib_sg_info) to pass sglist, npages and nmap.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28 14:13:27 -03:00
Enrico Weigelt, metux IT consult 1a2e158327 drivers: infiniband: Fix whitespace in kconfig
Adjust the kconfig whitespace in bnxt_re/iser to match the kernel
standard.

Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-26 12:49:33 -03:00
Linus Torvalds a50243b1dd 5.1 Merge Window Pull Request
This has been a slightly more active cycle than normal with ongoing core
 changes and quite a lot of collected driver updates.
 
 - Various driver fixes for bnxt_re, cxgb4, hns, mlx5, pvrdma, rxe
 
 - A new data transfer mode for HFI1 giving higher performance
 
 - Significant functional and bug fix update to the mlx5 On-Demand-Paging MR
   feature
 
 - A chip hang reset recovery system for hns
 
 - Change mm->pinned_vm to an atomic64
 
 - Update bnxt_re to support a new 57500 chip
 
 - A sane netlink 'rdma link add' method for creating rxe devices and fixing
   the various unregistration race conditions in rxe's unregister flow
 
 - Allow lookup up objects by an ID over netlink
 
 - Various reworking of the core to driver interface:
   * Drivers should not assume umem SGLs are in PAGE_SIZE chunks
   * ucontext is accessed via udata not other means
   * Start to make the core code responsible for object memory
     allocation
   * Drivers should convert struct device to struct ib_device
     via a helper
   * Drivers have more tools to avoid use after unregister problems
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlyAJYYACgkQOG33FX4g
 mxrWwQ/+OyAx4Moru7Aix0C6GWxTJp/wKgw21CS3reZxgLai6x81xNYG/s2wCNjo
 IccObVd7mvzyqPdxOeyHBsJBbQDqWvoD6O2duH8cqGMgBRgh3CSdUep2zLvPpSAx
 2W1SvWYCLDnCuarboFrCA8c4AN3eCZiqD7z9lHyFQGjy3nTUWzk1uBaOP46uaiMv
 w89N8EMdXJ/iY6ONzihvE05NEYbMA8fuvosKLLNdghRiHIjbMQU8SneY23pvyPDd
 ZziPu9NcO3Hw9OVbkwtJp47U3KCBgvKHmnixyZKkikjiD+HVoABw2IMwcYwyBZwP
 Bic/ddONJUvAxMHpKRnQaW7znAiHARk21nDG28UAI7FWXH/wMXgicMp6LRcNKqKF
 vqXdxHTKJb0QUR4xrYI+eA8ihstss7UUpgSgByuANJ0X729xHiJtlEvPb1DPo1Dz
 9CB4OHOVRl5O8sA5Jc6PSusZiKEpvWoyWbdmw0IiwDF5pe922VLl5Nv88ta+sJ38
 v2Ll5AgYcluk7F3599Uh9D7gwp5hxW2Ph3bNYyg2j3HP4/dKsL9XvIJPXqEthgCr
 3KQS9rOZfI/7URieT+H+Mlf+OWZhXsZilJG7No0fYgIVjgJ00h3SF1/299YIq6Qp
 9W7ZXBfVSwLYA2AEVSvGFeZPUxgBwHrSZ62wya4uFeB1jyoodPk=
 =p12E
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This has been a slightly more active cycle than normal with ongoing
  core changes and quite a lot of collected driver updates.

   - Various driver fixes for bnxt_re, cxgb4, hns, mlx5, pvrdma, rxe

   - A new data transfer mode for HFI1 giving higher performance

   - Significant functional and bug fix update to the mlx5
     On-Demand-Paging MR feature

   - A chip hang reset recovery system for hns

   - Change mm->pinned_vm to an atomic64

   - Update bnxt_re to support a new 57500 chip

   - A sane netlink 'rdma link add' method for creating rxe devices and
     fixing the various unregistration race conditions in rxe's
     unregister flow

   - Allow lookup up objects by an ID over netlink

   - Various reworking of the core to driver interface:
       - drivers should not assume umem SGLs are in PAGE_SIZE chunks
       - ucontext is accessed via udata not other means
       - start to make the core code responsible for object memory
         allocation
       - drivers should convert struct device to struct ib_device via a
         helper
       - drivers have more tools to avoid use after unregister problems"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (280 commits)
  net/mlx5: ODP support for XRC transport is not enabled by default in FW
  IB/hfi1: Close race condition on user context disable and close
  RDMA/umem: Revert broken 'off by one' fix
  RDMA/umem: minor bug fix in error handling path
  RDMA/hns: Use GFP_ATOMIC in hns_roce_v2_modify_qp
  cxgb4: kfree mhp after the debug print
  IB/rdmavt: Fix concurrency panics in QP post_send and modify to error
  IB/rdmavt: Fix loopback send with invalidate ordering
  IB/iser: Fix dma_nents type definition
  IB/mlx5: Set correct write permissions for implicit ODP MR
  bnxt_re: Clean cq for kernel consumers only
  RDMA/uverbs: Don't do double free of allocated PD
  RDMA: Handle ucontext allocations by IB/core
  RDMA/core: Fix a WARN() message
  bnxt_re: fix the regression due to changes in alloc_pbl
  IB/mlx4: Increase the timeout for CM cache
  IB/core: Abort page fault handler silently during owning process exit
  IB/mlx5: Validate correct PD before prefetch MR
  IB/mlx5: Protect against prefetch of invalid MR
  RDMA/uverbs: Store PR pointer before it is overwritten
  ...
2019-03-09 15:53:03 -08:00
Devesh Sharma 0fca467e81 bnxt_re: Clean cq for kernel consumers only
Kernel space provider driver should clean the CQs belonging to kernel
space consumers only. The current implementation is doing reverse of it.

Fixing the same by avoiding the call to __clean_cq on a kernel qp during
destroy.

Fixes: c50866e285 ("bnxt_re: fix the regression due to changes in alloc_pbl")
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-04 10:51:14 -04:00
Jakub Kicinski f4b6bcc700 net: devlink: turn devlink into a built-in
Being able to build devlink as a module causes growing pains.
First all drivers had to add a meta dependency to make sure
they are not built in when devlink is built as a module.  Now
we are struggling to invoke ethtool compat code reliably.

Make devlink code built-in, users can still not build it at
all but the dynamically loadable module option is removed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26 08:49:05 -08:00
Leon Romanovsky a2a074ef39 RDMA: Handle ucontext allocations by IB/core
Following the PD conversion patch, do the same for ucontext allocations.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-22 14:11:37 -07:00
Devesh Sharma c50866e285 bnxt_re: fix the regression due to changes in alloc_pbl
While adding the use of for_each_sg_dma_page iterator for Brodcom's rdma
driver, there was a regression added in the __alloc_pbl path. The change
left bnxt_re in DOA state in for-next branch.

Fixing the regression to avoid the host crash when a user space object is
created. Restricting the unconditional access to hwq.pg_arr when hwq is
initialized for user space objects.

Fixes: 161ebe2498 ("RDMA/bnxt_re: Use for_each_sg_dma_page iterator on umem SGL")
Reported-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-22 11:17:22 -07:00
Shamir Rabinovitch 8994445054 IB/{hw,sw}: Remove 'uobject->context' dependency in object creation APIs
Now when we have the udata passed to all the ib_xxx object creation APIs
and the additional macro 'rdma_udata_to_drv_context' to get the
ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally
start to remove the dependency of the drivers in the
ib_xxx->uobject->context.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15 15:38:38 -07:00
Colin Ian King a87145957e RDMA/bnxt_re: fix or'ing of data into an uninitialized struct member
The struct member comp_mask has not been initialized however a bit
pattern is being bitwise or'd into the member and hence other bit
fields in comp_mask may contain any garbage from the stack. Fix this
by making the bitwise or into an assignment.

Fixes: 95b86d1c91 ("RDMA/bnxt_re: Update kernel user abi to pass chip context")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-11 15:34:54 -07:00
Shiraz, Saleem 161ebe2498 RDMA/bnxt_re: Use for_each_sg_dma_page iterator on umem SGL
Use the for_each_sg_dma_page iterator variant to walk the umem DMA-mapped
SGL and get the page DMA address. This avoids the extra loop to iterate
pages in the SGE when for_each_sg iterator is used.

Additionally, purge umem->page_shift usage in the driver as its only
relevant for ODP MRs. Use system page size and shift instead.

Signed-off-by: Shiraz, Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-11 15:02:33 -07:00
Leon Romanovsky 21a428a019 RDMA: Handle PD allocations by IB/core
The PD allocations in IB/core allows us to simplify drivers and their
error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand
in had with relevant update in .dealloc_pd().

We will use this opportunity and convert .dealloc_pd() to don't fail, as
it was suggested a long time ago, failures are not happening as we have
never seen a WARN_ON print.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:51:04 -07:00
Devesh Sharma 95b86d1c91 RDMA/bnxt_re: Update kernel user abi to pass chip context
User space verbs provider library would need chip context.  Changing the
ABI to add chip version details in structure.  Furthermore, changing the
kernel driver ucontext allocation code to initialize the abi structure
with appropriate values.

As suggested by community, appended the new fields at the bottom of the
ABI structure and retaining to older fields as those were in the older
versions.

Keeping the ABI version at 1 and adding a new field in the ucontext
response structure to hold the component mask.  The user space library
should check pre-defined flags to figure out if a certain feature is
supported on not.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:49 -07:00
Devesh Sharma 37f91cff2d RDMA/bnxt_re: Add extended psn structure for 57500 adapters
The new 57500 series of adapter has bigger psn search structure.  The size
of new structure is 16B. Changing the control path memory allocation and
fast path code to accommodate the new psn structure while maintaining the
backward compatibility.

There are few additional changes listed below:
 - For 57500 chip max-sge are limited to 6 for now.
 - For 57500 chip max-receive-sge should be set to 6 for now.
 - Add driver/hardware interface structure for new chip.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma 374c5285ab RDMA/bnxt_re: Enable GSI QP support for 57500 series
In the new 57500 series of adapters the GSI qp is a UD type QP unlike the
previous generation where it was a Raw Eth QP. Changing the control and
data path to support the same. Listing all the significant diffs:

 - AH creation resolve network type unconditionally
 - Add check at relevant places to distinguish from Raw Eth
   processing flow.
 - bnxt_re_process_res_ud_wc report completion with GRH flag
   when qp is GSI.
 - Change length, cfa_meta and smac to match new driver/hardware
   interface.
 - Add new driver/hardware interface.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma e0387e1dd4 RDMA/bnxt_re: Skip backing store allocation for 57500 series
The backing store to keep HW context data structures is allocated and
initialized by L2 driver. For 57500 chip RoCE driver do not require to
allocate and initialize additional memory. Changing to skip duplicate
allocation and initialization for 57500 adapters. Driver continues as
before for older chips.

This patch also takes care of stats context memory alignment to 128
boundary, a requirement for 57500 series of chip. Older chips do not care
of alignment, thus the change is unconditional.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma b353ce556d RDMA/bnxt_re: Add 64bit doorbells for 57500 series
The new chip series has 64 bit doorbell for notification queues. Thus,
both control and data path event queues need new routines to write 64 bit
doorbell. Adding the same. There is new doorbell interface between the
chip and driver. Changing the chip specific data structure definitions.

Additional significant changes are listed below
- bnxt_re_net_ring_free/alloc takes a new argument
- bnxt_qplib_enable_nq and enable_rcfw uses new doorbell offset
  for new chip.
- DB mapping for NQ and CREQ now maps 8 bytes.
- DBR_DBR_* macros renames to DBC_DBC_*
- store nq_db_offset in a 32bit data type.
- got rid of __iowrite64_copy, used writeq instead.
- changed the DB header initialization to simpler scheme.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma ae8637e131 RDMA/bnxt_re: Add chip context to identify 57500 series
Adding setup and destroy routines for chip-context. The chip context would
be used frequently in control and data path to take execution flow
depending on the chip type.  chip context structure pointer is added to
the relevant data structures.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Leon Romanovsky 459cc69fa4 RDMA: Provide safe ib_alloc_device() function
All callers to ib_alloc_device() provide a larger size than struct
ib_device and rely on the fact that struct ib_device is embedded in their
driver specific structure as the first member.

Provide a safer variant of ib_alloc_device() that checks and enforces this
approach to make sure the drivers are using it right.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-30 15:52:30 -07:00
Jason Gunthorpe 55c293c38e Merge branch 'devx-async' into k.o/for-next
Yishai Hadas says:

Enable DEVX asynchronous query commands

This series enables querying a DEVX object in an asynchronous mode.

The userspace application won't block when calling the firmware and it will be
able to get the response back once that it will be ready.

To enable the above functionality:

- DEVX asynchronous command completion FD object was introduced.
- The applicable file operations were implemented to enable using it by
  the user application.
- Query asynchronous method was added to the DEVX object, it will call the
  firmware asynchronously and manages the response on the given input FD.
- Hot unplug support was added for the FD to work properly upon
  unbind/disassociate.
- mlx5 core fence for asynchronous commands was implemented and used to
  prevent racing upon unbind/disassociate.

This branch is based on mlx5-next & v5.0-rc2 due to dependencies, from
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

* branch 'devx-async':
  IB/mlx5: Implement DEVX hot unplug for async command FD
  IB/mlx5: Implement the file ops of DEVX async command FD
  IB/mlx5: Introduce async DEVX obj query API
  IB/mlx5: Introduce MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-29 13:49:31 -07:00
Masahiro Yamada b360ce3b2b infiniband: prefix header search paths with $(srctree)/
Currently, the Kbuild core manipulates header search paths in a crazy
way [1].

To fix this mess, I want all Makefiles to add explicit $(srctree)/ to
the search paths in the srctree. Some Makefiles are already written in
that way, but not all. The goal of this work is to make the notation
consistent, and finally get rid of the gross hacks.

Having whitespaces after -I does not matter since commit 48f6e3cf5b
("kbuild: do not drop -I without parameter").

[1]: https://patchwork.kernel.org/patch/9632347/

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-25 15:28:50 -07:00
Dan Carpenter 97099cc652 RDMA/bnxt_re: fix a size calculation
This is from static analysis not from testing.  Depending on the value
of rcfw->cmdq_depth, then this might not cause an issue at runtime.

The BITS_TO_LONGS() macro tells us how many longs it take to hold a
bitmap.  In other words, it divides by the number if bits per long and
rounds up.  Then we want to take that number and multiple by
sizeof(long) to get the number of bytes to allocate.

The code here does the multiplication first so the rounding up is done
in the wrong place.  So imagine we want to allocate 1 bit, then
"(1 * 8) / 64 = 1" when we round up.  But it should be
"(1 / 64) * 8 = 8".  In other words, because of the rounding difference
we might allocate up to "sizeof(long) - 1" bytes fewer than intended.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14 14:05:54 -07:00
Parav Pandit 5474723115 RDMA: Introduce and use rdma_device_to_ibdev()
Introduce and use rdma_device_to_ibdev() API for those drivers which are
registering one sysfs group and also use in ib_core.

In subsequent patch, device->provider_ibdev one-to-one mapping is no
longer holds true during accessing sysfs entries.
Therefore, introduce an API rdma_device_to_ibdev() that provides such
information.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14 13:12:03 -07:00
Parav Pandit ea4baf7f11 RDMA: Rename port_callback to init_port
Most provider routines are callback routines which ib core invokes.
_callback suffix doesn't convey information about when such callback is
invoked. Therefore, rename port_callback to init_port.

Additionally, store the init_port function pointer in ib_device_ops, so
that it can be accessed in subsequent patches when binding rdma device to
net namespace.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14 13:05:14 -07:00
Jason Gunthorpe b0ea0fa543 IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udata
ib_umem_get() can only be called in a method callback, which always has a
udata parameter. This allows ib_umem_get() to derive the ucontext pointer
directly from the udata without requiring the drivers to find it in some
way or another.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
2019-01-10 17:07:45 -07:00
Luis Chamberlain 750afb08ca cross-tree: phase out dma_zalloc_coherent()
We already need to zero out memory for dma_alloc_coherent(), as such
using dma_zalloc_coherent() is superflous. Phase it out.

This change was generated with the following Coccinelle SmPL patch:

@ replace_dma_zalloc_coherent @
expression dev, size, data, handle, flags;
@@

-dma_zalloc_coherent(dev, size, handle, flags)
+dma_alloc_coherent(dev, size, handle, flags)

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: re-ran the script on the latest tree]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-01-08 07:58:37 -05:00
Aditya Pakki 94edd87a1c infiniband: bnxt_re: qplib: Check the return value of send_message
In bnxt_qplib_map_tc2cos(), bnxt_qplib_rcfw_send_message() can return an
error value but it is lost. Propagate this error to the callers.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Acked-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-02 16:04:24 -07:00
Devesh Sharma bd1c24ccf9 RDMA/bnxt_re: Increase depth of control path command queue
Increasing the depth of control path command queue to 8K entries to handle
burst of commands. This feature needs support from FW and the driver/fw
compatibility is checked from the interface version number.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:37:33 -07:00
Selvin Xavier 2b827ea192 RDMA/bnxt_re: Query HWRM Interface version from FW
Get HWRM interface major, minor, build and patch version from FW for
checking the FW/Driver compatibility.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:37:32 -07:00
Gal Pressman 50c582de1d RDMA/bnxt_re: Make use of destroy AH sleepable flag
When in a sleepable (non-atomic) context, wait for firmware completion
instead of polling for it.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:28:04 -07:00
Gal Pressman 90e3edd8cc RDMA/bnxt_re: Make use of create AH sleepable flag
When in a sleepable (non-atomic) context, wait for firmware completion
instead of polling for it.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:28:03 -07:00
Gal Pressman 2553ba217e RDMA: Mark if destroy address handle is in a sleepable context
Introduce a 'flags' field to destroy address handle callback and add a
flag that marks whether the callback is executed in an atomic context or
not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:28:03 -07:00
Gal Pressman b090c4e3a0 RDMA: Mark if create address handle is in a sleepable context
Introduce a 'flags' field to create address handle callback and add a flag
that marks whether the callback is executed in an atomic context or not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:17:19 -07:00
Shamir Rabinovitch e00b64f7c5 RDMA: Cleanup undesired pd->uobject usage
Drivers should be using udata to determine if a method is invoked from
user space or kernel space. A pd does not necessarily say a different
objects is kernel or user.

Transforming the tests to use udata eliminates a large number of uobject
references from the drivers.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18 19:15:48 -07:00
Kamal Heib 9615f86be9 RDMA/bnxt_re: Initialize ib_device_ops struct
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-11 15:15:07 -07:00
Selvin Xavier a6c66d6a08 RDMA/bnxt_re: Avoid accessing the device structure after it is freed
When bnxt_re_ib_reg returns failure, the device structure gets
freed. Driver tries to access the device pointer
after it is freed.

[ 4871.034744] Failed to register with netedev: 0xffffffa1
[ 4871.034765] infiniband (null): Failed to register with IB: 0xffffffea
[ 4871.046430] ==================================================================
[ 4871.046437] BUG: KASAN: use-after-free in bnxt_re_task+0x63/0x180 [bnxt_re]
[ 4871.046439] Write of size 4 at addr ffff880fa8406f48 by task kworker/u48:2/17813

[ 4871.046443] CPU: 20 PID: 17813 Comm: kworker/u48:2 Kdump: loaded Tainted: G B OE  4.20.0-rc1+ #42
[ 4871.046444] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[ 4871.046447] Workqueue: bnxt_re bnxt_re_task [bnxt_re]
[ 4871.046449] Call Trace:
[ 4871.046454]  dump_stack+0x91/0xeb
[ 4871.046458]  print_address_description+0x6a/0x2a0
[ 4871.046461]  kasan_report+0x176/0x2d0
[ 4871.046463]  ? bnxt_re_task+0x63/0x180 [bnxt_re]
[ 4871.046466]  bnxt_re_task+0x63/0x180 [bnxt_re]
[ 4871.046470]  process_one_work+0x216/0x5b0
[ 4871.046471]  ? process_one_work+0x189/0x5b0
[ 4871.046475]  worker_thread+0x4e/0x3d0
[ 4871.046479]  kthread+0x10e/0x140
[ 4871.046480]  ? process_one_work+0x5b0/0x5b0
[ 4871.046482]  ? kthread_stop+0x220/0x220
[ 4871.046486]  ret_from_fork+0x3a/0x50

[ 4871.046492] The buggy address belongs to the page:
[ 4871.046494] page:ffffea003ea10180 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 4871.046495] flags: 0x57ffffc0000000()
[ 4871.046498] raw: 0057ffffc0000000 0000000000000000 ffffea003ea10188 0000000000000000
[ 4871.046500] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 4871.046501] page dumped because: kasan: bad access detected

Avoid accessing the device structure once it is freed.

Fixes: 497158aa5f ("RDMA/bnxt_re: Fix the ib_reg failure cleanup")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-21 14:12:21 -07:00
Selvin Xavier 3c4b1419c3 RDMA/bnxt_re: Fix system hang when registration with L2 driver fails
Driver doesn't release rtnl lock if registration with
L2 driver (bnxt_re_register_netdev) fais and this causes
hang while requesting for the next lock.

[  371.635416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  371.635417] kworker/u48:1   D    0   634      2 0x80000000
[  371.635423] Workqueue: bnxt_re bnxt_re_task [bnxt_re]
[  371.635424] Call Trace:
[  371.635426]  ? __schedule+0x36b/0xbd0
[  371.635429]  schedule+0x39/0x90
[  371.635430]  schedule_preempt_disabled+0x11/0x20
[  371.635431]  __mutex_lock+0x45b/0x9c0
[  371.635433]  ? __mutex_lock+0x16d/0x9c0
[  371.635435]  ? bnxt_re_ib_reg+0x2b/0xb30 [bnxt_re]
[  371.635438]  ? wake_up_klogd+0x37/0x40
[  371.635442]  bnxt_re_ib_reg+0x2b/0xb30 [bnxt_re]
[  371.635447]  bnxt_re_task+0xfd/0x180 [bnxt_re]
[  371.635449]  process_one_work+0x216/0x5b0
[  371.635450]  ? process_one_work+0x189/0x5b0
[  371.635453]  worker_thread+0x4e/0x3d0
[  371.635455]  kthread+0x10e/0x140
[  371.635456]  ? process_one_work+0x5b0/0x5b0
[  371.635458]  ? kthread_stop+0x220/0x220
[  371.635460]  ret_from_fork+0x3a/0x50
[  371.635477] INFO: task NetworkManager:1228 blocked for more than 120 seconds.
[  371.635478]       Tainted: G    B      OE     4.20.0-rc1+ #42
[  371.635479] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Release the rtnl_lock correctly in the failure path.

Fixes: de5c95d0f5 ("RDMA/bnxt_re: Fix system crash during RDMA resource initialization")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-21 14:11:04 -07:00
Parav Pandit 508a523f6b RDMA/drivers: Use core provided API for registering device attributes
Use rdma_set_device_sysfs_group() to register device attributes and
simplify the driver.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-17 03:45:01 -06:00
Selvin Xavier 5df9509949 RDMA/bnxt_re: Avoid resource leak in case the NQ registration fails
In case the NQ alloc/enable fails, free up the already allocated/enabled
NQ before reporting failure. Also, track the alloc/enable using proper
state checking.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:51 -06:00
Selvin Xavier a08b9e9a70 RDMA/bnxt_re: Wait for delayed work to finish before device removal
Delayed work bnxt_re_worker would be still running even after
cancel_delayed_work returns. This causes crash as the driver proceeds with
device removal. To make sure that the work is finished before returning,
use cancel_delayed_work_sync.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:51 -06:00
Devesh Sharma 854a202001 RDMA/bnxt_re: Limit max_pkey to 16 bit value
Some FW versios return pkey values more than 0xFFFF. pkey_tbl_len of
ib_port_attr is 16bit value. So restricting max_pkeys to 0xFFFF.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:51 -06:00
Devesh Sharma 4c01f2e3a9 RDMA/bnxt_re: Fix qp async event reporting
Reports affiliated async event on the qp-async event channel instead of
global event channel.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier 316dd2825d RDMA/bnxt_re: Report out of sequence hw counters
Expose out of sequence errors received from FW.  This counter is a 32 bit
counter and driver has to accumulate the counter. Stores the previous
value for calculating the difference in the next query.

Also, update the HW statistics structure with new fields.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier 5c80c9138e RDMA/bnxt_re: Expose rx discards and drop counters
Expose the RoCE discard and drop counters from the HW statistics context

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Somnath Kotur bb22c36cba RDMA/bnxt_re: Prevent driver crash due to NULL pointer in error message print
crsqe->resp would be NULL in case the host command timed out before
getting a response from HW. Check for NULL pointer to avoid a potential
crash while printing the error message.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Devesh Sharma f2bd4d096e RDMA/bnxt_re: Drop L2 async events silently
In some FW versions, RoCE driver also receives an async notification which
was directed to L2 driver.  RoCE driver does not handle this and print a
message to syslog.  Drop these notifications silently.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier ed51efd2ce RDMA/bnxt_re: Avoid accessing nq->bar_reg_iomem in failure case
In the failure path, nq->bar_reg_iomem gets accessed without
initializing. Avoid this by calling the bnxt_qplib_nq_stop_irq only if the
initialization is complete.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Fixes: 6e04b10356 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier eae4ad1b0c RDMA/bnxt_re: Avoid NULL check after accessing the pointer
This is reported by smatch check.  rcfw->creq_bar_reg_iomem is accessed in
bnxt_qplib_rcfw_stop_irq and this variable check afterwards doesn't make
sense.  Also, rcfw->creq_bar_reg_iomem will never be NULL.  So Removing
this check.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 6e04b10356 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier 1b7042d7a5 RDMA/bnxt_re: Remove the unnecessary version macro definition
Version macro is not required as the driver is not maintaining the
version. Removing the references of this macro too.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier d455f29f6d RDMA/bnxt_re: Fix recursive lock warning in debug kernel
Fix possible recursive lock warning. Its a false warning as the locks are
part of two differnt HW Queue data structure - cmdq and creq. Debug kernel
is throwing the following warning and stack trace.

[  783.914967] ============================================
[  783.914970] WARNING: possible recursive locking detected
[  783.914973] 4.19.0-rc2+ #33 Not tainted
[  783.914976] --------------------------------------------
[  783.914979] swapper/2/0 is trying to acquire lock:
[  783.914982] 000000002aa3949d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.914999]
but task is already holding lock:
[  783.915002] 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re]
[  783.915013]
other info that might help us debug this:
[  783.915016]  Possible unsafe locking scenario:

[  783.915019]        CPU0
[  783.915021]        ----
[  783.915034]   lock(&(&hwq->lock)->rlock);
[  783.915035]   lock(&(&hwq->lock)->rlock);
[  783.915037]
 *** DEADLOCK ***

[  783.915038]  May be due to missing lock nesting notation

[  783.915039] 1 lock held by swapper/2/0:
[  783.915040]  #0: 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re]
[  783.915044]
stack backtrace:
[  783.915046] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.19.0-rc2+ #33
[  783.915047] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[  783.915048] Call Trace:
[  783.915049]  <IRQ>
[  783.915054]  dump_stack+0x90/0xe3
[  783.915058]  __lock_acquire+0x106c/0x1080
[  783.915061]  ? sched_clock+0x5/0x10
[  783.915063]  lock_acquire+0xbd/0x1a0
[  783.915065]  ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915069]  _raw_spin_lock_irqsave+0x4a/0x90
[  783.915071]  ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915073]  bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915078]  tasklet_action_common.isra.17+0x197/0x1b0
[  783.915081]  __do_softirq+0xcb/0x3a6
[  783.915084]  irq_exit+0xe9/0x100
[  783.915085]  do_IRQ+0x6a/0x120
[  783.915087]  common_interrupt+0xf/0xf
[  783.915088]  </IRQ>

Use nested notation for the spin_lock to avoid this warning.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier 5a23e0b1dd RDMA/bnxt_re: Add missing spin lock initialization
Add the missing initalization of the cq_lock and qplib.flush_lock.

Fixes: 942c9b6ca8 ("RDMA/bnxt_re: Avoid Hard lockup during error CQE processing")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Jason Gunthorpe 59bfc59a68 Merge branch 'for-rc' into rdma.git for-next
From git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git

This is required to resolve dependencies of the next series of RDMA
patches.

The code motion conflicts in drivers/infiniband/core/cache.c were
resolved.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:01:02 -06:00