linux

Commit Graph

Author	SHA1	Message	Date
Steve Wise	4ab928f692	RDMA/cxgb3: Fixes for zero STag Handling the zero STag in receive work request requires some extra logic in the driver: - Only set the QP_PRIV bit for kernel mode QPs. - Add a zero STag build function for recv wrs. The uP needs a PBL allocated and passed down in the recv WR so it can construct a HW PBL for the zero STag S/G entries. Note: we need to place a few restrictions on zero STag usage because of this: 1) all SGEs in a recv WR must either be zero STag or not. No mixing. 2) an individual SGE length cannot exceed 128MB for a zero-stag SGE. This should be OK since it's not really practical to allocate such a large chunk of pinned contiguous DMA mapped memory. - Add an optimized non-zero-STag recv wr format for kernel users. This is needed to optimize both zero and non-zero STag cracking in the recv path for kernel users. - Remove the iwch_ prefix from the static build functions. - Bump required FW version. Signed-off-by: Steve Wise <swise@opengridcomputing.com>	2008-07-14 23:48:53 -07:00
Steve Wise	96f15c0353	RDMA/core: Add local DMA L_Key support - Change the IB_DEVICE_ZERO_STAG flag to the transport-neutral name IB_DEVICE_LOCAL_DMA_LKEY, which is used by iWARP RNICs to indicate 0 STag support and IB HCAs to indicate reserved L_Key support. - Add a u32 local_dma_lkey member to struct ib_device. Drivers fill this in with the appropriate local DMA L_Key (if they support it). - Fix up the drivers using this flag. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:53 -07:00
Steve Wise	70fe1796a5	RDMA/cxgb3: Set rkey field for new memory windows in iwch_alloc_mw() Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:49 -07:00
Jon Mason	52c8084b74	RDMA/cxgb3: Propagate HW page size capabilities cxgb3 does not currently report the page size capabilities, and incorrectly reports them internally. This version changes the bit-shifting to a static value (per Steve's request). Signed-off-by: Jon Mason <jon@opengridcomputing.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:49 -07:00
Steve Wise	14cc180f7b	RDMA/cxgb3: Add support for protocol statistics - Add a new rdma ctl command called RDMA_GET_MIB to the cxgb3 low level driver to obtain the protocol mib from the rnic hardware. - Add new iw_cxgb3 provider method to get the MIB from the low level driver. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:48 -07:00
Steve Wise	97d1cc8055	RDMA/cxgb3: Fix up some ib_device_attr fields - set fw_ver - set hw_ver - set max_qp_wr to something reasonable - set max_cqe to something reasonable Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:47 -07:00
Steve Wise	e7e5582999	RDMA/cxgb3: MEM_MGT_EXTENSIONS support - set IB_DEVICE_MEM_MGT_EXTENSIONS capability bit if fw supports it. - set max_fast_reg_page_list_len device attribute. - add iwch_alloc_fast_reg_mr function. - add iwch_alloc_fastreg_pbl - add iwch_free_fastreg_pbl - adjust the WQ depth for kernel mode work queues to account for fastreg possibly taking 2 WR slots. - add fastreg_mr work request support. - add local_inv work request support. - add send_with_inv and send_with_se_inv work request support. - removed useless duplicate enums/defines for TPT/MW/MR stuff. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:45 -07:00
Steve Wise	5e19cf663b	RDMA/cxgb3: Fix regression caused by class_device -> device conversion The change to iwch_provider.c in commit `f4e91eb4` ("IB: convert struct class_device to struct device") undid the fix done in commit `7f049f2f` ("RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call"). It removed the calls to rtnl_lock() that serialized the iw_cxgb3 ethtool ops calls into the cxgb3 driver. This locking is needed to avoid messing up the internal state of the cxgb3 driver. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-08 14:40:05 -07:00
Roland Dreier	273748cc90	RDMA/cxgb3: Fix severe limit on userspace memory registration size Currently, iw_cxgb3 is severely limited on the amount of userspace memory that can be registered in in a single memory region, which causes big problems for applications that expect to be able to register 100s of MB. The problem is that the driver uses a single kmalloc()ed buffer to hold the physical buffer list (PBL) for the entire memory region during registration, which means that 8 bytes of contiguous memory are required for each page of memory being registered. For example, a 64 MB registration will require 128 KB of contiguous memory with 4 KB pages, and it unlikely that such an allocation will succeed on a busy system. This is purely a driver problem: the temporary page list buffer is not needed by the hardware, so we can fix this by writing the PBL to the hardware in page-sized chunks rather than all at once. We do this by splitting the memory registration operation up into several steps: - Allocate PBL space in adapter memory for the full registration - Copy PBL to adapter memory in chunks - Allocate STag and enable memory region This also allows several other cleanups to the __cxio_tpt_op() interface and related parts of the driver. This change leaves the reregister memory region and memory window operations broken, but they already didn't work due to other longstanding bugs, so fixing them will be left to a later patch. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-05-06 15:56:22 -07:00
Steve Wise	ccaf10d0ad	RDMA/cxgb3: Set the max_mr_size device attribute correctly cxgb3 only supports 4GB memory regions. The lustre RDMA code uses this attribute and currently has to code around our bad setting. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-29 13:46:52 -07:00
Arthur Kepner	cb9fbc5c37	IB: expand ib_umem_get() prototype Add a new parameter, dmasync, to the ib_umem_get() prototype. Use dmasync = 1 when mapping user-allocated CQs with ib_umem_get(). Signed-off-by: Arthur Kepner <akepner@sgi.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Jes Sorensen <jes@sgi.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:12 -07:00
Tony Jones	f4e91eb4a8	IB: convert struct class_device to struct device This converts the main ib_device to use struct device instead of struct class_device as class_device is going away. Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-04-19 19:10:30 -07:00
Roland Dreier	0f39cf3d54	IB/core: Add support for "send with invalidate" work requests Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a "send with invalidate" work request as defined in the iWARP verbs and the InfiniBand base memory management extensions. Also put "imm_data" and a new "invalidate_rkey" member in a new "ex" union in struct ib_send_wr. The invalidate_rkey member can be used to pass in an R_Key/STag to be invalidated. Add this new union to struct ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in ib_uverbs_post_send(). Fix up low-level drivers to deal with the change to struct ib_send_wr, and just remove the imm_data initialization from net/sunrpc/xprtrdma/, since that code never does any send with immediate operations. Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since the iWARP drivers currently in the tree set the bit. The amso1100 driver at least will silently fail to honor the IB_SEND_INVALIDATE bit if passed in as part of userspace send requests (since it does not implement kernel bypass work request queueing). Remove the flag from all existing drivers that set it until we know which ones are OK. The values chosen for the new flag is not consecutive to avoid clashing with flags defined in the XRC patches, which are not merged yet but which are already in use and are likely to be merged soon. This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:32 -07:00
Harvey Harrison	3371836383	IB: Replace remaining __FUNCTION__ occurrences with __func__ __FUNCTION__ is gcc-specific, use __func__ instead. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:01:10 -07:00
Jon Mason	4fa45725df	RDMA/cxgb3: Fix iwch_create_cq() off-by-one error The cxbg3 driver is unnecessarily decreasing the number of CQ entries by one when creating a CQ. This will cause the CQ not to have as many entries as requested by the user if the user requests a power of 2 size. Signed-off-by: Jon Mason <jon@opengridcomputing.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-03-09 13:54:12 -07:00
Jon Mason	1bab74e691	RDMA/cxgb3: Return correct max_inline_data when creating a QP Set cap.max_inline_data to the actual max inline data that the adapter support, so that userspace apps see the right value returned. Signed-off-by: Jon Mason <jon@opengridcomputing.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-29 13:53:18 -08:00
Steve Wise	8176d297c7	RDMA/cxgb3: Fix the T3A workaround checks Correctly work around T3A issues by checking "hwtype != T3A" instead of "hwtype == T3B". This will be needed for new hardware types. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:47 -08:00
Steve Wise	7f049f2f42	RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call Currently the call into cxgb3 to get the driver info is not serialized. The iw_cxgb3 module needs to hold the rtnl_lock around the ethtool ops call like dev_ioctl() does. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:26 -08:00
Steve Wise	9a7666494b	RDMA/cxgb3: Set the max_qp_init_rd_atom attribute in query_device The device attribute max_qp_init_rd_atom is not getting set in cxgb3's query_device method. Version 1.0.4 of librdmacm now validates the user's requested initiator and responder resources against the max supported by the device. Since iw_cxgb3 wasn't setting this attribute (and it defaulted to 0), all rdma_connect()s fail if there are initiator resources requested by the app. Fix this by setting the correct value in iwch_query_device(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-13 15:27:00 -08:00
WANG Cong	6abb6ea80b	RDMA/cxgb3: Check return of kmalloc() in iwch_register_device() Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> [ Also remove cast from void * return of kmalloc() as suggested by Jesper Juhl <jesper.juhl@gmail.com>. ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:26 -07:00
Roland Dreier	f7c6a7b5d5	IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules Export ib_umem_get()/ib_umem_release() and put low-level drivers in control of when to call ib_umem_get() to pin and DMA map userspace, rather than always calling it in ib_uverbs_reg_mr() before calling the low-level driver's reg_user_mr method. Also move these functions to be in the ib_core module instead of ib_uverbs, so that driver modules using them do not depend on ib_uverbs. This has a number of advantages: - It is better design from the standpoint of making generic code a library that can be used or overridden by device-specific code as the details of specific devices dictate. - Drivers that do not need to pin userspace memory regions do not need to take the performance hit of calling ib_mem_get(). For example, although I have not tried to implement it in this patch, the ipath driver should be able to avoid pinning memory and just use copy_{to,from}_user() to access userspace memory regions. - Buffers that need special mapping treatment can be identified by the low-level driver. For example, it may be possible to solve some Altix-specific memory ordering issues with mthca CQs in userspace by mapping CQ buffers with extra flags. - Drivers that need to pin and DMA map userspace memory for things other than memory regions can use ib_umem_get() directly, instead of hacks using extra parameters to their reg_phys_mr method. For example, the mlx4 driver that is pending being merged needs to pin and DMA map QP and CQ buffers, but it does not need to create a memory key for these buffers. So the cleanest solution is for mlx4 to call ib_umem_get() in the create_qp and create_cq methods. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-08 18:00:37 -07:00
Roland Dreier	ed23a72778	IB: Return "maybe missed event" hint from ib_req_notify_cq() The semantics defined by the InfiniBand specification say that completion events are only generated when a completions is added to a completion queue (CQ) after completion notification is requested. In other words, this means that the following race is possible: while (CQ is not empty) ib_poll_cq(CQ); // new completion is added after while loop is exited ib_req_notify_cq(CQ); // no event is generated for the existing completion To close this race, the IB spec recommends doing another poll of the CQ after requesting notification. However, it is not always possible to arrange code this way (for example, we have found that NAPI for IPoIB cannot poll after requesting notification). Also, some hardware (eg Mellanox HCAs) actually will generate an event for completions added before the call to ib_req_notify_cq() -- which is allowed by the spec, since there's no way for any upper-layer consumer to know exactly when a completion was really added -- so the extra poll of the CQ is just a waste. Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for ib_req_notify_cq() so that it can return a hint about whether the a completion may have been added before the request for notification. The return value of ib_req_notify_cq() is extended so: < 0 means an error occurred while requesting notification == 0 means notification was requested successfully, and if IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events were missed and it is safe to wait for another event. > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed in. It means that the consumer must poll the CQ again to make sure it is empty to avoid the race described above. We add a flag to enable this behavior rather than turning it on unconditionally, because checking for missed events may incur significant overhead for some low-level drivers, and consumers that don't care about the results of this test shouldn't be forced to pay for the test. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00
Michael S. Tsirkin	f4fd0b224d	IB: Add CQ comp_vector support Add a num_comp_vectors member to struct ib_device and extend ib_create_cq() to pass in a comp_vector parameter -- this parallels the userspace libibverbs API. Update all hardware drivers to set num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector value. Pass the value of num_comp_vectors to userspace rather than hard-coding a value of 1. We want multiple CQ event vector support (via MSI-X or similar for adapters that can generate multiple interrupts), but it's not clear how many vectors we want, or how we want to deal with policy issues such as how to decide which vector to use or how to set up interrupt affinity. This patch is useful for experimenting, since no core changes will be necessary when updating a driver to support multiple vectors, and we know that we want to make at least these changes anyway. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00
Steve Wise	1860cdf802	RDMA/cxgb3: Fail qp creation if the requested max_inline is too large Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-04-30 17:30:28 -07:00
Joachim Fenkes	1912ffbb88	IB: Set class_dev->dev in core for nice device symlink All RDMA drivers except ehca set class_dev->dev to their dma_device value (ehca leaves this unset). dma_device is the only value that makes any sense, so move this assignment to core/sysfs.c. This reduce the duplicated code in the rest of the drivers and gives ehca a nice /sys/class/infiniband/ehcaX/device symlink. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-04-24 21:30:38 -07:00
Steve Wise	d601347188	RDMA/cxgb3: Handle build_phys_page_list() failure in iwch_reregister_phys_mem() Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-03-22 14:40:16 -07:00
Steve Wise	e64518f373	RDMA/cxgb3: Fix MR permission problems Fix memory region permission problems: - remove useless and redundant iwch_mem_perms enum. - create ib_to_tpt_access_rights() for mapping ib access rights to T3 TPT permissions. - create ib_to_mwbind_access_rights() for mapping ib access rights to T3 MWBIND WR permissions. - fix up the mem reg code to utilize the new functions. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-03-06 12:51:02 -08:00
Steve Wise	2df50da00e	RDMA/cxgb3: Move QP to error on destroy if the state is IDLE Change iwch_destroy_qp() to always move the QP to ERROR and let iwch_modify_qp() decide what to do. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-03-06 12:50:45 -08:00
Steve Wise	aeb100e246	RDMA/cxgb3: Don't use mm after it's freed in iwch_mmap() Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-03-06 11:48:17 -08:00
Adrian Bunk	2b540355cd	RDMA/cxgb3: cleanups - don't mark static functions in C files as inline - gcc should know best whether inlining makes sense - never compile the unused cxio_dbg.c - make the following needlessly global functions static: - cxio_hal.c: cxio_hal_clear_qp_ctx() - iwch_provider.c: iwch_get_qp() - remove the following unused global functions: - cxio_hal.c: cxio_allocate_stag() - cxio_resource.: cxio_hal_get_rhdl() - cxio_resource.: cxio_hal_put_rhdl() Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-23 13:10:43 -08:00
Steve Wise	c52daa2976	RDMA/cxgb3: Remove Open Grid Computing copyrights in iw_cxgb3 driver Remove the Open Grid Computing copyright. It shouldn't be there. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 13:57:35 -08:00
Steve Wise	b038ced7b3	RDMA/cxgb3: Add driver for Chelsio T3 RNIC Add an RDMA/iWARP driver for the Chelsio T3 1GbE and 10GbE adapters. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-12 16:16:18 -08:00

32 Commits