linux

Commit Graph

Author	SHA1	Message	Date
Bart Van Assche	e012f3639c	IB/srp: Fix srp_map_data() error paths Ensure that req->nmdesc is set correctly in srp_map_sg() if mapping fails. Avoid that mapping failure causes a memory descriptor leak. Report srp_map_sg() failure to the caller. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Laurence Oberman <loberman@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-12 14:18:55 -04:00
Bart Van Assche	77269cdfca	IB/srp: Document srp_map_data() return value Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-12 14:18:55 -04:00
Bart Van Assche	6ec2ba02e6	IB/srp: Fix a comment The free request list was removed through patch "IB/srp: Use block layer tags". Hence update a comment that refers to that free request list. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Laurence Oberman <loberman@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-12 14:18:55 -04:00
Bart Van Assche	1d3d98c4cf	IB/srp: Fix a spelling error in a source code comment Change one occurrence of "boundries" into "boundaries". Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-12 14:18:54 -04:00
Doug Ledford	94d7f1a255	Merge branches 'hfi1' and 'iw_cxgb4' into k.o/for-4.7	2016-05-05 16:42:09 -04:00
Hariprasad S	6973627968	RDMA/iw_cxgb4: remove abort_connection() usage from ep_timeout() Use c4iw_ep_disconnect() instead. This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. This is the last user of abort_connection(), so remove it too. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 16:11:14 -04:00
Hariprasad S	c00dcbafac	RDMA/iw_cxgb4: move QP -> ERROR on fatal disconnect errors In c4iw_ep_disconnect(), if we fail to initiate a close operation, then move the qp to ERROR to disassociate the ep from the qp. Failure to do this will leak the ep resources. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 16:11:14 -04:00
Hariprasad S	fd6aabe48c	RDMA/iw_cxgb4: don't use abort_connection in process_mpa_request() Instead return whether the caller needs to disconnect. This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 16:11:14 -04:00
Hariprasad S	eaf4c6d46a	RDMA/iw_cxgb4: remove abort_connection() usage from accept/reject Use c4iw_ep_disconnect() instead. This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 16:11:14 -04:00
Hariprasad S	fef4422d00	RDMA/iw_cxgb4: free resources when send_flowc() fails Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 16:11:14 -04:00
Hariprasad S	f8e1e1d137	RDMA/iw_cxgb4: remove connection abort from process_mpa_reply Instead, have the caller, rx_data() handle the close/abort like it does for process_mpa_request(). This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 16:11:14 -04:00
Hariprasad S	6e410d8f71	RDMA/iw_cxgb4: ensure eps don't get freed while the mutex is held In rx_data(), with the ep in FPDU_MODE, refcnt=2, if we get unexpected streaming data, we call c4iw_modify_rc_qp() and move the qp from RTS -> TERMINATE. In c4iw_modify_rc_qp(), if rdma_fini() returns an error, the ep will be dereferenced (refcnt=1). Then rx_data() calls c4iw_ep_disconnect() which starts the close operation. But if send_halfclose() fails in c4iw_ep_disconnect(), we will call release_ep_resources() derefing the ep which reduces the refcnt to 0 and and frees the ep. However we still has the ep mutex at that point, so we have a touch-after-free bug. There is a similar issue where peer_close() calls c4iw_ep_disconnect(). The solution is to add a reference to the ep in c4iw_ep_disconnect() after acquiring the mutex, and release it after releasing the mutex. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 16:11:14 -04:00
Hariprasad S	88bc230dc6	RDMA/iw_cxgb4: stop ep timer on close failure In c4iw_ep_disconnect(), if we start the ep timer to begin a close, but send_halfclose() fails, we need to stop the timer and send a CLOSE event up to the IWCM before releasing the resources. Otherwise, we can crash when the ep timer fires if the ep is referencing a previous instance of the device. This can happen as part of adapter reset/recovery, for instance. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 16:11:14 -04:00
Hariprasad S	9dec900c20	RDMA/iw_cxgb4: release ep resources on accept arp failure If ARP fails before the CPL_PASS_ACCEPT_RPL is seen by hardware, the tid will be stuck in SYN_PEND and never released. So create an arp failure handler specifically for this message to release the endpoint resources. In pass_accept_rpl_arp_failure(), put the parent endpoint so it will be freed when destroyed. Also we don't need to call release_tid() here because _c4iw_free_ep() calls cxgb4_remove_tid() which releases the hwtid. If we get an ABORT_REQ_RSS instead of a PASS_ESTABLISH (because the peer's ACK to our SYN is never received), then put the parent as well in peer_abort(). Treat accept_cr() failures just like arp failures: put the parent ep and release the ep resources destroying the tid The ARP failure handlers are called in an atomic context, so we need to schedule some of the processing which might block. Namely _c4iw_free_ep() which needs a mutex. So create a "special" CPL opcode and handler and schedule it via sched() to be run by process_work() in a blockable context. Also rework the active open arp failure handler to make use of release_ep_resources(). This allows both the active and passive arp failure handlers to use the same deferred cleanup function. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 16:11:14 -04:00
Christoph Hellwig	9c674815d3	IB/iser: Fix max_sectors calculation iSER currently has a couple places that set max_sectors in either the host template or SCSI host, and all of them get it wrong. This patch instead uses a single assignment that (hopefully) gets it right: the max_sectors value must be derived from the number of segments in the FR or FMR structure, but actually be one lower than the page size multiplied by the number of sectors, as it has to handle the case of non-aligned I/O. Without this I get trivial to reproduce hangs when running xfstests (on XFS) over iSER to Linux targets. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-05 12:41:24 -04:00
Florian Westphal	4c8bb95921	RDMA/nes: don't leak skb if carrier down Alternatively one could free the skb, OTOH I don't think this test is useful so just remove it. Cc: <linux-rdma@vger.kernel.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 21:11:09 -04:00
Tatyana Nikolova	ccea5f0f01	RDMA/i40iw: Fix for removing quad hash entries Fix for removing a quad hash entry when the corresponding quad hash entry hasn't been added, which is the case in loopback connections Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:55 -04:00
Tatyana Nikolova	f8a4e76c75	RDMA/i40iw: Fix for checking if the QP is destroyed Fix for checking if the QP associated with a completion has been destroyed while processing CQ elements. If that is the case, move the CQ head to the next element and continue completion processing. Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:55 -04:00
Shiraz Saleem	6c2f76197d	RDMA/i40iw: Fix for using one sge for RDMA READ A check is added to validate the requested sge number. iWARP doesn't support multiple sg elements for RDMA READ work requests. Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:55 -04:00
Shiraz Saleem	df2d96c3d0	RDMA/i40iw: Fix for the size of kernel mode SQ Fix to calculate the SQ size based on the max frag_count, requested by the application instead of overwriting it with the max supported frag_count Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:55 -04:00
Mohammad Khan	84a4c24663	RDMA/i40iw: Fix for a NOP WQE size Fix for filling in the WQE size for NOP Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:55 -04:00
Chien Tin Tung	8e9f04a7c7	RDMA/i40iw: Correct STag mask to min of 14 bits STag index mask is calculated incorrectly, missing the 14 bits minimum requirement. Add max macro to use either # of MRs or 14 bits in the mask size calculation. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Shiraz Saleem	9510b0666e	RDMA/i40iw: Fixes for WQE alignment Invalidation after every WQE write is changed to invalidate only if required. NOPs are padded so that WQE writes are aligned to 64B boundary. Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Ismail, Mustafa	c2b75ef7dc	RDMA/i40iw: Adding queue drain functions Adding sq and rq drain functions, which block until all previously posted wr-s in the specified queue have completed. A completion object is signaled to unblock the thread, when the last cqe for the corresponding queue is processed. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Ismail, Mustafa	fa41537961	RDMA/i40iw: Fix SD calculation for initial HMC creation Correct SD calculation by using base address returned from commit FPM. This alleviates any assumptions on resource ordering and alignment requirement. Also consolidate SD estimation code into i40iw_est_sd(). Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Ismail, Mustafa	20c61f7e88	RDMA/i40iw: Fix endian issues and warnings Fix endian warnings and errors due to u32 stored to u16. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Ismail, Mustafa	b7aee855d3	RDMA/i40iw: Add base memory management extensions Implement fast register mr, Local invalidate, send with invalidate and RDMA read with invalidate. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Ismail, Mustafa	eb9b0379f8	RDMA/i40iw: Initialize max enabled vfs variable Initialize max enabled vfs to max rdma vfs instead of 0. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Ismail, Mustafa	5c1c1908c1	RDMA/i40iw: Correct return code check in add_pble_pool Move return code check to immediately after i40iw_hmc_sd_one call where it is set instead of outside the then statement. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Ismail, Mustafa	f69c333162	RDMA/i40iw: Add virtual channel message queue Queue users of virtual channel on a waitqueue until the channel is clear instead of failing the call when the channel is occupied. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Ismail, Mustafa	f606d89330	RDMA/i40iw: Remove unused code and fix warning Remove unused code and fix warning. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:54 -04:00
Ismail, Mustafa	4920dc311c	RDMA/i40iw: Populate vendor_id and vendor_part_id fields Populate PCI info fields from PCI device structure. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:53 -04:00
Ismail, Mustafa	df35630af3	RDMA/i40iw: Set vendor_err only if there is an actual error Add a check for cq_poll_info.error before setting vendor_err instead of always setting it. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:53 -04:00
Ismail, Mustafa	996abf0a52	RDMA/i40iw: Add qp table lock around AE processing QP may be freed during Async Event processing. Add a lock around QP table to prevent it. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:53 -04:00
Ismail, Mustafa	36a4793350	RDMA/i40iw: Do not set self-referencing pointer to NULL after free iwqp->allocated_buffer is a self-referencing pointer to iwqp. Do not set iwqp->allocated_buffer to NULL after freeing it. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:53 -04:00
Ismail, Mustafa	bd57aeae56	RDMA/i40iw: Correct max message size in query port Fix to correct max reported message size in query port. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:53 -04:00
Ismail, Mustafa	b3437e0d5a	RDMA/i40iw: Fix refused connections Make sure cm_node is setup before sending SYN packet and ORD/IRD negotiation. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:53 -04:00
Ismail, Mustafa	23ef48ad6c	RDMA/i40iw: Correct QP size calculation Include inline data size as part of SQ size calculation. RQ size calculation uses only number of SGEs and does not support 96 byte WQE size. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:53 -04:00
Ismail, Mustafa	6b90036587	RDMA/i40iw: Fix overflow of region length Change region_length to u64 as a region can be > 4GB. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:53 -04:00
Jubin John	d35cf74492	IB/hfi1: Serialize hrtimer function calls hrtimer functions do not guarantee serialization, so we extend the cca_timer_lock to cover the hrtimer_forward_now() in the hrtimer callback handler and the hrtimer_start() in process_becn(). This prevents races between these 2 functions to update the hrtimer state leading to problems such as: kernel BUG at kernel/hrtimer.c:1282! encountered during validation of the CCA feature. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:29 -04:00
Dean Luick	1cbaa67035	IB/hfi1: Fix MAD port poll for active cables A MAD directive to start polling must go through the normal link tuning and start steps in order to correctly handle active cables. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:29 -04:00
Dean Luick	015e91fbc9	IB/hfi1: Correctly report neighbor link down reason The code to save the link down reason for reporting to the SMA was in a location before the actual reason was read. Move the SMA link down reason assignment to a better location. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:29 -04:00
Dean Luick	feb831ddf2	IB/hfi1: Use the neighbor link down reason only when valid The 8051 uses a link down reason to inform the driver why the link went down. The neighbor planned link down reason code is only valid when a link down idle message is received by the 8051. Enhance the explanation on why the link went down. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:29 -04:00
Dean Luick	f9b5635cbe	IB/hfi1: Ignore link downgrade with 0 lanes Versions of the 8051 firmware < 0.38 may report a link failure as a link downgrade with a width of 0 followed by a link down notification. Ignore the zero width downgrade notification - the driver should follow the link down path. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:29 -04:00
Dean Luick	8f000f7f6e	IB/hfi1: Add RSM rule for user FECN handling Add a receive side mapping rule to extract expected user packets with the FECN bit set and place them in an eager buffer. This will allow user libraries to recognize that a FECN was sent when using header suppression and respond appropriately. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:29 -04:00
Dean Luick	b12349ae13	IB/hfi1: Create a routine to set a receive side mapping rule Move the rule setting code into its own routine for improved searchability and reuse. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:28 -04:00
Dean Luick	4a818bedf7	IB/hfi1: Move QOS decision logic into its own function The decision to use QOS affects other resource allocation. Move the QOS decision logic into its own function so it can be called by other interested parties. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:28 -04:00
Dean Luick	372cc85a13	IB/hfi1: Extract RSM map table init from QOS Refactor the allocation, tracking, and writing of the RSM map table into its own set of routines. This will allow the map table to be passed to multiple users to fill in as needed. Start with the original user, QOS. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:28 -04:00
Jianxin Xiong	44306f15f0	IB/hfi1: Reduce kernel context pio buffer allocation The pio buffers were pooled evenly among all kernel contexts and user contexts. However, the demand from kernel contexts is much lower than user contexts. This patch reduces the allocation for kernel contexts and thus makes more credits available for PSM, helping performance. This is especially useful on high core-count systems where large numbers of contexts are used. A new context type SC_VL15 is added to distinguish the context used for VL15 from other kernel contexts. The reason is that VL15 needs to support 2KB sized packet while other kernel contexts need only support packets up to the size determined by "piothreshold", which has a default value of 256. The new allocation method allows triple buffering of largest pio packets configured for these contexts. This is sufficient to maintain verbs performance. The largest pio packet size is 2048B for VL15 and "piothreshold" for other kernel contexts. A cap is applied to "piothreshold" to avoid excessive buffer allocation. The special case that SDMA is disable is handled differently. In that case, the original pooling allocation is used to better support the much higher pio traffic. Notice that if adaptive pio is disabled (piothreshold==0), the pio buffer size doesn't matter for non-VL15 kernel send contexts when SDMA is enabled because pio is not used at all on these contexts and thus the new allocation is still valid. If SDMA is disabled then pooling allocation is used as mentioned in previous paragraph. Adjustment is also made to the calculation of the credit return threshold for the kernel contexts. Instead of purely based on the MTU size, a percentage based threshold is also considered and the smaller one of the two is chosen. This is necessary to ensure that with the reduced buffer allocation credits are returned in time to avoid unnecessary stall in the send path. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mark Debbage <mark.debbage@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:28 -04:00
Jubin John	0852d241f4	IB/hfi1: Change default number of user contexts Change the default number of user contexts to the number of real (non-HT) cpu cores in order to reduce the division of hfi1 hardware contexts in the case of high core counts with hyper-threading enabled. Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 16:32:28 -04:00

1 2 3 4 5 ...

589664 Commits All Branches Search

589664 Commits

All Branches