Commit Graph

136 Commits

Author SHA1 Message Date
Vladimir Neyelov c8c16d3bae IB/iser: Fix connection teardown race condition
Under heavy iser target(scst) start/stop stress during login/logout
on iser intitiator side happened trace call provided below.

The function iscsi_iser_slave_alloc iser_conn pointer could be NULL,
due to the fact that function iscsi_iser_conn_stop can be called before
and free iser connection. Let's protect that flow by introducing global mutex.

BUG: unable to handle kernel paging request at 0000000000001018
IP: [<ffffffffc0426f7e>] iscsi_iser_slave_alloc+0x1e/0x50 [ib_iser]
Call Trace:
? scsi_alloc_sdev+0x242/0x300
scsi_probe_and_add_lun+0x9e1/0xea0
? kfree_const+0x21/0x30
? kobject_set_name_vargs+0x76/0x90
? __pm_runtime_resume+0x5b/0x70
__scsi_scan_target+0xf6/0x250
scsi_scan_target+0xea/0x100
iscsi_user_scan_session.part.13+0x101/0x130 [scsi_transport_iscsi]
? iscsi_user_scan_session.part.13+0x130/0x130 [scsi_transport_iscsi]
iscsi_user_scan_session+0x1e/0x30 [scsi_transport_iscsi]
device_for_each_child+0x50/0x90
iscsi_user_scan+0x44/0x60 [scsi_transport_iscsi]
store_scan+0xa8/0x100
? common_file_perm+0x5d/0x1c0
dev_attr_store+0x18/0x30
sysfs_kf_write+0x37/0x40
kernfs_fop_write+0x12c/0x1c0
__vfs_write+0x18/0x40
vfs_write+0xb5/0x1a0
SyS_write+0x55/0xc0

Fixes: 318d311e8f ("iser: Accept arbitrary sg lists mapping if the device supports it")
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Vladimir Neyelov <vladimirn@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimbeg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 11:45:25 -04:00
Linus Torvalds ac1820fb28 This is a tree wide change and has been kept separate for that reason.
Bart Van Assche noted that the ib DMA mapping code was significantly
 similar enough to the core DMA mapping code that with a few changes
 it was possible to remove the IB DMA mapping code entirely and
 switch the RDMA stack to use the core DMA mapping code.  This resulted
 in a nice set of cleanups, but touched the entire tree.  This branch
 will be submitted separately to Linus at the end of the merge window
 as per normal practice for tree wide changes like this.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYo06oAAoJELgmozMOVy/d9Z8QALedWHdu98St1L0u2c8sxnR9
 2zo/4sF5Vb9u7FpmdIX32L4SQ9s9KhPE8Qp8NtZLf9v10zlDebIRJDpXknXtKooV
 CAXxX4sxBXV27/UrhbZEfXiPrmm6ccJFyIfRnMU6NlMqh2AtAsRa5AC2/RMp8oUD
 Med97PFiF0o6TD22/UH1VFbRpX1zjaKyqm7a3as5sJfzNA+UGIZAQ7Euz8000DKZ
 xCgVLTEwS0FmOujtBkCst7xa9TjuqR1HLOB4DdGvAhP6BHdz2yamM7Qmh9NN+NEX
 0BtjsuXomtn6j6AszGC+bpipCZh3NUigcwoFAARXCYFHibBvo4DPdFeGsraFgXdy
 1+KyR8CCeQG3Aly5Vwr264RFPGkGpwMj8PsBlXgQVtrlg4rriaCzOJNmIIbfdADw
 ftqhxBOzReZw77aH2s+9p2ILRfcAmPqhynLvFGFo9LBvsik8LVso7YgZN0xGxwcI
 IjI/XGC8UskPVsIZBIYA6sl2bYzgOjtBIHiXjRrPlW3uhduIXLrvKFfLPP/5XLAG
 ehLXK+J0bfsyY9ClmlNS8oH/WdLhXAyy/KNmnj5bRRm9qg6BRJR3bsOBhZJODuoC
 XgEXFfF6/7roNESWxowff7pK0rTkRg/m/Pa4VQpeO+6NWHE7kgZhL6kyIp5nKcwS
 3e7mgpcwC+3XfA/6vU3F
 =e0Si
 -----END PGP SIGNATURE-----

Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma DMA mapping updates from Doug Ledford:
 "Drop IB DMA mapping code and use core DMA code instead.

  Bart Van Assche noted that the ib DMA mapping code was significantly
  similar enough to the core DMA mapping code that with a few changes it
  was possible to remove the IB DMA mapping code entirely and switch the
  RDMA stack to use the core DMA mapping code.

  This resulted in a nice set of cleanups, but touched the entire tree
  and has been kept separate for that reason."

* tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
  IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it
  IB/core: Remove ib_device.dma_device
  nvme-rdma: Switch from dma_device to dev.parent
  RDS: net: Switch from dma_device to dev.parent
  IB/srpt: Modify a debug statement
  IB/srp: Switch from dma_device to dev.parent
  IB/iser: Switch from dma_device to dev.parent
  IB/IPoIB: Switch from dma_device to dev.parent
  IB/rxe: Switch from dma_device to dev.parent
  IB/vmw_pvrdma: Switch from dma_device to dev.parent
  IB/usnic: Switch from dma_device to dev.parent
  IB/qib: Switch from dma_device to dev.parent
  IB/qedr: Switch from dma_device to dev.parent
  IB/ocrdma: Switch from dma_device to dev.parent
  IB/nes: Remove a superfluous assignment statement
  IB/mthca: Switch from dma_device to dev.parent
  IB/mlx5: Switch from dma_device to dev.parent
  IB/mlx4: Switch from dma_device to dev.parent
  IB/i40iw: Remove a superfluous assignment statement
  IB/hns: Switch from dma_device to dev.parent
  ...
2017-02-25 13:45:43 -08:00
Linus Torvalds cdc194705d SCSI misc on 20170220
This update includes the usual round of major driver updates (ncr5380,
 ufs, lpfc, be2iscsi, hisi_sas, storvsc, cxlflash, aacraid,
 megaraid_sas, ).  There's also an assortment of minor fixes and the
 major update of switching a bunch of drivers to pci_alloc_irq_vectors
 from Christoph.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJYq5adAAoJEAVr7HOZEZN4bjUP/Atk7CSZVnC75pcYmncbEGCx
 ysOlEHK4uW2HhiAYk3PlYMk+pKrMHet2zsbbM9PHJfopdOHZ7Sq1+UZZVeqE1Zun
 8pe0NhON+fZx7XAnevdEvnSSULQZ+AGfjZO72iUwkJiN3ozYaFtCITOyn49l4GpR
 ra9emskBh7CQOFW2voGn1AKeDijPYGx3+TO4AUrWjVMiByR06gb1bmImx+ljiUrs
 jzRJPfrt90ORcTdpMateyN2EXxudcASMhX03SJ6fRI84hPAhMCROMbTv8RnzOTE4
 DPbnvbYUowlHt43iUhJHSwGdkRRaRBnkzQENBp1fNrNzZgF6vB7+kShxbonrYB2p
 gC4ewaJr0BNj+HsUnvTpe3WseiPOcfsnBsKilPLKBlm2dCKEXqFox/dj/T1uexxg
 HoyFrl3u8fyEqVHrzRS4M9t/njWh0NFmXxb0wBdj+lkVFTRErGSKQ8SfOqshuSGs
 P8NN88jy8vC7uqgzKBJ+UH3ehzn3qfBxasFHIC/e2awY9FqKjHGTxKMmSVpjXVxy
 wCvE2FQ3k/qEj2XSM6f7/NGytlSOlju5q1rFtHPW2M+TFSh0LJWCnmVjR/Zle9em
 pBWmtIgCv8W5b41zL2H94nLWAZbfdrrNU/XnX88l47LKnmorte/PGhpxu36NEsMS
 VCgreQmFMdMRY+WzDWl1
 =cBQx
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This update includes the usual round of major driver updates (ncr5380,
  ufs, lpfc, be2iscsi, hisi_sas, storvsc, cxlflash, aacraid,
  megaraid_sas, ...).

  There's also an assortment of minor fixes and the major update of
  switching a bunch of drivers to pci_alloc_irq_vectors from Christoph"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (188 commits)
  scsi: megaraid_sas: handle dma_addr_t right on 32-bit
  scsi: megaraid_sas: array overflow in megasas_dump_frame()
  scsi: snic: switch to pci_irq_alloc_vectors
  scsi: megaraid_sas: driver version upgrade
  scsi: megaraid_sas: Change RAID_1_10_RMW_CMDS to RAID_1_PEER_CMDS and set value to 2
  scsi: megaraid_sas: Indentation and smatch warning fixes
  scsi: megaraid_sas: Cleanup VD_EXT_DEBUG and SPAN_DEBUG related debug prints
  scsi: megaraid_sas: Increase internal command pool
  scsi: megaraid_sas: Use synchronize_irq to wait for IRQs to complete
  scsi: megaraid_sas: Bail out the driver load if ld_list_query fails
  scsi: megaraid_sas: Change build_mpt_mfi_pass_thru to return void
  scsi: megaraid_sas: During OCR, if get_ctrl_info fails do not continue with OCR
  scsi: megaraid_sas: Do not set fp_possible if TM capable for non-RW syspdIO, change fp_possible to bool
  scsi: megaraid_sas: Remove unused pd_index from megasas_build_ld_nonrw_fusion
  scsi: megaraid_sas: megasas_return_cmd does not memset IO frame to zero
  scsi: megaraid_sas: max_fw_cmds are decremented twice, remove duplicate
  scsi: megaraid_sas: update can_queue only if the new value is less
  scsi: megaraid_sas: Change max_cmd from u32 to u16 in all functions
  scsi: megaraid_sas: set pd_after_lb from MR_BuildRaidContext and initialize pDevHandle to MR_DEVHANDLE_INVALID
  scsi: megaraid_sas: latest controller OCR capability from FW before sending shutdown DCMD
  ...
2017-02-21 11:51:42 -08:00
Christoph Hellwig b6a05c823f scsi: remove eh_timed_out methods in the transport template
Instead define the timeout behavior purely based on the host_template
eh_timed_out method and wire up the existing transport implementations
in the host templates.  This also clears up the confusion that the
transport template method overrides the host template one, so some
drivers have to re-override the transport template one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-06 19:10:03 -05:00
Bart Van Assche 61118cecf2 IB/iser: Switch from dma_device to dev.parent
Prepare for removal of ib_device.dma_device.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 12:26:17 -05:00
Max Gurtovoy 83236f0157 IB/iser: remove unused variable from iser_conn struct
max_sectors calculation was fixed in commit:
9c674815d3 ("IB/iser: Fix max_sectors calculation").
Thus, iser_conn variable scsi_max_sectors is not needed anymore.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 11:37:45 -05:00
Max Gurtovoy 1e5db6c31a IB/iser: Fix sg_tablesize calculation
For devices that can register page list that is bigger than
USHRT_MAX, we actually take the wrong value for sg_tablesize.
E.g: for CX4 max_fast_reg_page_list_len is 65536 (bigger than USHRT_MAX)
so we set sg_tablesize to 0 by mistake. Therefore, each IO that is
bigger than 4k splitted to "< 4k" chunks that cause performance degredation.
Remove wrong sg_tablesize assignment, and use the value that was set during
address resolution handler with the needed casting.

Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 11:37:45 -05:00
Linus Torvalds 7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Christoph Hellwig 9c674815d3 IB/iser: Fix max_sectors calculation
iSER currently has a couple places that set max_sectors in either the host
template or SCSI host, and all of them get it wrong.

This patch instead uses a single assignment that (hopefully) gets it right:
the max_sectors value must be derived from the number of segments in the
FR or FMR structure, but actually be one lower than the page size multiplied
by the number of sectors, as it has to handle the case of non-aligned I/O.

Without this I get trivial to reproduce hangs when running xfstests
(on XFS) over iSER to Linux targets.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05 12:41:24 -04:00
Sagi Grimberg 318d311e8f iser: Accept arbitrary sg lists mapping if the device supports it
If the device support arbitrary sg list mapping (device cap
IB_DEVICE_SG_GAPS_REG set) we allocate the memory regions with
IB_MR_TYPE_SG_GAPS and allow the block layer to pass us
gaps by skip setting the queue virt_boundary.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-04 11:59:35 -05:00
Roi Dayan 08ff089b12 IB/iser: Fix module init not cleaning up on error flow
Destroy workqueue on transport register error, also
release kmem cache on workqueue allocation error.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:33 -05:00
Or Gerlitz 4a061b287b IB/ulps: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22 14:39:00 -05:00
Sagi Grimberg 630c3183ce IB/iser: Enable SG clustering
iser is perfectly capable supporting SG clustering as it translates
the SG list to a page vector. Enabling SG clustering can dramatically
reduce the number of SG elements, which doesn't make much of a difference
at this point, but with arbitrary SG list support, reducing the
number of SG elements can benefit greatly as as it would reduce
the length of the HW descriptors array.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 12:26:06 -04:00
Sagi Grimberg dd0107a089 IB/iser: set block queue_virt_boundary
The block layer can reliably guarantee that SG lists won't
contain gaps (page unaligned) if a driver set the queue
virt_boundary.

With this setting the block layer will:
- refuse merges if bios are not aligned to the virtual boundary
- split bios/requests that are not aligned to the virtual boundary
- or, bounce buffer SG_IOs that are not aligned to the virtual boundary

Since iser is working in 4K page size, set the virt_boundary to
4K pages. With this setting, we can now safely remove the bounce
buffering logic in iser.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 12:26:06 -04:00
Bart Van Assche 78fc3fc4cc IB/iser: Remove an unused variable
Detected this by compiling with W=1.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-22 18:37:14 -04:00
Geliang Tang 68a5e60436 IB/iser: fix a comment typo
Just fix a typo in the code comment.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 16:41:54 -04:00
Sagi Grimberg 3cffd93017 IB/iser: Add module parameter for always register memory
This module parameter forces memory registration even for
a continuous memory region. It is true by default as sending
an all-physical rkey with remote permissions might be insecure.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-25 10:46:51 -04:00
Jason Gunthorpe 256b7ad273 IB/iser: Use pd->local_dma_lkey
Replace all leys with  pd->local_dma_lkey. This driver does not support
iWarp, so this is safe.

The insecure use of ib_get_dma_mr is thus isolated to an rkey, and this
looks trivially fixed by forcing the use of registration in a future
patch.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:34 -04:00
Sagi Grimberg 7332bed085 IB/iser: Chain all iser transaction send work requests
Chaning of send work requests benefits performance by
reducing the send queue lock contention (acquired in
ib_post_send) and saves us HW doorbells which is posted
only once.

Currently, in normal IO flows iser does not chain the CDB send
work request with the registration work request. Also in PI
flows, signature work requests are not chained as well.

Lets chain those and post only once.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:33 -04:00
Sagi Grimberg df749cdc45 IB/iser: Support up to 8MB data transfer in a single command
iser support up to 512KB data transfer in a single scsi command.
This means that larger IOs will split to different request. While
iser can easily saturate FDR/EDR wires, some arrays are fine tuned
for 1MB (or larger) IO sizes, hence add an option to support larger
transfers (up to 8MB) if the device allows it.

Given that a few target implementations don't support data transfers
of more than 512KB by default and the fact that larger IO sizes require
more resources, we introduce a module parameter to determine the
maximum number of 512B sectors in a single scsi command.
Users that are interested in larger transfers can change this value given
that the target supports larger transfers.

At the moment, iser works in 4K pages granularity, In a later stage
we will get it to work with system page size instead.

IO operations that consists of N pages will need a page vector
of size N+1 in case the first SG element contains an offset. Given
that some devices allocates memory regions in powers of 2, this
means that allocating a region with N+1 pages, will result in
region resources allocation of the next power of 2. Since we don't
want that to happen, in case we are in the limit of IO size supported
and the first SG element has an offset, we align the SG list using a
bounce buffer (which is OK given that this is not likely to happen a lot).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:32 -04:00
Sagi Grimberg 8d5944d803 IB/iser: Fix possible bogus DMA unmapping
If iser_initialize_task_headers() routine failed before
dma mapping, we should not attempt to unmap in cleanup_task().

Fixes: 7414dde0a6 (IB/iser: Fix race between iser connection ...)
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:28 -04:00
Sagi Grimberg 02816a8b88 IB/iser: Get rid of un-maintained counters
We don't update those anywhere in the code and they
seem pretty useless (no one seem to care about those).

qp_tx_queue_full: We never should get this
fmr_map_not_avail: We can never get to this
eh_abort_cnt: We don't monitor aborts

Go ahead and remove them.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:27 -04:00
Sagi Grimberg 1156cc80f8 IB/iser: Remove '.' from log message
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:27 -04:00
Sagi Grimberg 74ce897b7c IB/iser: Change minor assignments and logging prints
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:26 -04:00
Jenny Falkovich db0a6cbd21 IB/iser: Change some module parameters to be RO
While we're at it, use permission defines instead
of octal values and rearrange a little bit.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:26 -04:00
Steve Wise 7854550ae6 RDMA/iser: Limit sgs to the device fastreg depth
Currently the sg tablesize, which dictates fast register page list
depth to use, does not take into account the limits of the rdma device.
So adjust it once we discover the device fastreg max depth limit.  Also
adjust the max_sectors based on the resulting sg tablesize.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28 22:54:47 -04:00
Sagi Grimberg 5bb6e543d2 IB/iser: DIX update
Following few recent Block integrity updates, we align the iSER data
integrity offload settings with:

- Deprecate pi_guard module param
- Expose support for DIX type 0.
- Use scsi_transfer_length for the transfer length
- Get pi_interval, ref_tag, ref_remap, bg_type and
  check_mask setting from scsi_cmnd

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-12-15 18:11:46 -08:00
Sagi Grimberg f0caef6d40 IB/iser: Terminate connection before cleaning inflight tasks
When closing the connection, we should first terminate the connection
(in case it was not previously terminated) to guarantee the QP is in
error state and we are done with servicing IO. Only then go ahead with
tasks cleanup via iscsi_conn_stop.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-12-15 18:11:44 -08:00
Sagi Grimberg 7414dde0a6 IB/iser: Fix race between iser connection teardown and scsi TMFs
In certain scenarios (target kill with live IO) scsi TMFs may race
with iser RDMA teardown, which might cause NULL dereference on iser IB
device handle (which might have been freed). In this case we take a
conditional lock for TMFs and check the connection state (avoid
introducing lock contention in the IO path). This is indeed best
effort approach, but sufficient to survive multi targets sudden death
while heavy IO is inflight.

While we are on it, add a nice kernel-doc style documentation.

Reported-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-12-15 18:11:44 -08:00
Ariel Nahum 3f562a0b8f IB/iser: Fix possible NULL derefernce ib_conn->device in session_create
If rdma_cm error event comes after ep_poll but before conn_bind, we
should protect against dereferncing the device (which may have been
terminated) in session_create and conn_create (already protected)
callbacks.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-12-15 18:11:44 -08:00
Minh Tran f4641ef701 IB/iser: Re-adjust CQ and QP send ring sizes to HW limits
Re-adjust max CQEs per CQ and max send_wr per QP according
to the resource limits supported by underlying hardware.

Signed-off-by: Minh Tran <minhduc.tran@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>
Acked-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-12-15 18:11:44 -08:00
Christoph Hellwig db5ed4dfd5 scsi: drop reason argument from ->change_queue_depth
Drop the now unused reason argument from the ->change_queue_depth method.
Also add a return value to scsi_adjust_queue_depth, and rename it to
scsi_change_queue_depth now that it can be used as the default
->change_queue_depth implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-24 14:45:27 +01:00
Christoph Hellwig c40ecc12cf scsi: avoid ->change_queue_depth indirection for queue full tracking
All drivers use the implementation for ramping the queue up and down, so
instead of overloading the change_queue_depth method call the
implementation diretly if the driver opts into it by setting the
track_queue_depth flag in the host template.

Note that a few drivers validated the new queue depth in their
change_queue_depth method, but as we never go over the queue depth
set during slave_configure or the sysfs file this isn't nessecary
and can safely be removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
2014-11-24 14:45:12 +01:00
Sagi Grimberg f043032ef1 IB/iser: Set IP_CSUM as default guard type
In the future this will be a per-command parameter so we can lose it,
but in the mean time IP_CSUM is a lot lighter for SW layers to
compute, set it as default.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:10:53 -07:00
Sagi Grimberg dc05ac36f7 IB/iser: Fix/add kernel-doc style description in iscsi_iser.c
This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:46 -07:00
Ariel Nahum bba0a3c9d7 IB/iser: Change iscsi_conn_stop log level to info
Match to the debug level of all functions in connect/disconnect flows.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:42 -07:00
Sagi Grimberg 3a940daf6f IB/iser: Protect tasks cleanup in case IB device was already released
Bailout in case a task cleanup (iscsi_iser_cleanup_task) is called
after the IB device was removed (DEVICE_REMOVAL CM event).  We also
call iscsi_conn_stop with a lock taken to prevent DEVICE_REMOVAL and
tasks cleanup from racing.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Ariel Nahum ec370e2b63 IB/iser: Unbind at conn_stop stage
Previously we didn't need to unbind the iser_conn and iscsi_conn since
we always relied on iscsi daemon to teardown the connection and never
let it finish before we cleanup all that is needed in iser.  This is
not the case anymore (for DEVICE_REMOVAL event).  So avoid any possible
chance we cause iscsi_conn dereference after iscsi_conn was freed.

We also call iser_conn_terminate (safe to call multiple times) just
for the corner case of iscsi daemon stopping an old connection before
invoking endpoint removal (might happen if it was violently killed).

Notice we are unbinding under a lock - which is required.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg a4ee3539f6 IB/iser: Re-introduce ib_conn
Structure that describes the RDMA relates connection objects.  Static
member of iser_conn.

This patch does not change any functionality

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg 5716af6e52 IB/iser: Rename ib_conn -> iser_conn
Two reasons why we choose to do this:

1. No point today calling struct iser_conn by another name ib_conn
2. In the next patches we will restructure iser control plane representation
   - struct iser_conn: connection logical representation
   - struct ib_conn: connection RDMA layout representation

This patch does not change any functionality.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg 91eb1df39a IB/iser: Allow bind only when connection state is UP
We need to fail the bind operation if the iser connection state != UP
(started teardown) and this should be done under the state lock.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:42 -07:00
Ariel Nahum 9a6d3234a1 IB/iser: Replace connection waitqueue with completion object
Instead of waiting for events and condition changes of the iser
connection state, we wait for explicit completion of connection
establishment and teardown.

Separate connection establishment wait object from the teardown object
to avoid a situation where racing connection establishment and
teardown may concurrently wakeup each other.

ep_poll will wait for up_completion invoked by
iser_connected_handler() and iser release worker will wait for
flush_completion before releasing the connection.

Bound the completion wait with a 30 seconds timeout for cases where
iscsid (the user space iscsi daemon) is too slow or gone.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-08-01 15:10:05 -07:00
Ariel Nahum 504130c039 IB/iser: Protect iser state machine with a mutex
The iser connection state lookups and transitions are not fully protected.

Some transitions are protected with a spinlock, and in some cases the
state is accessed unprotected due to specific assumptions of the flow.

Introduce a new mutex to protect the connection state access. We use a
mutex since we need to also include a scheduling operations executed
under the state lock.

Each state transition/condition and its corresponding action will be
protected with the state mutex.

The rdma_cm events handler acquires the mutex when handling connection
events. Since iser connection state can transition to DOWN
concurrently during connection establishment, we bailout from
addr/route resolution events when the state is not PENDING.

This addresses a scenario where ep_poll retries expire during CMA
connection establishment. In this case ep_disconnect is invoked while
CMA events keep coming (address/route resolution, connected, etc...).

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-08-01 15:10:05 -07:00
Ariel Nahum 0a6907588a IB/iser: Seperate iser_conn and iscsi_endpoint storage space
iser connection needs asynchronous cleanup completions which are
triggered in ep_disconnect.  As a result we are keeping the
corresponding iscsi_endpoint structure hanging for no good reason. In
order to avoid that, we seperate iser_conn from iscsi_endpoint storage
space to have their destruction being independent.

iscsi_endpoint will be destroyed at ep_disconnect stage, while the
iser connection will wait for asynchronous completions to be released
in an orderly fashion.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-08-01 15:10:04 -07:00
Roi Dayan 96ed02d4be IB/iser: Support IPv6 address family
Replace struct sockaddr_in with struct sockaddr which supports both
IPv4 and IPv6, and print using the %pIS format directive.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-08-01 15:10:04 -07:00
Roi Dayan e7eeffa4a0 IB/iser: Add missing newlines to logging messages
Logging messages need terminating newlines to avoid possible message
interleaving.  Add them.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-05-26 08:19:48 -07:00
Ariel Nahum b73c3adabd IB/iser: Simplify connection management
iSER relies on refcounting to manage iser connections establishment
and teardown.

Following commit 39ff05dbbb ("IB/iser: Enhance disconnection logic
for multi-pathing"), iser connection maintain 3 references:

 - iscsi_endpoint (at creation stage)
 - cma_id (at connection request stage)
 - iscsi_conn (at bind stage)

We can avoid taking explicit refcounts by correctly serializing iser
teardown flows (graceful and non-graceful).

Our approach is to trigger a scheduled work to handle ordered teardown
by gracefully waiting for 2 cleanup stages to complete:

 1. Cleanup of live pending tasks indicated by iscsi_conn_stop completion
 2. Flush errors processing

Each completed stage will notify a waiting worker thread when it is
done to allow teardwon continuation.

Since iSCSI connection establishment may trigger endpoint disconnect
without a successful endpoint connect, we rely on the iscsi <-> iser
binding (.conn_bind) to learn about the teardown policy we should take
wrt cleanup stages.

Since all cleanup worker threads are scheduled (release_wq) in
.ep_disconnect it is safe to assume that when module_exit is called,
all cleanup workers are already scheduled. Thus proper module unload
shall flush all scheduled works before allowing safe exit, to
guarantee no resources got left behind.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-05-26 08:19:48 -07:00
Or Gerlitz 3ee07d27ac IB/iser: Update Mellanox copyright note
Update Mellanox copyrights for 2014 on the iser initiator driver.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-04-01 11:09:46 -07:00
Ariel Nahum 4667f5dfb0 IB/iser: Remove struct iscsi_iser_conn
The iscsi stack has existing mechanisms to link back and forth between
the iscsi connection and the iscsi transport (e.g iser/tcp) connection.

This is done through a dd_data pointer field in struct iscsi_conn
which can be set to point to the transport connection, etc.

The iscsi_iser_conn structure was used to get this linking done in
another way, which is uneeded and adds extra complication to the iser
code, so we just remove it.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-04-01 11:09:46 -07:00
Sagi Grimberg 6192d4e6bb IB/iser: Publish T10-PI support to SCSI midlayer
After allocating a scsi_host we set protection types and guard type
supported.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17 22:33:58 -07:00