A single fix this time: a fix for a virtqueue removal bug which only
appears to affect S390, but which results in the queue hanging forever
thus causing the machine to fail shutdown.
Signed-off-by: James E. J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIbBAABAgAGBQJYlPNHAAoJEAVr7HOZEZN4OsgP9Rlu6udoNn5If6lPly77hjsQ
ukiXZNtPNOuxeiCrUmTMBm69588/XNyuVrrpP7pccujQhX+5sv2qd2Ph4uoaXLeK
zq695/Y/ejAVRhORCJNibA+EQ6Dr4+DGEm+Iitifa1ILO/npaf5hCzNfdY7Ln3pb
cUu8FhXQkFkKOwhNovtOzkB6lXDobh3pZKBxYOsK4Ea5f1CSB+Sjdr/Xl4l141Ei
3eN+flX9VLX8pV6mJ7xQEoWCYrqjgh7l0PYSgX011S2Qniw8sgwI91XsNABZP3oJ
Ceu+COJPt3fRYcJugBYvAJB0pFUyxPh8rC0NL6nJLBcWVm5hJoaHX96/I5hgx+r2
9ZH4lLOiIyyEZQxz31qe73YzGkBe6lBNxJMjRcP3o5MXw+GDsUhZfuwqnX/Zc6EH
o7R4cW1o08HTgZcE3pKAwhTzzZ5IxMe4pkUiVBxb2TgUMvKfeX9dRBW4YStgRLKC
EHBQ89g1DSWbP15a4OX45sNYCSYPvq+HyNQCFzXXIhELVsEd7VyCyMK2i8E/ccAu
UwusYLDpX1QH56IpYNMgwoTJeCjI9HeOTGf7EWtJSMUTa/rrYSFZwcEA6xHxVPco
o3GqJMID84sg9fOCvToW8tKbl38Smkse9r24FhqBdiZRRJXsCogPCgt2Fa9ZRcPx
oNy87IN7k4K+bL6BAJw=
=1j2t
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"A single fix this time: a fix for a virtqueue removal bug which only
appears to affect S390, but which results in the queue hanging forever
thus causing the machine to fail shutdown"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: virtio_scsi: Reject commands when virtqueue is broken
In the case of a graceful set of detaches, where the virtio-scsi-ccw
disk is removed from the guest prior to the controller, the guest
behaves quite normally. Specifically, the detach gets us into
sd_sync_cache to issue a Synchronize Cache(10) command, which
immediately fails (and is retried a couple of times) because the device
has been removed. Later, the removal of the controller sees two CRWs
presented, but there's no further indication of the removal from the
guest viewpoint.
[ 17.217458] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 17.219257] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 21.449400] crw_info : CRW reports slct=0, oflw=0, chn=1, rsc=3, anc=0, erc=4, rsid=2
[ 21.449406] crw_info : CRW reports slct=0, oflw=0, chn=0, rsc=3, anc=0, erc=4, rsid=0
However, on s390, the SCSI disks can be removed "by surprise" when an
entire controller (host) is removed and all associated disks are removed
via the loop in scsi_forget_host. The same call to sd_sync_cache is
made, but because the controller has already been removed, the
Synchronize Cache(10) command is neither issued (and then failed) nor
rejected.
That the I/O isn't returned means the guest cannot have other devices
added nor removed, and other tasks (such as shutdown or reboot) issued
by the guest will not complete either. The virtio ring has already been
marked as broken (via virtio_break_device in virtio_ccw_remove), but we
still attempt to queue the command only to have it remain there. The
calling sequence provides a bit of distinction for us:
virtscsi_queuecommand()
-> virtscsi_kick_cmd()
-> virtscsi_add_cmd()
-> virtqueue_add_sgs()
-> virtqueue_add()
if success
return 0
elseif vq->broken or vring_mapping_error()
return -EIO
else
return -ENOSPC
A return of ENOSPC is generally a temporary condition, so returning
"host busy" from virtscsi_queuecommand makes sense here, to have it
redriven in a moment or two. But the EIO return code is more of a
permanent error and so it would be wise to return the I/O itself and
allow the calling thread to finish gracefully. The result is these four
kernel messages in the guest (the fourth one does not occur prior to
this patch):
[ 22.921562] crw_info : CRW reports slct=0, oflw=0, chn=1, rsc=3, anc=0, erc=4, rsid=2
[ 22.921580] crw_info : CRW reports slct=0, oflw=0, chn=0, rsc=3, anc=0, erc=4, rsid=0
[ 22.921978] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 22.921993] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
I opted to fill in the same response data that is returned from the more
graceful device detach, where the disk device is removed prior to the
controller device.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull SCSI target fixes from Bart Van Assche:
- two small fixes for the ibmvscsis driver
- ten patches with bug fixes for the target mode of the qla2xxx driver
- four patches that avoid that the "sparse" and "smatch" static
analyzer tools report false positives for the qla2xxx code base
* 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux:
qla2xxx: Disable out-of-order processing by default in firmware
qla2xxx: Fix erroneous invalid handle message
qla2xxx: Reduce exess wait during chip reset
qla2xxx: Terminate exchange if corrupted
qla2xxx: Fix crash due to null pointer access
qla2xxx: Collect additional information to debug fw dump
qla2xxx: Reset reserved field in firmware options to 0
qla2xxx: Set tcm_qla2xxx version to automatically track qla2xxx version
qla2xxx: Include ATIO queue in firmware dump when in target mode
qla2xxx: Fix wrong IOCB type assumption
qla2xxx: Avoid that building with W=1 triggers complaints about set-but-not-used variables
qla2xxx: Move two arrays from header files to .c files
qla2xxx: Declare an array with file scope static
qla2xxx: Fix indentation
ibmvscsis: Fix sleeping in interrupt context
ibmvscsis: Fix max transfer length
This is a set of 12 fixes including the mpt3sas one that was causing
hangs on ATA passthrough. The others are a couple of zoned block
device fixes, a SAS device detection bug which lead to SATA drives not
being matched to bays, two qla2xxx MSI fixes, a qla2xxx req for rsp
confusion caused by cut and paste, and a few other minor fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJYgY/0AAoJEAVr7HOZEZN4avAP/1WxoiNBoQV16CsAyvSlbf0A
ZQ7cR0O/cWjI7ccoLUKHxbTJ57GzMTelfmrdRsVIbNeH74NGSP4+IxzshItSwKG0
/hz0MkCqAxRs/pxliZW8DM12cEbKDIm/t4XWjDst1KBdqakluigmAIniWitKnLE3
IeA0am4/KDf/n5tQm7CI+7GOT7g0QuRqAlfAM1MFXZGAzjw2RZcSQOVkPr6vW2I+
bvDUzmWDHOZ/3g5CRrgA6pGC/1FBVbYPF5iOQicCUuMjpniR4IVvcwek5ULUkCYl
5yZnx7dg/FEXcFawp3fTQ8SbxM6Dhkqkb5/Xe/E+wjjAOwF1I3lDwdV39nQk3aBK
TwIFgeZY8CS3Z+7e9rtlRFUL2kFWwKy5uhFM/6qmY9cpXFYnPN5JsrChq/7Tgb9t
XW3/4CoS2ijcNuvxGymvyJG2FpZsLaJfLfXw4kkEUtn0W3Dzbub2PrMfPUUn7hQO
rmk3ZEuoSANvsZ9n647Gpvo7iQBbC7blNf8krSLJvoopN6SUUmLCsePXEHxorBxz
r7xAkHFqapOjBn28u4JAn5po7aVa3/fQwBuUF6UvfuVJKbvnobl4Qx0O/Fqadd7s
bix9OfCG1GMyEnUBLEWPpBg0jS2JZbLupeXVssiATks93KV2SefiQlEhYwA5hOS5
6+PykxkIEjEmU7asjQJ0
=8y3/
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of 12 fixes including the mpt3sas one that was causing
hangs on ATA passthrough.
The others are a couple of zoned block device fixes, a SAS device
detection bug which lead to SATA drives not being matched to bays, two
qla2xxx MSI fixes, a qla2xxx req for rsp confusion caused by cut and
paste, and a few other minor fixes"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpt3sas: fix hang on ata passthrough commands
scsi: lpfc: Set elsiocb contexts to NULL after freeing it
scsi: sd: Ignore zoned field for host-managed devices
scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type
scsi: bfa: fix wrongly initialized variable in bfad_im_bsg_els_ct_request()
scsi: ses: Fix SAS device detection in enclosure
scsi: libfc: Fix variable name in fc_set_wwpn
scsi: lpfc: avoid double free of resource identifiers
scsi: qla2xxx: remove irq_affinity_notifier
scsi: qla2xxx: fix MSI-X vector affinity
scsi: qla2xxx: Fix apparent cut-n-paste error.
scsi: qla2xxx: Get mutex lock before checking optrom_state
mpt3sas has a firmware failure where it can only handle one pass through
ATA command at a time. If another comes in, contrary to the SAT
standard, it will hang until the first one completes (causing long
commands like secure erase to timeout). The original fix was to block
the device when an ATA command came in, but this caused a regression
with
commit 669f044170
Author: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Tue Nov 22 16:17:13 2016 -0800
scsi: srp_transport: Move queuecommand() wait code to SCSI core
So fix the original fix of the secure erase timeout by properly
returning SAM_STAT_BUSY like the SAT recommends. The original patch
also had a concurrency problem since scsih_qcmd is lockless at that
point (this is fixed by using atomic bitops to set and test the flag).
[mkp: addressed feedback wrt. test_bit and fixed whitespace]
Fixes: 18f6084a98 (mpt3sas: Fix secure erase premature termination)
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Out of order(OOO) processing requires initiator, switch
and target to support OOO. In today's environment, none
of the switches support OOO. OOO requires extra buffer
space which affect performance. By turning ON this feature
in QLogic's FW, it delays error recovery because dropped
frame is treated as out of order frame. We're turning OFF
this option of speed up error recovery.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Termination of Immediate Notify IOCB was using wrong
IOCB handle. IOCB completion code was unable to find
appropriate code path due to wrong handle.
Following message is seen in the logs.
"Error entry - invalid handle/queue (ffff)."
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed word order in patch title ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Soft reset and Risc reset should take 100uS to complete.
This change pad the timeout up to 400uS, which should be
plenty.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Corrupted ATIO is defined as length of fcp_header & fcp_cmd
payload is less than 0x38. It's the minimum size for a frame to
carry 8..16 bytes SCSI CDB. The exchange will be dropped or
terminated if corrupted.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch title ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
During code inspection, while investigating following stack trace
seen on one of the test setup, we found out there was possibility
of memory leak becuase driver was not unwinding the stack properly.
This issue has not been reproduced in a test environment or on a
customer setup.
Here's stack trace that was seen.
[1469877.797315] Call Trace:
[1469877.799940] [<ffffffffa03ab6e9>] qla2x00_mem_alloc+0xb09/0x10c0 [qla2xxx]
[1469877.806980] [<ffffffffa03ac50a>] qla2x00_probe_one+0x86a/0x1b50 [qla2xxx]
[1469877.814013] [<ffffffff813b6d01>] ? __pm_runtime_resume+0x51/0xa0
[1469877.820265] [<ffffffff8157c1f5>] ? _raw_spin_lock_irqsave+0x25/0x90
[1469877.826776] [<ffffffff8157cd2d>] ? _raw_spin_unlock_irqrestore+0x6d/0x80
[1469877.833720] [<ffffffff810741d1>] ? preempt_count_sub+0xb1/0x100
[1469877.839885] [<ffffffff8157cd0c>] ? _raw_spin_unlock_irqrestore+0x4c/0x80
[1469877.846830] [<ffffffff81319b9c>] local_pci_probe+0x4c/0xb0
[1469877.852562] [<ffffffff810741d1>] ? preempt_count_sub+0xb1/0x100
[1469877.858727] [<ffffffff81319c89>] pci_call_probe+0x89/0xb0
Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
During NVRAM initialization in target mode, reset reserved
fields in firmware options to Zero (BIT 15)
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Include ATIO queue for ISP27XX when firmware dump is collected
for target mode.
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
qlt_reset is called with Immedidate Notify IOCB only.
Current code wrongly cast it as ATIO IOCB.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Remove two set-but-not-used variables and avoid that the compiler
warns about a third variable (rc).
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Christoph Hellwig <hch@lst.de>
This patch avoids that building with W=1 triggers compiler
warnings similar to the following:
drivers/scsi/qla2xxx/qla_nx2.h:538:23: warning: ‘qla8044_reg_tbl’ defined but not used [-Wunused-const-variable=]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Christoph Hellwig <hch@lst.de>
This patch avoids that building with W=1 triggers a compiler warning
about a missing declaration.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Christoph Hellwig <hch@lst.de>
Set the elsiocb contexts to NULL after freeing as others depend on it.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is no good match of the zoned field of the block device
characteristics page for host-managed devices. For these devices, the
zoning model is derived directly from the device type. So ignore the
zoned field for these drives.
[mkp: typo]
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Zoned block devices force the use of READ/WRITE(16) commands by setting
sdkp->use_16_for_rw and clearing sdkp->use_10_for_rw. This result in
DPOFUA always being disabled for these drives as the assumed use of
the deprecated READ/WRITE(6) commands only looks at sdkp->use_10_for_rw.
Strenghten the test by also checking that sdkp->use_16_for_rw is false.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 01e0e15c8b ("scsi: don't use fc_bsg_job::request and
fc_bsg_job::reply directly") introduced a typo, which causes that the
bsg_request variable in bfad_im_bsg_els_ct_request() is initialized to
itself instead of pointing to the bsg job's request.
Reported-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The call to scsi_is_sas_rphy() needs to be made on the SAS end_device,
not on the SCSI device.
Fixes: 835831c57e ("ses: use scsi_is_sas_rphy instead of is_sas_attached")
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, dma_alloc_coherent is being called with a GFP_KERNEL
flag which allows it to sleep in an interrupt context, need to
change to GFP_ATOMIC.
Cc: stable@vger.kernel.org
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Reviewed-by: Michael Cyr <mikecyr@linux.vnet.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Current code incorrectly calculates the max transfer length, since
it is assuming a 4k page table, but ppc64 all run on 64k page tables.
Cc: stable@vger.kernel.org
Reported-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Pull block fixes from Jens Axboe:
- the virtio_blk stack DMA corruption fix from Christoph, fixing and
issue with VMAP stacks.
- O_DIRECT blkbits calculation fix from Chandan.
- discard regression fix from Christoph.
- queue init error handling fixes for nbd and virtio_blk, from Omar and
Jeff.
- two small nvme fixes, from Christoph and Guilherme.
- rename of blk_queue_zone_size and bdev_zone_size to _sectors instead,
to more closely follow what we do in other places in the block layer.
This interface is new for this series, so let's get the naming right
before releasing a kernel with this feature. From Damien.
* 'for-linus' of git://git.kernel.dk/linux-block:
block: don't try to discard from __blkdev_issue_zeroout
sd: remove __data_len hack for WRITE SAME
nvme: use blk_rq_payload_bytes
scsi: use blk_rq_payload_bytes
block: add blk_rq_payload_bytes
block: Rename blk_queue_zone_size and bdev_zone_size
nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too
nvme-rdma: fix nvme_rdma_queue_is_ready
virtio_blk: fix panic in initialization error path
nbd: blk_mq_init_queue returns an error code on failure, not NULL
virtio_blk: avoid DMA to stack for the sense buffer
do_direct_IO: Use inode->i_blkbits to compute block count to be cleaned
Now that we have the blk_rq_payload_bytes helper available to determine
the actual I/O size we don't need to mess around with __data_len for
WRITE SAME.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Without that we'll pass a wrong payload size in cmd->sdb, which
can lead to hangs with drivers that need the total transfer size.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Chris Valean <v-chvale@microsoft.com>
Reported-by: Dexuan Cui <decui@microsoft.com>
Fixes: f9d03f96 ("block: improve handling of the magic discard payload")
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The major fix is the bfa firmware, since the latest 10Gb cards fail
probing with the current firmware. The rest is a set of minor fixes:
one missed Kconfig dependency causing randconfig failures, a missed
error return on an error leg, a change for how multiqueue waits on a
blocked device and a don't reset while in reset fix.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJYeOuBAAoJEAVr7HOZEZN4TxEP/j3nlFKrs7XEhlbbZrezyHkA
RvEWAkpoSXpQiVvuqHcObOX5Jh9I8ndBMRh2MyS2zLV4PA4SzwrprgemaB5rL9+j
+5IIw7MdHY5SHP0IrvpTae+c1kOnHRfSdsLORqohB9uhqJiSbwhXgV0Q19hRzttA
qmjD5RBI4sY3a+/CjGQevaM2Y/WQqCp4+J5kvclr3AmEoTjMkAiOsjbSuclFHWX9
CxNKSMv/6z34ZgqS0FgqCrCAI5kO5UHrjznslcEtgVECIOTrNer3g5e75DhQOcD6
Mhklxbx1dfG9r3h3oush8Zy9IIJoXZFZLRNmhhX7zcKkuYUaR9LU2COUVCTOEE+s
SAtp4qdXxCZ8sFyHS+E9ajTYSeTw05fmGEbAu1dqJ8FUxkRnvsWouCiw/TI+ps94
mJSiS2UGHAYxpLdWFbnKC+2CaCG32ygXINUcZbIItmT2cdA+ERAPAO6tDegjgb0S
MoeRxAlUGfUehHReCppRV3igwjtw2vtQaY5hFq1+v8n00Ezt6kDNeeTanqobMrHr
71rOZAos9aT5wa39pmbW18M3pfUB1qeuGn6di4ArYk8f9ILAMI4qykvnDHH3INcn
BdCIY3SgRo/Zq+87vFWZr+56CcbVqq0GV38FKoRZcHglyOjDw8ibQY6elcsjBtJC
S1Z5wlkv1T9tfA+8mRX6
=kjPw
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"The major fix is the bfa firmware, since the latest 10Gb cards fail
probing with the current firmware.
The rest is a set of minor fixes: one missed Kconfig dependency
causing randconfig failures, a missed error return on an error leg, a
change for how multiqueue waits on a blocked device and a don't reset
while in reset fix"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: bfa: Increase requested firmware version to 3.2.5.1
scsi: snic: Return error code on memory allocation failure
scsi: fnic: Avoid sending reset to firmware when another reset is in progress
scsi: qedi: fix build, depends on UIO
scsi: scsi-mq: Wait for .queue_rq() if necessary
Set variables initialized in lpfc_sli4_alloc_resource_identifiers() to
NULL if an error occurred. Otherwise, lpfc_sli4_driver_resource_unset()
attempts to free the memory again.
Signed-off-by: Roberto Sassu <rsassu@suse.de>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Now that qla2xxx uses the IRQ layer affinity assignment, affinity won't
change over the life time of a device and the notifiers are useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The first two or three vectors in qla2xxx adapter are global and not
associated with a specific queue. They should not have IRQ affinity
assigned.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If srp_transfer_data fails within ibmvscsis_write_pending, then
the most likely scenario is that the client timed out the op and
removed the TCE mapping. Thus it will loop forever retrying the
op that is pretty much guaranteed to fail forever. A better return
code would be EIO instead of EAGAIN.
Cc: stable@vger.kernel.org
Reported-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Bryant G. Ly <bgly@us.ibm.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Commit 093df73771 ("scsi: qla2xxx: Fix Target mode handling with
Multiqueue changes.") introduces two bodies of code that look similar
but with s/req/rsp/ in the second instance. But in one case, it looks
like this conversion was missed.
Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Quinn Tran <Quinn.Tran@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a race condition with qla2xxx optrom functions where one thread
might modify optrom buffer, optrom_state while other thread is still
reading from it.
In couple of crashes, it was found that we had successfully passed the
following 'if' check where we confirm optrom_state to be
QLA_SREADING. But by the time we acquired mutex lock to proceed with
memory_read_from_buffer function, some other thread/process had already
modified that option rom buffer and optrom_state from QLA_SREADING to
QLA_SWAITING. Then we got ha->optrom_buffer 0x0 and crashed the system:
if (ha->optrom_state != QLA_SREADING)
return 0;
mutex_lock(&ha->optrom_mutex);
rval = memory_read_from_buffer(buf, count, &off, ha->optrom_buffer,
ha->optrom_region_size);
mutex_unlock(&ha->optrom_mutex);
With current optrom function we get following crash due to a race
condition:
[ 1479.466679] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1479.466707] IP: [<ffffffff81326756>] memcpy+0x6/0x110
[...]
[ 1479.473673] Call Trace:
[ 1479.474296] [<ffffffff81225cbc>] ? memory_read_from_buffer+0x3c/0x60
[ 1479.474941] [<ffffffffa01574dc>] qla2x00_sysfs_read_optrom+0x9c/0xc0 [qla2xxx]
[ 1479.475571] [<ffffffff8127e76b>] read+0xdb/0x1f0
[ 1479.476206] [<ffffffff811fdf9e>] vfs_read+0x9e/0x170
[ 1479.476839] [<ffffffff811feb6f>] SyS_read+0x7f/0xe0
[ 1479.477466] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
Below patch modifies qla2x00_sysfs_read_optrom,
qla2x00_sysfs_write_optrom functions to get the mutex_lock before
checking ha->optrom_state to avoid similar crashes.
The patch was applied and tested and same crashes were no longer
observed again.
Tested-by: Milan P. Gandhi <mgandhi@redhat.com>
Signed-off-by: Milan P. Gandhi <mgandhi@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
bna & bfa firmware version 3.2.5.1 was submitted to linux-firmware on
Feb 17 19:10:20 2015 -0500 in 0ab54ff1dc ("linux-firmware: Add QLogic BR
Series Adapter Firmware").
bna was updated to use the newer firmware on Feb 19 16:02:32 2015 -0500 in
3f307c3d70 ("bna: Update the Driver and Firmware Version")
bfa was not updated. I presume this was an oversight but it broke support
for bfa+bna cards such as the following
04:00.0 Fibre Channel [0c04]: Brocade Communications Systems, Inc.
1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
04:00.1 Fibre Channel [0c04]: Brocade Communications Systems, Inc.
1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
04:00.2 Ethernet controller [0200]: Brocade Communications Systems,
Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
04:00.3 Ethernet controller [0200]: Brocade Communications Systems,
Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
Currently, if the bfa module is loaded first, bna fails to probe the
respective devices with
[ 215.026787] bna: QLogic BR-series 10G Ethernet driver - version: 3.2.25.1
[ 215.043707] bna 0000:04:00.2: bar0 mapped to ffffc90001fc0000, len 262144
[ 215.060656] bna 0000:04:00.2: initialization failed err=1
[ 215.073893] bna 0000:04:00.3: bar0 mapped to ffffc90002040000, len 262144
[ 215.090644] bna 0000:04:00.3: initialization failed err=1
Whereas if bna is loaded first, bfa fails with
[ 249.592109] QLogic BR-series BFA FC/FCOE SCSI driver - version: 3.2.25.0
[ 249.610738] bfa 0000:04:00.0: Running firmware version is incompatible with the driver version
[ 249.833513] bfa 0000:04:00.0: bfa init failed
[ 249.833919] scsi host6: QLogic BR-series FC/FCOE Adapter, hwpath: 0000:04:00.0 driver: 3.2.25.0
[ 249.841446] bfa 0000:04:00.1: Running firmware version is incompatible with the driver version
[ 250.045449] bfa 0000:04:00.1: bfa init failed
[ 250.045962] scsi host7: QLogic BR-series FC/FCOE Adapter, hwpath: 0000:04:00.1 driver: 3.2.25.0
Increase bfa's requested firmware version. Also increase the driver
version. I only tested that all of the devices probe without error.
Reported-by: Tim Ehlers <tehlers@gwdg.de>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Rasesh Mody <rasesh.mody@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a call to mempool_create_slab_pool() in snic_probe() returns NULL,
return -ENOMEM to indicate failure. mempool_creat_slab_pool() only fails
if it cannot allocate memory.
https://bugzilla.kernel.org/show_bug.cgi?id=189061
Reported-by: bianpan2010@ruc.edu.cn
Signed-off-by: Burak Ok <burak-kernel@bur0k.de>
Signed-off-by: Andreas Schaertl <andreas.schaertl@fau.de>
Acked-by: Narsimhulu Musini <nmusini@cisco.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This fix is to avoid calling fnic_fw_reset_handler through
fnic_host_reset when a finc reset is alreay in progress.
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull timer type cleanups from Thomas Gleixner:
"This series does a tree wide cleanup of types related to
timers/timekeeping.
- Get rid of cycles_t and use a plain u64. The type is not really
helpful and caused more confusion than clarity
- Get rid of the ktime union. The union has become useless as we use
the scalar nanoseconds storage unconditionally now. The 32bit
timespec alike storage got removed due to the Y2038 limitations
some time ago.
That leaves the odd union access around for no reason. Clean it up.
Both changes have been done with coccinelle and a small amount of
manual mopping up"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ktime: Get rid of ktime_equal()
ktime: Cleanup ktime_set() usage
ktime: Get rid of the union
clocksource: Use a plain u64 instead of cycle_t
Pull SMP hotplug notifier removal from Thomas Gleixner:
"This is the final cleanup of the hotplug notifier infrastructure. The
series has been reintgrated in the last two days because there came a
new driver using the old infrastructure via the SCSI tree.
Summary:
- convert the last leftover drivers utilizing notifiers
- fixup for a completely broken hotplug user
- prevent setup of already used states
- removal of the notifiers
- treewide cleanup of hotplug state names
- consolidation of state space
There is a sphinx based documentation pending, but that needs review
from the documentation folks"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/armada-xp: Consolidate hotplug state space
irqchip/gic: Consolidate hotplug state space
coresight/etm3/4x: Consolidate hotplug state space
cpu/hotplug: Cleanup state names
cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
staging/lustre/libcfs: Convert to hotplug state machine
scsi/bnx2i: Convert to hotplug state machine
scsi/bnx2fc: Convert to hotplug state machine
cpu/hotplug: Prevent overwriting of callbacks
x86/msr: Remove bogus cleanup from the error path
bus: arm-ccn: Prevent hotplug callback leak
perf/x86/intel/cstate: Prevent hotplug callback leak
ARM/imx/mmcd: Fix broken cpu hotplug handling
scsi: qedi: Convert to hotplug state machine
ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Install the callbacks via the state machine. No functional change.
This is the minimal fixup so we can remove the hotplug notifier mess
completely.
The real rework of this driver to use work queues is still stuck in
review/testing on the SCSI mailing list.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chad Dupuis <chad.dupuis@qlogic.com>
Cc: QLogic-Storage-Upstream@qlogic.com
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20161221192111.836895753@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Install the callbacks via the state machine. No functional change.
This is the minimal fixup so we can remove the hotplug notifier mess
completely.
The real rework of this driver to use work queues is still stuck in
review/testing on the SCSI mailing list.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chad Dupuis <chad.dupuis@qlogic.com>
Cc: QLogic-Storage-Upstream@qlogic.com
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20161221192111.757309869@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>