linux

Commit Graph

Author	SHA1	Message	Date
Martin K. Petersen	9c1d9c207b	sd: Reject optimal transfer length smaller than page size Eryu Guan reported that loading scsi_debug would fail. This turned out to be caused by scsi_debug reporting an optimal I/O size of 32KB which is smaller than the 64KB page size on the PowerPC system in question. Add a check to ensure that we only use the device-reported OPTIMAL TRANSFER LENGTH if it is bigger than or equal to the page cache size. Reported-by: Eryu Guan <guaneryu@gmail.com> Reported-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-12-21 21:37:18 -05:00
James Bottomley	be9e2f775f	Merge branch 'mkp-fixes' into fixes	2015-12-03 09:32:33 -08:00
Martin K. Petersen	ca369d51b3	block/sd: Fix device-imposed transfer length limits Commit `4f258a4634` ("sd: Fix maximum I/O size for BLOCK_PC requests") had the unfortunate side-effect of removing an implicit clamp to BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer code. This caused problems for some SMR drives. Debugging this issue revealed a few problems with the existing infrastructure since the block layer didn't know how to deal with device-imposed limits, only limits set by the I/O controller. - Introduce a new queue limit, max_dev_sectors, which is used by the ULD to signal the maximum sectors for a REQ_TYPE_FS request. - Ensure that max_dev_sectors is correctly stacked and taken into account when overriding max_sectors through sysfs. - Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer values for later processing. - In sd_revalidate() set the queue's max_dev_sectors based on the MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value is not reported, fall back to a cap based on the CDB TRANSFER LENGTH field size. - In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits VPD--if reported and sane--to signal the preferred device transfer size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS. - blk_limits_max_hw_sectors() is no longer used and can be removed. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581 Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: sweeneygj@gmx.com Tested-by: Arzeets <anatol.pomozov@gmail.com> Tested-by: David Eisner <david.eisner@oriel.oxon.org> Tested-by: Mario Kicherer <dev@kicherer.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-11-25 21:38:58 -05:00
Martin K. Petersen	397737223c	sd: Make discard granularity match logical block size when LBPRZ=1 A device may report an OPTIMAL UNMAP GRANULARITY and UNMAP GRANULARITY ALIGNMENT in the Block Limits VPD. These parameters describe the device's internal provisioning allocation units. By default the block layer will round and align any discard requests based on these limits. If a device reports LBPRZ=1 to guarantee zeroes after discard, however, it is imperative that the block layer does not leave out any parts of the requested block range. Otherwise the device can not do the required zeroing of any partial allocation units and this can lead to data corruption. Since the dm thinp personality relies on the block layer's current behavior and is unable to deal with partial discard blocks we work around the problem by setting the granularity to match the logical block size when LBPRZ is enabled. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-11-25 21:38:34 -05:00
Linus Torvalds	d83763f4a6	SCSI misc on 20151113 Sorry for the delay in this patch which was mostly caused by getting the merger of the mpt2/mpt3sas driver, which was seen as an essential item of maintenance work to do before the drivers diverge too much. Unfortunately, this caused a compile failure (detected by linux-next), which then had to be fixed up and incubated. In addition to the mpt2/3sas rework, there are updates from pm80xx, lpfc, bnx2fc, hpsa, ipr, aacraid, megaraid_sas, storvsc and ufs plus an assortment of changes including some year 2038 issues, a fix for a remove before detach issue in some drivers and a couple of other minor issues. Signed-off-by: James Bottomley <JBottomley@Odin.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJWRnpiAAoJEDeqqVYsXL0MJd0IAIkNP1q/6ksMw/lam2UlbdxN zFsFOIhGM3xLnBFehbCx7JW3cmybA7WC5jhMjaa1OeJmYHLpwbTzvVwrLs2l/0y1 /S+G0wwxROZfIKj/1EO3oKbSPCu9N3lStxOnmNXt5PSUEzAXrTqSNTtnMf2ZGh7j bQxTWEJe+66GckgGw4ozTXJHWXqM/Zs/FsYjn2h/WzFhFv8utr7zRSzHMVjcqpLG TK/Lt03hiTGqiignKrV9W6JzdGTWf2LGIsj/njgR0dxv59cNH8PrHIjZyknSBxgn lblinsCjxDHWnP4BSfTi9MQG1lEiZiWO3Y6TKkKJTgxZ9M0Eitspc+cLOiJ1mpg= =HvQf -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull final round of SCSI updates from James Bottomley: "Sorry for the delay in this patch which was mostly caused by getting the merger of the mpt2/mpt3sas driver, which was seen as an essential item of maintenance work to do before the drivers diverge too much. Unfortunately, this caused a compile failure (detected by linux-next), which then had to be fixed up and incubated. In addition to the mpt2/3sas rework, there are updates from pm80xx, lpfc, bnx2fc, hpsa, ipr, aacraid, megaraid_sas, storvsc and ufs plus an assortment of changes including some year 2038 issues, a fix for a remove before detach issue in some drivers and a couple of other minor issues" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits) mpt3sas: fix inline markers on non inline function declarations sd: Clear PS bit before Mode Select. ibmvscsi: set max_lun to 32 ibmvscsi: display default value for max_id, max_lun and max_channel. mptfusion: don't allow negative bytes in kbuf_alloc_2_sgl() scsi: pmcraid: replace struct timeval with ktime_get_real_seconds() mvumi: 64bit value for seconds_since1970 be2iscsi: Fix bogus WARN_ON length check scsi_scan: don't dump trace when scsi_prep_async_scan() is called twice mpt3sas: Bump mpt3sas driver version to 09.102.00.00 mpt3sas: Single driver module which supports both SAS 2.0 & SAS 3.0 HBAs mpt2sas, mpt3sas: Update the driver versions mpt3sas: setpci reset kernel oops fix mpt3sas: Added OEM Gen2 PnP ID branding names mpt3sas: Refcount fw_events and fix unsafe list usage mpt3sas: Refcount sas_device objects and fix unsafe list usage mpt3sas: sysfs attribute to report Backup Rail Monitor Status mpt3sas: Ported WarpDrive product SSS6200 support mpt3sas: fix for driver fails EEH, recovery from injected pci bus error mpt3sas: Manage MSI-X vectors according to HBA device type ...	2015-11-13 20:35:54 -08:00
Gabriel Krisman Bertazi	2c5d16d6a9	sd: Clear PS bit before Mode Select. According to SPC-4, in a Mode Select, the PS bit in Mode Pages is reserved and must be set to 0 by the driver. In the sd implementation, function cache_type_store does a Mode Sense, which might set the PS bit on the read buffer, followed by a Mode Select, which receives the same buffer, without explicitly clearing the PS bit. So, in cases where target supports saving the Mode Page to a non-volatile location, we end up doing a Mode Select with the PS bit set, which could cause an illegal request error if the target is checking this. This was observed on a new firmware change, which was subsequently reverted, but this changes sd.c to be more compliant with SPC-4. This patch clears the PS bit in the buffer returned by Mode Select, right before it is used in the Mode Select command. Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-11-11 21:10:26 -05:00
Linus Torvalds	ccf21b69a8	Merge branch 'for-4.4/reservations' of git://git.kernel.dk/linux-block Pull block reservation support from Jens Axboe: "This adds support for persistent reservations, both at the core level, as well as for sd and NVMe" [ Background from the docs: "Persistent Reservations allow restricting access to block devices to specific initiators in a shared storage setup. All implementations are expected to ensure the reservations survive a power loss and cover all connections in a multi path environment" ] * 'for-4.4/reservations' of git://git.kernel.dk/linux-block: NVMe: Precedence error in nvme_pr_clear() nvme: add missing endianess annotations in nvme_pr_command NVMe: Add persistent reservation ops sd: implement the Persistent Reservation API block: add an API for Persistent Reservations block: cleanup blkdev_ioctl	2015-11-04 21:01:27 -08:00
Christoph Hellwig	924d55b063	sd: implement the Persistent Reservation API This is a mostly trivial mapping to the PERSISTENT RESERVE IN/OUT commands. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-10-21 14:46:58 -06:00
Dan Williams	9609b9942b	md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown Now that the integrity profile is statically allocated there is no work to do when shutting down an integrity enabled block device. Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: James Bottomley <JBottomley@Odin.com> Acked-by: NeilBrown <neilb@suse.com> Acked-by: Keith Busch <keith.busch@intel.com> Acked-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-10-21 14:43:37 -06:00
Linus Torvalds	1081230b74	Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block Pull core block updates from Jens Axboe: "This first core part of the block IO changes contains: - Cleanup of the bio IO error signaling from Christoph. We used to rely on the uptodate bit and passing around of an error, now we store the error in the bio itself. - Improvement of the above from myself, by shrinking the bio size down again to fit in two cachelines on x86-64. - Revert of the max_hw_sectors cap removal from a revision again, from Jeff Moyer. This caused performance regressions in various tests. Reinstate the limit, bump it to a more reasonable size instead. - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by me. Most devices have huge trim limits, which can cause nasty latencies when deleting files. Enable the admin to configure the size down. We will look into having a more sane default instead of UINT_MAX sectors. - Improvement of the SGP gaps logic from Keith Busch. - Enable the block core to handle arbitrarily sized bios, which enables a nice simplification of bio_add_page() (which is an IO hot path). From Kent. - Improvements to the partition io stats accounting, making it faster. From Ming Lei. - Also from Ming Lei, a basic fixup for overflow of the sysfs pending file in blk-mq, as well as a fix for a blk-mq timeout race condition. - Ming Lin has been carrying Kents above mentioned patches forward for a while, and testing them. Ming also did a few fixes around that. - Sasha Levin found and fixed a use-after-free problem introduced by the bio->bi_error changes from Christoph. - Small blk cgroup cleanup from Viresh Kumar" * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits) blk: Fix bio_io_vec index when checking bvec gaps block: Replace SG_GAPS with new queue limits mask block: bump BLK_DEF_MAX_SECTORS to 2560 Revert "block: remove artifical max_hw_sectors cap" blk-mq: fix race between timeout and freeing request blk-mq: fix buffer overflow when reading sysfs file of 'pending' Documentation: update notes in biovecs about arbitrarily sized bios block: remove bio_get_nr_vecs() fs: use helper bio_add_page() instead of open coding on bi_io_vec block: kill merge_bvec_fn() completely md/raid5: get rid of bio_fits_rdev() md/raid5: split bio for chunk_aligned_read block: remove split code in blkdev_issue_{discard,write_same} btrfs: remove bio splitting and merge_bvec_fn() calls bcache: remove driver private bio splitting code block: simplify bio_add_page() block: make generic_make_request handle arbitrarily sized bios blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) block: don't access bio->bi_error after bio_put() block: shrink struct bio down to 2 cache lines again ...	2015-09-02 13:10:25 -07:00
Martin K. Petersen	4f258a4634	sd: Fix maximum I/O size for BLOCK_PC requests Commit `bcdb247c6b` ("sd: Limit transfer length") clamped the maximum size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the BLOCK LIMITS VPD. This had the unfortunate effect of also limiting the maximum size of non-filesystem requests sent to the device through sg/bsg. Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue limit directly. Also update the comment in blk_limits_max_hw_sectors() to clarify that max_hw_sectors defines the limit for the I/O controller only. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Brian King <brking@linux.vnet.ibm.com> Tested-by: Brian King <brking@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # 3.17+ Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-08-12 11:54:37 -07:00
Jens Axboe	2bb4cd5cc4	block: have drivers use blk_queue_max_discard_sectors() Some drivers use it now, others just set the limits field manually. But in preparation for splitting this into a hard and soft limit, ensure that they all call the proper function for setting the hw limit for discards. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-07-17 08:41:53 -06:00
Dan Carpenter	dee0586e15	sd: fix an error return in probe() If device_add() fails then it should return the error code but instead the current code returns success. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-05-25 08:46:24 -07:00
Mark Hounschell	74856fbf44	sd: Disable support for 256 byte/sector disks 256 bytes per sector support has been broken since 2.6.X, and no-one stepped up to fix this. So disable support for it. Signed-off-by: Mark Hounschell <dmarkh@cfl.rr.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-05-18 11:26:37 -07:00
Martin K. Petersen	e727c42bd5	sd: Unregister integrity profile The new integrity code did not correctly unregister the profile for SD disks. Call blk_integrity_unregister() when we release a disk. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Tested-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # v3.17+ Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-04-16 10:36:14 -07:00
James Bottomley	b9f28d8635	sd, mmc, virtio_blk, string_helpers: fix block size units The current string_get_size() overflows when the device size goes over 2^64 bytes because the string helper routine computes the suffix from the size in bytes. However, the entirety of Linux thinks in terms of blocks, not bytes, so this will artificially induce an overflow on very large devices. Fix this by making the function string_get_size() take blocks and the block size instead of bytes. This should allow us to keep working until the current SCSI standard overflows. Also fix virtio_blk and mmc (both of which were also artificially multiplying by the block size to pass a byte side to string_get_size()). The mathematics of this is pretty simple: we're taking a product of size in blocks (S) and block size (B) and trying to re-express this in exponential form: SB = RN^E (where N, the exponent is either 1000 or 1024) and R < N. Mathematically, S = RSN^ES and B=RBN^EB, so if RSRB < N it's easy to see that SB = RSRBN^(ES+EB). However, if RSBS > N, we can see that this can be re-expressed as RSBS = RN (where R = RSBS/N < N) so the whole exponent becomes R*N^(ES+EB+1) [jejb: fix incorrect 32 bit do_div spotted by kbuild test robot <fengguang.wu@intel.com>] Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-04-10 16:27:48 -07:00
Christoph Hellwig	3d9a1f530e	sd: don't grab a device references from driver methods The device model already takes care of races between ->remove and ->shutdown vs its other methods, and we now take care about locking them out for ->rescan as well. This is a partial revert of commit 39b7f1 ("[SCSI] sd: Fix refcounting"). Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2015-03-19 06:40:53 -07:00
Linus Torvalds	540a7c5061	SCSI misc on 20150209 This is the usual grab bag of driver updates (hpsa, storvsc, mp2sas, megaraid_sas, ses) plus an assortment of minor updates. There's also an update to ufs which adds new phy drivers and finally a new logging infrastructure for SCSI. Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJU2Ty5AAoJEDeqqVYsXL0M9rAH/1xNpAxXuxQq+dW5Z+uOaX60 5RRIu7/xA1HEfzkT5FTHrolmogDjVqawu4PZS66iHDeo05RBVUlbTA8qCK+MlRcN U6s0cLEw59eH3EaCfOGuYp/MnbhuV0eNxe0btmqJIQwuW3+gwZKGJdOq6LS2YasJ k/DyIBVmkJAVsN56vm9q2vbtcZp+Bg+ngqBS+SC4TF7vV1WCtFmS6yaUf62PYW3D +Irx37qHZntDR5wdw3dsuKDi5U8bl6myPjaVLnVJqg/WIF9RlCkjk5xpWT99AmVO NmtYQxLLBlAQ5K+sIlBUwxZe+8q1l+Aj4TTmJHAfFtyfp25s7JR9I6/QtOyC5Kw= =odol -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull first round of SCSI updates from James Bottomley: "This is the usual grab bag of driver updates (hpsa, storvsc, mp2sas, megaraid_sas, ses) plus an assortment of minor updates. There's also an update to ufs which adds new phy drivers and finally a new logging infrastructure for SCSI" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (114 commits) scsi_logging: return void for dev_printk() functions scsi: print single-character strings with seq_putc scsi: merge consecutive seq_puts calls scsi: replace seq_printf with seq_puts aha152x: replace seq_printf with seq_puts advansys: replace seq_printf with seq_puts scsi: remove SPRINTF macro sg: remove an unused variable hpsa: Use local workqueues instead of system workqueues hpsa: add in P840ar controller model name hpsa: add in gen9 controller model names hpsa: detect and report failures changing controller transport modes hpsa: shorten the wait for the CISS doorbell mode change ack hpsa: refactor duplicated scan completion code into a new routine hpsa: move SG descriptor set-up out of hpsa_scatter_gather() hpsa: do not use function pointers in fast path command submission hpsa: print CDBs instead of kernel virtual addresses for uncommon errors hpsa: do not use a void pointer for scsi_cmd field of struct CommandList hpsa: return failed from device reset/abort handlers hpsa: check for ctlr lockup after command allocation in main io path ...	2015-02-11 10:28:45 -08:00
Brian King	3a9794d329	sd: Fix max transfer length for 4k disks The following patch fixes an issue observed with 4k sector disks where the max_hw_sectors attribute was getting set too large in sd_revalidate_disk. Since sdkp->max_xfer_blocks is in units of SCSI logical blocks and queue_max_hw_sectors is in units of 512 byte blocks, on a 4k sector disk, every time we went through sd_revalidate_disk, we were taking the current value of queue_max_hw_sectors and increasing it by a factor of 8. Fix this by only shifting sdkp->max_xfer_blocks. Cc: stable@vger.kernel.org Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2015-02-02 13:46:29 +01:00
Hannes Reinecke	2104551969	scsi: use per-cpu buffer for formatting sense Convert sense buffer logging to use the per-cpu buffer to avoid line breakup. Tested-by: Robert Elliott <elliott@hp.com> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2015-01-09 15:44:30 +01:00
Martin K. Petersen	e461338b6c	sd: tweak discard heuristics to work around QEMU SCSI issue `7985090aa0` changed the discard heuristics to give preference to the WRITE SAME commands that (unlike UNMAP) guarantee deterministic results. Ming Lei discovered that QEMU SCSI's WRITE SAME implementation internally relied on limits that were only communicated for the UNMAP case. And therefore discard commands backed by WRITE SAME would fail. Tweak the heuristics so we still pick UNMAP in the LBPRZ=0 case and only prefer the WRITE SAME variants if the device has the LBPRZ flag set. Reported-by: Ming Lei <ming.lei@canonical.com> Tested-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-12-30 13:30:38 +01:00
Hannes Reinecke	eb846d9f14	scsi: rename SERVICE_ACTION_IN to SERVICE_ACTION_IN_16 SPC-3 defines SERVICE ACTION IN(12) and SERVICE ACTION IN(16). So rename SERVICE_ACTION_IN to SERVICE_ACTION_IN_16 to be consistent with SPC and to allow for better distinction. Signed-off-by: Hannes Reinecke <hare@suse.de> Tested-by: Robert Elliott <elliott@hp.com> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-24 20:01:40 +01:00
Christoph Hellwig	3af6b35261	scsi: remove scsi_driver owner field The driver core driver structure has grown an owner field and now requires it to be set for all modular drivers. Set it up for all scsi_driver instances and get rid of the now superflous scsi_driver owner field. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Shane M Seymour <shane.seymour@hp.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-24 20:01:28 +01:00
Christoph Hellwig	3c356bde19	scsi: stop passing a gfp_mask argument down the command setup path There is no reason for ULDs to pass in a flag on how to allocate the S/G lists. While we don't need GFP_ATOMIC for the blk-mq case because we don't hold locks, that decision can be made way down the chain without having to pass a pointless gfp_mask argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-24 19:56:55 +01:00
Martin K. Petersen	7985090aa0	sd: disable discard_zeroes_data for UNMAP The T10 SBC UNMAP command does not provide any hard guarantees that blocks will return zeroes on a subsequent READ. This is due to the fact that the device server is free to silently ignore all or parts of the request. The only way to ensure that a block consistently returns zeroes after being unmapped is to use WRITE SAME with the UNMAP bit set. Should the device be unable to unmap one or more blocks described by the command it is required to manually write zeroes to them. Until now we have preferred UNMAP over the WRITE SAME variants to accommodate thinly provisioned devices that predated the final SBC-3 spec. This patch changes the heuristic so that we favor WRITE SAME(16) or (10) over UNMAP if these commands are marked as supported in the Logical Block Provisioning VPD page. The patch also disables discard_zeroes_data for devices operating in UNMAP mode. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:19:14 +01:00
Christoph Hellwig	21a9d4c9d6	sd: fix up ->compat_ioctl No need to verify the passthrough ioctls, the real handler will take care of that. Also make sure not to block for resets on O_NONBLOCK fds. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-12 11:16:11 +01:00
Christoph Hellwig	906d15fbd2	scsi: split scsi_nonblockable_ioctl The calling conventions for this function are bad as it could return -ENODEV both for a device not currently online and a not recognized ioctl. Add a new scsi_ioctl_block_when_processing_errors function that wraps scsi_block_when_processing_errors with the a special case for the SG_SCSI_RESET ioctl command, and handle the SG_SCSI_RESET case itself in scsi_ioctl. All callers of scsi_ioctl now must call the above helper to check for the EH state, so that the ioctl handler itself doesn't have to. Reported-by: Robert Elliott <Elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-12 11:16:11 +01:00
Hannes Reinecke	ef61329db7	scsi: remove scsi_show_result() Open-code scsi_print_result in sd.c, and cleanup logging to not print duplicate informations. Also remove the call to scsi_show_result() in ufshcd.c to be consistent with other callers of scsi_execute(). With that we can remove scsi_show_result in constants.c Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:16:05 +01:00
Hannes Reinecke	d811b848eb	scsi: use sdev as argument for sense code printing We should be using the standard dev_printk() variants for sense code printing. [hch: remove __scsi_print_sense call in xen-scsiback, Acked by Juergen] [hch: folded bracing fix from Dan Carpenter] Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:15:58 +01:00
Hannes Reinecke	ad3819c09b	sd: remove scsi_print_sense() in sd_done() sd_done() was calling scsi_print_sense() for a sense code of 'NO_SENSE'. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:15:56 +01:00
Linus Torvalds	e75437fb93	Merge branch 'for-3.18/drivers' of git://git.kernel.dk/linux-block Pull block layer driver update from Jens Axboe: "This is the block driver pull request for 3.18. Not a lot in there this round, and nothing earth shattering. - A round of drbd fixes from the linbit team, and an improvement in asender performance. - Removal of deprecated (and unused) IRQF_DISABLED flag in rsxx and hd from Michael Opdenacker. - Disable entropy collection from flash devices by default, from Mike Snitzer. - A small collection of xen blkfront/back fixes from Roger Pau Monné and Vitaly Kuznetsov" * 'for-3.18/drivers' of git://git.kernel.dk/linux-block: block: disable entropy contributions for nonrot devices xen, blkfront: factor out flush-related checks from do_blkif_request() xen-blkback: fix leak on grant map error path xen/blkback: unmap all persistent grants when frontend gets disconnected rsxx: Remove deprecated IRQF_DISABLED block: hd: remove deprecated IRQF_DISABLED drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks drbd: compute the end before rb_insert_augmented() drbd: Add missing newline in resync progress display in /proc/drbd drbd: reduce lock contention in drbd_worker drbd: Improve asender performance drbd: Get rid of the WORK_PENDING macro drbd: Get rid of the __no_warn and __cond_lock macros drbd: Avoid inconsistent locking warning drbd: Remove superfluous newline from "resync_extents" debugfs entry. drbd: Use consistent names for all the bi_end_io callbacks drbd: Use better variable names	2014-10-18 12:12:45 -07:00
Linus Torvalds	d3dc366bba	Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block Pull core block layer changes from Jens Axboe: "This is the core block IO pull request for 3.18. Apart from the new and improved flush machinery for blk-mq, this is all mostly bug fixes and cleanups. - blk-mq timeout updates and fixes from Christoph. - Removal of REQ_END, also from Christoph. We pass it through the ->queue_rq() hook for blk-mq instead, freeing up one of the request bits. The space was overly tight on 32-bit, so Martin also killed REQ_KERNEL since it's no longer used. - blk integrity updates and fixes from Martin and Gu Zheng. - Update to the flush machinery for blk-mq from Ming Lei. Now we have a per hardware context flush request, which both cleans up the code should scale better for flush intensive workloads on blk-mq. - Improve the error printing, from Rob Elliott. - Backing device improvements and cleanups from Tejun. - Fixup of a misplaced rq_complete() tracepoint from Hannes. - Make blk_get_request() return error pointers, fixing up issues where we NULL deref when a device goes bad or missing. From Joe Lawrence. - Prep work for drastically reducing the memory consumption of dm devices from Junichi Nomura. This allows creating clone bio sets without preallocating a lot of memory. - Fix a blk-mq hang on certain combinations of queue depths and hardware queues from me. - Limit memory consumption for blk-mq devices for crash dump scenarios and drivers that use crazy high depths (certain SCSI shared tag setups). We now just use a single queue and limited depth for that" * 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits) block: Remove REQ_KERNEL blk-mq: allocate cpumask on the home node bio-integrity: remove the needless fail handle of bip_slab creating block: include func name in __get_request prints block: make blk_update_request print prefix match ratelimited prefix blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio block: fix alignment_offset math that assumes io_min is a power-of-2 blk-mq: Make bt_clear_tag() easier to read blk-mq: fix potential hang if rolling wakeup depth is too high block: add bioset_create_nobvec() block: use bio_clone_fast() in blk_rq_prep_clone() block: misplaced rq_complete tracepoint sd: Honor block layer integrity handling flags block: Replace strnicmp with strncasecmp block: Add T10 Protection Information functions block: Don't merge requests if integrity flags differ block: Integrity checksum flag block: Relocate bio integrity flags block: Add a disk flag to block integrity profile block: Add prefix to block integrity profile flags ...	2014-10-18 11:53:51 -07:00
Mike Snitzer	b277da0a8a	block: disable entropy contributions for nonrot devices Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set QUEUE_FLAG_NONROT. Historically, all block devices have automatically made entropy contributions. But as previously stated in commit `e2e1a148` ("block: add sysfs knob for turning off disk entropy contributions"): - On SSD disks, the completion times aren't as random as they are for rotational drives. So it's questionable whether they should contribute to the random pool in the first place. - Calling add_disk_randomness() has a lot of overhead. There are more reliable sources for randomness than non-rotational block devices. From a security perspective it is better to err on the side of caution than to allow entropy contributions from unreliable "random" sources. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-10-04 10:55:32 -06:00
Martin K. Petersen	c611529e7c	sd: Honor block layer integrity handling flags A set of flags introduced in the block layer enable better control over how protection information is handled. These flags are useful for both error injection and data recovery purposes. Checking can be enabled and disabled for controller and disk, and the guard tag format is now a per-I/O property. Update sd_protect_op to communicate the relevant information to the low-level device driver via a set of flags in scsi_cmnd. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-09-30 15:17:35 -06:00
Subhash Jadavani	6fe8c1dbef	scsi: balance out autopm get/put calls in scsi_sysfs_add_sdev() SCSI Well-known logical units generally don't have any scsi driver associated with it which means no one will call scsi_autopm_put_device() on these wlun scsi devices and this would result in keeping the corresponding scsi device always active (hence LLD can't be suspended as well). Same exact problem can be seen for other scsi device representing normal logical unit whose driver is yet to be loaded. This patch fixes the above problem with this approach: - make the scsi_autopm_put_device call at the end of scsi_sysfs_add_sdev to make it balance out the get earlier in the function. - let drivers do paired get/put calls in their probe methods. Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Dolev Raviv <draviv@codeaurora.org> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-09-15 16:02:05 -07:00
Sujit Reddy Thumma	2eefd57b97	sd: Avoid sending medium write commands if device is write protected The SYNCHRONIZE_CACHE command is a medium write command and hence can fail when the device is write protected. Avoid sending such commands by making sure that write-cache-enable is disabled even though the device claim to support it. Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org> Signed-off-by: Dolev Raviv <draviv@codeaurora.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-09-15 16:01:57 -07:00
K. Y. Srinivasan	26b9fd8b34	sd: fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout Commit ID: `7e660100d8` added code to derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the basic I/O timeout of the device. Fix this bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-25 17:16:42 -04:00
Martin K. Petersen	c1d40a527e	scsi: add a blacklist flag which enables VPD page inquiries Despite supporting modern SCSI features some storage devices continue to claim conformance to an older version of the SPC spec. This is done for compatibility with legacy operating systems. Linux by default will not attempt to read VPD pages on devices that claim SPC-2 or older. Introduce a blacklist flag that can be used to trigger VPD page inquiries on devices that are known to support them. Reported-by: KY Srinivasan <kys@microsoft.com> Tested-by: KY Srinivasan <kys@microsoft.com> Reviewed-by: KY Srinivasan <kys@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> CC: <stable@vger.kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-25 17:16:41 -04:00
Christoph Hellwig	fd2eb9034e	scsi: move the writeable field from struct scsi_device to struct scsi_cd We currently set the field in common code based on the device type, but then only use it in the cdrom driver which also overrides the value previously set in the generic code. Just leave this entirely to the CDROM driver to make everyones life simpler. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>	2014-07-25 17:16:41 -04:00
Christoph Hellwig	87949eee7e	sd: split sd_init_command Factor out a function to initialize regular read/write commands and leave sd_init_command as a simple dispatcher to the different prepare routines. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:16:29 +02:00
Christoph Hellwig	e4200f8ee3	sd: retry discard commands Currently cmd->allowed is initialized from rq->retries for discard commands, but retries is always 0 for non-BLOCK_PC requests. Set it to the standard number of retries instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:16:28 +02:00
Christoph Hellwig	a25ee54851	sd: retry write same commands Currently cmd->allowed is initialized from rq->retries for write same commands, but retries is always 0 for non-BLOCK_PC requests. Set it to the standard number of retries instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:16:27 +02:00
Christoph Hellwig	6a7b43985d	sd: don't use scsi_setup_blk_pc_cmnd for discard requests Simplify handling of discard requests by setting up the command directly instead of initializing request fields and then calling scsi_setup_blk_pc_cmnd to propagate the information into the command. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:16:26 +02:00
Christoph Hellwig	59b1134c5a	sd: don't use scsi_setup_blk_pc_cmnd for write same requests Simplify handling of write same requests by setting up the command directly instead of initializing request fields and then calling scsi_setup_blk_pc_cmnd to propagate the information into the command. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Webb Scales <webbnh@hp.com>	2014-07-17 22:12:47 +02:00
Christoph Hellwig	a118c6c1d9	sd: don't use scsi_setup_blk_pc_cmnd for flush requests Simplify handling of flush requests by setting up the command directly instead of initializing request fields and then calling scsi_setup_blk_pc_cmnd to propagate the information into the command. Also rename scsi_setup_flush_cmnd to sd_setup_flush_cmnd for consistency. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:12:33 +02:00
Christoph Hellwig	5158a899d8	scsi: set sc_data_direction in common code The data direction fiel in the SCSI command is derived only from the block request structure. Move setting it up into common code instead of duplicating it in the ULDs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:11:41 +02:00
Christoph Hellwig	3868cf8ea7	scsi: restructure command initialization for TYPE_FS requests We should call the device handler prep_fn for all TYPE_FS requests, not just simple read/write calls that are handled by the disk driver. Restructure the common I/O code to call the prep_fn handler and zero out the CDB, and just leave the call to scsi_init_io to the ULDs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:11:27 +02:00
Martin K. Petersen	bcdb247c6b	sd: Limit transfer length Until now the per-command transfer length has exclusively been gated by the max_sectors parameter in the scsi_host template. Given that the size of this parameter has been bumped to an unsigned int we have to be careful not to exceed the target device's capabilities. If the if the device specifies a Maximum Transfer Length in the Block Limits VPD we'll use that value. Otherwise we'll use 0xffffffff for devices that have use_16_for_rw set and 0xffff for the rest. We then combine the chosen disk limit with max_sectors in the host template. The smaller of the two will be used to set the max_hw_sectors queue limit. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:33 +02:00
Clément Calmels	8d964478b2	sd: bad return code of init_sd In init_sd function, if kmem_cache_create or mempool_create_slab_pools calls fail, the error will not be correclty reported because class_register previously set the value of err to 0. Signed-off-by: Clément Calmels <clement.calmels@free.fr> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:32 +02:00
Vaughan Cao	cb2fb68d06	sd: notify block layer when using temporary change to cache_type This is a fix for commit `39c60a0948` "sd: fix array cache flushing bug causing performance problems" We must notify the block layer via q->flush_flags after a temporary change of the cache_type to write through. Without this, a SYNCHRONIZE CACHE command will still be generated. This patch factors out a helper that can be called from sd_revalidate_disk and cache_type_store. Signed-off-by: Vaughan Cao <vaughan.cao@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:31 +02:00
Akinobu Mita	e430cbc8bb	sd: use READ_16 or WRITE_16 when transfer length is greater than 0xffff This change makes the scsi disk driver handle the requests whose transfer length is greater than 0xffff with READ_16 or WRITE_16. However, this is a preparation for extending the data type of max_sectors in struct Scsi_Host and scsi_host_template. So, it is impossible to happen this condition for now, because SCSI low-level drivers can not specify max_sectors greater than 0xffff due to the data type limitation. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:30 +02:00
Alan Stern	b14bf2d0c0	usb-storage/SCSI: Add broken_fua blacklist flag Some buggy JMicron USB-ATA bridges don't know how to translate the FUA bit in READs or WRITEs. This patch adds an entry in unusual_devs.h and a blacklist flag to tell the sd driver not to use FUA. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Michael Büsch <m@bues.ch> Tested-by: Michael Büsch <m@bues.ch> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net> CC: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 22:47:18 -07:00
Linus Torvalds	1c54fc1efe	SCSI for-linus on 20140609 This patch consists of the usual driver updates (qla2xxx, qla4xxx, lpfc, be2iscsi, fnic, ufs, NCR5380) The NCR5380 is the addition to maintained status of a long neglected driver for older hardware. In addition there are a lot of minor fixes and cleanups and some more updates to make scsi mq ready. Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJTlcwiAAoJEDeqqVYsXL0M5qsIALzVPLd4yxA16zCiaPQUeIV5 mfYmwISFlN+qW3AcUeSH4D13YgegCjEBfqaDMWvIkgouxLy/7jpxtChutq3MCzUE cDT1B9+ZrzoqBISRNHEh/gx5F1MOF2VPuqG2pe0J90wyRCNzJscB6PbtWMAo86CA 2eu7wq3K9FXxCC1qY0PzwBLXHqUcgk5GWiK9CM/k4W0NiTVeNmwPeh5i91IQnBHx E2l7NAXgNLyCf5tyeswvZ4pW0T3hlaswNmBB4qC8oJm4U6UqMN+tk4ML63Pz7uPe 4mlHG0uI8Vbdi13iv1EDUZ9Vo8iqVrzP2UAhakgP9poKSGqE4d/MD0EKNGQB2Es= =cBY8 -----END PGP SIGNATURE----- Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This patch consists of the usual driver updates (qla2xxx, qla4xxx, lpfc, be2iscsi, fnic, ufs, NCR5380) The NCR5380 is the addition to maintained status of a long neglected driver for older hardware. In addition there are a lot of minor fixes and cleanups and some more updates to make scsi mq ready" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (130 commits) include/scsi/osd_protocol.h: remove unnecessary __constant mvsas: Recognise device/subsystem 9485/9485 as 88SE9485 Revert "be2iscsi: Fix processing cqe for cxn whose endpoint is freed" mptfusion: fix msgContext in mptctl_hp_hostinfo acornscsi: remove linked command support scsi/NCR5380: dprintk macro fusion: Remove use of DEF_SCSI_QCMD fusion: Add free msg frames to the head, not tail of list mpt2sas: Add free smids to the head, not tail of list mpt2sas: Remove use of DEF_SCSI_QCMD mpt2sas: Remove uses of serial_number mpt3sas: Remove use of DEF_SCSI_QCMD mpt3sas: Remove uses of serial_number qla2xxx: Use kmemdup instead of kmalloc + memcpy qla4xxx: Use kmemdup instead of kmalloc + memcpy qla2xxx: fix incorrect debug printk be2iscsi: Bump the driver version be2iscsi: Fix processing cqe for cxn whose endpoint is freed be2iscsi: Fix destroy MCC-CQ before MCC-EQ is destroyed be2iscsi: Fix memory corruption in MBX path ...	2014-06-09 18:54:06 -07:00
David Jeffery	2a863ba8f6	sd: medium access timeout counter fails to reset There is an error with the medium access timeout feature of the sd driver. The sdkp->medium_access_timed_out value is reset to zero in sd_done() in the wrong place. Currently it is reset to zero only when a command returns sense data. This can result in cases where the medium access check falsely triggers from timed out commands which are hours or days apart. For example, an I/O command times out and is aborted. It then retries and succeeds. But with no sense data generated and returned, the medium_access_timed_out value is not reset. If no sd command returns sense data, then the next command to time out (however far in time from the first failure) will trigger the medium access timeout and put the device offline. The resetting of sdkp->medium_access_timed_out should occur before the check for sense data. To reproduce using scsi_debug, use SCSI_DEBUG_OPT_TIMEOUT or SCSI_DEBUG_OPT_MAC_TIMEOUT to force an I/O command to timeout. Then, remove the opt value so the I/O will succeed on retry. Perform more I/O as desired. Finally, repeat the process to make a new I/O command time out. Without the patch, the device will be marked offline even though many I/O commands have succeeded between the 2 instances of timed out commands. Signed-off-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-05-19 12:54:21 +02:00
Christoph Hellwig	a1b73fc194	scsi: reintroduce scsi_driver.init_command Instead of letting the ULD play games with the prep_fn move back to the model of a central prep_fn with a callback to the ULD. This already cleans up and shortens the code by itself, and will be required to properly support blk-mq in the SCSI midlayer. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-05-19 12:35:09 +02:00
Jens Axboe	dc4a93078b	sd/skd: stuff discard page in request->completion_data Store the pointer to the page there, so we can always safely reference it from end_io context where ->bio may have been cleared. Signed-off-by: Jens Axboe <axboe@fb.com>	2014-04-16 21:37:30 -06:00
Jens Axboe	b4f42e2831	block: remove struct request buffer member This was used in the olden days, back when onions were proper yellow. Basically it mapped to the current buffer to be transferred. With highmem being added more than a decade ago, most drivers map pages out of a bio, and rq->buffer isn't pointing at anything valid. Convert old style drivers to just use bio_data(). For the discard payload use case, just reference the page in the bio. Signed-off-by: Jens Axboe <axboe@fb.com>	2014-04-15 14:03:02 -06:00
Linus Torvalds	b7e70ca9c7	Merge branch 'async-scsi-resume' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/isci Pull async SCSI resume support from Dan Williams: "Allow disks and other devices to resume in parallel. This provides a tangible speed up for a non-esoteric use case (laptop resume): https://01.org/suspendresume/blogs/tebrandt/2013/hard-disk-resume-optimization-simpler-approach" * 'async-scsi-resume' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/isci: scsi: async sd resume	2014-04-11 17:23:52 -07:00
Dan Williams	3c31b52f96	scsi: async sd resume async_schedule() sd resume work to allow disks and other devices to resume in parallel. This moves the entirety of scsi_device resume to an async context to ensure that scsi_device_resume() remains ordered with respect to the completion of the start/stop command. For the duration of the resume, new command submissions (that do not originate from the scsi-core) will be deferred (BLKPREP_DEFER). It adds a new ASYNC_DOMAIN_EXCLUSIVE(scsi_sd_pm_domain) as a container of these operations. Like scsi_sd_probe_domain it is flushed at sd_remove() time to ensure async ops do not continue past the end-of-life of the sdev. The implementation explicitly refrains from reusing scsi_sd_probe_domain directly for this purpose as it is flushed at the end of dpm_resume(), potentially defeating some of the benefit. Given sdevs are quiesced it is permissible for these resume operations to bleed past the async_synchronize_full() calls made by the driver core. We defer the resolution of which pm callback to call until scsi_dev_type_{suspend\|resume} time and guarantee that the callback parameter is never NULL. With this in place the type of resume operation is encoded in the async function identifier. There is a concern that async resume could trigger PSU overload. In the enterprise, storage enclosures enforce staggered spin-up regardless of what the kernel does making async scanning safe by default. Outside of that context a user can disable asynchronous scanning via a kernel command line or CONFIG_SCSI_SCAN_ASYNC. Honor that setting when deciding whether to do resume asynchronously. Inspired by Todd's analysis and initial proposal [2]: https://01.org/suspendresume/blogs/tebrandt/2013/hard-disk-resume-optimization-simpler-approach Cc: Len Brown <len.brown@intel.com> Cc: Phillip Susi <psusi@ubuntu.com> [alan: bug fix and clean up suggestion] Acked-by: Alan Stern <stern@rowland.harvard.edu> Suggested-by: Todd Brandt <todd.e.brandt@linux.intel.com> [djbw: kick all resume work to the async queue] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2014-04-10 15:30:35 -07:00
Martin K. Petersen	b2bff6ceb6	[SCSI] sd: Quiesce mode sense error messages Messages about discovered disk properties are only printed once unless they are found to have changed. Errors encountered during mode sense, however, are printed every time we revalidate. Quiesce mode sense errors so they are only printed during the first scan. [jejb: checkpatch fixes] Bugzilla: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=733565 Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-27 08:26:33 -07:00
Alan Stern	7aae51347b	[SCSI] sd: don't fail if the device doesn't recognize SYNCHRONIZE CACHE Evidently some wacky USB-ATA bridges don't recognize the SYNCHRONIZE CACHE command, as shown in this email thread: http://marc.info/?t=138978356200002&r=1&w=2 The fact that we can't tell them to drain their caches shouldn't prevent the system from going into suspend. Therefore sd_sync_cache() shouldn't return an error if the device replies with an Invalid Command ASC. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Sven Neumann <s.neumann@raumfeld.com> Tested-by: Daniel Mack <zonque@gmail.com> CC: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-15 10:19:22 -07:00
Linus Torvalds	f568849eda	Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block Pull core block IO changes from Jens Axboe: "The major piece in here is the immutable bio_ve series from Kent, the rest is fairly minor. It was supposed to go in last round, but various issues pushed it to this release instead. The pull request contains: - Various smaller blk-mq fixes from different folks. Nothing major here, just minor fixes and cleanups. - Fix for a memory leak in the error path in the block ioctl code from Christian Engelmayer. - Header export fix from CaiZhiyong. - Finally the immutable biovec changes from Kent Overstreet. This enables some nice future work on making arbitrarily sized bios possible, and splitting more efficient. Related fixes to immutable bio_vecs: - dm-cache immutable fixup from Mike Snitzer. - btrfs immutable fixup from Muthu Kumar. - bio-integrity fix from Nic Bellinger, which is also going to stable" * 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits) xtensa: fixup simdisk driver to work with immutable bio_vecs block/blk-mq-cpu.c: use hotcpu_notifier() blk-mq: for_each_* macro correctness block: Fix memory leak in rw_copy_check_uvector() handling bio-integrity: Fix bio_integrity_verify segment start bug block: remove unrelated header files and export symbol blk-mq: uses page->list incorrectly blk-mq: use __smp_call_function_single directly btrfs: fix missing increment of bi_remaining Revert "block: Warn and free bio if bi_end_io is not set" block: Warn and free bio if bi_end_io is not set blk-mq: fix initializing request's start time block: blk-mq: don't export blk_mq_free_queue() block: blk-mq: make blk_sync_queue support mq block: blk-mq: support draining mq queue dm cache: increment bi_remaining when bi_end_io is restored block: fixup for generic bio chaining block: Really silence spurious compiler warnings block: Silence spurious compiler warnings block: Kill bio_pair_split() ...	2014-01-30 11:19:05 -08:00
Jens Axboe	b28bc9b38c	Linux 3.13-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJSwLfoAAoJEHm+PkMAQRiGi6QH/1U1B7lmHChDTw3jj1lfm9gA 189Si4QJlnxFWCKHvKEL+pcaVuACU+aMGI8+KyMYK4/JfuWVjjj5fr/SvyHH2/8m LdSK8aHMhJ46uBS4WJ/l6v46qQa5e2vn8RKSBAyKm/h4vpt+hd6zJdoFrFai4th7 k/TAwOAEHI5uzexUChwLlUBRTvbq4U8QUvDu+DeifC8cT63CGaaJ4qVzjOZrx1an eP6UXZrKDASZs7RU950i7xnFVDQu4PsjlZi25udsbeiKcZJgPqGgXz5ULf8ZH8RQ YCi1JOnTJRGGjyIOyLj7pyB01h7XiSM2+eMQ0S7g54F2s7gCJ58c2UwQX45vRWU= =/4/R -----END PGP SIGNATURE----- Merge tag 'v3.13-rc6' into for-3.14/core Needed to bring blk-mq uptodate, since changes have been going in since for-3.14/core was established. Fixup merge issues related to the immutable biovec changes. Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-flush.c fs/btrfs/check-integrity.c fs/btrfs/extent_io.c fs/btrfs/scrub.c fs/logfs/dev_bdev.c	2013-12-31 09:51:02 -07:00
Geert Uytterhoeven	ef80d1e18b	[SCSI] sd: Do not call do_div() with a 64-bit divisor do_div() is meant for divisions of 64-bit number by 32-bit numbers. Passing 64-bit divisor types caused issues in the past on 32-bit platforms, cfr. commit `ea077b1b96` ("m68k: Truncate base in do_div()"). As scsi_device.sector_size is unsigned (int), factor should be unsigned int, too. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-12-19 07:39:03 -08:00
James Bottomley	2451079bc2	[SCSI] Fix erratic device offline during EH Commit `18a4d0a22e` (Handle disk devices which can not process medium access commands) was introduced to offline any device which cannot process medium access commands. However, commit `3eef6257de` (Reduce error recovery time by reducing use of TURs) reduced the number of TURs by sending it only on the first failing command, which might or might not be a medium access command. So in combination this results in an erratic device offlining during EH; if the command where the TUR was sent upon happens to be a medium access command the device will be set offline, if not everything proceeds as normal. This patch moves the check to the final test, eliminating this problem. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-12-19 07:39:02 -08:00
Martin K. Petersen	54b2b50c20	[SCSI] Disable WRITE SAME for RAID and virtual host adapter drivers Some host adapters do not pass commands through to the target disk directly. Instead they provide an emulated target which may or may not accurately report its capabilities. In some cases the physical device characteristics are reported even when the host adapter is processing commands on the device's behalf. This can lead to adapter firmware hangs or excessive I/O errors. This patch disables WRITE SAME for devices connected to host adapters that provide an emulated target. Driver writers can disable WRITE SAME by setting the no_write_same flag in the host adapter template. [jejb: fix up rejections due to eh_deadline patch] Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-11-29 08:48:39 +04:00
Kent Overstreet	a4ad39b1d1	block: Convert bio_iovec() to bvec_iter For immutable biovecs, we'll be introducing a new bio_iovec() that uses our new bvec iterator to construct a biovec, taking into account bvec_iter->bi_bvec_done - this patch updates existing users for the new usage. Some of the existing users really do need a pointer into the bvec array - those uses are all going to be removed, but we'll need the functionality from immutable to remove them - so for now rename the existing bio_iovec() -> __bio_iovec(), and it'll be removed in a couple patches. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Cc: "James E.J. Bottomley" <JBottomley@parallels.com>	2013-11-23 22:33:49 -08:00
Linus Torvalds	0d522ee749	SCSI for-linus on 20131110 This patch set is driver updates for qla4xxx, scsi_debug, pm80xx, fcoe/libfc, eas2r, lpfc, be2iscsi and megaraid_sas plus some assorted bug fixes and cleanups. Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJSfw30AAoJEDeqqVYsXL0MZPEIAK6GBHFw8JsU3NQ4SbM5hzdM ywPryTn7DO9jyj0J04i6TNtbS6om9E8tjLyr3SnmTQNiTDXGv44rIEfJyHR9ko2n E2hRu4xaGEWK4dkuktQuOqj2fuXRyeXr2maYIXjkmFI0hesLqozYKgLAeWTHvabE 2HICwG/lfCzesqVl69Y3V8n1vZvtJqAls6liwY09i9eSDRe39DynRn7bjLXzkPkc ynjJYl22CIZ7nb+PgzqQ+xEUIdXqGG890CvGaqg7+x3ZUmmOtfECaDUkCjVeiiE6 sy72V6E4ET/YMrkhRmIUKyZxGbl/tMxYPuGaBhq2fSNRx8x1R+Ajfh9UM2AZTh4= =hCHG -----END PGP SIGNATURE----- Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull first round of SCSI updates from James Bottomley: "This patch set is driver updates for qla4xxx, scsi_debug, pm80xx, fcoe/libfc, eas2r, lpfc, be2iscsi and megaraid_sas plus some assorted bug fixes and cleanups" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (106 commits) [SCSI] scsi_error: Escalate to LUN reset if abort fails [SCSI] Add 'eh_deadline' to limit SCSI EH runtime [SCSI] remove check for 'resetting' [SCSI] dc395: Move 'last_reset' into internal host structure [SCSI] tmscsim: Move 'last_reset' into host structure [SCSI] advansys: Remove 'last_reset' references [SCSI] dpt_i2o: return SCSI_MLQUEUE_HOST_BUSY when in reset [SCSI] dpt_i2o: Remove DPTI_STATE_IOCTL [SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path [SCSI] lpfc: Fix typo on NULL assignment [SCSI] scsi_dh_alua: ALUA handler attach should succeed while TPG is transitioning [SCSI] scsi_dh_alua: ALUA check sense should retry device internal reset unit attention [SCSI] esas2r: Cleanup snprinf formatting of firmware version [SCSI] esas2r: Remove superfluous mask of pcie_cap_reg [SCSI] esas2r: Fixes for big-endian platforms [SCSI] esas2r: Directly call kernel functions for atomic bit operations [SCSI] lpfc 8.3.43: Update lpfc version to driver version 8.3.43 [SCSI] lpfc 8.3.43: Fixed not processing task management IOCB response status [SCSI] lpfc 8.3.43: Fixed spinlock hang. [SCSI] lpfc 8.3.43: Fixed invalid Total_Data_Placed value received for els and ct command responses ...	2013-11-14 12:25:38 +09:00
Jens Axboe	e37459b8e2	Merge branch 'blk-mq/core' into for-3.13/core Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-timeout.c	2013-11-08 09:08:12 -07:00
Jens Axboe	5953316dbf	block: make rq->cmd_flags be 64-bit We have officially run out of flags in a 32-bit space. Extend it to 64-bit even on 32-bit archs. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-10-25 11:55:59 +01:00
James Bottomley	7e660100d8	[SCSI] Derive the FLUSH_TIMEOUT from the basic I/O timeout Rather than having a separate constant for specifying the timeout on FLUSH operations, use the basic I/O timeout value that is already configurable on a per target basis to derive the FLUSH timeout. Looking at the current definitions of these timeout values, the FLUSH operation is supposed to have a value that is twice the normal timeout value. This patch preserves this relationship while leveraging the flexibility of specifying the I/O timeout. Based on a prior patch by KY Srinivasan <kys@microsoft.com> Reviewed-by: KY Srinivasan <kys@microsoft.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-10-25 09:58:16 +01:00
Oliver Neukum	95897910a5	[SCSI] sd: Add error handling during flushing caches It makes no sense to flush the cache of a device without medium. Errors during suspend must be handled according to their causes. Errors due to missing media or unplugged devices must be ignored. Errors due to devices being offlined must also be ignored. The error returns must be modified so that the generic layer understands them. [jejb: fix up whitespace and other formatting problems] Signed-off-by: Oliver Neukum <oneukum@suse.de> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-10-25 09:58:13 +01:00
Bernd Schubert	af73623f5f	[SCSI] sd: Reduce buffer size for vpd request Somehow older areca firmware versions have issues with scsi_get_vpd_page() and a large buffer, the firmware seems to crash and the scsi error-handler will start endless recovery retries. Limiting the buf-size to 64-bytes fixes this issue with older firmware versions (<1.49 for my controller). Fixes a regression with areca controllers and older firmware versions introduced by commit: `66c28f9712` Reported-by: Nix <nix@esperi.org.uk> Tested-by: Nix <nix@esperi.org.uk> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Cc: stable@vger.kernel.org # delay inclusion for 2 months for testing Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-10-25 09:58:12 +01:00
Aaron Lu	10c580e423	[SCSI] sd: call blk_pm_runtime_init before add_disk Sujit has found a race condition that would make q->nr_pending unbalanced, it occurs as Sujit explained: " sd_probe_async() -> add_disk() -> disk_add_event() -> schedule(disk_events_workfn) sd_revalidate_disk() blk_pm_runtime_init() return; Let's say the disk_events_workfn() calls sd_check_events() which tries to send test_unit_ready() and because of sd_revalidate_disk() trying to send another commands the test_unit_ready() might be re-queued as the tagged command queuing is disabled. So the race condition is - Thread 1 \| Thread 2 sd_revalidate_disk() \| sd_check_events() ...nr_pending = 0 as q->dev = NULL\| scsi_queue_insert() blk_runtime_pm_init() \| blk_pm_requeue_request() -> \| nr_pending = -1 since \| q->dev != NULL " The problem is, the test_unit_ready request doesn't get counted the first time it is queued, so the later decrement of q->nr_pending in blk_pm_requeue_request makes it unbalanced. Fix this by calling blk_pm_runtime_init before add_disk so that all requests initiated there will all be counted. Signed-off-by: Aaron Lu <aaron.lu@intel.com> Reported-and-tested-by: Sujit Reddy Thumma <sthumma@codeaurora.org> Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-10-23 14:09:18 +01:00
Alan Stern	984f1733fc	[SCSI] sd: Fix potential out-of-bounds access This patch fixes an out-of-bounds error in sd_read_cache_type(), found by Google's AddressSanitizer tool. When the loop ends, we know that "offset" lies beyond the end of the data in the buffer, so no Caching mode page was found. In theory it may be present, but the buffer size is limited to 512 bytes. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Dmitry Vyukov <dvyukov@google.com> CC: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-09-11 09:44:00 -07:00
Greg Kroah-Hartman	e1ea2351fb	[SCSI] sd: convert class code to use dev_groups The dev_attrs field of struct class is going away soon, dev_groups should be used instead. This converts the scsi disk class code to use the correct field. It required some functions to be moved around to place the show and store functions next to each other, the old order seemed to make no sense at all. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-08-21 10:10:48 -07:00
Ewan D. Milne	085b513f97	[SCSI] sd: fix crash when UA received on DIF enabled device sd_prep_fn will allocate a larger CDB for the command via mempool_alloc for devices using DIF type 2 protection. This CDB was being freed in sd_done, which results in a kernel crash if the command is retried due to a UNIT ATTENTION. This change moves the code to free the larger CDB into sd_unprep_fn instead, which is invoked after the request is complete. It is no longer necessary to call scsi_print_command separately for this case as the ->cmnd will no longer be NULL in the normal code path. Also removed conditional test for DIF type 2 when freeing the larger CDB because the protection_type could have been changed via sysfs while the command was executing. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-07-23 07:41:53 -07:00
Linus Torvalds	84cbd7222b	SCSI misc on 20130702 The patch set is mostly driver updates (usf, zfcp, lpfc, mpt2sas, megaraid_sas, bfa, ipr) and a few bug fixes. Also of note is that the Buslogic driver has been rewritten to a better coding style and 64 bit support added. We also removed the libsas limitation on 16 bytes for the command size (currently no drivers make use of this). Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJR0ugCAAoJEDeqqVYsXL0MX2sH+gOkWuy5p3igz+VEim8TNaOA VV5EIxG1v7Q0ZiXCp/wcF6eqhgQkWvkrKSxWkaN0yzq8LEWfQeY7VmFDbGgFeVUZ XMlX5ay8+FLCIK9M76oxwhV7VAXYbeUUZafh+xX6StWCdKrl0eJbicOGoUk/pjsi ZjCBpK5BM0SW+s2gMSDQhO2eMsgMp9QrJMiCJHUF1wWPN8Yez6va1tg4b9iW39BZ dd3sJq+PuN6yDbYAJIjEpiGF9gDaaYxSE6bTKJuY+oy08+VsP/RRWjorTENs9Aev rQXZIC3nwsv26QRSX7RDSj+UE+kFV6FcPMWMU3HN2UG6ttprtOxT8tslVJf7LcA= =BxtF -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull first round of SCSI updates from James Bottomley: "The patch set is mostly driver updates (usf, zfcp, lpfc, mpt2sas, megaraid_sas, bfa, ipr) and a few bug fixes. Also of note is that the Buslogic driver has been rewritten to a better coding style and 64 bit support added. We also removed the libsas limitation on 16 bytes for the command size (currently no drivers make use of this)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (101 commits) [SCSI] megaraid: minor cut and paste error fixed. [SCSI] ufshcd-pltfrm: remove unnecessary dma_set_coherent_mask() call [SCSI] ufs: fix register address in UIC error interrupt handling [SCSI] ufshcd-pltfrm: add missing empty slot in ufs_of_match[] [SCSI] ufs: use devres functions for ufshcd [SCSI] ufs: Fix the response UPIU length setting [SCSI] ufs: rework link start-up process [SCSI] ufs: remove version check before IS reg clear [SCSI] ufs: amend interrupt configuration [SCSI] ufs: wrap the i/o access operations [SCSI] storvsc: Update the storage protocol to win8 level [SCSI] storvsc: Increase the value of scsi timeout for storvsc devices [SCSI] MAINTAINERS: Add myself as the maintainer for BusLogic SCSI driver [SCSI] BusLogic: Port driver to 64-bit. [SCSI] BusLogic: Fix style issues [SCSI] libiscsi: Added new boot entries in the session sysfs [SCSI] aacraid: Fix for arrays are going offline in the system. System hangs [SCSI] ipr: IOA Status Code(IOASC) update [SCSI] sd: Update WRITE SAME heuristics [SCSI] fnic: potential dead lock in fnic_is_abts_pending() ...	2013-07-04 12:30:30 -07:00
Kees Cook	02aa2a3763	drivers: avoid format string in dev_set_name Calling dev_set_name with a single paramter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents, including wrappers like device_create*() and bdi_register(). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:41 -07:00
Martin K. Petersen	66c28f9712	[SCSI] sd: Update WRITE SAME heuristics SATA drives located behind a SAS controller would incorrectly receive WRITE SAME commands. Tweak the heuristics so that: - If REPORT SUPPORTED OPERATION CODES is provided we will use that to choose between WRITE SAME(16), WRITE SAME(10) and disabled. This also fixes an issue with the old code which would issue WRITE SAME(10) despite the command not being whitelisted in REPORT SUPPORTED OPERATION CODES. - If REPORT SUPPORTED OPERATION CODES is not provided we will fall back to WRITE SAME(10) unless the device has an ATA Information VPD page. The assumption is that a SATL which is smart enough to implement WRITE SAME would also provide REPORT SUPPORTED OPERATION CODES. To facilitate the new heuristics scsi_report_opcode() has been modified to so we can distinguish between "operation not supported" and "RSOC not supported". Reported-by: H. Peter Anvin <hpa@zytor.com> Tested-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-06-26 17:56:18 -07:00
Ben Hutchings	2ee3e26c67	[SCSI] sd: Fix parsing of 'temporary ' cache mode prefix Commit `39c60a0948` '[SCSI] sd: fix array cache flushing bug causing performance problems' added temp as a pointer to "temporary " and used sizeof(temp) - 1 as its length. But sizeof(temp) is the size of the pointer, not the size of the string constant. Change temp to a static array so that sizeof() does what was intended. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-06-26 08:44:33 -07:00
Hannes Reinecke	0761df9c4b	[SCSI] sd: avoid deadlocks when running under multipath When multipathed systems run into an all-paths-down scenario all devices might be dropped, too. This causes 'del_gendisk' to be called, which will unregister the kobj_map->probe() function for all disk device numbers. When the device comes back the default ->probe() function is run which will call __request_module(), which will deadlock. As 'del_gendisk' typically does _not_ trigger a module unload the default ->probe() function is pointless anyway. This patch implements a dummy ->probe() function, which will just return NULL if the disk is not registered. This will avoid the deadlock. Plus it'll speed up device scanning. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-06-04 11:16:23 -07:00
James Bottomley	297b8a0734	Merge branch 'postmerge' into for-linus Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-05-10 07:54:01 -07:00
James Bottomley	832e77bc11	Merge branch 'misc' into for-linus Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-05-10 07:53:40 -07:00
Al Viro	db2a144bed	block_device_operations->release() should return void The value passed is 0 in all but "it can never happen" cases (and those only in a couple of drivers) and it would've been lost on the way out anyway, even if something tried to pass something meaningful. Just don't bother. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-05-07 02:16:21 -04:00
Lin Ming	6df339a51e	[SCSI] sd: change to auto suspend mode Uses block layer runtime pm helper functions in scsi_runtime_suspend/resume for devices that take advantage of it. Remove scsi_autopm_* from sd open/release path and check_events path. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-05-06 12:48:31 -07:00
Lin Ming	9b21493c45	[SCSI] sd: use REQ_PM in sd's runtime suspend operation With the introduction of REQ_PM, modify sd's runtime suspend operation functions to use that flag so that the operations to put the device into runtime suspended state(i.e. sync cache and stop device) will not affect its runtime PM status. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-05-06 12:48:17 -07:00
James Bottomley	39c60a0948	[SCSI] sd: fix array cache flushing bug causing performance problems Some arrays synchronize their full non volatile cache when the sd driver sends a SYNCHRONIZE CACHE command. Unfortunately, they can have Terrabytes of this and we send a SYNCHRONIZE CACHE for every barrier if an array reports it has a writeback cache. This leads to massive slowdowns on journalled filesystems. The fix is to allow userspace to turn off the writeback cache setting as a temporary measure (i.e. without doing the MODE SELECT to write it back to the device), so even though the device reported it has a writeback cache, the user, knowing that the cache is non volatile and all they care about is filesystem correctness, can turn that bit off in the kernel and avoid the performance ruinous (and safety irrelevant) SYNCHRONIZE CACHE commands. The way you do this is add a 'temporary' prefix when performing the usual cache setting operations, so echo temporary write through > /sys/class/scsi_disk/<disk>/cache_type Reported-by: Ric Wheeler <rwheeler@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-05-02 16:15:54 -07:00
Aaron Lu	691e3d3175	[SCSI] sd: update sd to use the new pm callbacks Update sd driver to use the callbacks defined in dev_pm_ops. sd_freeze is NULL, the bus level callback has taken care of quiescing the device so there should be nothing needs to be done here. Consequently, sd_thaw is not needed here either. suspend, poweroff and runtime suspend share the same routine sd_suspend, which will sync flush and then stop the drive, this is the same as before. resume, restore and runtime resume share the same routine sd_resume, which will start the drive by putting it into active power state, this is also the same as before. Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-11-30 09:28:34 +00:00
Aaron Lu	a01475637c	[SCSI] sd: put to stopped power state when runtime suspend When device is runtime suspended, put it to stopped power state to save some power. This will also make the behaviour consistent with what the scsi_pm.c thinks about sd as the comment says: sd treats runtime suspend, system suspend and system hibernate identical. With this patch, it is now identical. And sd_shutdown will also do nothing when it finds the device has been runtime suspended, if we do not spin down the disk in runtime suspend by putting it into stopped power state, the disk will be shut down incorrectly. And the the same problem can be solved for runtime power off after runtime suspended case by this change. With the current runtime scheme for disk, it will only be runtime suspended when no process opens the disk, so this shouldn't happen a lot, which makes it acceptable to spin down the disk when runtime suspended. If some day a more aggressive runtime scheme is used, like the 'request based runtime pm for disk' that Alan Stern and Lin Ming has been working, we can introduce some policy to control this. But for now, make it simple and correct by spinning down the disk. Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-11-30 09:22:09 +00:00
Jason J. Herne	53ad570be6	[SCSI] sd: Use SCSI read/write(16) with > 32-bit LBA drives Force large capacity (> 0xFFFFFFFF blocks) drives to use READ/WRITE(16) instead of READ/WRITE(10). Some(most/all?) USB enclosures do not like READ(10) commands when a large capacity drive is installed. This issue was reported and discussed here: http://marc.info/?l=linux-usb&m=135247705222324 Signed-off-by: Jason J. Herne <hernejj@gmail.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-11-27 09:00:38 +04:00
Joel D. Diaz	afd5e34b2b	[SCSI] sd: Reshuffle init_sd to avoid crash scsi_register_driver will register a prep_fn() function, which in turn migh need to use the sd_cdp_pool for DIF. Which hasn't been initialised at this point, leading to a crash. So reshuffle the init_sd() and exit_sd() paths to have the driver registered last. Signed-off-by: Joel D. Diaz <joeldiaz@us.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-11-27 08:59:34 +04:00
Martin K. Petersen	5db44863b6	[SCSI] sd: Implement support for WRITE SAME Implement support for WRITE SAME(10) and WRITE SAME(16) in the SCSI disk driver. - We set the default maximum to 0xFFFF because there are several devices out there that only support two-byte block counts even with WRITE SAME(16). We only enable transfers bigger than 0xFFFF if the device explicitly reports MAXIMUM WRITE SAME LENGTH in the BLOCK LIMITS VPD. - max_write_same_blocks can be overriden per-device basis in sysfs. - The UNMAP discovery heuristics remain unchanged but the discard limits are tweaked to match the "real" WRITE SAME commands. - In the error handling logic we now distinguish between WRITE SAME with and without UNMAP set. The discovery process heuristics are: - If the device reports a SCSI level of SPC-3 or greater we'll issue READ SUPPORTED OPERATION CODES to find out whether WRITE SAME(16) is supported. If that's the case we will use it. - If the device supports the block limits VPD and reports a MAXIMUM WRITE SAME LENGTH bigger than 0xFFFF we will use WRITE SAME(16). - Otherwise we will use WRITE SAME(10) unless the target LBA is beyond 0xFFFFFFFF or the block count exceeds 0xFFFF. - no_write_same is set for ATA, FireWire and USB. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-11-13 22:45:42 -08:00
Martin K. Petersen	26e85fcd15	[SCSI] sd: Permit merged discard requests Support requests with more than one bio payload for discards. The total number of bytes to be discarded is stored in req->__data_len and used in sd_done() to complete the I/O. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-11-13 21:23:53 -08:00
Martin K. Petersen	fe542396da	[SCSI] sd: Ensure we correctly disable devices with unknown protection type We set the capacity to zero when we discovered a device formatted with an unknown DIF protection type. However, the read_capacity code would override the capacity and cause the device to be enabled regardless. Make sd_read_protection_type() return an error if the protection type is unknown. Also prevent duplicate printk lines when the device is being revalidated. Reported-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-09-24 13:01:24 +04:00
Martin K. Petersen	8172499aae	[SCSI] sd: Allow protection_type to be overridden We have encountered a few devices that misbehaved when operating in T10 PI mode. Allow T10 PI protection type to be overridden from userland. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-09-24 12:10:59 +04:00
Martin K. Petersen	8c579ab69d	[SCSI] sd: Avoid remapping bad reference tags It does not make sense to translate ref tags with unexpected values. Instead we simply ignore them and let the upper layers catch the problem. Ref tags that contain the expected value are still remapped. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-09-24 12:10:59 +04:00
Namjae Jeon	b81478d82e	[SCSI] set to WCE if usb cache quirk is present. Make use of USB quirk method to identify such HDD while reading the cache status in sd_probe(). If cache quirk is present for the HDD, lets assume that cache is enabled and make WCE bit equal to 1. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:59:00 +01:00
Josh Hunt	9e1a15376b	[SCSI] properly initialize atomic_t Initialize atomic_t scsi_host_next_hn and ioerr_cntas per the guidelines defined in Documentation/atomic_ops.txt Signed-off-by: Josh Hunt <johunt@akamai.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:36 +01:00
Alan Stern	6a0bdffa00	SCSI & usb-storage: add try_rc_10_first flag Several bug reports have been received recently for USB mass-storage devices that don't handle READ CAPACITY(16) commands properly. They report bogus sizes, in some cases becoming unusable as a result. The bugs were triggered by commit `09b6b51b0b` (SCSI & usb-storage: add flags for VPD pages and REPORT LUNS), which caused usb-storage to stop overriding the SCSI level reported by devices. By default, the sd driver will try READ CAPACITY(16) first for any device whose level is above SCSI_SPC_2. It seems likely that any device large enough to require the use of READ CAPACITY(16) (i.e., 2 TB or more) would be able to handle READ CAPACITY(10) commands properly. Indeed, I don't know of any devices that don't handle READ CAPACITY(10) properly. Therefore this patch (as1559) adds a new flag telling the sd driver to try READ CAPACITY(10) before READ CAPACITY(16), and sets this flag for every USB mass-storage device. If a device really is larger than 2 TB, sd will fall back to READ CAPACITY(16) just as it used to. This fixes Bugzilla #43391. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Hans de Goede <hdegoede@redhat.com> CC: "James E.J. Bottomley" <JBottomley@parallels.com> CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-22 22:05:31 -07:00

1 2 3 4 5 ...

358 Commits