linux/drivers/scsi/scsi_sysfs.c

1513 lines
38 KiB
C
Raw Normal View History

/*
* scsi_sysfs.c
*
* SCSI sysfs interface routines.
*
* Created to pull SCSI mid layer sysfs routines into one file.
*/
#include <linux/module.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/blkdev.h>
#include <linux/device.h>
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 22:41:42 +08:00
#include <linux/pm_runtime.h>
#include <scsi/scsi.h>
#include <scsi/scsi_device.h>
#include <scsi/scsi_host.h>
#include <scsi/scsi_tcq.h>
#include <scsi/scsi_dh.h>
#include <scsi/scsi_transport.h>
#include <scsi/scsi_driver.h>
#include "scsi_priv.h"
#include "scsi_logging.h"
static struct device_type scsi_dev_type;
static const struct {
enum scsi_device_state value;
char *name;
} sdev_states[] = {
{ SDEV_CREATED, "created" },
{ SDEV_RUNNING, "running" },
{ SDEV_CANCEL, "cancel" },
{ SDEV_DEL, "deleted" },
{ SDEV_QUIESCE, "quiesce" },
{ SDEV_OFFLINE, "offline" },
{ SDEV_TRANSPORT_OFFLINE, "transport-offline" },
{ SDEV_BLOCK, "blocked" },
{ SDEV_CREATED_BLOCK, "created-blocked" },
};
const char *scsi_device_state_name(enum scsi_device_state state)
{
int i;
char *name = NULL;
for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
if (sdev_states[i].value == state) {
name = sdev_states[i].name;
break;
}
}
return name;
}
static const struct {
enum scsi_host_state value;
char *name;
} shost_states[] = {
{ SHOST_CREATED, "created" },
{ SHOST_RUNNING, "running" },
{ SHOST_CANCEL, "cancel" },
{ SHOST_DEL, "deleted" },
{ SHOST_RECOVERY, "recovery" },
{ SHOST_CANCEL_RECOVERY, "cancel/recovery" },
{ SHOST_DEL_RECOVERY, "deleted/recovery", },
};
const char *scsi_host_state_name(enum scsi_host_state state)
{
int i;
char *name = NULL;
for (i = 0; i < ARRAY_SIZE(shost_states); i++) {
if (shost_states[i].value == state) {
name = shost_states[i].name;
break;
}
}
return name;
}
#ifdef CONFIG_SCSI_DH
static const struct {
unsigned char value;
char *name;
} sdev_access_states[] = {
{ SCSI_ACCESS_STATE_OPTIMAL, "active/optimized" },
{ SCSI_ACCESS_STATE_ACTIVE, "active/non-optimized" },
{ SCSI_ACCESS_STATE_STANDBY, "standby" },
{ SCSI_ACCESS_STATE_UNAVAILABLE, "unavailable" },
{ SCSI_ACCESS_STATE_LBA, "lba-dependent" },
{ SCSI_ACCESS_STATE_OFFLINE, "offline" },
{ SCSI_ACCESS_STATE_TRANSITIONING, "transitioning" },
};
static const char *scsi_access_state_name(unsigned char state)
{
int i;
char *name = NULL;
for (i = 0; i < ARRAY_SIZE(sdev_access_states); i++) {
if (sdev_access_states[i].value == state) {
name = sdev_access_states[i].name;
break;
}
}
return name;
}
#endif
static int check_set(unsigned long long *val, char *src)
{
char *last;
if (strncmp(src, "-", 20) == 0) {
*val = SCAN_WILD_CARD;
} else {
/*
* Doesn't check for int overflow
*/
*val = simple_strtoull(src, &last, 0);
if (*last != '\0')
return 1;
}
return 0;
}
static int scsi_scan(struct Scsi_Host *shost, const char *str)
{
char s1[15], s2[15], s3[17], junk;
unsigned long long channel, id, lun;
int res;
res = sscanf(str, "%10s %10s %16s %c", s1, s2, s3, &junk);
if (res != 3)
return -EINVAL;
if (check_set(&channel, s1))
return -EINVAL;
if (check_set(&id, s2))
return -EINVAL;
if (check_set(&lun, s3))
return -EINVAL;
if (shost->transportt->user_scan)
res = shost->transportt->user_scan(shost, channel, id, lun);
else
res = scsi_scan_host_selected(shost, channel, id, lun,
SCSI_SCAN_MANUAL);
return res;
}
/*
* shost_show_function: macro to create an attr function that can be used to
* show a non-bit field.
*/
#define shost_show_function(name, field, format_string) \
static ssize_t \
show_##name (struct device *dev, struct device_attribute *attr, \
char *buf) \
{ \
struct Scsi_Host *shost = class_to_shost(dev); \
return snprintf (buf, 20, format_string, shost->field); \
}
/*
* shost_rd_attr: macro to create a function and attribute variable for a
* read only field.
*/
#define shost_rd_attr2(name, field, format_string) \
shost_show_function(name, field, format_string) \
static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL);
#define shost_rd_attr(field, format_string) \
shost_rd_attr2(field, field, format_string)
/*
* Create the actual show/store functions and data structures.
*/
static ssize_t
store_scan(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
struct Scsi_Host *shost = class_to_shost(dev);
int res;
res = scsi_scan(shost, buf);
if (res == 0)
res = count;
return res;
};
static DEVICE_ATTR(scan, S_IWUSR, NULL, store_scan);
static ssize_t
store_shost_state(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
int i;
struct Scsi_Host *shost = class_to_shost(dev);
enum scsi_host_state state = 0;
for (i = 0; i < ARRAY_SIZE(shost_states); i++) {
const int len = strlen(shost_states[i].name);
if (strncmp(shost_states[i].name, buf, len) == 0 &&
buf[len] == '\n') {
state = shost_states[i].value;
break;
}
}
if (!state)
return -EINVAL;
if (scsi_host_set_state(shost, state))
return -EINVAL;
return count;
}
static ssize_t
show_shost_state(struct device *dev, struct device_attribute *attr, char *buf)
{
struct Scsi_Host *shost = class_to_shost(dev);
const char *name = scsi_host_state_name(shost->shost_state);
if (!name)
return -EINVAL;
return snprintf(buf, 20, "%s\n", name);
}
/* DEVICE_ATTR(state) clashes with dev_attr_state for sdev */
static struct device_attribute dev_attr_hstate =
__ATTR(state, S_IRUGO | S_IWUSR, show_shost_state, store_shost_state);
static ssize_t
show_shost_mode(unsigned int mode, char *buf)
{
ssize_t len = 0;
if (mode & MODE_INITIATOR)
len = sprintf(buf, "%s", "Initiator");
if (mode & MODE_TARGET)
len += sprintf(buf + len, "%s%s", len ? ", " : "", "Target");
len += sprintf(buf + len, "\n");
return len;
}
static ssize_t
show_shost_supported_mode(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct Scsi_Host *shost = class_to_shost(dev);
unsigned int supported_mode = shost->hostt->supported_mode;
if (supported_mode == MODE_UNKNOWN)
/* by default this should be initiator */
supported_mode = MODE_INITIATOR;
return show_shost_mode(supported_mode, buf);
}
static DEVICE_ATTR(supported_mode, S_IRUGO | S_IWUSR, show_shost_supported_mode, NULL);
static ssize_t
show_shost_active_mode(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct Scsi_Host *shost = class_to_shost(dev);
if (shost->active_mode == MODE_UNKNOWN)
return snprintf(buf, 20, "unknown\n");
else
return show_shost_mode(shost->active_mode, buf);
}
static DEVICE_ATTR(active_mode, S_IRUGO | S_IWUSR, show_shost_active_mode, NULL);
[SCSI] prevent stack buffer overflow in host_reset store_host_reset() has tried to re-invent the wheel to compare sysfs strings. Unfortunately it did so poorly and never bothered to check the input from userspace before overwriting stack with it, so something simple as: echo "WoopsieWoopsie" > /sys/devices/pseudo_0/adapter0/host0/scsi_host/host0/host_reset would result in: [ 316.310101] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81f5bac7 [ 316.310101] [ 316.320051] Pid: 6655, comm: sh Tainted: G W 3.7.0-rc5-next-20121114-sasha-00016-g5c9d68d-dirty #129 [ 316.320051] Call Trace: [ 316.340058] pps pps0: PPS event at 1352918752.620355751 [ 316.340062] pps pps0: capture assert seq #303 [ 316.320051] [<ffffffff83b3856b>] panic+0xcd/0x1f4 [ 316.320051] [<ffffffff81f5bac7>] ? store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff8110b996>] __stack_chk_fail+0x16/0x20 [ 316.320051] [<ffffffff81f5bac7>] store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff81e55bb3>] dev_attr_store+0x13/0x30 [ 316.320051] [<ffffffff812f7db1>] sysfs_write_file+0x101/0x170 [ 316.320051] [<ffffffff8127acc8>] vfs_write+0xb8/0x180 [ 316.320051] [<ffffffff8127ae80>] sys_write+0x50/0xa0 [ 316.320051] [<ffffffff83c03418>] tracesys+0xe1/0xe6 Fix this by uninventing whatever was going on there and just use sysfs_streq. Bug introduced by 29443691 ("[SCSI] scsi: Added support for adapter and firmware reset"). [jejb: added necessary const to prevent compile warnings] Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: <stable@vger.kernel.org> #3.2+ Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-11-16 04:51:46 +08:00
static int check_reset_type(const char *str)
{
[SCSI] prevent stack buffer overflow in host_reset store_host_reset() has tried to re-invent the wheel to compare sysfs strings. Unfortunately it did so poorly and never bothered to check the input from userspace before overwriting stack with it, so something simple as: echo "WoopsieWoopsie" > /sys/devices/pseudo_0/adapter0/host0/scsi_host/host0/host_reset would result in: [ 316.310101] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81f5bac7 [ 316.310101] [ 316.320051] Pid: 6655, comm: sh Tainted: G W 3.7.0-rc5-next-20121114-sasha-00016-g5c9d68d-dirty #129 [ 316.320051] Call Trace: [ 316.340058] pps pps0: PPS event at 1352918752.620355751 [ 316.340062] pps pps0: capture assert seq #303 [ 316.320051] [<ffffffff83b3856b>] panic+0xcd/0x1f4 [ 316.320051] [<ffffffff81f5bac7>] ? store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff8110b996>] __stack_chk_fail+0x16/0x20 [ 316.320051] [<ffffffff81f5bac7>] store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff81e55bb3>] dev_attr_store+0x13/0x30 [ 316.320051] [<ffffffff812f7db1>] sysfs_write_file+0x101/0x170 [ 316.320051] [<ffffffff8127acc8>] vfs_write+0xb8/0x180 [ 316.320051] [<ffffffff8127ae80>] sys_write+0x50/0xa0 [ 316.320051] [<ffffffff83c03418>] tracesys+0xe1/0xe6 Fix this by uninventing whatever was going on there and just use sysfs_streq. Bug introduced by 29443691 ("[SCSI] scsi: Added support for adapter and firmware reset"). [jejb: added necessary const to prevent compile warnings] Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: <stable@vger.kernel.org> #3.2+ Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-11-16 04:51:46 +08:00
if (sysfs_streq(str, "adapter"))
return SCSI_ADAPTER_RESET;
[SCSI] prevent stack buffer overflow in host_reset store_host_reset() has tried to re-invent the wheel to compare sysfs strings. Unfortunately it did so poorly and never bothered to check the input from userspace before overwriting stack with it, so something simple as: echo "WoopsieWoopsie" > /sys/devices/pseudo_0/adapter0/host0/scsi_host/host0/host_reset would result in: [ 316.310101] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81f5bac7 [ 316.310101] [ 316.320051] Pid: 6655, comm: sh Tainted: G W 3.7.0-rc5-next-20121114-sasha-00016-g5c9d68d-dirty #129 [ 316.320051] Call Trace: [ 316.340058] pps pps0: PPS event at 1352918752.620355751 [ 316.340062] pps pps0: capture assert seq #303 [ 316.320051] [<ffffffff83b3856b>] panic+0xcd/0x1f4 [ 316.320051] [<ffffffff81f5bac7>] ? store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff8110b996>] __stack_chk_fail+0x16/0x20 [ 316.320051] [<ffffffff81f5bac7>] store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff81e55bb3>] dev_attr_store+0x13/0x30 [ 316.320051] [<ffffffff812f7db1>] sysfs_write_file+0x101/0x170 [ 316.320051] [<ffffffff8127acc8>] vfs_write+0xb8/0x180 [ 316.320051] [<ffffffff8127ae80>] sys_write+0x50/0xa0 [ 316.320051] [<ffffffff83c03418>] tracesys+0xe1/0xe6 Fix this by uninventing whatever was going on there and just use sysfs_streq. Bug introduced by 29443691 ("[SCSI] scsi: Added support for adapter and firmware reset"). [jejb: added necessary const to prevent compile warnings] Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: <stable@vger.kernel.org> #3.2+ Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-11-16 04:51:46 +08:00
else if (sysfs_streq(str, "firmware"))
return SCSI_FIRMWARE_RESET;
else
return 0;
}
static ssize_t
store_host_reset(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
struct Scsi_Host *shost = class_to_shost(dev);
struct scsi_host_template *sht = shost->hostt;
int ret = -EINVAL;
int type;
[SCSI] prevent stack buffer overflow in host_reset store_host_reset() has tried to re-invent the wheel to compare sysfs strings. Unfortunately it did so poorly and never bothered to check the input from userspace before overwriting stack with it, so something simple as: echo "WoopsieWoopsie" > /sys/devices/pseudo_0/adapter0/host0/scsi_host/host0/host_reset would result in: [ 316.310101] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81f5bac7 [ 316.310101] [ 316.320051] Pid: 6655, comm: sh Tainted: G W 3.7.0-rc5-next-20121114-sasha-00016-g5c9d68d-dirty #129 [ 316.320051] Call Trace: [ 316.340058] pps pps0: PPS event at 1352918752.620355751 [ 316.340062] pps pps0: capture assert seq #303 [ 316.320051] [<ffffffff83b3856b>] panic+0xcd/0x1f4 [ 316.320051] [<ffffffff81f5bac7>] ? store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff8110b996>] __stack_chk_fail+0x16/0x20 [ 316.320051] [<ffffffff81f5bac7>] store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff81e55bb3>] dev_attr_store+0x13/0x30 [ 316.320051] [<ffffffff812f7db1>] sysfs_write_file+0x101/0x170 [ 316.320051] [<ffffffff8127acc8>] vfs_write+0xb8/0x180 [ 316.320051] [<ffffffff8127ae80>] sys_write+0x50/0xa0 [ 316.320051] [<ffffffff83c03418>] tracesys+0xe1/0xe6 Fix this by uninventing whatever was going on there and just use sysfs_streq. Bug introduced by 29443691 ("[SCSI] scsi: Added support for adapter and firmware reset"). [jejb: added necessary const to prevent compile warnings] Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: <stable@vger.kernel.org> #3.2+ Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-11-16 04:51:46 +08:00
type = check_reset_type(buf);
if (!type)
goto exit_store_host_reset;
if (sht->host_reset)
ret = sht->host_reset(shost, type);
exit_store_host_reset:
if (ret == 0)
ret = count;
return ret;
}
static DEVICE_ATTR(host_reset, S_IWUSR, NULL, store_host_reset);
static ssize_t
show_shost_eh_deadline(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct Scsi_Host *shost = class_to_shost(dev);
if (shost->eh_deadline == -1)
return snprintf(buf, strlen("off") + 2, "off\n");
return sprintf(buf, "%u\n", shost->eh_deadline / HZ);
}
static ssize_t
store_shost_eh_deadline(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
struct Scsi_Host *shost = class_to_shost(dev);
int ret = -EINVAL;
unsigned long deadline, flags;
if (shost->transportt &&
(shost->transportt->eh_strategy_handler ||
!shost->hostt->eh_host_reset_handler))
return ret;
if (!strncmp(buf, "off", strlen("off")))
deadline = -1;
else {
ret = kstrtoul(buf, 10, &deadline);
if (ret)
return ret;
if (deadline * HZ > UINT_MAX)
return -EINVAL;
}
spin_lock_irqsave(shost->host_lock, flags);
if (scsi_host_in_recovery(shost))
ret = -EBUSY;
else {
if (deadline == -1)
shost->eh_deadline = -1;
else
shost->eh_deadline = deadline * HZ;
ret = count;
}
spin_unlock_irqrestore(shost->host_lock, flags);
return ret;
}
static DEVICE_ATTR(eh_deadline, S_IRUGO | S_IWUSR, show_shost_eh_deadline, store_shost_eh_deadline);
scsi: add support for a blk-mq based I/O path. This patch adds support for an alternate I/O path in the scsi midlayer which uses the blk-mq infrastructure instead of the legacy request code. Use of blk-mq is fully transparent to drivers, although for now a host template field is provided to opt out of blk-mq usage in case any unforseen incompatibilities arise. In general replacing the legacy request code with blk-mq is a simple and mostly mechanical transformation. The biggest exception is the new code that deals with the fact the I/O submissions in blk-mq must happen from process context, which slightly complicates the I/O completion handler. The second biggest differences is that blk-mq is build around the concept of preallocated requests that also include driver specific data, which in SCSI context means the scsi_cmnd structure. This completely avoids dynamic memory allocations for the fast path through I/O submission. Due the preallocated requests the MQ code path exclusively uses the host-wide shared tag allocator instead of a per-LUN one. This only affects drivers actually using the block layer provided tag allocator instead of their own. Unlike the old path blk-mq always provides a tag, although drivers don't have to use it. For now the blk-mq path is disable by defauly and must be enabled using the "use_blk_mq" module parameter. Once the remaining work in the block layer to make blk-mq more suitable for slow devices is complete I hope to make it the default and eventually even remove the old code path. Based on the earlier scsi-mq prototype by Nicholas Bellinger. Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking and various sugestions and code contributions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>
2014-01-17 19:06:53 +08:00
shost_rd_attr(use_blk_mq, "%d\n");
shost_rd_attr(unique_id, "%u\n");
shost_rd_attr(cmd_per_lun, "%hd\n");
shost_rd_attr(can_queue, "%hd\n");
shost_rd_attr(sg_tablesize, "%hu\n");
shost_rd_attr(sg_prot_tablesize, "%hu\n");
shost_rd_attr(unchecked_isa_dma, "%d\n");
shost_rd_attr(prot_capabilities, "%u\n");
shost_rd_attr(prot_guard_type, "%hd\n");
shost_rd_attr2(proc_name, hostt->proc_name, "%s\n");
static ssize_t
show_host_busy(struct device *dev, struct device_attribute *attr, char *buf)
{
struct Scsi_Host *shost = class_to_shost(dev);
return snprintf(buf, 20, "%d\n", atomic_read(&shost->host_busy));
}
static DEVICE_ATTR(host_busy, S_IRUGO, show_host_busy, NULL);
static struct attribute *scsi_sysfs_shost_attrs[] = {
scsi: add support for a blk-mq based I/O path. This patch adds support for an alternate I/O path in the scsi midlayer which uses the blk-mq infrastructure instead of the legacy request code. Use of blk-mq is fully transparent to drivers, although for now a host template field is provided to opt out of blk-mq usage in case any unforseen incompatibilities arise. In general replacing the legacy request code with blk-mq is a simple and mostly mechanical transformation. The biggest exception is the new code that deals with the fact the I/O submissions in blk-mq must happen from process context, which slightly complicates the I/O completion handler. The second biggest differences is that blk-mq is build around the concept of preallocated requests that also include driver specific data, which in SCSI context means the scsi_cmnd structure. This completely avoids dynamic memory allocations for the fast path through I/O submission. Due the preallocated requests the MQ code path exclusively uses the host-wide shared tag allocator instead of a per-LUN one. This only affects drivers actually using the block layer provided tag allocator instead of their own. Unlike the old path blk-mq always provides a tag, although drivers don't have to use it. For now the blk-mq path is disable by defauly and must be enabled using the "use_blk_mq" module parameter. Once the remaining work in the block layer to make blk-mq more suitable for slow devices is complete I hope to make it the default and eventually even remove the old code path. Based on the earlier scsi-mq prototype by Nicholas Bellinger. Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking and various sugestions and code contributions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>
2014-01-17 19:06:53 +08:00
&dev_attr_use_blk_mq.attr,
&dev_attr_unique_id.attr,
&dev_attr_host_busy.attr,
&dev_attr_cmd_per_lun.attr,
&dev_attr_can_queue.attr,
&dev_attr_sg_tablesize.attr,
&dev_attr_sg_prot_tablesize.attr,
&dev_attr_unchecked_isa_dma.attr,
&dev_attr_proc_name.attr,
&dev_attr_scan.attr,
&dev_attr_hstate.attr,
&dev_attr_supported_mode.attr,
&dev_attr_active_mode.attr,
&dev_attr_prot_capabilities.attr,
&dev_attr_prot_guard_type.attr,
&dev_attr_host_reset.attr,
&dev_attr_eh_deadline.attr,
NULL
};
static struct attribute_group scsi_shost_attr_group = {
.attrs = scsi_sysfs_shost_attrs,
};
const struct attribute_group *scsi_sysfs_shost_attr_groups[] = {
&scsi_shost_attr_group,
NULL
};
static void scsi_device_cls_release(struct device *class_dev)
{
struct scsi_device *sdev;
sdev = class_to_sdev(class_dev);
put_device(&sdev->sdev_gendev);
}
2006-11-22 22:55:48 +08:00
static void scsi_device_dev_release_usercontext(struct work_struct *work)
{
struct scsi_device *sdev;
struct device *parent;
struct list_head *this, *tmp;
unsigned long flags;
2006-11-22 22:55:48 +08:00
sdev = container_of(work, struct scsi_device, ew.work);
scsi_dh_release_device(sdev);
2006-11-22 22:55:48 +08:00
parent = sdev->sdev_gendev.parent;
spin_lock_irqsave(sdev->host->host_lock, flags);
list_del(&sdev->siblings);
list_del(&sdev->same_target_siblings);
list_del(&sdev->starved_entry);
spin_unlock_irqrestore(sdev->host->host_lock, flags);
cancel_work_sync(&sdev->event_work);
list_for_each_safe(this, tmp, &sdev->event_list) {
struct scsi_event *evt;
evt = list_entry(this, struct scsi_event, node);
list_del(&evt->node);
kfree(evt);
}
blk_put_queue(sdev->request_queue);
/* NULL queue means the device can't be used */
sdev->request_queue = NULL;
kfree(sdev->vpd_pg83);
kfree(sdev->vpd_pg80);
kfree(sdev->inquiry);
kfree(sdev);
if (parent)
put_device(parent);
}
static void scsi_device_dev_release(struct device *dev)
{
struct scsi_device *sdp = to_scsi_device(dev);
2006-11-22 22:55:48 +08:00
execute_in_process_context(scsi_device_dev_release_usercontext,
&sdp->ew);
}
static struct class sdev_class = {
.name = "scsi_device",
.dev_release = scsi_device_cls_release,
};
/* all probing is done in the individual ->probe routines */
static int scsi_bus_match(struct device *dev, struct device_driver *gendrv)
{
struct scsi_device *sdp;
if (dev->type != &scsi_dev_type)
return 0;
sdp = to_scsi_device(dev);
if (sdp->no_uld_attach)
return 0;
return (sdp->inq_periph_qual == SCSI_INQ_PQ_CON)? 1: 0;
}
static int scsi_bus_uevent(struct device *dev, struct kobj_uevent_env *env)
[SCSI] modalias for scsi devices The following patch adds support for sysfs/uevent modalias attribute for scsi devices (like disks, tapes, cdroms etc), based on whatever current sd.c, sr.c, st.c and osst.c drivers supports. The modalias format is like this: scsi:type-0x04 (for TYPE_WORM, handled by sr.c now). Several comments. o This hexadecimal type value is because all TYPE_XXX constants in include/scsi/scsi.h are given in hex, but __stringify() will not convert them to decimal (so it will NOT be scsi:type-4). Since it does not really matter in which format it is, while both modalias in module and modalias attribute match each other, I descided to go for that 0x%02x format (and added a comment in include/scsi/scsi.h to keep them that way), instead of changing them all to decimal. o There was no .uevent routine for SCSI bus. It might be a good idea to add some more ueven environment variables in there. o osst.c driver handles tapes too, like st.c, but only SOME tapes. With this setup, hotplug scripts (or whatever is used by the user) will try to load both st and osst modules for all SCSI tapes found, because both modules have scsi:type-0x01 alias). It is not harmful, but one extra module is no good either. It is possible to solve this, by exporting more info in modalias attribute, including vendor and device identification strings, so that modalias becomes something like scsi:type-0x12:vendor-Adaptec LTD:device-OnStream Tape Drive and having that, match for all 3 attributes, not only device type. But oh well, vendor and device strings may be large, and they do contain spaces and whatnot. So I left them for now, awaiting for comments first. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-27 20:02:37 +08:00
{
struct scsi_device *sdev;
if (dev->type != &scsi_dev_type)
return 0;
sdev = to_scsi_device(dev);
[SCSI] modalias for scsi devices The following patch adds support for sysfs/uevent modalias attribute for scsi devices (like disks, tapes, cdroms etc), based on whatever current sd.c, sr.c, st.c and osst.c drivers supports. The modalias format is like this: scsi:type-0x04 (for TYPE_WORM, handled by sr.c now). Several comments. o This hexadecimal type value is because all TYPE_XXX constants in include/scsi/scsi.h are given in hex, but __stringify() will not convert them to decimal (so it will NOT be scsi:type-4). Since it does not really matter in which format it is, while both modalias in module and modalias attribute match each other, I descided to go for that 0x%02x format (and added a comment in include/scsi/scsi.h to keep them that way), instead of changing them all to decimal. o There was no .uevent routine for SCSI bus. It might be a good idea to add some more ueven environment variables in there. o osst.c driver handles tapes too, like st.c, but only SOME tapes. With this setup, hotplug scripts (or whatever is used by the user) will try to load both st and osst modules for all SCSI tapes found, because both modules have scsi:type-0x01 alias). It is not harmful, but one extra module is no good either. It is possible to solve this, by exporting more info in modalias attribute, including vendor and device identification strings, so that modalias becomes something like scsi:type-0x12:vendor-Adaptec LTD:device-OnStream Tape Drive and having that, match for all 3 attributes, not only device type. But oh well, vendor and device strings may be large, and they do contain spaces and whatnot. So I left them for now, awaiting for comments first. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-27 20:02:37 +08:00
add_uevent_var(env, "MODALIAS=" SCSI_DEVICE_MODALIAS_FMT, sdev->type);
[SCSI] modalias for scsi devices The following patch adds support for sysfs/uevent modalias attribute for scsi devices (like disks, tapes, cdroms etc), based on whatever current sd.c, sr.c, st.c and osst.c drivers supports. The modalias format is like this: scsi:type-0x04 (for TYPE_WORM, handled by sr.c now). Several comments. o This hexadecimal type value is because all TYPE_XXX constants in include/scsi/scsi.h are given in hex, but __stringify() will not convert them to decimal (so it will NOT be scsi:type-4). Since it does not really matter in which format it is, while both modalias in module and modalias attribute match each other, I descided to go for that 0x%02x format (and added a comment in include/scsi/scsi.h to keep them that way), instead of changing them all to decimal. o There was no .uevent routine for SCSI bus. It might be a good idea to add some more ueven environment variables in there. o osst.c driver handles tapes too, like st.c, but only SOME tapes. With this setup, hotplug scripts (or whatever is used by the user) will try to load both st and osst modules for all SCSI tapes found, because both modules have scsi:type-0x01 alias). It is not harmful, but one extra module is no good either. It is possible to solve this, by exporting more info in modalias attribute, including vendor and device identification strings, so that modalias becomes something like scsi:type-0x12:vendor-Adaptec LTD:device-OnStream Tape Drive and having that, match for all 3 attributes, not only device type. But oh well, vendor and device strings may be large, and they do contain spaces and whatnot. So I left them for now, awaiting for comments first. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-27 20:02:37 +08:00
return 0;
}
struct bus_type scsi_bus_type = {
.name = "scsi",
.match = scsi_bus_match,
[SCSI] modalias for scsi devices The following patch adds support for sysfs/uevent modalias attribute for scsi devices (like disks, tapes, cdroms etc), based on whatever current sd.c, sr.c, st.c and osst.c drivers supports. The modalias format is like this: scsi:type-0x04 (for TYPE_WORM, handled by sr.c now). Several comments. o This hexadecimal type value is because all TYPE_XXX constants in include/scsi/scsi.h are given in hex, but __stringify() will not convert them to decimal (so it will NOT be scsi:type-4). Since it does not really matter in which format it is, while both modalias in module and modalias attribute match each other, I descided to go for that 0x%02x format (and added a comment in include/scsi/scsi.h to keep them that way), instead of changing them all to decimal. o There was no .uevent routine for SCSI bus. It might be a good idea to add some more ueven environment variables in there. o osst.c driver handles tapes too, like st.c, but only SOME tapes. With this setup, hotplug scripts (or whatever is used by the user) will try to load both st and osst modules for all SCSI tapes found, because both modules have scsi:type-0x01 alias). It is not harmful, but one extra module is no good either. It is possible to solve this, by exporting more info in modalias attribute, including vendor and device identification strings, so that modalias becomes something like scsi:type-0x12:vendor-Adaptec LTD:device-OnStream Tape Drive and having that, match for all 3 attributes, not only device type. But oh well, vendor and device strings may be large, and they do contain spaces and whatnot. So I left them for now, awaiting for comments first. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-27 20:02:37 +08:00
.uevent = scsi_bus_uevent,
#ifdef CONFIG_PM
.pm = &scsi_bus_pm_ops,
#endif
};
EXPORT_SYMBOL_GPL(scsi_bus_type);
int scsi_sysfs_register(void)
{
int error;
error = bus_register(&scsi_bus_type);
if (!error) {
error = class_register(&sdev_class);
if (error)
bus_unregister(&scsi_bus_type);
}
return error;
}
void scsi_sysfs_unregister(void)
{
class_unregister(&sdev_class);
bus_unregister(&scsi_bus_type);
}
/*
* sdev_show_function: macro to create an attr function that can be used to
* show a non-bit field.
*/
#define sdev_show_function(field, format_string) \
static ssize_t \
sdev_show_##field (struct device *dev, struct device_attribute *attr, \
char *buf) \
{ \
struct scsi_device *sdev; \
sdev = to_scsi_device(dev); \
return snprintf (buf, 20, format_string, sdev->field); \
} \
/*
* sdev_rd_attr: macro to create a function and attribute variable for a
* read only field.
*/
#define sdev_rd_attr(field, format_string) \
sdev_show_function(field, format_string) \
static DEVICE_ATTR(field, S_IRUGO, sdev_show_##field, NULL);
/*
* sdev_rw_attr: create a function and attribute variable for a
* read/write field.
*/
#define sdev_rw_attr(field, format_string) \
sdev_show_function(field, format_string) \
\
static ssize_t \
sdev_store_##field (struct device *dev, struct device_attribute *attr, \
const char *buf, size_t count) \
{ \
struct scsi_device *sdev; \
sdev = to_scsi_device(dev); \
sscanf (buf, format_string, &sdev->field); \
return count; \
} \
static DEVICE_ATTR(field, S_IRUGO | S_IWUSR, sdev_show_##field, sdev_store_##field);
/* Currently we don't export bit fields, but we might in future,
* so leave this code in */
#if 0
/*
* sdev_rd_attr: create a function and attribute variable for a
* read/write bit field.
*/
#define sdev_rw_attr_bit(field) \
sdev_show_function(field, "%d\n") \
\
static ssize_t \
sdev_store_##field (struct device *dev, struct device_attribute *attr, \
const char *buf, size_t count) \
{ \
int ret; \
struct scsi_device *sdev; \
ret = scsi_sdev_check_buf_bit(buf); \
if (ret >= 0) { \
sdev = to_scsi_device(dev); \
sdev->field = ret; \
ret = count; \
} \
return ret; \
} \
static DEVICE_ATTR(field, S_IRUGO | S_IWUSR, sdev_show_##field, sdev_store_##field);
/*
* scsi_sdev_check_buf_bit: return 0 if buf is "0", return 1 if buf is "1",
* else return -EINVAL.
*/
static int scsi_sdev_check_buf_bit(const char *buf)
{
if ((buf[1] == '\0') || ((buf[1] == '\n') && (buf[2] == '\0'))) {
if (buf[0] == '1')
return 1;
else if (buf[0] == '0')
return 0;
else
return -EINVAL;
} else
return -EINVAL;
}
#endif
/*
* Create the actual show/store functions and data structures.
*/
sdev_rd_attr (type, "%d\n");
sdev_rd_attr (scsi_level, "%d\n");
sdev_rd_attr (vendor, "%.8s\n");
sdev_rd_attr (model, "%.16s\n");
sdev_rd_attr (rev, "%.4s\n");
static ssize_t
sdev_show_device_busy(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct scsi_device *sdev = to_scsi_device(dev);
return snprintf(buf, 20, "%d\n", atomic_read(&sdev->device_busy));
}
static DEVICE_ATTR(device_busy, S_IRUGO, sdev_show_device_busy, NULL);
static ssize_t
sdev_show_device_blocked(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct scsi_device *sdev = to_scsi_device(dev);
return snprintf(buf, 20, "%d\n", atomic_read(&sdev->device_blocked));
}
static DEVICE_ATTR(device_blocked, S_IRUGO, sdev_show_device_blocked, NULL);
/*
* TODO: can we make these symlinks to the block layer ones?
*/
static ssize_t
sdev_show_timeout (struct device *dev, struct device_attribute *attr, char *buf)
{
struct scsi_device *sdev;
sdev = to_scsi_device(dev);
return snprintf(buf, 20, "%d\n", sdev->request_queue->rq_timeout / HZ);
}
static ssize_t
sdev_store_timeout (struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
struct scsi_device *sdev;
int timeout;
sdev = to_scsi_device(dev);
sscanf (buf, "%d\n", &timeout);
blk_queue_rq_timeout(sdev->request_queue, timeout * HZ);
return count;
}
static DEVICE_ATTR(timeout, S_IRUGO | S_IWUSR, sdev_show_timeout, sdev_store_timeout);
static ssize_t
sdev_show_eh_timeout(struct device *dev, struct device_attribute *attr, char *buf)
{
struct scsi_device *sdev;
sdev = to_scsi_device(dev);
return snprintf(buf, 20, "%u\n", sdev->eh_timeout / HZ);
}
static ssize_t
sdev_store_eh_timeout(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
struct scsi_device *sdev;
unsigned int eh_timeout;
int err;
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
sdev = to_scsi_device(dev);
err = kstrtouint(buf, 10, &eh_timeout);
if (err)
return err;
sdev->eh_timeout = eh_timeout * HZ;
return count;
}
static DEVICE_ATTR(eh_timeout, S_IRUGO | S_IWUSR, sdev_show_eh_timeout, sdev_store_eh_timeout);
static ssize_t
store_rescan_field (struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
scsi_rescan_device(dev);
return count;
}
static DEVICE_ATTR(rescan, S_IWUSR, NULL, store_rescan_field);
static ssize_t
sdev_store_delete(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
if (device_remove_file_self(dev, attr))
scsi_remove_device(to_scsi_device(dev));
return count;
};
static DEVICE_ATTR(delete, S_IWUSR, NULL, sdev_store_delete);
static ssize_t
store_state_field(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
int i, ret;
struct scsi_device *sdev = to_scsi_device(dev);
enum scsi_device_state state = 0;
for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
const int len = strlen(sdev_states[i].name);
if (strncmp(sdev_states[i].name, buf, len) == 0 &&
buf[len] == '\n') {
state = sdev_states[i].value;
break;
}
}
if (!state)
return -EINVAL;
mutex_lock(&sdev->state_mutex);
ret = scsi_device_set_state(sdev, state);
mutex_unlock(&sdev->state_mutex);
return ret == 0 ? count : -EINVAL;
}
static ssize_t
show_state_field(struct device *dev, struct device_attribute *attr, char *buf)
{
struct scsi_device *sdev = to_scsi_device(dev);
const char *name = scsi_device_state_name(sdev->sdev_state);
if (!name)
return -EINVAL;
return snprintf(buf, 20, "%s\n", name);
}
static DEVICE_ATTR(state, S_IRUGO | S_IWUSR, show_state_field, store_state_field);
static ssize_t
show_queue_type_field(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct scsi_device *sdev = to_scsi_device(dev);
const char *name = "none";
if (sdev->simple_tags)
name = "simple";
return snprintf(buf, 20, "%s\n", name);
}
static ssize_t
store_queue_type_field(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
struct scsi_device *sdev = to_scsi_device(dev);
if (!sdev->tagged_supported)
return -EINVAL;
sdev_printk(KERN_INFO, sdev,
"ignoring write to deprecated queue_type attribute");
return count;
}
static DEVICE_ATTR(queue_type, S_IRUGO | S_IWUSR, show_queue_type_field,
store_queue_type_field);
#define sdev_vpd_pg_attr(_page) \
static ssize_t \
show_vpd_##_page(struct file *filp, struct kobject *kobj, \
struct bin_attribute *bin_attr, \
char *buf, loff_t off, size_t count) \
{ \
struct device *dev = container_of(kobj, struct device, kobj); \
struct scsi_device *sdev = to_scsi_device(dev); \
int ret; \
if (!sdev->vpd_##_page) \
return -EINVAL; \
rcu_read_lock(); \
ret = memory_read_from_buffer(buf, count, &off, \
rcu_dereference(sdev->vpd_##_page), \
sdev->vpd_##_page##_len); \
rcu_read_unlock(); \
return ret; \
} \
static struct bin_attribute dev_attr_vpd_##_page = { \
.attr = {.name = __stringify(vpd_##_page), .mode = S_IRUGO }, \
.size = 0, \
.read = show_vpd_##_page, \
};
sdev_vpd_pg_attr(pg83);
sdev_vpd_pg_attr(pg80);
static ssize_t show_inquiry(struct file *filep, struct kobject *kobj,
struct bin_attribute *bin_attr,
char *buf, loff_t off, size_t count)
{
struct device *dev = container_of(kobj, struct device, kobj);
struct scsi_device *sdev = to_scsi_device(dev);
if (!sdev->inquiry)
return -EINVAL;
return memory_read_from_buffer(buf, count, &off, sdev->inquiry,
sdev->inquiry_len);
}
static struct bin_attribute dev_attr_inquiry = {
.attr = {
.name = "inquiry",
.mode = S_IRUGO,
},
.size = 0,
.read = show_inquiry,
};
static ssize_t
show_iostat_counterbits(struct device *dev, struct device_attribute *attr,
char *buf)
{
return snprintf(buf, 20, "%d\n", (int)sizeof(atomic_t) * 8);
}
static DEVICE_ATTR(iocounterbits, S_IRUGO, show_iostat_counterbits, NULL);
#define show_sdev_iostat(field) \
static ssize_t \
show_iostat_##field(struct device *dev, struct device_attribute *attr, \
char *buf) \
{ \
struct scsi_device *sdev = to_scsi_device(dev); \
unsigned long long count = atomic_read(&sdev->field); \
return snprintf(buf, 20, "0x%llx\n", count); \
} \
static DEVICE_ATTR(field, S_IRUGO, show_iostat_##field, NULL)
show_sdev_iostat(iorequest_cnt);
show_sdev_iostat(iodone_cnt);
show_sdev_iostat(ioerr_cnt);
[SCSI] modalias for scsi devices The following patch adds support for sysfs/uevent modalias attribute for scsi devices (like disks, tapes, cdroms etc), based on whatever current sd.c, sr.c, st.c and osst.c drivers supports. The modalias format is like this: scsi:type-0x04 (for TYPE_WORM, handled by sr.c now). Several comments. o This hexadecimal type value is because all TYPE_XXX constants in include/scsi/scsi.h are given in hex, but __stringify() will not convert them to decimal (so it will NOT be scsi:type-4). Since it does not really matter in which format it is, while both modalias in module and modalias attribute match each other, I descided to go for that 0x%02x format (and added a comment in include/scsi/scsi.h to keep them that way), instead of changing them all to decimal. o There was no .uevent routine for SCSI bus. It might be a good idea to add some more ueven environment variables in there. o osst.c driver handles tapes too, like st.c, but only SOME tapes. With this setup, hotplug scripts (or whatever is used by the user) will try to load both st and osst modules for all SCSI tapes found, because both modules have scsi:type-0x01 alias). It is not harmful, but one extra module is no good either. It is possible to solve this, by exporting more info in modalias attribute, including vendor and device identification strings, so that modalias becomes something like scsi:type-0x12:vendor-Adaptec LTD:device-OnStream Tape Drive and having that, match for all 3 attributes, not only device type. But oh well, vendor and device strings may be large, and they do contain spaces and whatnot. So I left them for now, awaiting for comments first. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-27 20:02:37 +08:00
static ssize_t
sdev_show_modalias(struct device *dev, struct device_attribute *attr, char *buf)
{
struct scsi_device *sdev;
sdev = to_scsi_device(dev);
return snprintf (buf, 20, SCSI_DEVICE_MODALIAS_FMT "\n", sdev->type);
}
static DEVICE_ATTR(modalias, S_IRUGO, sdev_show_modalias, NULL);
#define DECLARE_EVT_SHOW(name, Cap_name) \
static ssize_t \
sdev_show_evt_##name(struct device *dev, struct device_attribute *attr, \
char *buf) \
{ \
struct scsi_device *sdev = to_scsi_device(dev); \
int val = test_bit(SDEV_EVT_##Cap_name, sdev->supported_events);\
return snprintf(buf, 20, "%d\n", val); \
}
#define DECLARE_EVT_STORE(name, Cap_name) \
static ssize_t \
sdev_store_evt_##name(struct device *dev, struct device_attribute *attr,\
const char *buf, size_t count) \
{ \
struct scsi_device *sdev = to_scsi_device(dev); \
int val = simple_strtoul(buf, NULL, 0); \
if (val == 0) \
clear_bit(SDEV_EVT_##Cap_name, sdev->supported_events); \
else if (val == 1) \
set_bit(SDEV_EVT_##Cap_name, sdev->supported_events); \
else \
return -EINVAL; \
return count; \
}
#define DECLARE_EVT(name, Cap_name) \
DECLARE_EVT_SHOW(name, Cap_name) \
DECLARE_EVT_STORE(name, Cap_name) \
static DEVICE_ATTR(evt_##name, S_IRUGO, sdev_show_evt_##name, \
sdev_store_evt_##name);
#define REF_EVT(name) &dev_attr_evt_##name.attr
DECLARE_EVT(media_change, MEDIA_CHANGE)
DECLARE_EVT(inquiry_change_reported, INQUIRY_CHANGE_REPORTED)
DECLARE_EVT(capacity_change_reported, CAPACITY_CHANGE_REPORTED)
DECLARE_EVT(soft_threshold_reached, SOFT_THRESHOLD_REACHED_REPORTED)
DECLARE_EVT(mode_parameter_change_reported, MODE_PARAMETER_CHANGE_REPORTED)
DECLARE_EVT(lun_change_reported, LUN_CHANGE_REPORTED)
static ssize_t
sdev_store_queue_depth(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
int depth, retval;
struct scsi_device *sdev = to_scsi_device(dev);
struct scsi_host_template *sht = sdev->host->hostt;
if (!sht->change_queue_depth)
return -EINVAL;
depth = simple_strtoul(buf, NULL, 0);
if (depth < 1 || depth > sdev->host->can_queue)
return -EINVAL;
retval = sht->change_queue_depth(sdev, depth);
if (retval < 0)
return retval;
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-23 06:46:33 +08:00
sdev->max_queue_depth = sdev->queue_depth;
return count;
}
sdev_show_function(queue_depth, "%d\n");
static DEVICE_ATTR(queue_depth, S_IRUGO | S_IWUSR, sdev_show_queue_depth,
sdev_store_queue_depth);
static ssize_t
sdev_show_wwid(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct scsi_device *sdev = to_scsi_device(dev);
ssize_t count;
count = scsi_vpd_lun_id(sdev, buf, PAGE_SIZE);
if (count > 0) {
buf[count] = '\n';
count++;
}
return count;
}
static DEVICE_ATTR(wwid, S_IRUGO, sdev_show_wwid, NULL);
#ifdef CONFIG_SCSI_DH
static ssize_t
sdev_show_dh_state(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct scsi_device *sdev = to_scsi_device(dev);
if (!sdev->handler)
return snprintf(buf, 20, "detached\n");
return snprintf(buf, 20, "%s\n", sdev->handler->name);
}
static ssize_t
sdev_store_dh_state(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
struct scsi_device *sdev = to_scsi_device(dev);
int err = -EINVAL;
if (sdev->sdev_state == SDEV_CANCEL ||
sdev->sdev_state == SDEV_DEL)
return -ENODEV;
if (!sdev->handler) {
/*
* Attach to a device handler
*/
err = scsi_dh_attach(sdev->request_queue, buf);
} else if (!strncmp(buf, "activate", 8)) {
/*
* Activate a device handler
*/
if (sdev->handler->activate)
err = sdev->handler->activate(sdev, NULL, NULL);
else
err = 0;
} else if (!strncmp(buf, "detach", 6)) {
/*
* Detach from a device handler
*/
sdev_printk(KERN_WARNING, sdev,
"can't detach handler %s.\n",
sdev->handler->name);
err = -EINVAL;
}
return err < 0 ? err : count;
}
static DEVICE_ATTR(dh_state, S_IRUGO | S_IWUSR, sdev_show_dh_state,
sdev_store_dh_state);
static ssize_t
sdev_show_access_state(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct scsi_device *sdev = to_scsi_device(dev);
unsigned char access_state;
const char *access_state_name;
if (!sdev->handler)
return -EINVAL;
access_state = (sdev->access_state & SCSI_ACCESS_STATE_MASK);
access_state_name = scsi_access_state_name(access_state);
return sprintf(buf, "%s\n",
access_state_name ? access_state_name : "unknown");
}
static DEVICE_ATTR(access_state, S_IRUGO, sdev_show_access_state, NULL);
static ssize_t
sdev_show_preferred_path(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct scsi_device *sdev = to_scsi_device(dev);
if (!sdev->handler)
return -EINVAL;
if (sdev->access_state & SCSI_ACCESS_STATE_PREFERRED)
return sprintf(buf, "1\n");
else
return sprintf(buf, "0\n");
}
static DEVICE_ATTR(preferred_path, S_IRUGO, sdev_show_preferred_path, NULL);
#endif
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-23 06:46:33 +08:00
static ssize_t
sdev_show_queue_ramp_up_period(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct scsi_device *sdev;
sdev = to_scsi_device(dev);
return snprintf(buf, 20, "%u\n",
jiffies_to_msecs(sdev->queue_ramp_up_period));
}
static ssize_t
sdev_store_queue_ramp_up_period(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
struct scsi_device *sdev = to_scsi_device(dev);
unsigned int period;
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-23 06:46:33 +08:00
if (kstrtouint(buf, 10, &period))
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-23 06:46:33 +08:00
return -EINVAL;
sdev->queue_ramp_up_period = msecs_to_jiffies(period);
return count;
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-23 06:46:33 +08:00
}
static DEVICE_ATTR(queue_ramp_up_period, S_IRUGO | S_IWUSR,
sdev_show_queue_ramp_up_period,
sdev_store_queue_ramp_up_period);
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-23 06:46:33 +08:00
static umode_t scsi_sdev_attr_is_visible(struct kobject *kobj,
struct attribute *attr, int i)
{
struct device *dev = container_of(kobj, struct device, kobj);
struct scsi_device *sdev = to_scsi_device(dev);
if (attr == &dev_attr_queue_depth.attr &&
!sdev->host->hostt->change_queue_depth)
return S_IRUGO;
if (attr == &dev_attr_queue_ramp_up_period.attr &&
!sdev->host->hostt->change_queue_depth)
return 0;
#ifdef CONFIG_SCSI_DH
if (attr == &dev_attr_access_state.attr &&
!sdev->handler)
return 0;
if (attr == &dev_attr_preferred_path.attr &&
!sdev->handler)
return 0;
#endif
return attr->mode;
}
static umode_t scsi_sdev_bin_attr_is_visible(struct kobject *kobj,
struct bin_attribute *attr, int i)
{
struct device *dev = container_of(kobj, struct device, kobj);
struct scsi_device *sdev = to_scsi_device(dev);
if (attr == &dev_attr_vpd_pg80 && !sdev->vpd_pg80)
return 0;
if (attr == &dev_attr_vpd_pg83 && !sdev->vpd_pg83)
return 0;
return S_IRUGO;
}
/* Default template for device attributes. May NOT be modified */
static struct attribute *scsi_sdev_attrs[] = {
&dev_attr_device_blocked.attr,
&dev_attr_type.attr,
&dev_attr_scsi_level.attr,
&dev_attr_device_busy.attr,
&dev_attr_vendor.attr,
&dev_attr_model.attr,
&dev_attr_rev.attr,
&dev_attr_rescan.attr,
&dev_attr_delete.attr,
&dev_attr_state.attr,
&dev_attr_timeout.attr,
&dev_attr_eh_timeout.attr,
&dev_attr_iocounterbits.attr,
&dev_attr_iorequest_cnt.attr,
&dev_attr_iodone_cnt.attr,
&dev_attr_ioerr_cnt.attr,
&dev_attr_modalias.attr,
&dev_attr_queue_depth.attr,
&dev_attr_queue_type.attr,
&dev_attr_wwid.attr,
#ifdef CONFIG_SCSI_DH
&dev_attr_dh_state.attr,
&dev_attr_access_state.attr,
&dev_attr_preferred_path.attr,
#endif
&dev_attr_queue_ramp_up_period.attr,
REF_EVT(media_change),
REF_EVT(inquiry_change_reported),
REF_EVT(capacity_change_reported),
REF_EVT(soft_threshold_reached),
REF_EVT(mode_parameter_change_reported),
REF_EVT(lun_change_reported),
NULL
};
static struct bin_attribute *scsi_sdev_bin_attrs[] = {
&dev_attr_vpd_pg83,
&dev_attr_vpd_pg80,
&dev_attr_inquiry,
NULL
};
static struct attribute_group scsi_sdev_attr_group = {
.attrs = scsi_sdev_attrs,
.bin_attrs = scsi_sdev_bin_attrs,
.is_visible = scsi_sdev_attr_is_visible,
.is_bin_visible = scsi_sdev_bin_attr_is_visible,
};
static const struct attribute_group *scsi_sdev_attr_groups[] = {
&scsi_sdev_attr_group,
NULL
};
static int scsi_target_add(struct scsi_target *starget)
{
int error;
if (starget->state != STARGET_CREATED)
return 0;
error = device_add(&starget->dev);
if (error) {
dev_err(&starget->dev, "target device_add failed, error %d\n", error);
return error;
}
transport_add_device(&starget->dev);
starget->state = STARGET_RUNNING;
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 22:41:42 +08:00
pm_runtime_set_active(&starget->dev);
pm_runtime_enable(&starget->dev);
device_enable_async_suspend(&starget->dev);
return 0;
}
/**
* scsi_sysfs_add_sdev - add scsi device to sysfs
* @sdev: scsi_device to add
*
* Return value:
* 0 on Success / non-zero on Failure
**/
int scsi_sysfs_add_sdev(struct scsi_device *sdev)
{
int error, i;
struct request_queue *rq = sdev->request_queue;
struct scsi_target *starget = sdev->sdev_target;
error = scsi_target_add(starget);
if (error)
return error;
transport_configure_device(&starget->dev);
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 22:41:42 +08:00
device_enable_async_suspend(&sdev->sdev_gendev);
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 22:41:42 +08:00
scsi_autopm_get_target(starget);
pm_runtime_set_active(&sdev->sdev_gendev);
pm_runtime_forbid(&sdev->sdev_gendev);
pm_runtime_enable(&sdev->sdev_gendev);
scsi_autopm_put_target(starget);
scsi_autopm_get_device(sdev);
error = scsi_dh_add_device(sdev);
if (error)
/*
* device_handler is optional, so any error can be ignored
*/
sdev_printk(KERN_INFO, sdev,
"failed to add device handler: %d\n", error);
error = device_add(&sdev->sdev_gendev);
if (error) {
sdev_printk(KERN_INFO, sdev,
"failed to add device: %d\n", error);
scsi_dh_remove_device(sdev);
return error;
}
device_enable_async_suspend(&sdev->sdev_dev);
error = device_add(&sdev->sdev_dev);
if (error) {
sdev_printk(KERN_INFO, sdev,
"failed to add class device: %d\n", error);
scsi_dh_remove_device(sdev);
device_del(&sdev->sdev_gendev);
return error;
}
transport_add_device(&sdev->sdev_gendev);
sdev->is_visible = 1;
error = bsg_register_queue(rq, &sdev->sdev_gendev, NULL, NULL);
if (error)
/* we're treating error on bsg register as non-fatal,
* so pretend nothing went wrong */
sdev_printk(KERN_INFO, sdev,
"Failed to register bsg queue, errno=%d\n", error);
/* add additional host specific attributes */
if (sdev->host->hostt->sdev_attrs) {
for (i = 0; sdev->host->hostt->sdev_attrs[i]; i++) {
error = device_create_file(&sdev->sdev_gendev,
sdev->host->hostt->sdev_attrs[i]);
if (error)
return error;
}
}
scsi_autopm_put_device(sdev);
return error;
}
void __scsi_remove_device(struct scsi_device *sdev)
{
struct device *dev = &sdev->sdev_gendev;
int res;
scsi_sysfs: protect against double execution of __scsi_remove_device() On some host errors storvsc module tries to remove sdev by scheduling a job which does the following: sdev = scsi_device_lookup(wrk->host, 0, 0, wrk->lun); if (sdev) { scsi_remove_device(sdev); scsi_device_put(sdev); } While this code seems correct the following crash is observed: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC RIP: 0010:[<ffffffff81169979>] [<ffffffff81169979>] bdi_destroy+0x39/0x220 ... [<ffffffff814aecdc>] ? _raw_spin_unlock_irq+0x2c/0x40 [<ffffffff8127b7db>] blk_cleanup_queue+0x17b/0x270 [<ffffffffa00b54c4>] __scsi_remove_device+0x54/0xd0 [scsi_mod] [<ffffffffa00b556b>] scsi_remove_device+0x2b/0x40 [scsi_mod] [<ffffffffa00ec47d>] storvsc_remove_lun+0x3d/0x60 [hv_storvsc] [<ffffffff81080791>] process_one_work+0x1b1/0x530 ... The problem comes with the fact that many such jobs (for the same device) are being scheduled simultaneously. While scsi_remove_device() uses shost->scan_mutex and scsi_device_lookup() will fail for a device in SDEV_DEL state there is no protection against someone who did scsi_device_lookup() before we actually entered __scsi_remove_device(). So the whole scenario looks like that: two callers do simultaneous (or preemption happens) calls to scsi_device_lookup() ant these calls succeed for both of them, after that they try doing scsi_remove_device(). shost->scan_mutex only serializes their calls to __scsi_remove_device() and we end up doing the cleanup path twice. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-11-19 21:02:19 +08:00
/*
* This cleanup path is not reentrant and while it is impossible
* to get a new reference with scsi_device_get() someone can still
* hold a previously acquired one.
*/
if (sdev->sdev_state == SDEV_DEL)
return;
if (sdev->is_visible) {
/*
* If scsi_internal_target_block() is running concurrently,
* wait until it has finished before changing the device state.
*/
mutex_lock(&sdev->state_mutex);
/*
* If blocked, we go straight to DEL and restart the queue so
* any commands issued during driver shutdown (like sync
* cache) are errored immediately.
*/
res = scsi_device_set_state(sdev, SDEV_CANCEL);
if (res != 0) {
res = scsi_device_set_state(sdev, SDEV_DEL);
if (res == 0)
scsi_start_queue(sdev);
}
mutex_unlock(&sdev->state_mutex);
if (res != 0)
return;
bsg_unregister_queue(sdev->request_queue);
device_unregister(&sdev->sdev_dev);
transport_remove_device(dev);
scsi_dh_remove_device(sdev);
device_del(dev);
} else
put_device(&sdev->sdev_dev);
/*
* Stop accepting new requests and wait until all queuecommand() and
* scsi_run_queue() invocations have finished before tearing down the
* device.
*/
mutex_lock(&sdev->state_mutex);
scsi_device_set_state(sdev, SDEV_DEL);
mutex_unlock(&sdev->state_mutex);
blk_cleanup_queue(sdev->request_queue);
cancel_work_sync(&sdev->requeue_work);
if (sdev->host->hostt->slave_destroy)
sdev->host->hostt->slave_destroy(sdev);
transport_destroy_device(dev);
/*
* Paired with the kref_get() in scsi_sysfs_initialize(). We have
* remoed sysfs visibility from the device, so make the target
* invisible if this was the last device underneath it.
*/
scsi_target_reap(scsi_target(sdev));
put_device(dev);
}
/**
* scsi_remove_device - unregister a device from the scsi bus
* @sdev: scsi_device to unregister
**/
void scsi_remove_device(struct scsi_device *sdev)
{
struct Scsi_Host *shost = sdev->host;
mutex_lock(&shost->scan_mutex);
__scsi_remove_device(sdev);
mutex_unlock(&shost->scan_mutex);
}
EXPORT_SYMBOL(scsi_remove_device);
static void __scsi_remove_target(struct scsi_target *starget)
{
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
unsigned long flags;
struct scsi_device *sdev;
spin_lock_irqsave(shost->host_lock, flags);
restart:
list_for_each_entry(sdev, &shost->__devices, siblings) {
if (sdev->channel != starget->channel ||
sdev->id != starget->id ||
[SCSI] Fix race when removing SCSI devices Removing SCSI devices through echo 1 > /sys/bus/scsi/devices/ ... /delete while the FC transport class removes the SCSI target can lead to an oops: Unable to handle kernel pointer dereference at virtual kernel address 00000000b6815000 Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: sunrpc qeth_l3 binfmt_misc dm_multipath scsi_dh dm_mod ipv6 qeth ccwgroup [last unloaded: scsi_wait_scan] CPU: 1 Not tainted 2.6.35.5-45.x.20100924-s390xdefault #1 Process fc_wq_0 (pid: 861, task: 00000000b7331240, ksp: 00000000b735bac0) Krnl PSW : 0704200180000000 00000000003ff6e4 (__scsi_remove_device+0x24/0xd0) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3 Krnl GPRS: 0000000000000001 0000000000000000 00000000b6815000 00000000bc24a8c0 00000000003ff7c8 000000000056dbb8 0000000000000002 0000000000835d80 ffffffff00000000 0000000000001000 00000000b6815000 00000000bc24a7f0 00000000b68151a0 00000000b6815000 00000000b735bc20 00000000b735bbf8 Krnl Code: 00000000003ff6d6: a7840001 brc 8,3ff6d8 00000000003ff6da: a7fbffd8 aghi %r15,-40 00000000003ff6de: e3e0f0980024 stg %r14,152(%r15) >00000000003ff6e4: e31021200004 lg %r1,288(%r2) 00000000003ff6ea: a71f0000 cghi %r1,0 00000000003ff6ee: a7a40011 brc 10,3ff710 00000000003ff6f2: a7390003 lghi %r3,3 00000000003ff6f6: c0e5ffffc8b1 brasl %r14,3f8858 Call Trace: ([<0000000000001000>] 0x1000) [<00000000003ff7d2>] scsi_remove_device+0x42/0x54 [<00000000003ff8ba>] __scsi_remove_target+0xca/0xfc [<00000000003ff99a>] __remove_child+0x3a/0x48 [<00000000003e3246>] device_for_each_child+0x72/0xbc [<00000000003ff93a>] scsi_remove_target+0x4e/0x74 [<0000000000406586>] fc_rport_final_delete+0xb2/0x23c [<000000000015d080>] worker_thread+0x200/0x344 [<000000000016330c>] kthread+0xa0/0xa8 [<0000000000106c1a>] kernel_thread_starter+0x6/0xc [<0000000000106c14>] kernel_thread_starter+0x0/0xc INFO: lockdep is turned off. Last Breaking-Event-Address: [<00000000003ff7cc>] scsi_remove_device+0x3c/0x54 The function __scsi_remove_target iterates through the SCSI devices on the host, but it drops the host_lock before calling scsi_remove_device. When the SCSI device is deleted from another thread, the pointer to the SCSI device in scsi_remove_device can become invalid. Fix this by getting a reference to the SCSI device before dropping the host_lock to keep the SCSI device alive for the call to scsi_remove_device. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-10-06 19:19:44 +08:00
scsi_device_get(sdev))
continue;
spin_unlock_irqrestore(shost->host_lock, flags);
scsi_remove_device(sdev);
[SCSI] Fix race when removing SCSI devices Removing SCSI devices through echo 1 > /sys/bus/scsi/devices/ ... /delete while the FC transport class removes the SCSI target can lead to an oops: Unable to handle kernel pointer dereference at virtual kernel address 00000000b6815000 Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: sunrpc qeth_l3 binfmt_misc dm_multipath scsi_dh dm_mod ipv6 qeth ccwgroup [last unloaded: scsi_wait_scan] CPU: 1 Not tainted 2.6.35.5-45.x.20100924-s390xdefault #1 Process fc_wq_0 (pid: 861, task: 00000000b7331240, ksp: 00000000b735bac0) Krnl PSW : 0704200180000000 00000000003ff6e4 (__scsi_remove_device+0x24/0xd0) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3 Krnl GPRS: 0000000000000001 0000000000000000 00000000b6815000 00000000bc24a8c0 00000000003ff7c8 000000000056dbb8 0000000000000002 0000000000835d80 ffffffff00000000 0000000000001000 00000000b6815000 00000000bc24a7f0 00000000b68151a0 00000000b6815000 00000000b735bc20 00000000b735bbf8 Krnl Code: 00000000003ff6d6: a7840001 brc 8,3ff6d8 00000000003ff6da: a7fbffd8 aghi %r15,-40 00000000003ff6de: e3e0f0980024 stg %r14,152(%r15) >00000000003ff6e4: e31021200004 lg %r1,288(%r2) 00000000003ff6ea: a71f0000 cghi %r1,0 00000000003ff6ee: a7a40011 brc 10,3ff710 00000000003ff6f2: a7390003 lghi %r3,3 00000000003ff6f6: c0e5ffffc8b1 brasl %r14,3f8858 Call Trace: ([<0000000000001000>] 0x1000) [<00000000003ff7d2>] scsi_remove_device+0x42/0x54 [<00000000003ff8ba>] __scsi_remove_target+0xca/0xfc [<00000000003ff99a>] __remove_child+0x3a/0x48 [<00000000003e3246>] device_for_each_child+0x72/0xbc [<00000000003ff93a>] scsi_remove_target+0x4e/0x74 [<0000000000406586>] fc_rport_final_delete+0xb2/0x23c [<000000000015d080>] worker_thread+0x200/0x344 [<000000000016330c>] kthread+0xa0/0xa8 [<0000000000106c1a>] kernel_thread_starter+0x6/0xc [<0000000000106c14>] kernel_thread_starter+0x0/0xc INFO: lockdep is turned off. Last Breaking-Event-Address: [<00000000003ff7cc>] scsi_remove_device+0x3c/0x54 The function __scsi_remove_target iterates through the SCSI devices on the host, but it drops the host_lock before calling scsi_remove_device. When the SCSI device is deleted from another thread, the pointer to the SCSI device in scsi_remove_device can become invalid. Fix this by getting a reference to the SCSI device before dropping the host_lock to keep the SCSI device alive for the call to scsi_remove_device. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-10-06 19:19:44 +08:00
scsi_device_put(sdev);
spin_lock_irqsave(shost->host_lock, flags);
goto restart;
}
spin_unlock_irqrestore(shost->host_lock, flags);
}
/**
* scsi_remove_target - try to remove a target and all its devices
* @dev: generic starget or parent of generic stargets to be removed
*
* Note: This is slightly racy. It is possible that if the user
* requests the addition of another device then the target won't be
* removed.
*/
void scsi_remove_target(struct device *dev)
{
struct Scsi_Host *shost = dev_to_shost(dev->parent);
struct scsi_target *starget;
unsigned long flags;
restart:
spin_lock_irqsave(shost->host_lock, flags);
list_for_each_entry(starget, &shost->__targets, siblings) {
scsi: fix soft lockup in scsi_remove_target() on module removal This softlockup is currently happening: [ 444.088002] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:29] [ 444.088002] Modules linked in: lpfc(-) qla2x00tgt(O) qla2xxx_scst(O) scst_vdisk(O) scsi_transport_fc libcrc32c scst(O) dlm configfs nfsd lockd grace nfs_acl auth_rpcgss sunrpc ed d snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device dm_mod iTCO_wdt snd_hda_codec_realtek snd_hda_codec_generic gpio_ich iTCO_vendor_support ppdev snd_hda_intel snd_hda_codec snd_hda _core snd_hwdep tg3 snd_pcm snd_timer libphy lpc_ich parport_pc ptp acpi_cpufreq snd pps_core fjes parport i2c_i801 ehci_pci tpm_tis tpm sr_mod cdrom soundcore floppy hwmon sg 8250_ fintek pcspkr i915 drm_kms_helper uhci_hcd ehci_hcd drm fb_sys_fops sysimgblt sysfillrect syscopyarea i2c_algo_bit usbcore button video usb_common fan ata_generic ata_piix libata th ermal [ 444.088002] CPU: 1 PID: 29 Comm: kworker/1:1 Tainted: G O 4.4.0-rc5-2.g1e923a3-default #1 [ 444.088002] Hardware name: FUJITSU SIEMENS ESPRIMO E /D2164-A1, BIOS 5.00 R1.10.2164.A1 05/08/2006 [ 444.088002] Workqueue: fc_wq_4 fc_rport_final_delete [scsi_transport_fc] [ 444.088002] task: f6266ec0 ti: f6268000 task.ti: f6268000 [ 444.088002] EIP: 0060:[<c07e7044>] EFLAGS: 00000286 CPU: 1 [ 444.088002] EIP is at _raw_spin_unlock_irqrestore+0x14/0x20 [ 444.088002] EAX: 00000286 EBX: f20d3800 ECX: 00000002 EDX: 00000286 [ 444.088002] ESI: f50ba800 EDI: f2146848 EBP: f6269ec8 ESP: f6269ec8 [ 444.088002] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 444.088002] CR0: 8005003b CR2: 08f96600 CR3: 363ae000 CR4: 000006d0 [ 444.088002] Stack: [ 444.088002] f6269eec c066b0f7 00000286 f2146848 f50ba808 f50ba800 f50ba800 f2146a90 [ 444.088002] f2146848 f6269f08 f8f0a4ed f3141000 f2146800 f2146a90 f619fa00 00000040 [ 444.088002] f6269f40 c026cb25 00000001 166c6392 00000061 f6757140 f6136340 00000004 [ 444.088002] Call Trace: [ 444.088002] [<c066b0f7>] scsi_remove_target+0x167/0x1c0 [ 444.088002] [<f8f0a4ed>] fc_rport_final_delete+0x9d/0x1e0 [scsi_transport_fc] [ 444.088002] [<c026cb25>] process_one_work+0x155/0x3e0 [ 444.088002] [<c026cde7>] worker_thread+0x37/0x490 [ 444.088002] [<c027214b>] kthread+0x9b/0xb0 [ 444.088002] [<c07e72c1>] ret_from_kernel_thread+0x21/0x40 What appears to be happening is that something has pinned the target so it can't go into STARGET_DEL via final release and the loop in scsi_remove_target spins endlessly until that happens. The fix for this soft lockup is to not keep looping over a device that we've called remove on but which hasn't gone into DEL state. This patch will retain a simplistic memory of the last target and not keep looping over it. Reported-by: Sebastian Herbszt <herbszt@gmx.de> Tested-by: Sebastian Herbszt <herbszt@gmx.de> Fixes: 40998193560dab6c3ce8d25f4fa58a23e252ef38 Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2016-02-11 00:03:26 +08:00
if (starget->state == STARGET_DEL ||
scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state The addition of the STARGET_REMOVE state had the side effect of introducing a race condition that can cause a crash. scsi_target_reap_ref_release() checks the starget->state to see if it still in STARGET_CREATED, and if so, skips calling transport_remove_device() and device_del(), because the starget->state is only set to STARGET_RUNNING after scsi_target_add() has called device_add() and transport_add_device(). However, if an rport loss occurs while a target is being scanned, it can happen that scsi_remove_target() will be called while the starget is still in the STARGET_CREATED state. In this case, the starget->state will be set to STARGET_REMOVE, and as a result, scsi_target_reap_ref_release() will take the wrong path. The end result is a panic: [ 1255.356653] Oops: 0000 [#1] SMP [ 1255.360154] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel ghash_clmulni_i [ 1255.393234] CPU: 5 PID: 149 Comm: kworker/u96:4 Tainted: G W 4.11.0+ #8 [ 1255.401879] Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013 [ 1255.410327] Workqueue: scsi_wq_6 fc_scsi_scan_rport [scsi_transport_fc] [ 1255.417720] task: ffff88060ca8c8c0 task.stack: ffffc900048a8000 [ 1255.424331] RIP: 0010:kernfs_find_ns+0x13/0xc0 [ 1255.429287] RSP: 0018:ffffc900048abbf0 EFLAGS: 00010246 [ 1255.435123] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 1255.443083] RDX: 0000000000000000 RSI: ffffffff8188d659 RDI: 0000000000000000 [ 1255.451043] RBP: ffffc900048abc10 R08: 0000000000000000 R09: 0000012433fe0025 [ 1255.459005] R10: 0000000025e5a4b5 R11: 0000000025e5a4b5 R12: ffffffff8188d659 [ 1255.466972] R13: 0000000000000000 R14: ffff8805f55e5088 R15: 0000000000000000 [ 1255.474931] FS: 0000000000000000(0000) GS:ffff880616b40000(0000) knlGS:0000000000000000 [ 1255.483959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1255.490370] CR2: 0000000000000068 CR3: 0000000001c09000 CR4: 00000000000406e0 [ 1255.498332] Call Trace: [ 1255.501058] kernfs_find_and_get_ns+0x31/0x60 [ 1255.505916] sysfs_unmerge_group+0x1d/0x60 [ 1255.510498] dpm_sysfs_remove+0x22/0x60 [ 1255.514783] device_del+0xf4/0x2e0 [ 1255.518577] ? device_remove_file+0x19/0x20 [ 1255.523241] attribute_container_class_device_del+0x1a/0x20 [ 1255.529457] transport_remove_classdev+0x4e/0x60 [ 1255.534607] ? transport_add_class_device+0x40/0x40 [ 1255.540046] attribute_container_device_trigger+0xb0/0xc0 [ 1255.546069] transport_remove_device+0x15/0x20 [ 1255.551025] scsi_target_reap_ref_release+0x25/0x40 [ 1255.556467] scsi_target_reap+0x2e/0x40 [ 1255.560744] __scsi_scan_target+0xaa/0x5b0 [ 1255.565312] scsi_scan_target+0xec/0x100 [ 1255.569689] fc_scsi_scan_rport+0xb1/0xc0 [scsi_transport_fc] [ 1255.576099] process_one_work+0x14b/0x390 [ 1255.580569] worker_thread+0x4b/0x390 [ 1255.584651] kthread+0x109/0x140 [ 1255.588251] ? rescuer_thread+0x330/0x330 [ 1255.592730] ? kthread_park+0x60/0x60 [ 1255.596815] ret_from_fork+0x29/0x40 [ 1255.600801] Code: 24 08 48 83 42 40 01 5b 41 5c 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 [ 1255.621876] RIP: kernfs_find_ns+0x13/0xc0 RSP: ffffc900048abbf0 [ 1255.628479] CR2: 0000000000000068 [ 1255.632756] ---[ end trace 34a69ba0477d036f ]--- Fix this by adding another scsi_target state STARGET_CREATED_REMOVE to distinguish this case. Fixes: f05795d3d771 ("scsi: Add intermediate STARGET_REMOVE state to scsi_target_state") Reported-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Cc: <stable@vger.kernel.org> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-28 02:55:58 +08:00
starget->state == STARGET_REMOVE ||
starget->state == STARGET_CREATED_REMOVE)
continue;
if (starget->dev.parent == dev || &starget->dev == dev) {
kref_get(&starget->reap_ref);
scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state The addition of the STARGET_REMOVE state had the side effect of introducing a race condition that can cause a crash. scsi_target_reap_ref_release() checks the starget->state to see if it still in STARGET_CREATED, and if so, skips calling transport_remove_device() and device_del(), because the starget->state is only set to STARGET_RUNNING after scsi_target_add() has called device_add() and transport_add_device(). However, if an rport loss occurs while a target is being scanned, it can happen that scsi_remove_target() will be called while the starget is still in the STARGET_CREATED state. In this case, the starget->state will be set to STARGET_REMOVE, and as a result, scsi_target_reap_ref_release() will take the wrong path. The end result is a panic: [ 1255.356653] Oops: 0000 [#1] SMP [ 1255.360154] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel ghash_clmulni_i [ 1255.393234] CPU: 5 PID: 149 Comm: kworker/u96:4 Tainted: G W 4.11.0+ #8 [ 1255.401879] Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013 [ 1255.410327] Workqueue: scsi_wq_6 fc_scsi_scan_rport [scsi_transport_fc] [ 1255.417720] task: ffff88060ca8c8c0 task.stack: ffffc900048a8000 [ 1255.424331] RIP: 0010:kernfs_find_ns+0x13/0xc0 [ 1255.429287] RSP: 0018:ffffc900048abbf0 EFLAGS: 00010246 [ 1255.435123] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 1255.443083] RDX: 0000000000000000 RSI: ffffffff8188d659 RDI: 0000000000000000 [ 1255.451043] RBP: ffffc900048abc10 R08: 0000000000000000 R09: 0000012433fe0025 [ 1255.459005] R10: 0000000025e5a4b5 R11: 0000000025e5a4b5 R12: ffffffff8188d659 [ 1255.466972] R13: 0000000000000000 R14: ffff8805f55e5088 R15: 0000000000000000 [ 1255.474931] FS: 0000000000000000(0000) GS:ffff880616b40000(0000) knlGS:0000000000000000 [ 1255.483959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1255.490370] CR2: 0000000000000068 CR3: 0000000001c09000 CR4: 00000000000406e0 [ 1255.498332] Call Trace: [ 1255.501058] kernfs_find_and_get_ns+0x31/0x60 [ 1255.505916] sysfs_unmerge_group+0x1d/0x60 [ 1255.510498] dpm_sysfs_remove+0x22/0x60 [ 1255.514783] device_del+0xf4/0x2e0 [ 1255.518577] ? device_remove_file+0x19/0x20 [ 1255.523241] attribute_container_class_device_del+0x1a/0x20 [ 1255.529457] transport_remove_classdev+0x4e/0x60 [ 1255.534607] ? transport_add_class_device+0x40/0x40 [ 1255.540046] attribute_container_device_trigger+0xb0/0xc0 [ 1255.546069] transport_remove_device+0x15/0x20 [ 1255.551025] scsi_target_reap_ref_release+0x25/0x40 [ 1255.556467] scsi_target_reap+0x2e/0x40 [ 1255.560744] __scsi_scan_target+0xaa/0x5b0 [ 1255.565312] scsi_scan_target+0xec/0x100 [ 1255.569689] fc_scsi_scan_rport+0xb1/0xc0 [scsi_transport_fc] [ 1255.576099] process_one_work+0x14b/0x390 [ 1255.580569] worker_thread+0x4b/0x390 [ 1255.584651] kthread+0x109/0x140 [ 1255.588251] ? rescuer_thread+0x330/0x330 [ 1255.592730] ? kthread_park+0x60/0x60 [ 1255.596815] ret_from_fork+0x29/0x40 [ 1255.600801] Code: 24 08 48 83 42 40 01 5b 41 5c 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 [ 1255.621876] RIP: kernfs_find_ns+0x13/0xc0 RSP: ffffc900048abbf0 [ 1255.628479] CR2: 0000000000000068 [ 1255.632756] ---[ end trace 34a69ba0477d036f ]--- Fix this by adding another scsi_target state STARGET_CREATED_REMOVE to distinguish this case. Fixes: f05795d3d771 ("scsi: Add intermediate STARGET_REMOVE state to scsi_target_state") Reported-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Cc: <stable@vger.kernel.org> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-28 02:55:58 +08:00
if (starget->state == STARGET_CREATED)
starget->state = STARGET_CREATED_REMOVE;
else
starget->state = STARGET_REMOVE;
spin_unlock_irqrestore(shost->host_lock, flags);
__scsi_remove_target(starget);
scsi_target_reap(starget);
goto restart;
}
}
spin_unlock_irqrestore(shost->host_lock, flags);
}
EXPORT_SYMBOL(scsi_remove_target);
int scsi_register_driver(struct device_driver *drv)
{
drv->bus = &scsi_bus_type;
return driver_register(drv);
}
EXPORT_SYMBOL(scsi_register_driver);
int scsi_register_interface(struct class_interface *intf)
{
intf->class = &sdev_class;
return class_interface_register(intf);
}
EXPORT_SYMBOL(scsi_register_interface);
/**
* scsi_sysfs_add_host - add scsi host to subsystem
* @shost: scsi host struct to add to subsystem
**/
int scsi_sysfs_add_host(struct Scsi_Host *shost)
{
int error, i;
/* add host specific attributes */
if (shost->hostt->shost_attrs) {
for (i = 0; shost->hostt->shost_attrs[i]; i++) {
error = device_create_file(&shost->shost_dev,
shost->hostt->shost_attrs[i]);
if (error)
return error;
}
}
transport_register_device(&shost->shost_gendev);
transport_configure_device(&shost->shost_gendev);
return 0;
}
static struct device_type scsi_dev_type = {
.name = "scsi_device",
.release = scsi_device_dev_release,
.groups = scsi_sdev_attr_groups,
};
void scsi_sysfs_device_initialize(struct scsi_device *sdev)
{
unsigned long flags;
struct Scsi_Host *shost = sdev->host;
struct scsi_target *starget = sdev->sdev_target;
device_initialize(&sdev->sdev_gendev);
sdev->sdev_gendev.bus = &scsi_bus_type;
sdev->sdev_gendev.type = &scsi_dev_type;
dev_set_name(&sdev->sdev_gendev, "%d:%d:%d:%llu",
sdev->host->host_no, sdev->channel, sdev->id, sdev->lun);
device_initialize(&sdev->sdev_dev);
sdev->sdev_dev.parent = get_device(&sdev->sdev_gendev);
sdev->sdev_dev.class = &sdev_class;
dev_set_name(&sdev->sdev_dev, "%d:%d:%d:%llu",
sdev->host->host_no, sdev->channel, sdev->id, sdev->lun);
scsi: don't store LUN bits in CDB[1] for USB mass-storage devices The SCSI specification requires that the second Command Data Byte should contain the LUN value in its high-order bits if the recipient device reports SCSI level 2 or below. Nevertheless, some USB mass-storage devices use those bits for other purposes in vendor-specific commands. Currently Linux has no way to send such commands, because the SCSI stack always overwrites the LUN bits. Testing shows that Windows 7 and XP do not store the LUN bits in the CDB when sending commands to a USB device. This doesn't matter if the device uses the Bulk-Only or UAS transports (which virtually all modern USB mass-storage devices do), as these have a separate mechanism for sending the LUN value. Therefore this patch introduces a flag in the Scsi_Host structure to inform the SCSI midlayer that a transport does not require the LUN bits to be stored in the CDB, and it makes usb-storage set this flag for all devices using the Bulk-Only transport. (UAS is handled by a separate driver, but it doesn't really matter because no SCSI-2 or lower device is at all likely to use UAS.) The patch also cleans up the code responsible for storing the LUN value by adding a bitflag to the scsi_device structure. The test for whether to stick the LUN value in the CDB can be made when the device is probed, and stored for future use rather than being made over and over in the fast path. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Tiziano Bacocco <tiziano.bacocco@gmail.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-02 23:35:50 +08:00
/*
* Get a default scsi_level from the target (derived from sibling
* devices). This is the best we can do for guessing how to set
* sdev->lun_in_cdb for the initial INQUIRY command. For LUN 0 the
* setting doesn't matter, because all the bits are zero anyway.
* But it does matter for higher LUNs.
*/
sdev->scsi_level = starget->scsi_level;
scsi: don't store LUN bits in CDB[1] for USB mass-storage devices The SCSI specification requires that the second Command Data Byte should contain the LUN value in its high-order bits if the recipient device reports SCSI level 2 or below. Nevertheless, some USB mass-storage devices use those bits for other purposes in vendor-specific commands. Currently Linux has no way to send such commands, because the SCSI stack always overwrites the LUN bits. Testing shows that Windows 7 and XP do not store the LUN bits in the CDB when sending commands to a USB device. This doesn't matter if the device uses the Bulk-Only or UAS transports (which virtually all modern USB mass-storage devices do), as these have a separate mechanism for sending the LUN value. Therefore this patch introduces a flag in the Scsi_Host structure to inform the SCSI midlayer that a transport does not require the LUN bits to be stored in the CDB, and it makes usb-storage set this flag for all devices using the Bulk-Only transport. (UAS is handled by a separate driver, but it doesn't really matter because no SCSI-2 or lower device is at all likely to use UAS.) The patch also cleans up the code responsible for storing the LUN value by adding a bitflag to the scsi_device structure. The test for whether to stick the LUN value in the CDB can be made when the device is probed, and stored for future use rather than being made over and over in the fast path. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Tiziano Bacocco <tiziano.bacocco@gmail.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-02 23:35:50 +08:00
if (sdev->scsi_level <= SCSI_2 &&
sdev->scsi_level != SCSI_UNKNOWN &&
!shost->no_scsi2_lun_in_cdb)
sdev->lun_in_cdb = 1;
transport_setup_device(&sdev->sdev_gendev);
spin_lock_irqsave(shost->host_lock, flags);
list_add_tail(&sdev->same_target_siblings, &starget->devices);
list_add_tail(&sdev->siblings, &shost->__devices);
spin_unlock_irqrestore(shost->host_lock, flags);
/*
* device can now only be removed via __scsi_remove_device() so hold
* the target. Target will be held in CREATED state until something
* beneath it becomes visible (in which case it moves to RUNNING)
*/
kref_get(&starget->reap_ref);
}
int scsi_is_sdev_device(const struct device *dev)
{
return dev->type == &scsi_dev_type;
}
EXPORT_SYMBOL(scsi_is_sdev_device);
/* A blank transport template that is used in drivers that don't
* yet implement Transport Attributes */
struct scsi_transport_template blank_transport_template = { { { {NULL, }, }, }, };