On larger installations it is useful to disable automatic LUN scanning,
and only add the required LUNs via udev rules. This can speed up bootup
dramatically.
This patch introduces a new scan module parameter value 'manual', which
works like 'none', but can be overridden by setting the 'rescan' value
from scsi_scan_target to 'SCSI_SCAN_MANUAL'. And it updates all
relevant callers to set the 'rescan' value to 'SCSI_SCAN_MANUAL' if
invoked via the 'scan' option in sysfs.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use scsi_host_{get,put}() instead of open-coding these functions.
Compile-tested only.
[mkp: Dropped CC:stable and fixed James Smart's email address]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
These speeds are to support the next generation of FCoE port speeds.
Signed-off-by: Dick Kennedy <Dick.Kennedy@Emulex.Com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Now that the ibmvstgt driver as the only user of scsi_tgt is gone, the
scsi_tgt kernel module, the CONFIG_SCSI_TGT, CONFIG_SCSI_SRP_TGT_ATTRS and
CONFIG_SCSI_FC_TGT_ATTRS kbuild variable, the scsi_host_template
transfer_response method are no longer needed.
[hch: minor updates to the current tree, changelog update]
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
The SCSI standard defines 64-bit values for LUNs, and large arrays
employing large or hierarchical LUN numbers become more and more
common.
So update the linux SCSI stack to use 64-bit LUN numbers.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
For the workqueue creation interfaces that do not expect format strings,
make sure they cannot accidently be parsed that way. Additionally, clean
up calls made with a single parameter that would be handled as a format
string. Many callers are passing potentially dynamic string content, so
use "%s" in those cases to avoid any potential accidents.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull core block IO bits from Jens Axboe:
"The most complicated part if this is the request allocation rework by
Tejun, which has been queued up for a long time and has been in
for-next ditto as well.
There are a few commits from yesterday and today, mostly trivial and
obvious fixes. So I'm pretty confident that it is sound. It's also
smaller than usual."
* 'for-3.6/core' of git://git.kernel.dk/linux-block:
block: remove dead func declaration
block: add partition resize function to blkpg ioctl
block: uninitialized ioc->nr_tasks triggers WARN_ON
block: do not artificially constrain max_sectors for stacking drivers
blkcg: implement per-blkg request allocation
block: prepare for multiple request_lists
block: add q->nr_rqs[] and move q->rq.elvpriv to q->nr_rqs_elvpriv
blkcg: inline bio_blkcg() and friends
block: allocate io_context upfront
block: refactor get_request[_wait]()
block: drop custom queue draining used by scsi_transport_{iscsi|fc}
mempool: add @gfp_mask to mempool_create_node()
blkcg: make root blkcg allocation use %GFP_KERNEL
blkcg: __blkg_lookup_create() doesn't need radix preload
This has scsi_internal_device_unblock/scsi_target_unblock take
the new state to set the devices as an argument instead of
always setting to running. The patch also converts users of these
functions.
This allows the FC and iSCSI class to transition devices from blocked
to transport-offline, so that when fast_io_fail/replacement_timeout
has fired we do not set the devices back to running. Instead, we
set them to SDEV_TRANSPORT_OFFLINE.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The libfc provides more flexibility and with that
we can monitor some more FC specific stats for
FC exches or FCP error cases, this patch add
such new FC stats.
The patch adds *only* FC specific new stats to
existing fc_host attribute container.
Added stats names are self explanatory as
existing FC stats already has, however anyway
still added commentary along their definition
to describe them.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Acked-by : Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
iscsi_remove_host() uses bsg_remove_queue() which implements custom
queue draining. fc_bsg_remove() open-codes mostly identical logic.
The draining logic isn't correct in that blk_stop_queue() doesn't
prevent new requests from being queued - it just stops processing, so
nothing prevents new requests to be queued after the logic determines
that the queue is drained.
blk_cleanup_queue() now implements proper queue draining and these
custom draining logics aren't necessary. Drop them and use
bsg_unregister_queue() + blk_cleanup_queue() instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: James Smart <james.smart@emulex.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When a rport is added back or the role is changed the fc class
will queue a scan and then call scsi_target_unblock. The problem
with this is if the devices are in the SDEV_OFFLINE state and
the scan is run before the scsi_target_unblock, then the scan
will see LUN0 as offline and the scan will fail. This patch moves
the unblock call to before the scan, so we know the device state
will be set correctly when the scan is run.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This adds FC-GS Fabric Device Management Interface
(FDMI) related attributes to fc_host_attr structure.
This is in preparation for allowing FDMI attributes
to be registered via libfc.
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch fixes a bug where devloss is not called on fc_host teardown.
The issue is seen if the LLDD uses rport_rolechg to add the target role
to an rport.
When an rport goes away, the LLDD will call fc_remote_port_delete, which
will start the devloss timer. If the timer expires, the transport will
call the devloss callback and set the FC_RPORT_DEVLOSS_CALLBK_DONE flag.
However, the rport structure is not deleted, it is retained to store the
SCSI id mappings for the rport in case it comes back. In the scenario
where it does come back, and the driver calls fc_remote_port_add, but does
not indicate the "target" role for the rport - the create will clear the
structure, but forgets to clear FC_RPORT_DEVLOSS_CALLBK_DONE flag (which
is cleared if it's added with the target role). The secondary call, of
fc_remote_port_rolechg to add the target role also does not clear the flag.
Thus, the next time the rport goes away, the resulting devloss timer
expiration will not call the driver callback as the flag is still set.
This patch adds the FC_RPORT_DEVLOSS_CALLBK_DONE flags to the list of
those that are cleared upon reuse of the rport structure.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (110 commits)
[SCSI] qla2xxx: Refactor call to qla2xxx_read_sfp for thermal temperature.
[SCSI] qla2xxx: Unify the read/write sfp mailbox command routines.
[SCSI] qla2xxx: Clear complete initialization control block.
[SCSI] qla2xxx: Allow an override of the registered maximum LUN.
[SCSI] qla2xxx: Add host number in reset and quiescent message logs.
[SCSI] qla2xxx: Correctly read sfp single byte mailbox register.
[SCSI] qla2xxx: Add qla82xx_rom_unlock() function.
[SCSI] qla2xxx: Log if qla82xx firmware fails to load from flash.
[SCSI] qla2xxx: Use passed in host to initialize local scsi_qla_host in queuecommand function
[SCSI] qla2xxx: Correct buffer start in edc sysfs debug print.
[SCSI] qla2xxx: Update firmware version after flash update for ISP82xx.
[SCSI] qla2xxx: Fix hang during driver unload when vport is active.
[SCSI] qla2xxx: Properly set the dsd_list_len for dsd_chaining in cmd type 6.
[SCSI] qla2xxx: Fix virtual port failing to login after chip reset.
[SCSI] qla2xxx: Fix vport delete hang when logins are outstanding.
[SCSI] hpsa: Change memset using sizeof(ptr) to sizeof(*ptr)
[SCSI] ipr: Rate limit DMA mapping errors
[SCSI] hpsa: add P2000 to list of shared SAS devices
[SCSI] hpsa: do not attempt PCI power management reset method if we know it won't work.
[SCSI] hpsa: remove superfluous sleeps around reset code
...
Creating and destroying fcoe interface in a tight loop leads to a system
deadlock with the following call traces:
Call Trace:
[<ffffffff814f4b3d>] schedule_timeout+0x1fd/0x2c0
[<ffffffff814f469f>] ? wait_for_common+0x4f/0x190
[<ffffffff814f469f>] ? wait_for_common+0x4f/0x190
[<ffffffff814f4737>] wait_for_common+0xe7/0x190
[<ffffffff81042fa0>] ? default_wake_function+0x0/0x20
[<ffffffff81082c2d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff814f48bd>] wait_for_completion+0x1d/0x20
[<ffffffff81066d90>] flush_workqueue+0x290/0x5f0
[<ffffffff81066b00>] ? flush_workqueue+0x0/0x5f0
[<ffffffff81067148>] destroy_workqueue+0x38/0x340
[<ffffffffa0260289>] fc_remove_host+0x1b9/0x1f0 [scsi_transport_fc]
[<ffffffffa02ed195>] bnx2fc_if_destroy+0xc5/0x1f0 [bnx2fc]
[<ffffffffa02ed33a>] bnx2fc_destroy+0x7a/0x100 [bnx2fc]
[<ffffffffa02c789b>] fcoe_transport_destroy+0x9b/0x1b0 [libfcoe]
[<ffffffff81069ec2>] param_attr_store+0x52/0x80
[<ffffffff81069976>] module_attr_store+0x26/0x30
[<ffffffff8119e726>] sysfs_write_file+0xe6/0x170
[<ffffffff81134710>] vfs_write+0xd0/0x1a0
[<ffffffff811348e4>] sys_write+0x54/0xa0
[<ffffffff81002e02>] system_call_fastpath+0x16/0x1b
Call Trace:
[<ffffffff81074865>] async_synchronize_cookie_domain+0x75/0x120
[<ffffffff8106caa0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff81074925>] async_synchronize_cookie+0x15/0x20
[<ffffffff8107494c>] async_synchronize_full+0x1c/0x40
[<ffffffffa0057466>] sd_remove+0x36/0xc0 [sd_mod]
[<ffffffff81358a75>] __device_release_driver+0x75/0xe0
[<ffffffff81358bef>] device_release_driver+0x2f/0x50
[<ffffffff81357aee>] bus_remove_device+0xbe/0x120
[<ffffffff813553ef>] device_del+0x12f/0x1e0
[<ffffffff8137454d>] __scsi_remove_device+0xbd/0xc0
[<ffffffff81374585>] scsi_remove_device+0x35/0x50
[<ffffffff813746a7>] __scsi_remove_target+0xe7/0x110
[<ffffffff81374730>] ? __remove_child+0x0/0x30
[<ffffffff81374753>] __remove_child+0x23/0x30
[<ffffffff81354a2c>] device_for_each_child+0x4c/0x80
[<ffffffff81374703>] scsi_remove_target+0x33/0x60
[<ffffffffa02622c6>] fc_starget_delete+0x26/0x30 [scsi_transport_fc]
[<ffffffffa026271a>] fc_rport_final_delete+0xaa/0x200 [scsi_transport_fc]
[<ffffffff8106585a>] process_one_work+0x1aa/0x540
[<ffffffff810657eb>] ? process_one_work+0x13b/0x540
[<ffffffffa0262670>] ? fc_rport_final_delete+0x0/0x200 [scsi_transport_fc]
[<ffffffff81067ac9>] worker_thread+0x179/0x410
[<ffffffff81067950>] ? worker_thread+0x0/0x410
[<ffffffff8106c546>] kthread+0xb6/0xc0
[<ffffffff8103879b>] ? finish_task_switch+0x4b/0xe0
[<ffffffff81003ca4>] kernel_thread_helper+0x4/0x10
[<ffffffff814f7994>] ? restore_args+0x0/0x30
[<ffffffff8106c490>] ? kthread+0x0/0xc0
[<ffffffff81003ca0>] ? kernel_thread_helper+0x0/0x10
fc_remove_host() waits for flushing the workqueue, but it is stuck at flushing
the first work. The first work doesnt complete, because it is waiting for async
layer to complete the IOs. The async layer cannot complete the IO as the
terminate_rport_io for the second work was not called, which will be called
only when the first work completes. Hence the deadlock. To resolve this
deadlock, the workqueue allocation has been modified from
create_singlethread_workqueue() to alloc_workqueue().
In addition, fc_terminate_rport_io() should be called before the
scsi_flush_work() to avoid the similar deadlock as above.
scsi fc alloc queue. move terminate rport io before flush
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
We are currently using this flag to check whether it's safe
to call into ->request_fn(). If it is set, we punt to kblockd.
But we get a lot of false positives and excessive punts to
kblockd, which hurts performance.
The only real abuser of this infrastructure is SCSI. So export
the async queue run and convert SCSI over to use that. There's
room for improvement in that SCSI need not always use the async
call, but this fixes our performance issue and they can fix that
up in due time.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Instead of overloading __blk_run_queue to force an offload to kblockd
add a new blk_run_queue_async helper to do it explicitly. I've kept
the blk_queue_stopped check for now, but I suspect it's not needed
as the check we do when the workqueue items runs should be enough.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
__blk_run_queue() automatically either calls q->request_fn() directly
or schedules kblockd depending on whether the function is recursed.
blk-flush implementation needs to be able to explicitly choose
kblockd. Add @force_kblockd.
All the current users are converted to specify %false for the
parameter and this patch doesn't introduce any behavior change.
stable: This is prerequisite for fixing ide oops caused by the new
blk-flush implementation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
This adds a fc host dev loss sysfs file. Instead of
calling into the driver using the get_host_def_dev_loss_tmo
callback, we allow drivers to init the dev loss like is done
for other fc host params, and then the fc class will handle
updating the value if the user writes to the new sysfs file.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When an rport is "blocked" and a bsg request is received, the bsg request gets
placed on the queue but the queue stalls. If the fc object is then deleted - the
bsg queue never restarts and keeps the reference on the object, and stops the
overall teardown.
This patch restarts the bsg queue on teardown and drains any pending requests,
allowing the teardown to succeed.
Signed-off-by: Carl Lajeunesse <carl.lajeunesse@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch adds a fc_host setting to store the
default dev_loss_tmo. It is used if the driver
has a callack to get the value from the LLD. If
the callback is not set, then we use the fc class
module default value.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If the scsi eh is running and then a FC LLD calls
fc_remote_port_delete, the SCSI commands sent from the eh will fail.
To prevent this, a FC LLD can call fc_block_scsi_eh from the eh
callback, blocking the eh thread until the dev_loss_tmo fires or the
remote port is available again.
If (e.g. for a multipathing setup) the dev_loss_tmo is set to a very
large value, thus preventing the scsi device removal , the scsi eh can
block for a long time. For multipathing, the fast_io_fail_tmo is then
set to a low value to detect path problems sooner.
This patch introduces a new return code FAST_IO_FAIL. The function
fc_block_scsi_eh now returns FAST_IO_FAIL when the fast_io_fail_tmo
fires. This indicates that the LLD terminated all pending I/O requests
and there are no more pending SCSI commands for the scsi eh to wait
for. This return code can be passed back to the scsi eh to stop the
escalation and finish the recovery process for this device.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The rport structure defines dev_loss_tmo as u32, which is
later multiplied with HZ to get the actual timeout value.
This might overflow for large dev_loss_tmo values. So we
should be better using u64 as intermediate variables here
to protect against overflow.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] qla1280: retain firmware for error recovery
[SCSI] attirbute_container: Initialize sysfs attributes with sysfs_attr_init
[SCSI] advansys: fix regression with request_firmware change
[SCSI] qla2xxx: Updated version number to 8.03.02-k2.
[SCSI] qla2xxx: Prevent sending mbx commands from sysfs during isp reset.
[SCSI] qla2xxx: Disable MSI on qla24xx chips other than QLA2432.
[SCSI] qla2xxx: Check to make sure multique and CPU affinity support is not enabled at the same time.
[SCSI] qla2xxx: Correct vp_idx checking during PORT_UPDATE processing.
[SCSI] qla2xxx: Honour "Extended BB credits" bit for CNAs.
[SCSI] scsi_transport_fc: Make sure commands are completed when rport is offline
[SCSI] libiscsi: Fix recovery slowdown regression
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
blk_end_request doesn't complete a bidi request
successfully
The unfinished request eventually triggers a panic in
timeout handling routine fc_bsg_job_timeout as
req->special is NULL
Use blk_end_request_all to end the request unconditionally
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Acked-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The issue occur while deleting 60 virtual ports through the sys
interface /sys/class/fc_vports/vport-X/vport_delete. It happen while in
a mistake each request sent twice for the same vport. This interface is
asynchronous, entering the delete request into a work queue, allowing
more than one request to enter to the delete work queue. The result is a
NULL pointer. The first request already delete the vport, while the
second request got a pointer to the vport before the device destroyed.
Re-create vport later cause system freeze.
Solution: Check vport flags before entering the request to the work queue.
[jejb: fixed int<->long problem on spinlock flags variable]
Signed-off-by: Gal Rosen <galr@storwize.com>
Acked-by: James Smart <james.smart@emulex.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Currently dev_loss_tmo is capped by SCSI_DEVICE_BLOCK_MAX_TIMEOUT.
This causes problem with multipathing when the 'no_path_retry' setting
exceeds the dev_loss_tmo setting, as then the system might run into
a deadlock when all paths have been removed temporarily for longer
than dev_loss_tmo.
The principal reasons for the capping has been that we should
not allow a remote port to remain in status 'blocked' indefinitely,
so the capping is there to ensure that the port status is being reset
eventually.
However, the fast_io_fail_tmo will also move the remote port out of
the 'blocked' state, so for any HBA driver implementing both the
capping should really be on the fast_io_fail_tmo, and not on the
dev_loss_tmo.
This patch implements just that, ie the fast_io_fail_tmo is capped
to SCSI_DEVICE_BLOCK_TIMEOUT and the capping is removed from
dev_loss_tmo when fast_io_fail_tmo is set.
This allows us to synchronize the dev_loss_tmo setting to the
'no_path_retry' setting from multipathing thus avoiding the deadlock.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The hardware used with zfcp cannot abort a currently pending CT or ELS
request. Therefore we need the option to postpone the timeout
triggered request abort within the fc layer, since there is nothing
zfcp can do to stop the request at this point.
Cc: James Smart <James.Smart@emulex.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If transport_class_register fails we should unregister any
registered classes, or we will leak memory or other
resources.
I did a quick modprobe of scsi_transport_fc to test the
patch.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (222 commits)
[SCSI] zfcp: Remove flag ZFCP_STATUS_FSFREQ_TMFUNCNOTSUPP
[SCSI] zfcp: Activate fc4s attributes for zfcp in FC transport class
[SCSI] zfcp: Block scsi_eh thread for rport state BLOCKED
[SCSI] zfcp: Update FSF error reporting
[SCSI] zfcp: Improve ELS ADISC handling
[SCSI] zfcp: Simplify handling of ct and els requests
[SCSI] zfcp: Remove ZFCP_DID_MASK
[SCSI] zfcp: Move WKA port to zfcp FC code
[SCSI] zfcp: Use common code definitions for FC CT structs
[SCSI] zfcp: Use common code definitions for FC ELS structs
[SCSI] zfcp: Update FCP protocol related code
[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport
[SCSI] zfcp: Assign scheduled work to driver queue
[SCSI] zfcp: Remove STATUS_COMMON_REMOVE flag as it is not required anymore
[SCSI] zfcp: Implement module unloading
[SCSI] zfcp: Merge trace code for fsf requests in one function
[SCSI] zfcp: Access ports and units with container_of in sysfs code
[SCSI] zfcp: Remove suspend callback
[SCSI] zfcp: Remove global config_mutex
[SCSI] zfcp: Replace local reference counting with common kref
...
If the port state is blocked and the fast io fail tmo has
fired then this patch will fail bsg requests immediately.
This is needed if userspace is sending IOs to test the transport
like with fcping, so it will not have to wait for the dev loss tmo.
With this patch he bsg req fast io fail code behaves like the normal
and sg io/passthrough fast io fail.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-By: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move the duplicated code from FC LLDs to SCSI FC transport class.
Acked-by: James Smart <james.smart@emulex.com>
Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Acked-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
I was doing some large lun count testing with 2.6.31 and hit
a BUG_ON() in fc_timeout_deleted_rport(), and it seems like it
should have been just a matter of time before someone did.
It seems invalid to set port_state under lock, then expect it to
remain set after releasing the lock. Another thread called
fc_remote_port_add() when the lock was released, changing the
port_state.
This patch removes the BUG_ON and moves the test of the
port_state to inside the host_lock. It's been running for
several weeks now with no ill effect.
Signed-off-by: Michael Reed <mdr@sgi.com>
Acked-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add missing kernel-doc notation in scsi_transport_fc.c:
Warning(drivers/scsi/scsi_transport_fc.c:3593): No description found for parameter 'q'
Warning(drivers/scsi/scsi_transport_fc.c:3700): No description found for parameter 'q'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Change function parameter name in kernel-doc to match the function's
actual parameter name, to fix 2 kernel-doc warnings.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
BUS_ID_SIZE is being removed from the kernel.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The return value from BSG timout function should be based on the state of the
BSG job. This helps block layer to take selective actions to clean up BSG job.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Registered the softirq_done function, since this is requried iby an request
using block level request timeout functionality. This function will be called
by the block layer as part of time out clean process to release the BSG
request.
Moved some of the BSG request completion activities to softirq_done routine to
take care of both normal and timout completions.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Attached is the ELS/CT pass-thru patch for the FC Transport. The patch
creates a generic framework that lays on top of bsg and the SGIO v4 ioctl
in order to pass transaction requests to LLDD's.
The interface supports the following operations:
On an fc_host basis:
Request login to the specified N_Port_ID, creating an fc_rport.
Request logout of the specified N_Port_ID, deleting an fc_rport
Send ELS request to specified N_Port_ID w/o requiring a login, and
wait for ELS response.
Send CT request to specified N_Port_ID and wait for CT response.
Login is required, but LLDD is allowed to manage login and decide
whether it stays in place after the request is satisfied.
Vendor-Unique request. Allows a LLDD-specific request to be passed
to the LLDD, and the passing of a response back to the application.
On an fc_rport basis:
Send ELS request to nport and wait for ELS response.
Send CT request to nport and wait for CT response.
The patch also exports several headers from include/scsi such that
they can be available to user-space applications:
include/scsi/scsi.h
include/scsi/scsi_netlink.h
include/scsi/scsi_netlink_fc.h
include/scsi/scsi_bsg_fc.h
For further information, refer to the last RFC:
http://marc.info/?l=linux-scsi&m=123436574018579&w=2
Note: Documentation is still spotty and will be added later.
[bharrosh@panasas.com: update for new block API]
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently, netlink_broadcast() reports errors to the caller if no
messages at all were delivered:
1) If, at least, one message has been delivered correctly, returns 0.
2) Otherwise, if no messages at all were delivered due to skb_clone()
failure, return -ENOBUFS.
3) Otherwise, if there are no listeners, return -ESRCH.
With this patch, the caller knows if the delivery of any of the
messages to the listeners have failed:
1) If it fails to deliver any message (for whatever reason), return
-ENOBUFS.
2) Otherwise, if all messages were delivered OK, returns 0.
3) Otherwise, if no listeners, return -ESRCH.
In the current ctnetlink code and in Netfilter in general, we can add
reliable logging and connection tracking event delivery by dropping the
packets whose events were not successfully delivered over Netlink. Of
course, this option would be settable via /proc as this approach reduces
performance (in terms of filtered connections per seconds by a stateful
firewall) but providing reliable logging and event delivery (for
conntrackd) in return.
This patch also changes some clients of netlink_broadcast() that
may report ENOBUFS errors via printk. This error handling is not
of any help. Instead, the userspace daemons that are listening to
those netlink messages should resync themselves with the kernel-side
if they hit ENOBUFS.
BTW, netlink_broadcast() clients include those that call
cn_netlink_send(), nlmsg_multicast() and genlmsg_multicast() since they
internally call netlink_broadcast() and return its error value.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we reworked the transport for the rport lifetimes, in cases where the
rport was reused as a container for tgt id bindings, we inadvertantly
removed the callback to the driver indicating that dev_loss_tmo had fired.
This patch restores that functionality.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[jejb: limit ioctl to returning 20 characters to avoid overrun
on long device names and add a few more conversions]
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Pre-emptively terminate i/o on the rport if dev_loss_tmo has fired.
The desire is to terminate everything, so that the i/o is cleaned up
prior to the sdev's being unblocked, thus any outstanding timeouts/aborts
are avoided.
Also, we do this early enough such that the rport's port_id field is
still valid. FCOE libFC code needs this info to find the i/o's to
terminate.
Signed-off-by: James Smart <james.smart@emulex.com>
[michaelc@cs.wisc.edu: remove extra scsi_target_unblock call]
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When we block a rport and the driver implements the terminate
callback we will fail IO that was running quickly. However
IO that was in the scsi_device/block queue sits there until
the dev_loss_tmo fires, and this can make it look like IO is
lost because new IO will get executed but that IO stuck in
the blocked queue sits there for some time longer.
With this patch when the fast io fail tmo fires, we will
fail the blocked IO and any new IO. This patch also allows
all drivers to partially support the fast io fail tmo. If the
terminate io callback is not implemented, we will still fail blocked
IO and any new IO, so multipath can handle that.
This patch also allows the fc and iscsi classes to implement the
same behavior. The timers are just unfornately named differently.
This patch also fixes the problem where drivers were unblocking
the target in their terminate callback, which was needed for
rport removal, but for fast io fail timeout it would cause
IO to bounce arround the scsi/block layer and the LLD queuecommand.
And it for drivers that could have IO stuck but did not have
a terminate callback the unblock calls in the class will fix
them.
v2.
- fix up bit setting style to meet JamesS's pref.
- Broke out new host byte error changes to make it easier to read.
- added JamesS's ack from list.
v1
- initial patch
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits)
[SCSI] zfcp: fix double dbf id usage
[SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev
[SCSI] zfcp: fix erp list usage without using locks
[SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport
[SCSI] zfcp: fix deadlock caused by shared work queue tasks
[SCSI] zfcp: put threshold data in hba trace
[SCSI] zfcp: Simplify zfcp data structures
[SCSI] zfcp: Simplify get_adapter_by_busid
[SCSI] zfcp: remove all typedefs and replace them with standards
[SCSI] zfcp: attach and release SAN nameserver port on demand
[SCSI] zfcp: remove unused references, declarations and flags
[SCSI] zfcp: Update message with input from review
[SCSI] zfcp: add queue_full sysfs attribute
[SCSI] scsi_dh: suppress comparison warning
[SCSI] scsi_dh: add Dell product information into rdac device handler
[SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option
[SCSI] qla2xxx: fix printk format warnings
[SCSI] qla2xxx: Update version number to 8.02.01-k8.
[SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing.
[SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling.
...
Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.
Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
There's already a fc_vport_termintate() call exported by
the transport. This patch adds a symmetric call to the API to allow
an NPIV-capable LLD to instantiate vports sans user intervention.
Additional comments/updates:
Re: scsi_fc_transport.txt
Add a function prototype for fc_vport_terminate similar to what's
done for fc_vport_create
Re: fc_vport_create
I recommend we pass the channel number in fc_vport_create rather
than fixing it at zero.
Also, ids->vport_type should be set to FC_PORTTYPE_NPIV prior to
calling fc_vport_create. The comment is also meaningless.
Added-by and
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[jejb: fixed up a ton of missed conversions.
All of you are on notice this has happened, driver trees will now
need to be rebased]
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: SCSI List <linux-scsi@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Kobjects do not have a limit in name size since a while, so stop
pretending that they do.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Way back when, when the fc_user_scan routine was created, it kept some
of its original logic that walked the rport list and kicked off a scan.
Unfortunately, it didn't keep any of the locking around the rport list,
nor did it consider the synchronous nature of the scan invoked. The result,
there are some scan requests where the rport list changes, thus a subsequent
scan is called on a bogus rport structure and the system NMI's.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
It's big, but there doesn't seem to be a way to split it up smaller...
Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add Documentation/DocBook/scsi_midlayer.tmpl, add to Makefile, and update
lots of kerneldoc comments in drivers/scsi/*.
Updated with comments from Stefan Richter, Stephen M. Cameron,
James Bottomley and Randy Dunlap.
Signed-off-by: Rob Landley <rob@landley.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
In scsi module I've found some inconsistency between variable type
used in module_param_named and type passed to module_param_named as an
argument. Especially the inconsistency of `max_scsi_luns' parameter is
a bit serious because the description text says "last scsi LUN (should
be between 1 and 2^32-1)".
Signed-off-by: Masatake YAMATO <jet@gyve.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This adds minimum target driver support like the srp transport does:
- fc_remote_port_{rolechg,delete} calls
scsi_tgt_it_nexus_{create,destroy} for target drivers.
- add callbacks to notify target drivers of the nexus and tmf
operation results to fc_function_template.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This change has already been discussed on linux-scsi:
http://marc.info/?t=118771096400003http://marc.info/?t=118760913100005
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch contains the following cleanups:
- make needlessly global functions static
- every file should #include the headers containing the prototypes for
it's global functions
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When a target scan is initiated from sysfs, we should check the
portstate prior to invoke scsi_scan_target().
Otherwise scsi_scan_target() might oops as the rport might already
been removed from the scsi host and the traversal from the rport to
the scsi_host in scsi_scan_target() will fail.
Also the portstate already told us that communication with the target
has failed, so it's quite pointless to try.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When the vport attribute "delete" is used to delete the vport, sysfs
deadlocks waiting for the write to complete, which is waiting for the
sysfs teardown to complete. Moved this effort to a work_q element.
Took the opportunity to make some other cosmetic changes:
- removed tabs in Doc file - replaced with expanded spaces
- minor copyright text and author text updates
- removed a bunch of trailing whitespace
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch provides support for FC virtual ports based on NPIV.
For information on the interfaces and design, please read the
Documentation/scsi/scsi_fc_transport.txt file enclosed within
the patch.
The RFC was originally posted here:
http://marc.info/?l=linux-scsi&m=117226959918393&w=2
Changes from the initial RFC:
- Bug fix: needed a transport_class_unregister() for the vport class
- Create a symlink to the vport in the shost device if it is not the
parent of the vport.
- Made symbolic name writable so it can be set after creation
- Made the temporary fc_vport_identifiers struct private to the
transport.
- Deleted the vport_id field from the vport. I couldn't find any good
use for it (and symname is a good replacement).
- Made the vport_state and vport_last_state "private" attributes.
Added the fc_vport_set_state() helper function to manage state
transitions
- Updated vport_create() to allow a vport to be created in a disabled
state.
- Added INITIALIZING and FAILED vport states
- Added VPCERR_xxx defines for errors to be returned from vport_create()
- Created a Documentation/scsi/scsi_fc_transport.txt file that describes
the interfaces and expected LLDD behaviors.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Per the comment in the change - it's not always prudent to immediately
remove the rport upon first notice of a disconnect. Make all rports
wait dev_loss_tmo before being deleted (and each could have a separate
dev_loss_tmo value).
The original post was:
http://marc.info/?l=linux-scsi&m=117392196006703&w=2
The repost contains the following changes:
- Bug fix in fc_starget_delete(). Dev_loss_tmo_callbk() was called prior to
tearing down the target. The callback is to be the last thing called, as
it tells the LLDD that the rport is completely finished and can be torn
down. Rework so that terminate_rport_io() is called to terminate the
outstanding io. Isolated work so it's is simply "starget" work.
- Fix holes in original patch. There were code paths that did not expect
the dev_loss_tmo timer to be running for the non-fcp rports.
- Bug Fix: the transport wasn't protecting against a LLDD calling
fc_remote_port_delete() back-to-back. Thus, the dev_loss_tmo timer
could be restarted such that it fires after the rport had been deleted.
Validate rport state before starting the timer.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch updates the FC transport for all speeds identified in
SM-HBA. Note: it does not sync the "bit" definitions, as that is
actually insulated from user-space via the sysfs text string. (I could
do it, but it does introduce a potential binary-incompatibility).
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
After Al Viro (finally) succeeded in removing the sched.h #include in module.h
recently, it makes sense again to remove other superfluous sched.h includes.
There are quite a lot of files which include it but don't actually need
anything defined in there. Presumably these includes were once needed for
macros that used to live in sched.h, but moved to other header files in the
course of cleaning it up.
To ease the pain, this time I did not fiddle with any header files and only
removed #includes from .c-files, which tend to cause less trouble.
Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
configs in arch/arm/configs on arm. I also checked that no new warnings were
introduced by the patch (actually, some warnings are removed that were emitted
by unnecessarily included header files).
Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nlmsg_multicast now takes an extra allocation flag, so add it to
the use in the fibre channel transport class.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds the following functionality to the FC transport:
- dev_loss_tmo LLDD callback :
Called to essentially confirm the deletion of an rport. Thus, it is
called whenever the dev_loss_tmo fires, or when the rport is deleted
due to other circumstances (module unload, etc). It is expected that
the callback will initiate the termination of any outstanding i/o on
the rport.
- fast_io_fail_tmo and LLD callback:
There are some cases where it may take a long while to truly determine
device loss, but the system is in a multipathing configuration that if
the i/o was failed quickly (faster than dev_loss_tmo), it could be
redirected to a different path and completed sooner.
Many thanks to Mike Reed who cleaned up the initial RFC in support
of this post.
The original RFC is at:
http://marc.theaimsgroup.com/?l=linux-scsi&m=115505981027246&w=2
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
During discussions with Mike Christie, I became convinced that we needed
a larger vendor id. This patch extends the id from 32 to 64 bits.
This applies on top of the prior patches that add SCSI transport events
via netlink.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch formally adds support for the posting of FC events via netlink.
It is a followup to the original RFC at:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114530667923464&w=2
and the initial posting at:
http://marc.theaimsgroup.com/?l=linux-scsi&m=115507374832500&w=2
The patch has been updated to optimize the send path, per the discussions
in the initial posting.
Per discussions at the Storage Summit and at OLS, we are to use netlink for
async events from transports. Also per discussions, to avoid a netlink
protocol per transport, I've create a single NETLINK_SCSITRANSPORT protocol,
which can then be used by all transports.
This patch:
- Creates new files scsi_netlink.c and scsi_netlink.h, which contains the
single and shared definitions for the SCSI Transport. It is tied into the
base SCSI subsystem intialization.
Contains a single interface routine, scsi_send_transport_event(), for a
transport to send an event (via multicast to a protocol specific group).
- Creates a new scsi_netlink_fc.h file, which contains the FC netlink event
messages
- Adds 3 new routines to the fc transport:
fc_get_event_number() - to get a FC event #
fc_host_post_event() - to send a simple FC event (32 bits of data)
fc_host_post_vendor_event() - to send a Vendor unique event, with
arbitrary amounts of data.
Note: the separation of event number allows for a LLD to send a standard
event, followed by vendor-specific data for the event.
Note: This patch assumes 2 prior fc transport patches have been installed:
http://marc.theaimsgroup.com/?l=linux-scsi&m=115555807316329&w=2http://marc.theaimsgroup.com/?l=linux-scsi&m=115581614930261&w=2
Sorry - next time I'll do something like making these individual
patches of the same posting when I know they'll be posted closely
together.
Signed-off-by: James Smart <James.Smart@emulex.com>
Tidy up configuration not to make SCSI always select NET
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch updates the fc transport for the following:
- Addition of a new attribute "system_hostname" which can be
used to set the fully qualified hostname that the fc_host
is attached to. The fc_host can then register this string
as the FDMI-based host name attribute.
Note: for NPIV, a fc_host could be associated with a system which
is not the local system.
- Add the inline function u64_to_wwn(), which is the inverse of the
existing wwn_to_u64() function.
- Slight reorg, just to keep dynamic attributes with each other, etc
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Original post was incorrect as it didn't realize that we already had
a self-referenc due to device_initialize(), and we were really only
missing the put on our own reference. This was hidden by the other bug
which had the midlayer reusing stargets after they were already free,
which was doing too many puts on our rport.
Updating FC transport for:
- Add put in fc_rport_final_delete(), to release the rport.
Prior, we were leaving the rport with a reference, thus the shost
with references, etc. If the driver was unloaded, shosts and rports
remained, along with work threads, etc
- Fix fc_rport_create failure path - too many put's on parent
- Add commenting to easily track ref taking.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Updated patch to address comments from Pat Mansfield and Michael Reed:
Bumped max to 600 (10mins). Set default dev_loss_tmo to a value other
than the max (30s).
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In a prior posting to linux-scsi on the fc transport and workq
deadlocks, we noted a second error that did not have a patch:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114467847711383&w=2
- There's a deadlock where scsi_remove_target() has to sit behind
scsi_scan_target() due to contention over the scan_lock().
Subsequently we posted a request for comments about the deadlock:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114469358829500&w=2
This posting resolves the second error. Here's what we now understand,
and are implementing:
If the lldd deletes the rport while a scan is active, the sdev's queue
is blocked which stops the issuing of commands associated with the scan.
At this point, the scan stalls, and does so with the shost->scan_mutex held.
If, at this point, if any scan or delete request is made on the host, it
will stall waiting for the scan_mutex.
For the FC transport, we queue all delete work to a single workq.
So, things worked fine when competing with the scan, as long as the
target blocking the scan was the same target at the top of our delete
workq, as the delete workq routine always unblocked just prior to
requesting the delete. Unfortunately, if the top of our delete workq
was for a different target, we deadlock. Additionally, if the target
blocking scan returned, we were unblocking it in the scan workq routine,
which really won't execute until the existing stalled scan workq
completes (e.g. we're re-scheduling it while it is in the midst of its
execution).
This patch moves the unblock out of the workq routines and moves it to
the context that is scheduling the work. This ensures that at some point,
we will unblock the target that is blocking scan. Please note, however,
that the deadlock condition may still occur while it waits for the
transport to timeout an unblock on a target. Worst case, this is bounded
by the transport dev_loss_tmo (default: 30 seconds).
Finally, Michael Reed deserves the credit for the bulk of this patch,
analysis, and it's testing. Thank you for your help.
Note: The request for comments statements about the gross-ness of the
scan_mutex still stand.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove
duplicates of the macro.
Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
As previously reported via Michael Reed, the FC transport took a hit
in 2.6.15 (perhaps a little earlier) when we solved a recursion error.
There are 2 deadlocks occurring:
- With scan and the delete items sharing the same workq, flushing the
workq for the delete code was getting it stalled behind a very long
running scan code path.
- There's a deadlock where scsi_remove_target() has to sit behind
scsi_scan_target() due to contention over the scan_lock().
This patch resolves the 1st deadlock and significantly reduces the
odds of the second. So far, we have only replicated the 2nd deadlock
on a highly-parallel SMP system. More on the 2nd deadlock in a following
email.
This patch reworks the transport to:
- Only use the scsi host workq for scanning
- Use 2 other workq's internally. One for deletions, the other for
scheduled deletions. Originally, we tried this with a single workq,
but the occassional flushes of the scheduled queues was hitting the
second deadlock with a slightly higher frequency. In the future, we'll
look at the LLDD's and the transport to see if we can get rid of this
extra overhead.
- When moving to the other workq's we tightened up some object states
and some lock handling.
- Properly syncs adds/deletes
- minor code cleanups
- directly reference fc_host_attrs, rather than through attribute
macros
- flush the right workq on delayed work cancel failures.
Large kudos to Michael Reed who has been working this issue for the last
month.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This moves the eh_timed_out functionality from the scsi_host_template
to the transport_template. Given that this is now a transport function,
the EH_RESET_TIMER case no longer caps the timer reschedulings. The
transport guarantees that this is not an infinite condition.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In the past I added an host attribute but unfortunately
I forgot to increase FC_HOST_NUM_ATTRS.
This is fixed with the patch. Otherwise an fibre channel
lld might run into
BUG_ON(count > FC_HOST_NUM_ATTRS);
in fc_attach_transport().
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Change the core SCSI code to use kzalloc rather than kmalloc+memset
where possible.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Current fc_transport consumers initially register rports
with an UNKNOWN role-state and follow-up with a call to
fc_remote_port_rolechg(). Modify code in
fc_remote_port_add() to scan the fc_host_rport_bindings()
array for consistent bindings regardless of role-type.
Original code would only scan bindings array for targets,
causing duplicate fc_remote_ports/rport-X:Y-Z entries to be
created for the yet-to-be-role-changed rports.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When James Smart fixed the issue of the userspace scan atributes
crashing the system with the FC transport class he added a patch to
let the transport class check if the parent is valid for a given
transport class.
When adding support for the integrated raid of fusion sas devices
we ran into a problem with that, as it didn't allow adding virtual
raid volumes without the transport class knowing about it.
So this patch adds a user_scan attribute instead, that takes over from
scsi_scan_host_selected if the transport class sets it and thus lets
the transport class control the user-initiated scanning. As this
plugs the hole about user-initiated scanning the target_parent hook
goes away and we rely on callers of the scanning routines to do
something sensible.
For SAS this meant I had to switch from a spinlock to a mutex to
synchronize the topology linked lists, in FC they were completely
unsynchronized which seems wrong.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add fc_host attribute permanent_port_name which is
used to show the port name of the primary port -
the port that initially logged into the fabric.
For a virtual port (registered via the primary port with
FDISC command) it is useful to know not only its (virtual)
port name but also the permanent port name.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In the scenario that a link was broken, the devloss timer for each
rport was expire at roughly the same time, causing lots of "delete"
workqueue items being queued. Depth is dependent upon the number of
rports that were on the link.
The rport target remove calls were calling flush_scheduled_work(),
which would interrupt the stream, and start the next workqueue item,
which did the same thing, and so on until recursion depth was large.
This fix stops the recursion in the initial delete path, and pushes it
off to a host-level work item that reaps the dead rports.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>