Adds support for receiving FAST_IO_FAIL from fc_block_scsi_eh
when in error recovery. This fixes cases of devices being
taken offline when they are no longer accessible on the fabric,
preventing them from coming back online when the fabric recovers.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The flags on a cancel operation are intended to indicate what,
if any, TMF will follow the cancel request. This fixes a case
where we were incorrectly setting the abort task set flag on
the cancel flag when we were cancelling an abort task set.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
No locks should be held when calling scsi_adjust_queue_depth
so drop the lock in slave_configure prior to calling it.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
CONFIG_HOTPLUG is going away as an option. As a result, the __dev*
markings need to be removed.
This change removes the use of __devinit, __devexit_p, __devinitdata,
__devinitconst, and __devexit from these drivers.
Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Adam Radford <linuxraid@lsi.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The length field in the host config packet is only 16-bit long, so
passing it 0x10000 (64K which is our standard PAGE_SIZE) doesn't
work and result in an empty config from the server.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <stable@vger.kernel.org>
Acked-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Now that the iSeries code is gone the backend abstraction
in this driver is no longer necessary, which allows us to
consolidate the driver in one file.
The side effect is that the module name is now ibmvscsi.ko
which matches the driver hotplug name and fixes auto-load
issues.
[jejb:fix up checkpatch.pl errors]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This fixes an issues seen where a Fabric RSCN event was received
while the link was down, which resulted in repeated attempts to log back
into the fabric, which then failed, resulting in the ibmvfc driver
taking the host offline. Fix this by delaying taking any action
regarding the fabric RSCN until the link comes back up.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If an abort request times out to the virtual fibre channel adapter,
the ibmvfc driver will kick off a reset of the adapter. This
patch ensures we wait for the both the abort request and the
request being aborted to be completed prior to exiting the
eh_abort handler. This fixes a bug where the ibmvfc driver
was erroneously returning success to the eh_abort handler
then later sending back a response to the same command.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This makes vio_register_driver() get the module owner & name at compile
time like PCI drivers do, and adds a name pointer directly in struct
vio_driver to avoid having to explicitly initialize the embedded
struct device.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David S. Miller <davem@davemloft.net>
The PowerPC legacy iSeries platform is being removed and this code is
no longer selectable. There is more clean up that can be done, but this
just gets the old code out of the way.
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If a Virtual I/O server fails in a dual virtual I/O server multipath
configuration, ensure we delete all remote ports so that path failover
can occur. For a single path configuration, the remote ports will
go into devloss state.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch fixes an issue seen where an event occurs
which causes the ibmvscsi driver to reset its CRQ. Upon
re-registering its CRQ, it receives H_CLOSED, indicating
the Virtual I/O Server is not yet ready to receive commands.
This resulted in the ibmvscsi driver essentially offlining
the adapter and not recovering. The fix is to re-enable
our interrupt so that when the Virtual I/O server is ready
and sends a CRQ init, we will be able to receive it and
resume initialization of the VSCSI adapter.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
By changing field ordering we can avoid a couple of memory holes in
the tables that use the ibmvfc_async_desc structure.
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.
The patch below presents a simple SCSI host lock push-down as an
equivalent transformation. No locking or other behavior should change
with this patch. All existing bugs and locking orders are preserved.
Additionally, add one parameter to queuecommand,
struct Scsi_Host *
and remove one parameter from queuecommand,
void (*done)(struct scsi_cmnd *)
Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.
Minimal code disturbance was attempted with this change. Most drivers
needed only two one-line modifications for their host lock push-down.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a Virtual I/O server is rebooted, the client fibre channel sees a
transport event on its CRQ, which causes it to attempt to reconnect
to the CRQ. For a period of time during the VIOS reboot, the client's
attempts to register the CRQ will return H_CLOSED, indicating the server
side is not currently registered. The ibmvfc driver was not handling
this well and was taking the virtual adapter offline. Fix this by
re-enabling our interrupt and waiting for the event on our CRQ
indicating the server is back, at which point we can reconnect.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This removes the driver's get_host_def_dev_loss_tmo
callback and just has the driver set the dev loss
using the fc class fc_host_dev_loss_tmo macro like is
done for other fc params.
This patch also removes the module dev loss param.
To override the value the fc host sysfs value being
added in the fc class patch can be used instead of the driver
module param.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
By default, ibmvfc does not log any async events in order
to avoid flooding the log with them. Improve on this by
logging by default events that are not likely to flood the
log, such as link up/down. Having these events in the log
will improve the ability to debug issues with ibmvfc.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This fixes a bug where the driver was resetting the
rport dev_loss_tmo when devices were added by adding
support for the get_host_def_dev_loss_tmo callout.
Patch has only been compile tested.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The ibmvfc driver was incorrectly obtaining a scsi_target pointer
from an fc_rport. The way it is coded ensures that ibmvfc's
terminate_rport_io handler does absolutely nothing. Fix this up
to iterate through affected devices differently, sending cancel
and abort task set as appropriate. Without this patch,
fast_io_fail_tmo is broken for ibmvfc.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Commit 43c8da907c introduced a race
condition which can occur when adding/deleting rports. There are
two possible threads now that can be deleting rports in the ibmvfc
driver, which can result in list_del being called twice, resulting
in an oops. This patch adds a new state to the ibmvfc_target struct
to indicate the target has been removed from the list and is in
the process of being deleted.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add the __init and __exit macros to the module_init / module_exit
functions from drivers/scsi/ibmvscsi/ibmvstgt.c
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If we encounter an error when sending a management datagram (i.e. non
SCSI command, such as virtual adapter initialization command), we
end up incrementing the request_limit, even though we don't decrement
it for these commands. Fix this up by doing this increment in
the error path for SRP commands only.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fixes a deadlock that can occur if we hit a command timeout
during the virtual adapter initialization. The event done
functions are written with the assumption that no locks are held,
however, when purging requests this is not true. Fix up the
purge function to drop the lock so that the done function
is not called with the lock held, which can cause a deadlock.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This fixes a softlockup seen on resume. During resume, the CRQ
must be reenabled. However, the H_ENABLE_CRQ hcall used to do
this may return H_BUSY or H_LONG_BUSY. When this happens, the
caller is expected to retry later. This patch changes a simple
loop, which was causing the softlockup, to a loop at task level
which sleeps between retries rather than simply spinning.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Adds support for fc_block_scsi_eh to block the EH handlers if
the target device is in the blocked state to ensure we don't
take devices offline.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This fixes a softlockup seen on resume. During resume, the CRQ
must be reenabled. However, the H_ENABLE_CRQ hcall used to do
this may return H_BUSY or H_LONG_BUSY. When this happens, the
caller is expected to retry later. Normally the H_ENABLE_CRQ
succeeds relatively soon. However, we have seen cases where
this can take long enough to see softlockup warnings.
This patch changes a simple loop, which was causing the
softlockup, to a loop at task level which sleeps between
retries rather than simply spinning.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
A driver needs to be ready to take an interrupt as soon as it registers
an interrupt handler. I noticed the following oops when testing kdump:
ipr: IBM Power RAID SCSI Device Driver version: 2.5.0 (February 11, 2010)
ibmvscsi 30000002: SRP_VERSION: 16.a
ibmvscsi 30000002: SRP_VERSION: 16.a
Unable to handle kernel paging request for data at address 0x00000000
...
pc: c000000004085e34: .tasklet_action+0xf4/0x1dc
...
c000000004086fe4 .__do_softirq+0x16c/0x2c0
c00000000403138c .call_do_softirq+0x14/0x24
c00000000400ee14 .do_softirq+0xa0/0x104
c00000000408690c .irq_exit+0x70/0xd0
c00000000400f190 .do_IRQ+0x214/0x2a8
c000000004004804 hardware_interrupt_entry+0x1c/0x98
--- Exception: 501 (Hardware Interrupt) at c00000000400c544 .raw_local_irq_restore+0x48/0x54
c00000000465d2a8 ._raw_spin_unlock_irqrestore+0x74/0xa0
c0000000040e7f00 .__setup_irq+0x2ec/0x3f0
c0000000040e8198 .request_threaded_irq+0x194/0x22c
c00000000446d854 .rpavscsi_init_crq_queue+0x284/0x3f0
c00000000446c764 .ibmvscsi_probe+0x688/0x710
c00000000402903c .vio_bus_probe+0x37c/0x3e4
c000000004403f10 .driver_probe_device+0xec/0x1b8
c000000004404088 .__driver_attach+0xac/0xf4
c000000004403184 .bus_for_each_dev+0x98/0x104
c000000004403c98 .driver_attach+0x40/0x60
c0000000044026f0 .bus_add_driver+0x154/0x324
c0000000044045d0 .driver_register+0xe8/0x1ac
c00000000402b2a8 .vio_register_driver+0x54/0x74
c000000004933ea4 .ibmvscsi_module_init+0x80/0xc0
c000000004009834 .do_one_initcall+0x98/0x1d8
c0000000049005b4 .kernel_init+0x27c/0x33c
c000000004031550 .kernel_thread+0x54/0x70
srp_task needs to be setup before request_irq. The patch below fixes the oops.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Merging in current state of Linus' tree to deal with merge conflicts and
build failures in vio.c after merge.
Conflicts:
drivers/i2c/busses/i2c-cpm.c
drivers/i2c/busses/i2c-mpc.c
drivers/net/gianfar.c
Also fixed up one line in arch/powerpc/kernel/vio.c to use the
correct node pointer.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This allows bin_attr->read,write,mmap callbacks to check file specific data
(such as inode owner) as part of any privilege validation.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The following structure elements duplicate the information in
'struct device.of_node' and so are being eliminated. This patch
makes all readers of these elements use device.of_node instead.
(struct of_device *)->node
(struct dev_archdata *)->prom_node (sparc)
(struct dev_archdata *)->of_node (powerpc & microblaze)
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
If a command times out resulting in EH getting invoked, we wait for the
aborted commands to come back after sending the abort. Shorten
the amount of time we wait for these responses, to ensure we don't
get stuck in EH for several minutes.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Commands which are completed by the VIOS are placed on a CRQ
in kernel memory for the ibmvfc driver to process. Each CRQ
entry is 16 bytes. The ibmvfc driver reads the first 8 bytes
to check if the entry is valid, then reads the next 8 bytes to get
the handle, which is a pointer the completed command. This fixes
an issue seen on Power 7 where the processor reordered the
loads from memory, resulting in processing command completion
with a stale handle. This could result in command timeouts,
and also early completion of commands.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
ibmvscsi uses dma_unmap_single() for buffers mapped via
dma_map_sg(). It works however it's the API violation. The DMA debug
facility complains about it:
http://marc.info/?l=linux-scsi&m=127018555013151&w=2
Reported-by: Sachin Sant <sachinp@in.ibm.com>
Tested-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Adds support for resuming from suspend for IBM VFC devices. We may have
lost an interrupt over the suspend, so we just kick the interrupt handler
to process anything that is outstanding. We expect to find a transport event
indicating we need to reestablish our CRQ.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Adds support for resuming from suspend for IBM VSCSI devices. We may have
lost an interrupt over the suspend, so we just kick the interrupt handler
to process anything that is outstanding. We expect to find a transport event
indicating we need to reestablish our CRQ.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block: (38 commits)
block: don't access jiffies when initialising io_context
cfq: remove 8 bytes of padding from cfq_rb_root on 64 bit builds
block: fix for "Consolidate phys_segment and hw_segment limits"
cfq-iosched: quantum check tweak
blktrace: perform cleanup after setup error
blkdev: fix merge_bvec_fn return value checks
cfq-iosched: requests "in flight" vs "in driver" clarification
cciss: Fix problem with scatter gather elements in the scsi half of the driver
cciss: eliminate unnecessary pointer use in cciss scsi code
cciss: do not use void pointer for scsi hba data
cciss: factor out scatter gather chain block mapping code
cciss: fix scatter gather chain block dma direction kludge
cciss: simplify scatter gather code
cciss: factor out scatter gather chain block allocation and freeing
cciss: detect bad alignment of scsi commands at build time
cciss: clarify command list padding calculation
cfq-iosched: rethink seeky detection for SSDs
cfq-iosched: rework seeky detection
block: remove padding from io_context on 64bit builds
block: Consolidate phys_segment and hw_segment limits
...
Except for SCSI no device drivers distinguish between physical and
hardware segment limits. Consolidate the two into a single segment
limit.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch modifies scsi_host_template->change_queue_depth so that
it takes an argument indicating why it is being called. This will be
used so that if a LLD needs to do some extra processing when
handling queue fulls or later ramp ups, it can do so.
This is a simple port of the drivers setting a change_queue_depth
callback. In the patch I just have these LLDs adjust the queue depth
if the user was requesting it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
[Vasu.Dev: v2
Also converted pmcraid_change_queue_depth and then verified
all modules compile using "make allmodconfig" for any new build
warnings on X86_64.
Updated original description after combing two original
patches from Mike to make this patch git bisectable.]
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
[jejb: fixed up 53c700]
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When issuing a Cancel to the virtual fibre channel adapter,
the interface specifies a flags field for the client to indicate
what kind of error recovery is being performed. Fix up these
flags for terminate_rport_io to indicate an abort task set
rather than a target reset.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remove a parameter to ibmvfc_init_host which is always set to
zero by all callers.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Need to grab the host lock around the call to ibmvfc_link_down.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When processing the response to either a LUN reset,
target reset, or an abort task set, the ibmvfc driver needs to
treat as success receiving a response with a non-zero
status in the response IU along with a general transport
error with the FCP response code being zero. The VIOS
currently guarantees this cannot happen, but a future version
of VIOS may allow this to be returned, so ensure we handle
this response combination correctly for TMFs, as we already
do for SCSI commands.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
fix the following 'make includecheck' warning:
drivers/scsi/ibmvscsi/ibmvscsi.c: asm/firmware.h is included more than once.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <1247067016.4382.78.camel@ht.satnam>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
The allocated struct is manually zeroed after allocation, so avoid using
the (broken) kzalloc mempool (which does not re-zero previously used items
when they are returned to the pool).
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fixes a regression seen in the ibmvscsi driver when using the VSCSI
server in SLES 9 and SLES 10. The VSCSI server in these releases
has a bug in it in which it does not send responses to unknown MADs.
Check the OS Type field in the adapter info response and do not send
these unsupported commands when talking to an older server.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Fixes a problem seen where sending a PRLI to a target
resulted in it sending a LOGO. This caused the ibmvfc
driver to go back through discovery again, which caused
another PRLI attempt, which caused another LOGO. Fix this
behavior by ignoring LOGO if we haven't even logged into
the target yet.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Since async events could indicate changes to link status, or
events which could affect decisions made during discovery, we should
process async events prior to command completion responses.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device. Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used. These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.
Cc: linux-scsi@vger.kernel.org
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add support to ibmvscsi for the capabilities MAD. This command gets sent
to the Virtual I/O server prior to login in order to communicate client
capabilities. Additionally it returns information regarding capabilities
that the server supports. The two main capabilities communicated in this
MAD are related to partition migration and client reserve. Client reserve
allows for SCSI-2 reservations to be sent to virtual disks which are backed
by physical LUNs and will result in the reservation being sent to the
physical LUN.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
A new mode of error reporting, fast fail, has been added to the VIOS
which allows failover to happen more quickly.
If this new fast fail mode is enabled on the VIOS and the vSCSI client
supports the mode, the VIOS will not return MEDIUM error on path failures,
but rather return VIOSRP_ADAPTER_FAIL in the crq response, which
ibmvscsi will translate to DID_ERROR.
This new mode can be enabled for single path configurations as well,
so it is the new default error reporting mode. A module parameter is
provided to disable this new behavior on the off chance it causes a
problem on some old VIOS version.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ibmvscsi driver currently sends the SRP Login before sending the Adapter
Info MAD, which can result in commands getting sent to the virtual adapter
before we are ready for them. This results in a slight window where the target
devices may not behave as expected. Change the order and close the window.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Previously we had one timeout that was used for all types of operations.
This adds specific timeout values for different operations (init, login,
adapter info MAD, abort task, and LUN reset).
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Adds support for 16 byte CDBs to the ibmvscsi driver.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There are several scenarios where the ibmvfc driver needs to
try to log back into a target on the fabric. Today when these events
occur, we simply go through re-discovery for all attached targets,
assuming that either the query of the name server or an ADISC will
indicate we might need to log back into the target, which doesn't
work for all scenarios. Fix this by taking note of the affected target(s)
in these conditions and ensuring we try to PLOGI back into the target.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
For certain scenarios during device rediscovery, we detect we need
to log back into a target. Currently we do just that - PLOGI/PRLI
back into the target. Change the code to delete and add the target
from the FC transport layer as well, to ensure we handle any cases
where the target may have changed.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The virtual I/O server controlling the NPIV adapter associated with
a virtual fibre channel adapter can send a HALT event to the client.
When this occurs, the client can no longer send commands until a RESUME
is received. By adding support for flush on halt, we will get all of
our outstanding commands flushed back before the Virtual I/O server
enters the halt state, eliminating potential command timeouts for
outstanding commands which might occur if we did not support this feature.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch adds support for a new command supported by the Virtual I/O
Server, NPIV Logout. The command will abort all outstanding commands
and log out of the fabric. Currently, the only way to do this is
by breaking the CRQ, which can take a fairly long time when lots of
commands are outstanding. The NPIV Logout commands provides a mechanism
to accomplish virtually the same function, but is much faster.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ibmvfc driver currently logs errors during discovery for several
transient fabric errors, which generally get retried. If retries
do not work, we see multiple errors in the log. If retries do work,
we see errors in the log which may be confusing since the retry worked.
This patch enhances the discovery time error logging to only log errors
for command failures during discovery if all allowed retries have been
used up. The existing behavior of logging all failures can be restored
by setting the hosts log_level to a value of 3 or greater.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Use DEVICE_ATTR macro for defining device sysfs attributes.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Since target allocations can occur while resetting the virtual adapter,
we shouldn't be using GFP_KERNEL for them as it could hang. Switch to
use GFP_NOIO.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Fix an obvious bug in processing error responses for SCSI commands
which can result in successful responses being incorrectly returned
with DID_ERROR.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Bump driver version to 1.0.5.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ibmvfc driver currently breaks the CRQ and essentially
resets the entire virtual FC adapter, killing all outstanding
ops to all attached targets, if an ADISC times out during target
discover/rediscovery. This patch adds some code to cancel the
ADISC if it times out, which prevents a single ADISC timeout from
affecting the other devices attached to the fabric.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Set show_host_maxframe_size so that maxframe_size gets exported in
sysfs for the host.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ibmvfc driver really does not handle dynamically changing disc_threads.
To change this dynamically would cause confusion in the driver regarding
the number of event structs allocated. Fix this by simply not allowing
disc_threads to be changed at runtime.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch fixes a problem of possible dropped interrupts. Currently,
the ibmvfc driver has a race condition where after ibmvfc_interrupt
gets run, the platform code clears the interrupt. This can result in
lost interrupts and, in worst case scenarios, result in command
timeouts. Fix this by implementing a tasklet similar to what the
ibmvscsi driver does so that interrupt processing is no longer done in
the actual interrupt handler, which eliminates the race.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ibmvfc driver currently always sets the role of all rports
to FC_PORT_ROLE_FCP_TARGET, which is not correct for other initiators.
This can cause problems if other initiators are on the fabric
when we then try to scan the rport for LUNs. Fix this by looking
at the service parameters returned in the PRLI to set the roles
appropriately. Also look at the returned service parameters to
decide whether or not we were actually able to successfully log into
the target.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
During cancel testing it has been shown that 15 seconds is not
nearly long enough for the VIOS to respond to a cancel under
loaded situations. Increasing this timeout to 60 seconds allows
time for the VIOS to cancel the outstanding commands and prevents
us from escalating to a full host reset, which can take much longer.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ibmvfc driver has a bug in its SCN handling. If it receives
an ELS event such asn an N-Port SCN event or an unsolicited PLOGI,
or any other SCN event which causes ibmvfc_reinit_host to be called,
it is possible that we will call fc_remote_port_add for a target
that already has an rport added, which can result in duplicate
rports getting created for the same targets. Fix this by calling
fc_remote_port_rolechg in this scenario instead to report any possible
role change that may have occurred.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently the ibmvfc driver sets the IBMVFC_CLASS_3_ERR flag
in the VFC Frame if both the adapter and the device claim support
for Class 3. However, this bit actually refers to Class 3 Error
Recovery, which is currently not supported by the VIOS. Setting this
bit can cause lots of command timeout responses from the VIOS resulting
in general instability. Fix this by never setting this bit.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ibmvscsi client driver is not unmapping the SCSI command after
encountering a DMA mapping error while trying to map an indirect
scattergather list for the event pool. This leads to a leak of DMA
entitlement that could result in the device failing future DMA operations
in a CMO environment.
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is currently a DMA mapping leak that can occur in the ibmvfc
driver if we fail to allocate a scatterlist. Fix this by unmapping
the scatterlist in the failure path. Additionally, only log an error
for a scatterlist allocation failure if the log level is greater
than the default, since this can occur when running Active Memory
Sharing and this is not considered an error.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This is a powerpc specific driver.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Convert arch/powerpc/ over to long long based u64:
-#ifdef __powerpc64__
-# include <asm-generic/int-l64.h>
-#else
-# include <asm-generic/int-ll64.h>
-#endif
+#include <asm-generic/int-ll64.h>
This will avoid reoccuring spurious warnings in core kernel code that
comes when people test on their own hardware. (i.e. x86 in ~98% of the
cases) This is what x86 uses and it generally helps keep 64-bit code
32-bit clean too.
[Adjusted to not impact user mode (from paulus) - sfr]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In a previous patch to fix an issue with error recovery,
the behavior of the max_requests module paramater was also
changed. If, for some reason, max_requests is set to one by
the user, we will end up with a negative number for can_queue.
Fix this by making max_requests not include the two event structs
needed to do error recovery.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If a link down event is received, outstanding commands may get
returned to the ibmvfc driver with a "transaction cancelled implicit"
response. This is currently translated to DID_ABORT, which does
not get retried by SCSI core, but rather passes the failure up
the stack. This can result in I/O errors at the filesystem level.
Fix up this response a well as a few other error responses.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[jejb: limit ioctl to returning 20 characters to avoid overrun
on long device names and add a few more conversions]
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
While doing various error injection testing, such as cable
pulls and target moves, some issues were observed in handling
these events. This patch improves the way these events are handled
by increasing the delay waiting for the fabric to settle and also
changes the behavior of Link Up to break the CRQ to ensure everything
gets cleaned up properly on the VIOS.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ibmvscsi driver currently has a bug in it which can result
in it using up all its event structs for commands. If something
results in all those commands timing out, we won't have any resources
left to send aborts or resets. This results in escalating to a host reset
in order to recover, which is a bit heavy handed. This fixes it
by reducing can_queue by two in order to have resources to do EH.
It also changes the max_requests module parameter so that it is not
writable at runtime, since the code really does not handle it changing
at runtime.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
In order to ensure the VIOS sees a consistent command buffer, we
need to add a memory barrier after building the command buffer
but before sending the command.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Adds a delay prior to retrying a failed NPIV login. This fixes
a scenario if the backing fibre channel adapter is getting reset
due to an EEH event, NPIV login will fail. Currently, ibmvfc
retries three times very quickly, resets the CRQ and tries one
more time. If the adapter is getting reset due to EEH, this isn't
enough time. This adds a delay prior to retrying a failed NPIV
login and also increments the number of retries.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The virtual fibre channel stack can return a failure response for a command
indicating the port login has been invalidated without sending the client
an async event. Add code to handle this response and initiate a PLOGI.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The CRQs used by the ibmvfc driver are read and written by both
the client and the server. Therefore, we need to mark them volatile
so that we do not cache their contents when handling an interrupt.
This fixes a problem which can surface as occasional command timeouts.
No commands were actually timing out, but due to accessing cached data
for the CRQ in the interrupt handler, the interrupt was not processing
all command completions as it should.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Fixes an oops that can occur in the interrupt handler
if we get a lot of async events.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Stops gcc from complaining about a possible uninitialized
variable being used in ibmvfc_reset_device.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the ibmvfc driver is in discovery attempting to log into a target
and it encounters an error, the command may get retried one or more
times, depending on the error received. If the retries are
unsuccessful such that the discovery thread gives up on discovery to
that target, the target ends up in a state where, if SCSI core had
previously known about the device, the host will get unblocked but the
host will not be logged into the target, causing any commands sent to
the target to fail. This patch fixes this so that if this occurs, the
target is deleted such that the normal dev_loss processing can occur
instead.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>