Annotate members of FC protocol and firmware dump data structures as big
endian. Annotate members of RISC control structures as little endian.
Annotate mailbox registers as little endian. Annotate the mb[] arrays as
CPU-endian because communication of the mb[] values with the hardware
happens through the readw() and writew() functions. readw() converts from
__le16 to u16 and writew() converts from u16 to __le16. Annotate 'handles'
as CPU-endian because for the firmware these are opaque values.
Link: https://lore.kernel.org/r/20200518211712.11395-15-bvanassche@acm.org
CC: Hannes Reinecke <hare@suse.de>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This was suggested by Daniel Wagner.
Link: https://lore.kernel.org/r/20200518211712.11395-12-bvanassche@acm.org
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make the MMIO accessors strongly typed such that the compiler checks
whether the accessor function is used that matches the register width. Fix
those MMIO accesses where another number of bits was read or written than
the size of the register.
Link: https://lore.kernel.org/r/20200518211712.11395-11-bvanassche@acm.org
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of passing an argument to the firmware dumping functions that tells
these functions whether or not to obtain the hardware lock, obtain that
lock before calling these functions. This patch fixes the following
recently introduced C=2 build error:
CHECK drivers/scsi/qla2xxx/qla_tmpl.c
drivers/scsi/qla2xxx/qla_tmpl.c:1133:1: error: Expected ; at end of statement
drivers/scsi/qla2xxx/qla_tmpl.c:1133:1: error: got }
drivers/scsi/qla2xxx/qla_tmpl.h:247:0: error: Expected } at end of function
drivers/scsi/qla2xxx/qla_tmpl.h:247:0: error: got end-of-input
Link: https://lore.kernel.org/r/20200518211712.11395-4-bvanassche@acm.org
Fixes: cbb01c2f2f ("scsi: qla2xxx: Fix MPI failure AEN (8200) handling")
Cc: Arun Easi <aeasi@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The bitfields mpi_fw_dump_reading and mpi_fw_dumped are currently signed
which is not recommended as the representation is an implementation defined
behaviour. Fix this by making the bit-fields unsigned ints.
Link: https://lore.kernel.org/r/20200428102013.1040598-1-colin.king@canonical.com
Fixes: cbb01c2f2f ("scsi: qla2xxx: Fix MPI failure AEN (8200) handling")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Today, upon an MPI failure AEN, on top of collecting an MPI dump, a regular
firmware dump is also taken and then chip reset. This is disruptive to IOs
and not required. Make the firmware dump collection, followed by chip
reset, optional (not done by default).
Firmware dump buffer and MPI dump buffer are independent of each
other with this change and each can have dump that was taken at two
different times for two different issues. The MPI dump is saved in a
separate buffer and is retrieved differently from firmware dump.
To collect full dump on MPI failure AEN, a module parameter is
introduced:
ql2xfulldump_on_mpifail (default: 0)
Link: https://lore.kernel.org/r/20200331104015.24868-2-njavali@marvell.com
Reported-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
commit e4e3a2ce95 ("scsi: qla2xxx: Add ability to autodetect SFP
type") takes a heavy handed approach to BPM (Buffer Plus Management)
enablement:
1) During hardware initialization, if an LR-capable transceiver is
recognized, the driver schedules a disruptive post-initialization
chip-reset (ISP-ABORT) to allow the BPM settings to be sent to the
firmware. This chip-reset will result in (short-term) path-loss to
all fc-rports and their attached SCSI devices.
2) LR-detection is triggered during any link-up event, resulting in a
refresh and potential chip-reset
Based on firmware-team guidance, upon LR-capable transceiver
recognition, the driver's hardware initialization code will now
re-execute firmware with the new BPM settings, then continue on with
driver initialization. To address the second issue, the driver
performs LR-capable detection upon the driver receiving a
transceiver-insertion asynchronous event from firmware. No short-term
path loss is needed with this new semantic.
Link: https://lore.kernel.org/r/20200226224022.24518-10-hmadhani@marvell.com
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Andrew Vasquez <andrewv@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There's no point checking flags.disable_msix_handshake in the
interrupt handler hot-path. Instead perform the check during
queue-pair instantiation and use the proper interrupt handler.
Link: https://lore.kernel.org/r/20200226224022.24518-8-hmadhani@marvell.com
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Andrew Vasquez <andrewv@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch allows sparse to verify the endianness of the arguments passed
to make_handle().
Link: https://lore.kernel.org/r/20200220043441.20504-5-bvanassche@acm.org
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Quinn Tran <qutran@marvell.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of changing endianness in-place, write the data in CPU endian
format in another buffer and copy that buffer back. This patch does not
change any functionality but silences some sparse endianness warnings.
Link: https://lore.kernel.org/r/20200220043441.20504-3-bvanassche@acm.org
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Quinn Tran <qutran@marvell.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since the SCSI core does not reuse the tag of the SCSI command that is
being aborted by .eh_abort() before .eh_abort() has finished it is not
necessary to check from inside that callback whether or not the SCSI
command has already completed. Instead, rely on the firmware to return an
error code when attempting to abort a command that has already
completed. Additionally, rely on the firmware to return an error code when
attempting to abort an already aborted command.
In qla2x00_abort_srb(), use blk_mq_request_started() instead of
sp->completed and sp->aborted.
Link: https://lore.kernel.org/r/20200220043441.20504-2-bvanassche@acm.org
Cc: Martin Wilck <mwilck@suse.com>
Cc: Quinn Tran <qutran@marvell.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds deferred queue for processing aborts and RDP in the driver.
Link: https://lore.kernel.org/r/20200212214436.25532-14-hmadhani@marvell.com
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds support for extended FDMI commands and cleans up code to
reduce duplication.
Link: https://lore.kernel.org/r/20200212214436.25532-10-hmadhani@marvell.com
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds RDP command support in the driver. With the help of new
ql2xsmartsan parameter, driver will use PUREX IOCB mode to send RDP command
to switch and will be able to receive various diagnostic data.
Link: https://lore.kernel.org/r/20200212214436.25532-8-hmadhani@marvell.com
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch prepares code for implementing Vendor specific extended FDMI/RDP
commands. It also addes support for MBC_GET_PORT_DATABASE and
MBC_GET_RNID_PARAMS commands.
Link: https://lore.kernel.org/r/20200212214436.25532-7-hmadhani@marvell.com
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch fixes endian warning for fc_host_stats.
Link: https://lore.kernel.org/r/20200212214436.25532-6-hmadhani@marvell.com
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds sysfs node to show D-Port diag data.
Link: https://lore.kernel.org/r/20200212214436.25532-4-hmadhani@marvell.com
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch provides an interface to do the following (using MBC 0x3B):
- Displays (in hex) the LED config words for all three LEDs.
- Programs the config words for one LED or for all three LEDs.
The sysfs node defined is named beacon_config.
First, to allow driver to gain LED control, do this:
# echo 1 > /sys/class/scsi_host/host#/beacon
Then, to display config words for all three LEDs do this:
# cat /sys/class/scsi_host/host#/beacon_config
To set config words for all three LEDs do this:
# echo 3 xxxx yyyy zzzz > /sys/class/scsi_host/host#/beacon_config
Or, to set config word for a specific single LED n do this:
# echo n xxxx > /sys/class/scsi_host/host#/beacon_config
where n is the LED number (0, 1, 2)
Finally, to restore LED control back to firmware, do this:
# echo 0 > /sys/class/scsi_host/host#/beacon
Link: https://lore.kernel.org/r/20200212214436.25532-2-hmadhani@marvell.com
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Declare qla_hw_data.flt as a qla_flt_header pointer instead of as a void
pointer. Add a zero-length array at the end of struct qla_flt_header to
make it clear that qla_flt_header and qla_flt_region are contiguous. This
patch removes several casts but does not change any functionality.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Link: https://lore.kernel.org/r/20191219004706.39039-1-bvanassche@acm.org
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Session is stuck if driver sees FW has received a PRLI. Driver allows FW to
finish with processing of PRLI by checking back with FW at a later time to
see if the PRLI has finished. Instead, driver failed to push forward after
re-checking PRLI completion.
Fixes: ce0ba496dc ("scsi: qla2xxx: Fix stuck login session")
Cc: stable@vger.kernel.org # 5.3
Link: https://lore.kernel.org/r/20191217220617.28084-9-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch removes unused qla2x00_async_logout_done from the code.
Link: https://lore.kernel.org/r/20191217220617.28084-5-hmadhani@marvell.com
Signed-off-by: Shyam Sundar <ssundar@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds a shadow variable to hold disc_state history for the fcport
and prints state transition when the logging is enabled.
Link: https://lore.kernel.org/r/20191217220617.28084-4-hmadhani@marvell.com
Signed-off-by: Shyam Sundar <ssundar@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Target makes implicit LOGO on session teardown. LOGO ELS is not send on the
wire and initiator is not aware that target no longer wants talking to
it. Initiator keeps sending I/O requests, target responds with BA_RJT, they
time out and then initiator sends ABORT TASK (ABTS-LS).
Current behaviour incurs unneeded I/O timeout and can be fixed for some
initiators by making explicit LOGO on session deletion.
Link: https://lore.kernel.org/r/20191125165702.1013-3-r.bolshakov@yadro.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Avoid an uninitialized value (0) for ha->fc4_type_priority being falsely
interpreted as NVMe priority. Not strictly needed any more after the
previous patch, but makes the fc4_type_priority handling more explicit.
Link: https://lore.kernel.org/r/20191107224839.32417-3-martin.wilck@suse.com
Tested-by: David Bond <dbond@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Current code assumes abort will remove the original command from the active
list where scsi_done will not be called. Instead, the eh_abort thread will
do the scsi_done. That is not the case. Instead, we have a double
scsi_done calls triggering use after free.
Abort will tell FW to release the command from FW possesion. The original
command will return to ULP with error in its normal fashion via scsi_done.
eh_abort path would wait for the original command completion before
returning. eh_abort path will not perform the scsi_done call.
Fixes: 219d27d714 ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands")
Cc: stable@vger.kernel.org # 5.2
Link: https://lore.kernel.org/r/20191105150657.8092-6-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On switch, fabric and mgt command timeout, driver send Abort to tell FW to
return the original command. If abort is timeout, then return both Abort
and original command for cleanup.
Fixes: 219d27d714 ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands")
Cc: stable@vger.kernel.org # 5.2
Link: https://lore.kernel.org/r/20191105150657.8092-3-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some storage arrays advertise FCP LUNs and NVMe namespaces behind the same
WWN. The driver now offers a user option by way of NVRAM parameter to
allow users to choose, on a per port basis, the kind of FC-4 type they
would like to prioritize for login.
Link: https://lore.kernel.org/r/20190912180918.6436-9-hmadhani@marvell.com
Signed-off-by: Michael Hernandez <mhernandez@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix stalled link recovery for N2N with FC-NVMe connection.
Link: https://lore.kernel.org/r/20190912180918.6436-6-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the case of NPIV port is being torn down, this patch will set a flag to
indicate VPORT_DELETE. This would prevent relogin to be triggered.
Link: https://lore.kernel.org/r/20190912180918.6436-5-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of calling qla2x00_fcport_event_handler() and letting the switch
statement inside that function decide which other function to call, call
the latter function directly. Remove the event member from the event_arg
structure because it is no longer needed. Remove the
qla_handle_els_plogi_done() function because it is never called.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It is easy to mix up the QLA_* and the MBS_* status codes. Complain loudly
if that happens.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Split srb_cmd.ctx into two pointers such that the compiler can check the
type of that pointer.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of allocating a struct srb dynamically from inside .queuecommand(),
set qla2xxx_driver_template.cmd_size such that struct scsi_cmnd and struct
srb are contiguous. Do not call QLA_QPAIR_MARK_BUSY() /
QLA_QPAIR_MARK_NOT_BUSY() for SRBs associated with SCSI commands. That is
safe because scsi_remove_host() is called before queue pairs are deleted
and scsi_remove_host() waits for all outstanding SCSI commands to finish.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since all pointers passed to the srb_t.done() and srb_t.free() functions
have type srb_t, change the type of the first argument of these functions
from void * into struct srb *. This allows the compiler to verify the
argument types for these functions. This patch does not change any
functionality.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce the be_id_t and le_id_t data types for Fibre Channel source and
destination ID formats supported by the firmware instead of using an
uint8_t[3] array. Introduce functions for converting from and to the
port_id_t data types. This patch does not change the behavior of the
qla2xxx driver but improves source code readability and also allows the
compiler to verify the endianness of Fibre Channel IDs.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pass the output buffer size to the code that generates a PCI info string
and check the output buffer size while generating a PCI info string.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch does not change any functionality but fixes a Coverity complaint
about using a scalar as an array.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Insert a space where required, surround complex expressions in macros with
parentheses, use the UL suffix instead of the (unsigned long) cast, do not
use line continuations when not necessary and do not explicitly initialize
static variables to zero.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For any qla2xxx async command, the SRB buffer is used to send it. In
setting up the SRB buffer, the timer for this command is started before all
memory allocation has finished. Under low memory pressure, memory alloc
can go to sleep and not wake up before the timer expires. Once timer has
expired, the timer thread will access uninitialize fields resulting into
NULL pointer crash.
This patch fixes this crash by moving the start of timer after everything
is setup.
backtrace shows following
PID: 3720 TASK: ffff996928401040 CPU: 0 COMMAND: "qla2xxx_1_dpc"
0 [ffff99652751b698] __schedule at ffffffff965676c7
1 [ffff99652751b728] schedule at ffffffff96567bc9
2 [ffff99652751b738] schedule_timeout at ffffffff965655e8
3 [ffff99652751b7e0] io_schedule_timeout at ffffffff9656726d
4 [ffff99652751b810] congestion_wait at ffffffff95fd8d12
5 [ffff99652751b870] isolate_migratepages_range at ffffffff95fddaf3
6 [ffff99652751b930] compact_zone at ffffffff95fdde96
7 [ffff99652751b980] compact_zone_order at ffffffff95fde0bc
8 [ffff99652751ba20] try_to_compact_pages at ffffffff95fde481
9 [ffff99652751ba80] __alloc_pages_direct_compact at ffffffff9655cc31
10 [ffff99652751bae0] __alloc_pages_slowpath at ffffffff9655d101
11 [ffff99652751bbd0] __alloc_pages_nodemask at ffffffff95fc0e95
12 [ffff99652751bc80] dma_generic_alloc_coherent at ffffffff95e3217f
13 [ffff99652751bcc8] x86_swiotlb_alloc_coherent at ffffffff95e6b7a1
14 [ffff99652751bcf8] qla2x00_rft_id at ffffffffc055b5e0 [qla2xxx]
15 [ffff99652751bd50] qla2x00_loop_resync at ffffffffc0533e71 [qla2xxx]
16 [ffff99652751be68] qla2x00_do_dpc at ffffffffc05210ca [qla2xxx]
PID: 0 TASK: ffffffff96a18480 CPU: 0 COMMAND: "swapper/0"
0 [ffff99652fc03ae0] machine_kexec at ffffffff95e63674
1 [ffff99652fc03b40] __crash_kexec at ffffffff95f1ce12
2 [ffff99652fc03c10] crash_kexec at ffffffff95f1cf00
3 [ffff99652fc03c28] oops_end at ffffffff9656c758
4 [ffff99652fc03c50] no_context at ffffffff9655aa7e
5 [ffff99652fc03ca0] __bad_area_nosemaphore at ffffffff9655ab15
6 [ffff99652fc03cf0] bad_area_nosemaphore at ffffffff9655ac86
7 [ffff99652fc03d00] __do_page_fault at ffffffff9656f6b0
8 [ffff99652fc03d70] do_page_fault at ffffffff9656f915
9 [ffff99652fc03da0] page_fault at ffffffff9656b758
[exception RIP: unknown or invalid address]
RIP: 0000000000000000 RSP: ffff99652fc03e50 RFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff99652b79a600 RCX: ffff99652b79a760
RDX: ffff99652b79a600 RSI: ffffffffc0525ad0 RDI: ffff99652b79a600
RBP: ffff99652fc03e60 R8: ffffffff96a18a18 R9: ffffffff96ee3c00
R10: 0000000000000002 R11: ffff99652fc03de8 R12: ffff99652b79a760
R13: 0000000000000100 R14: ffffffffc0525ad0 R15: ffff99652b79a600
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
10 [ffff99652fc03e50] qla2x00_sp_timeout at ffffffffc0525af8 [qla2xxx]
11 [ffff99652fc03e68] call_timer_fn at ffffffff95ea7f58
12 [ffff99652fc03ea0] run_timer_softirq at ffffffff95eaa3bd
13 [ffff99652fc03f18] __do_softirq at ffffffff95ea0f05
14 [ffff99652fc03f88] call_softirq at ffffffff9657832c
15 [ffff99652fc03fa0] do_softirq at ffffffff95e2e675
16 [ffff99652fc03fc0] irq_exit at ffffffff95ea1285
17 [ffff99652fc03fd8] smp_apic_timer_interrupt at ffffffff965796c8
18 [ffff99652fc03ff0] apic_timer_interrupt at ffffffff96575df2
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If an abort times out, the Abort IOCB completion and Abort timer can race
against each other. This patch provides unique error code for timer path to
allow proper cleanup.
[mkp: typo]
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On session deletion, current qla code would unregister an NVMe session
before flushing IOs. This patch would move the unregistration of NVMe
session after IO flush. This way FC-NVMe layer would not have to wait for
stuck IOs. In addition, qla2xxx would stop accepting new IOs during session
deletion.
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch uses kref to protect access between fcp_abort path and nvme
command and LS command completion path. Stack trace below shows the abort
path is accessing stale memory (nvme_private->sp).
When command kref reaches 0, nvme_private & srb resource will be
disconnected from each other. Any subsequence nvme abort request will not
be able to reference the original srb.
[ 5631.003998] BUG: unable to handle kernel paging request at 00000010000005d8
[ 5631.004016] IP: [<ffffffffc087df92>] qla_nvme_abort_work+0x22/0x100 [qla2xxx]
[ 5631.004086] Workqueue: events qla_nvme_abort_work [qla2xxx]
[ 5631.004097] RIP: 0010:[<ffffffffc087df92>] [<ffffffffc087df92>] qla_nvme_abort_work+0x22/0x100 [qla2xxx]
[ 5631.004109] Call Trace:
[ 5631.004115] [<ffffffffaa4b8174>] ? pwq_dec_nr_in_flight+0x64/0xb0
[ 5631.004117] [<ffffffffaa4b9d4f>] process_one_work+0x17f/0x440
[ 5631.004120] [<ffffffffaa4bade6>] worker_thread+0x126/0x3c0
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffc050d10c>] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx]
PGD 800000084cf41067 PUD 84d288067 PMD 0
Oops: 0000 [#1] SMP
Call Trace:
[<ffffffff98abcfdf>] process_one_work+0x17f/0x440
[<ffffffff98abdca6>] worker_thread+0x126/0x3c0
[<ffffffff98abdb80>] ? manage_workers.isra.26+0x2a0/0x2a0
[<ffffffff98ac4f81>] kthread+0xd1/0xe0
[<ffffffff98ac4eb0>] ? insert_kthread_work+0x40/0x40
[<ffffffff9918ad37>] ret_from_fork_nospec_begin+0x21/0x21
[<ffffffff98ac4eb0>] ? insert_kthread_work+0x40/0x40
RIP [<ffffffffc050d10c>] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx]
The crash is due to a bad entry in the nvme_rport_list. This list is not
protected, and when a remoteport_delete callback is called, driver
traverses the list and crashes.
Actually, the list could be removed and driver could traverse the main
fcport list instead. Fix does exactly that.
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch makes the code easier to read and more compact.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: Giridhar Malavali <gmalavali@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce two structures for the (DMA address, length) combination instead
of using separate structure members for the DMA address and length. This
patch fixes several Coverity complaints about 'cur_dsd' being used to write
outside the bounds of structure members.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: Giridhar Malavali <gmalavali@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the *_done() functions, instead of returning early if sp->ref_count >=
2, only decrement sp->ref_count. In qla2xxx_eh_abort(), instead of deciding
what to do based on the value of sp->ref_count, decide which action to take
depending on the completion status of the firmware abort. Remove srb.cwaitq
and use srb.comp instead. In qla2x00_abort_srb(), call
isp_ops->abort_command() directly instead of calling qla2xxx_eh_abort().
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: Giridhar Malavali <gmalavali@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch reduces the size of struct srb.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: Giridhar Malavali <gmalavali@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reduce the size of the qla2xxx kernel module by moving an array definition
from a .h into a .c file.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: Giridhar Malavali <gmalavali@marvell.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Improve source code readability by following the Linux kernel coding style
for pointer types. This patch only changes whitespace.
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: Giridhar Malavali <gmalavali@marvell.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>