linux

Commit Graph

Author	SHA1	Message	Date
Dongliang Mu	d98e000cc7	scsi: aacraid: Spelling fix in comment requesed -> requested Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-08 21:32:43 -04:00
Dave Carroll	7d3af7d96a	scsi: aacraid: Correct hba_send to include iu_type commit `b60710ec7d` ("scsi: aacraid: enable sending of TMFs from aac_hba_send()") allows aac_hba_send() to send scsi commands, and TMF requests, but the existing code only updates the iu_type for scsi commands. For TMF requests we are sending an unknown iu_type to firmware, which causes a fault. Include iu_type prior to determining the validity of the command Reported-by: Noah Misner <nmisner@us.ibm.com> Fixes: `b60710ec7d` ("aacraid: enable sending of TMFs from aac_hba_send()") Fixes: `423400e64d` ("aacraid: Include HBA direct interface") Tested-by: Noah Misner <nmisner@us.ibm.com> cc: stable@vger.kernel.org Signed-off-by: Dave Carroll <david.carroll@microsemi.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:27:18 -04:00
Dave Carroll	1c6b41fb92	scsi: aacraid: Insure command thread is not recursively stopped If a recursive IOP_RESET is invoked, usually due to the eh_thread handling errors after the first reset, be sure we flag that the command thread has been stopped to avoid an Oops of the form; [ 336.620256] CPU: 28 PID: 1193 Comm: scsi_eh_0 Kdump: loaded Not tainted 4.14.0-49.el7a.ppc64le #1 [ 336.620297] task: c000003fd630b800 task.stack: c000003fd61a4000 [ 336.620326] NIP: c000000000176794 LR: c00000000013038c CTR: c00000000024bc10 [ 336.620361] REGS: c000003fd61a7720 TRAP: 0300 Not tainted (4.14.0-49.el7a.ppc64le) [ 336.620395] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 22084022 XER: 20040000 [ 336.620435] CFAR: c000000000130388 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1 [ 336.620435] GPR00: c00000000013038c c000003fd61a79a0 c0000000014c7e00 0000000000000000 [ 336.620435] GPR04: 000000000000000c 000000000000000c 9000000000009033 0000000000000477 [ 336.620435] GPR08: 0000000000000477 0000000000000000 0000000000000000 c008000010f7d940 [ 336.620435] GPR12: c00000000024bc10 c000000007a33400 c0000000001708a8 c000003fe3b881d8 [ 336.620435] GPR16: c000003fe3b88060 c000003fd61a7d10 fffffffffffff000 000000000000001e [ 336.620435] GPR20: 0000000000000001 c000000000ebf1a0 0000000000000001 c000003fe3b88000 [ 336.620435] GPR24: 0000000000000003 0000000000000002 c000003fe3b88840 c000003fe3b887e8 [ 336.620435] GPR28: c000003fe3b88000 c000003fc8181788 0000000000000000 c000003fc8181700 [ 336.620750] NIP [c000000000176794] exit_creds+0x34/0x160 [ 336.620775] LR [c00000000013038c] __put_task_struct+0x8c/0x1f0 [ 336.620804] Call Trace: [ 336.620817] [c000003fd61a79a0] [c000003fe3b88000] 0xc000003fe3b88000 (unreliable) [ 336.620853] [c000003fd61a79d0] [c00000000013038c] __put_task_struct+0x8c/0x1f0 [ 336.620889] [c000003fd61a7a00] [c000000000171418] kthread_stop+0x1e8/0x1f0 [ 336.620922] [c000003fd61a7a40] [c008000010f7448c] aac_reset_adapter+0x14c/0x8d0 [aacraid] [ 336.620959] [c000003fd61a7b00] [c008000010f60174] aac_eh_host_reset+0x84/0x100 [aacraid] [ 336.621010] [c000003fd61a7b30] [c000000000864f24] scsi_try_host_reset+0x74/0x180 [ 336.621046] [c000003fd61a7bb0] [c000000000867ac0] scsi_eh_ready_devs+0xc00/0x14d0 [ 336.625165] [c000003fd61a7ca0] [c0000000008699e0] scsi_error_handler+0x550/0x730 [ 336.632101] [c000003fd61a7dc0] [c000000000170a08] kthread+0x168/0x1b0 [ 336.639031] [c000003fd61a7e30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4 [ 336.645971] Instruction dump: [ 336.648743] 384216a0 7c0802a6 fbe1fff8 f8010010 f821ffd1 7c7f1b78 60000000 60000000 [ 336.657056] 39400000 e87f0838 f95f0838 7c0004ac <7d401828> 314affff 7d40192d 40c2fff4 [ 336.663997] -[ end trace 4640cf8d4945ad95 ]- So flag when the thread is stopped by setting the thread pointer to NULL. Signed-off-by: Dave Carroll <david.carroll@microsemi.com> Reviewed-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-09 21:08:30 -04:00
Linus Torvalds	28bc6fb959	SCSI misc on 20180131 This is mostly updates of the usual driver suspects: arcmsr, scsi_debug, mpt3sas, lpfc, cxlflash, qla2xxx, aacraid, megaraid_sas, hisi_sas. We also have a rework of the libsas hotplug handling to make it more robust, a slew of 32 bit time conversions and fixes, and a host of the usual minor updates and style changes. The biggest potential for regressions is the libsas hotplug changes, but so far they seem stable under testing. Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWnH+5SYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishWxuAP0UvuJp MNR/yU/wv/emSzOc48Ldwd7I0xD2XxSnloGUgwD+IGZZT5yNUQA1THCbm+en4hkB WvyBieQs9qRit+2czd4= =gJMf -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly updates of the usual driver suspects: arcmsr, scsi_debug, mpt3sas, lpfc, cxlflash, qla2xxx, aacraid, megaraid_sas, hisi_sas. We also have a rework of the libsas hotplug handling to make it more robust, a slew of 32 bit time conversions and fixes, and a host of the usual minor updates and style changes. The biggest potential for regressions is the libsas hotplug changes, but so far they seem stable under testing" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (313 commits) scsi: qla2xxx: Fix logo flag for qlt_free_session_done() scsi: arcmsr: avoid do_gettimeofday scsi: core: Add VENDOR_SPECIFIC sense code definitions scsi: qedi: Drop cqe response during connection recovery scsi: fas216: fix sense buffer initialization scsi: ibmvfc: Remove unneeded semicolons scsi: hisi_sas: fix a bug in hisi_sas_dev_gone() scsi: hisi_sas: directly attached disk LED feature for v2 hw scsi: hisi_sas: devicetree: bindings: add LED feature for v2 hw scsi: megaraid_sas: NVMe passthrough command support scsi: megaraid: use ktime_get_real for firmware time scsi: fnic: use 64-bit timestamps scsi: qedf: Fix error return code in __qedf_probe() scsi: devinfo: fix format of the device list scsi: qla2xxx: Update driver version to 10.00.00.05-k scsi: qla2xxx: Add XCB counters to debugfs scsi: qla2xxx: Fix queue ID for async abort with Multiqueue scsi: qla2xxx: Fix warning for code intentation in __qla24xx_handle_gpdb_event() scsi: qla2xxx: Fix warning during port_name debug print scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout() ...	2018-01-31 11:23:28 -08:00
Raghava Aditya Renukunta	75be67cd15	scsi: aacraid: Remove unused rescan variable Remove unused rescan variable. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	fe5237590b	scsi: aacraid: Skip schedule rescan in case of kdump There is a chance of the driver to be stuck in kdump if drives start acting up in kdump discovery process and the kernel decides to send eh resets, which would prompt rescan to be scheduled. Do not perform a rescan in kdump context, since we do not expect a hotplug event during kdump and all the devices are going to go away anyway. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	8a30e50b72	scsi: aacraid: Fix hang while scanning in eh recovery Add back the ability to scan for hotplug changes while eh was in progress. Schedule a rescan for a later time in the eh recovery code and wait for eh to complete in the rescan worker. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	a1367e4ade	scsi: aacraid: Reschedule host scan in case of failure If the driver fails to retrieve information from the fw (could happen when the fw is not fully in its senses), the driver does nothing and change is not processed correctly by the driver Schedule host rescan in case of failure. This is only for SAFW, since the information retrieval failure will happen on SAFW devices. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	8ebaa67fc2	scsi: aacraid: Use hotplug handling function in place of scsi_scan_host Driver uses scsi_scan_host to add new devices in the driver init path, which adds all the fw exposed devices. The drivers resorts to queue command checks to block out commands to _hidden_ devices. Use the hotplug handler code to add new devices during driver init and other areas, this is only for safw. For ARC scsi_scan_host will still apply. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	3395614e48	scsi: aacraid: Block concurrent hotplug event handling Currently driver will attempt to process hotplug events concurrently based on the FW interrupt. Protect safw update function with a scan mutex. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	6f44a22b2c	scsi: aacraid: Merge adapter setup with resolve luns The device hotplug events are processed only after retrieving the updated lun information from the fw. Does not make sense to keep them separate. Merge both the hotplug handling and safw adapter setup code into single function. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	3031c6565f	scsi: aacraid: Refactor resolve luns code and scsi functions Resolve luns checks the if a sdev is already present in the os to figure out if it needs to be removed. Internally the driver exposes HBA on bus 2 even though its bus 1 in the fw. Its mildly confusing. Refactor out the sdev lookup into its function to check if sdev has been added to the kernel or not. Add helper functions to add, remove and put devices based on their fw bus and target number. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	2290678fed	scsi: aacraid: Added macros to help loop through known buses and targets Added macros to loop through the MAX SUPPORTED Buses and Targets. This will make the code a bit easier to read. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	f2d2cabadb	scsi: aacraid: Process hba and container hot plug events in single function The hotplug handler code is duplicated for hba handling and container handling. Merged function to handle hba and container hot plug events into the resolve luns functions. Added a bunch of helper functions to check the validity of a given target and to check if bus, target is container device. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	1d1fec53dc	scsi: aacraid: Merge func to get container information Merge aac_get_containers to setup target function, so that information about all the present devices can be retrieved in one shot. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	fc0fdd9abc	scsi: aacraid: Add target setup helper function Add helper function to setup targets devices and create the base for the upcoming patches Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	b5a475e944	scsi: aacraid: Refactor and rename to make mirror existing changes Rename variables and functions to make bmic identify, report phy luns to make them consistent across code internal existing code bases Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	5480aa1837	scsi: aacraid: Change phy luns function to use common bmic function Edit function that retrieves phy lun information to use common bmic function Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	216ced02fa	scsi: aacraid: Move code to wait for IO completion to shutdown func Ideally driver needs to wait for IO to be submitted or responded to before shutdown. Move code to wait for IO completion into shutdown path Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	95900629fa	scsi: aacraid: Do not remove offlined devices As part of the recovery process, the drivers removes offline devices ( done by the kernel) and then tries to add them back in the rescan code. Removing the device is like taking a sledgehammer to a nail. Set the device as running if it is marked offline. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:41 -05:00
Raghava Aditya Renukunta	c5313ae8e4	scsi: aacraid: Fix hang in kdump Driver attempts to perform a device scan and device add after coming out of reset. At times when the kdump kernel loads and it tries to perform eh recovery, the device scan hangs since its commands are blocked because of the eh recovery. This should have shown up in normal eh recovery path (Should have been obvious) Remove the code that performs scanning.I can live without the rescanning support in the stable kernels but a hanging kdump/eh recovery needs to be fixed. Fixes: `a2d0321dd5` (scsi: aacraid: Reload offlined drives after controller reset) Cc: <stable@vger.kernel.org> Reported-by: Douglas Miller <dougmill@linux.vnet.ibm.com> Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Fixes: `a2d0321dd5` (scsi: aacraid: Reload offlined drives after controller reset) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:41 -05:00
Arnd Bergmann	d18539754d	scsi: aacraid: address UBSAN warning regression As reported by Meelis Roos, my previous patch causes an incorrect calculation of the timeout, through an undefined signed integer overflow: [ 12.228155] UBSAN: Undefined behaviour in drivers/scsi/aacraid/commsup.c:2514:49 [ 12.228229] signed integer overflow: [ 12.228283] 964297611 * 250 cannot be represented in type 'long int' The problem is that doing a multiplication with HZ first and then dividing by USEC_PER_SEC worked correctly for 32-bit microseconds, but not for 32-bit nanoseconds, which would require up to 41 bits. This reworks the calculation to first convert the nanoseconds into jiffies, which should give us the same result as before and not overflow. Unfortunately I did not understand the exact intention of the algorithm, in particular the part where we add half a second, so it's possible that there is still a preexisting problem in this function. I added a comment that this would be handled more nicely using usleep_range(), which generally works better for waking up at a particular time than the current schedule_timeout() based implementation. I did not feel comfortable trying to implement that without being sure what the intent is here though. Fixes: `820f188659` ("scsi: aacraid: use timespec64 instead of timeval") Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-29 00:07:20 -05:00
Guilherme G. Piccoli	e4717292dd	scsi: aacraid: Prevent crash in case of free interrupt during scsi EH path As part of the scsi EH path, aacraid performs a reinitialization of the adapter, which encompass freeing resources and IRQs, NULLifying lots of pointers, and then initialize it all over again. We've identified a problem during the free IRQ portion of this path if CONFIG_DEBUG_SHIRQ is enabled on kernel config file. Happens that, in case this flag was set, right after free_irq() effectively clears the interrupt, it checks if it was requested as IRQF_SHARED. In positive case, it performs another call to the IRQ handler on driver. Problem is: since aacraid currently free some resources before freeing the IRQ, once free_irq() path calls the handler again (due to CONFIG_DEBUG_SHIRQ), aacraid crashes due to NULL pointer dereference with the following trace: aac_src_intr_message+0xf8/0x740 [aacraid] __free_irq+0x33c/0x4a0 free_irq+0x78/0xb0 aac_free_irq+0x13c/0x150 [aacraid] aac_reset_adapter+0x2e8/0x970 [aacraid] aac_eh_reset+0x3a8/0x5d0 [aacraid] scsi_try_host_reset+0x74/0x180 scsi_eh_ready_devs+0xc70/0x1510 scsi_error_handler+0x624/0xa20 This patch prevents the crash by changing the order of the deinitialization in this path of aacraid: first we clear the IRQ, then we free other resources. No functional change intended. Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-20 22:33:09 -05:00
Guilherme G. Piccoli	bd257b2f3b	scsi: aacraid: Check for PCI state of device in a generic way Commit `16ae9dd35d` ("scsi: aacraid: Fix for excessive prints on EEH") introduced checks about the state of device before any PCI operations in the driver. Basically, this prevents it to perform PCI accesses when device is in the process of recover from a PCI error. In PowerPC, such mechanism is called EEH, and the aforementioned commit introduced checks that are based on EEH-specific primitives for that. The potential problems with this approach are three: first, these checks are "locked" to powerpc only - another archs could have error recovery methods too, like AER in Intel. Also, the powerpc primitives perform expensive FW accesses to validate the precise PCI state of a device. Finally, code becomes more complicated and needs ifdef validation based on arch config being set. So, this patch makes use of generic PCI state checks, which are lightweight and non-dependent of arch configs - also, it makes the code cleaner. Fixes: `16ae9dd35d` ("scsi: aacraid: Fix for excessive prints on EEH") Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-20 22:29:10 -05:00
Arnd Bergmann	820f188659	scsi: aacraid: use timespec64 instead of timeval aacraid passes the current time to the firmware in one of two ways, either as year/month/day/... or as 32-bit unsigned seconds. The first one is broken on 32-bit architectures as it cannot go past year 2038. Using timespec64 here makes it behave properly on both 32-bit and 64-bit architectures, and avoids relying on signed integer overflow to pass times into the second interface. The interface used in aac_send_hosttime() however is still problematic in year 2106 when 32-bit seconds overflow. Hopefully we don't have to worry about aacraid by that time. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-08 18:08:01 -05:00
Hannes Reinecke	c323eab7a6	scsi: aacraid: add fib flag to mark scsi command callback To correctly identify which fib has a scsi command callback this patch implements a flag FIB_CONTEXT_FLAG_SCSI_CMD. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:00 -04:00
Hannes Reinecke	b60710ec7d	scsi: aacraid: enable sending of TMFs from aac_hba_send() aac_hba_send() will return FAILED for any non-SCSI command requests, failing any TMFs. This patch updates the check to allow TMFs. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:00 -04:00
Raghava Aditya Renukunta	395e5df79a	scsi: aacraid: Remove reference to Series-9 Remove reference to Series-9 HBA and created arc ctrl check function. Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	8c41b9b798	scsi: aacraid: Make sure ioctl returns on controller reset Made sure that ioctl commands return in case of a controller reset. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	9473ddb2b0	scsi: aacraid: Use correct function to get ctrl health The command thread checks the ctrl health periodically before sending updates to the controller. The function that it uses is aac_check_health which does more than get the health status. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	fed820073f	scsi: aacraid: Remove reset support from check_health Check health does not need to reset the ctrl but just return the controller health status. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	8105d39d0e	scsi: aacraid: Fix DMAR issues with iommu=pt The driver changed the DMA consistent map after consistent memory was allocated, this invalidated the IOMMU identity mapping. The fix was to make sure that we set the DMA consistent mask setting once depending on the controller card. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Linus Torvalds	8d5e72dfdf	SCSI misc on 20170503 This update includes the usual round of major driver updates (hisi_sas, ufs, fnic, cxlflash, be2iscsi, ipr, stex). There's also the usual amount of cosmetic and spelling stuff. Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZClQkAAoJEAVr7HOZEZN4OmkP/j/JJx2ImGzTgil5S8yeSWPY 5Gqb8IK9rCQ+OJgCZYCz3JsLZZnwY4ODZ9tC1lO/3he6VfjIhcEs2/eXbTnEfsZx D3EwWEVR3wYBNZN0d4hQoudVbdCf6UuvsUvM1hDFO7by10qFEs0DqsufccpDlpG/ us96BWf7PgiNzHYSvZIlmsfEDzNDRRg7Dm1NuLOQvXw56zFGsrysCO6Tqg7/ScJm Unz/VlEe1DE7zE9QotsKNCht7xHkmn1vfuva1wqG2wMp7EHf0rKnavRYrWUrxiEy 2ig6GpR7mIHmVHS8PAMNhyS6iNxGQ3e50sAvZdqDlq42P73AEwbrOo5YhgsTJxWT vCpRAzSuHwPOPY3W2Aa1yJ10iOpoPKxXs2xSZuzpcz8XJ3RjHy+l90Y0VT4Jrvzv +dSY1cynshFccZmw2HQanlt1Ly9G3U8xmx8KIbnsIPCdSIQaQQD27H+Ip0YZ0fKt aLmMcQzffma3UP/LPmRAQ45bwx8rLi9M3DWbWOGmSkIRY3etPCXqNuDcC6h5p9TF 4W74oVcELTql/u8ATZNSbdHBsWAg3GATIkAgdqwLTk/CU/0OgGY8epILr3EM2bc6 vVbglwP9DiyVOikTLhVNJdZA97qHjZ1WXNo03eefFTBfPDcUlkZw4j2gufGuNFh2 5vA4C/aSl9uxaLInr3aC =kj7u -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (hisi_sas, ufs, fnic, cxlflash, be2iscsi, ipr, stex). There's also the usual amount of cosmetic and spelling stuff" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (155 commits) scsi: qla4xxx: fix spelling mistake: "Tempalate" -> "Template" scsi: stex: make S6flag static scsi: mac_esp: fix to pass correct device identity to free_irq() scsi: aacraid: pci_alloc_consistent() failures on ARM64 scsi: ufs: make ufshcd_get_lists_status() register operation obvious scsi: ufs: use MASK_EE_STATUS scsi: mac_esp: Replace bogus memory barrier with spinlock scsi: fcoe: make fcoe_e_d_tov and fcoe_r_a_tov static scsi: sd_zbc: Do not write lock zones for reset scsi: sd_zbc: Remove superfluous assignments scsi: sd: sd_zbc: Rename sd_zbc_setup_write_cmnd scsi: Improve scsi_get_sense_info_fld scsi: sd: Cleanup sd_done sense data handling scsi: sd: Improve sd_completed_bytes scsi: sd: Fix function descriptions scsi: mpt3sas: remove redundant wmb scsi: mpt: Move scsi_remove_host() out of mptscsih_remove_host() scsi: sg: reset 'res_in_use' after unlinking reserved array scsi: mvumi: remove code handling zero scsi_sg_count(scmd) case scsi: fusion: fix spelling mistake: "Persistancy" -> "Persistency" ...	2017-05-04 12:19:44 -07:00
Mahesh Rajashekhara	f481973d5e	scsi: aacraid: pci_alloc_consistent() failures on ARM64 There were pci_alloc_consistent() failures on ARM64 platform. Use dma_alloc_coherent() with GFP_KERNEL flag DMA memory allocations. Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com> [hch: tweaked indentation, removed memsets] Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-04-26 18:28:06 -04:00
Guilherme G. Piccoli	911e572e98	scsi: aacraid: fix PCI error recovery path During a PCI error recovery, if aac_check_health() is not aware that a PCI error happened and we have an offline PCI channel, it might trigger some errors (like NULL pointer dereference) and inhibit the error recovery process to complete. This patch makes the health check procedure aware of PCI channel issues, and in case of error recovery process, the function aac_adapter_check_health() returns -1 and let the recovery process to complete successfully. This patch was tested on upstream kernel v4.11-rc5 in PowerPC ppc64le architecture with adapter 9005:028d (VID:DID) - the error recovery procedure was able to recover fine. Fixes: `5c63f7f710` ("aacraid: Added EEH support") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-04-11 20:45:59 -04:00
Raghava Aditya Renukunta	e498520ede	scsi: aacraid: Fix potential null access Currently, command threads fails to return ioctls commands for older controller versions, since it returns when all the fibs have been allocated. Another issue is even all the fibs have not been allocated, the correct allocated fibs is not updated nor freed. Fixes: `113156bcea` (scsi: aacraid: Reworked aac_command_thread) Reported-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-03-15 19:00:36 -04:00
Colin Ian King	fbdab3e7fd	scsi: aacraid: remove redundant zero check on ret The check for ret being zero is redundant as a few statements earlier we break out of the while loop if ret is non-zero. Thus we can remove the zero check and also the dead-code non-zero case too. Detected by CoverityScan, CID#1411632 ("Logically Dead Code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-27 22:19:43 -05:00
Raghava Aditya Renukunta	a56e574067	scsi: aacraid: Fixed expander hotplug for SMART family Current driver Hotplug processing code skips over Enclosure channel, therefore any addition/removal of expander enclosure is not processed. Additionally device addition code relies on older device type, which prevents the hotplug of adapter expanders. Fixed by removing code that skips over Enclosure channels and using the latest device type for addition or removal or enclosure expanders. Fixes: `6223a39fe6` (scsi: aacraid: Added support for hotplug) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-23 17:02:15 -05:00
Raghava Aditya Renukunta	d844752e18	scsi: aacraid: Fix a potential spinlock double unlock bug The driver does not unlock the reply queue spin lock after handling SMART adapter events. Instead it might attempt to unlock an already unlocked spin lock. Fixed by making sure the driver locks the spin lock before freeing it. Thank you dan for finding this issue out. Fixes: `6223a39fe6` (scsi: aacraid: Added support for hotplug) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	a2d0321dd5	scsi: aacraid: Reload offlined drives after controller reset During the IOP reset stress testing, it was found that the drives can be marked offline when the adapter controller crashes and IO's are running in parallel. When the controller does come back from the reset, the drive that is marked offline is not exposed. Fixed by removing and adding drives that are marked offline. In addition invoke a scsi host bus rescan to capture any additional configuration changes. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	849ac6a591	scsi: aacraid: Skip wellness sync on controller failure aac_command_thread checks on the health of controller periodically, using aac_check_health. If the status is an error state KERNEL_PANIC or anything else. The driver will attempt to restart the adapter, but the response is not checked in aac_command_thread. This allows the periodic sync to go thru and lead the driver to a hung state. Fixed by terminating the periodic loop(intended per original design), if the controller is not restored to a healthy state. Cc: stable@vger.kernel.org Fixes: `3d77d84044` (scsi: aacraid: Added support for periodic wellness sync) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	1bff5abca6	scsi: aacraid: Fix memory leak in fib init path aac_fib_map_free frees misaligned fib dma memory, additionally it does not free up the whole memory. Fixed by changing the code to free up the correct and full memory allocation. Cc: stable@vger.kernel.org Fixes: `e8b12f0fb8` ([SCSI] aacraid: Add new code for PMC-Sierra's SRC based controller family) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:41 -05:00
Raghava Aditya Renukunta	a0c6143e95	scsi: aacraid: Prevent E3 lockup when deleting units Arrconf management utility at times sends fibs with AdapterProcessed set in its fibs. This causes the controller to panic and lockup. Fixed by failing the commands that have AdapterProcessed set in its flag. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:41 -05:00
Raghava Aditya Renukunta	16ae9dd35d	scsi: aacraid: Fix for excessive prints on EEH This issue showed up on a kdump debug(single CPU on powerkvm), when EEH errors rendered the adapter unusable. The driver correctly detected the issue and attempted to restart the controller, in doing so the driver attempted to read the status registers of the controller. This triggered additional eeh errors which continued for a good 6 minutes. Fixed by returning without waiting when EEH error is reported. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:41 -05:00
Raghava Aditya Renukunta	1c68856e6e	scsi: aacraid: Fix camel case Replaced camel case with snake case for init supported options. Suggested-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:41 -05:00
Raghava Aditya Renukunta	f4babba0af	scsi: aacraid: Update copyrights Added new copyright messages Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Dave Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-03 10:35:04 -05:00
Raghava Aditya Renukunta	3136432956	scsi: aacraid: Added new IWBR reset Added a new IWBR soft reset type, reworked the IOP reset interface for a bit. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Dave Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-03 10:35:04 -05:00
Raghava Aditya Renukunta	423400e64d	scsi: aacraid: Include HBA direct interface Added support to send direct pasthru srb commands from management utilty to the controller. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Dave Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-03 10:35:03 -05:00
Raghava Aditya Renukunta	6223a39fe6	scsi: aacraid: Added support for hotplug Added support for drive hotplug add and removal Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Dave Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-03 10:35:03 -05:00
Raghava Aditya Renukunta	3d77d84044	scsi: aacraid: Added support for periodic wellness sync This patch adds a new functions that periodically sync the time of host to the adapter. In addition also informs the adapter that the driver is alive and kicking. Only applicable to the HBA1000 and SMARTIOC2000. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Dave Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-03 10:35:03 -05:00

1 2 3

135 Commits