linux

Commit Graph

Author	SHA1	Message	Date
Steffen Maier	2d9a2c5f58	scsi: zfcp: Fix use-after-free in request timeout handlers Before v4.15 commit `75492a5156` ("s390/scsi: Convert timers to use timer_setup()"), we intentionally only passed zfcp_adapter as context argument to zfcp_fsf_request_timeout_handler(). Since we only trigger adapter recovery, it was unnecessary to sync against races between timeout and (late) completion. Likewise, we only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action, it was unnecessary to sync against races between timeout and (late) completion. Meanwhile the timeout handlers get timer_list as context argument and do a timer-specific container-of to zfcp_fsf_req which can have been freed. Fix it by making sure that any request timeout handlers, that might just have started before del_timer(), are completed by using del_timer_sync() instead. This ensures the request free happens afterwards. Space time diagram of potential use-after-free: Basic idea is to have 2 or more pending requests whose timeouts run out at almost the same time. req 1 timeout ERP thread req 2 timeout ---------------- ---------------- --------------------------------------- zfcp_fsf_request_timeout_handler fsf_req = from_timer(fsf_req, t, timer) adapter = fsf_req->adapter zfcp_qdio_siosl(adapter) zfcp_erp_adapter_reopen(adapter,...) zfcp_erp_strategy ... zfcp_fsf_req_dismiss_all list_for_each_entry_safe zfcp_fsf_req_complete 1 del_timer 1 zfcp_fsf_req_free 1 zfcp_fsf_req_complete 2 zfcp_fsf_request_timeout_handler del_timer 2 fsf_req = from_timer(fsf_req, t, timer) zfcp_fsf_req_free 2 adapter = fsf_req->adapter ^^^^^^^ already freed Link: https://lore.kernel.org/r/20200813152856.50088-1-maier@linux.ibm.com Fixes: `75492a5156` ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> #4.15+ Suggested-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-08-17 22:12:09 -04:00
Linus Torvalds	dfdf16ecfd	SCSI misc on 20200806 This series consists of the usual driver updates (ufs, qla2xxx, tcmu, lpfc, hpsa, zfcp, scsi_debug) and minor bug fixes. We also have a huge docbook fix update like most other subsystems and no major update to the core (the few non trivial updates are either minor fixes or removing an unused feature [scsi_sdb_cache]). Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXyxq1yYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSoAAQChZ4i8 ZqYW3pL33JO3fA8vdjvLuyC489Hj4wzIsl3/bQEAxYyM6BSLvMoLWR2Plq/JmTLm 4W/LDptarpTiDI3NuDc= =4b0W -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, qla2xxx, tcmu, lpfc, hpsa, zfcp, scsi_debug) and minor bug fixes. We also have a huge docbook fix update like most other subsystems and no major update to the core (the few non trivial updates are either minor fixes or removing an unused feature [scsi_sdb_cache])" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (307 commits) scsi: scsi_transport_srp: Sanitize scsi_target_block/unblock sequences scsi: ufs-mediatek: Apply DELAY_AFTER_LPM quirk to Micron devices scsi: ufs: Introduce device quirk "DELAY_AFTER_LPM" scsi: virtio-scsi: Correctly handle the case where all LUNs are unplugged scsi: scsi_debug: Implement tur_ms_to_ready parameter scsi: scsi_debug: Fix request sense scsi: lpfc: Fix typo in comment for ULP scsi: ufs-mediatek: Prevent LPM operation on undeclared VCC scsi: iscsi: Do not put host in iscsi_set_flashnode_param() scsi: hpsa: Correct ctrl queue depth scsi: target: tcmu: Make TMR notification optional scsi: target: tcmu: Implement tmr_notify callback scsi: target: tcmu: Fix and simplify timeout handling scsi: target: tcmu: Factor out new helper ring_insert_padding scsi: target: tcmu: Do not queue aborted commands scsi: target: tcmu: Use priv pointer in se_cmd scsi: target: Add tmr_notify backend function scsi: target: Modify core_tmr_abort_task() scsi: target: iscsi: Fix inconsistent debug message scsi: target: iscsi: Fix login error when receiving ...	2020-08-06 16:50:07 -07:00
Julian Wiedmann	c3bfffa5ec	scsi: zfcp: Avoid benign overflow of the Request Queue's free-level zfcp_qdio_send() and zfcp_qdio_int_req() run concurrently, adding and completing SBALs on the Request Queue. There's a theoretical race where zfcp_qdio_int_req() completes a number of SBALs & increments the queue's free-level _before_ zfcp_qdio_send() was able to decrement it. This can cause ->req_q_free to momentarily hold a value larger than QDIO_MAX_BUFFERS_PER_Q. Luckily zfcp_qdio_send() is always called under ->req_q_lock, and all readers of the free-level also take this lock. So we can trust that zfcp_qdio_send() will clean up such a temporary overflow before anyone can actually observe it. But it's still confusing and annoying to worry about. So adjust the code to avoid this race. Link: https://lore.kernel.org/r/7f61f59a1f8db270312e64644f9173b8f1ac895f.1593780621.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-07-08 00:50:56 -04:00
Julian Wiedmann	6bcb7c171a	scsi: zfcp: Replace open-coded list move Instead of manually moving each element of the unit and port lists into our temporary on-stack lists, splice them over in one go. Link: https://lore.kernel.org/r/cacb179f49ece50fd4dce119c61252d632cdc1d4.1593780621.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-07-08 00:50:55 -04:00
Julian Wiedmann	b43cdb5ac8	scsi: zfcp: Clean up zfcp_erp_action_ready() We already maintain a pointer to act->adapter. Use it consistently to avoid any confusion about whose ->erp_ready_head and ->erp_ready_wq we are accessing. Link: https://lore.kernel.org/r/d1bb04322f240dee32f4c4a551bc93bc736f4b01.1593780621.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-07-08 00:50:55 -04:00
Julian Wiedmann	459ad085d8	scsi: zfcp: Fix an outdated comment for zfcp_qdio_send() zfcp no longer uses the qdio PCI flag, update the comment. Link: https://lore.kernel.org/r/6717c26fc986bff8776d110e27c199b523684c63.1593780621.git.bblock@linux.ibm.com Fixes: `21ddaa53f9` ("[SCSI] zfcp: Remove PCI flag") Reviewed-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-07-08 00:50:53 -04:00
George Spelvin	0cd0e57ec8	scsi: zfcp: Use prandom_u32_max() for backoff We don't need crypto-grade random numbers for randomized backoffs. Instead use prandom_u32_max(ep_ro) which generates a pseudo-random number uniformly distributed in the interval [0, ep_ro). Link: https://lore.kernel.org/r/8fc7c4c4069ff1783f4a9ccd84a923f581a09ec5.1593780621.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: George Spelvin <lkml@sdf.org> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-07-08 00:50:52 -04:00
Steffen Maier	936e6b85da	scsi: zfcp: Fix panic on ERP timeout for previously dismissed ERP action Suppose that, for unrelated reasons, FSF requests on behalf of recovery are very slow and can run into the ERP timeout. In the case at hand, we did adapter recovery to a large degree. However due to the slowness a LUN open is pending so the corresponding fc_rport remains blocked. After fast_io_fail_tmo we trigger close physical port recovery for the port under which the LUN should have been opened. The new higher order port recovery dismisses the pending LUN open ERP action and dismisses the pending LUN open FSF request. Such dismissal decouples the ERP action from the pending corresponding FSF request by setting zfcp_fsf_req->erp_action to NULL (among other things) [zfcp_erp_strategy_check_fsfreq()]. If now the ERP timeout for the pending open LUN request runs out, we must not use zfcp_fsf_req->erp_action in the ERP timeout handler. This is a problem since v4.15 commit `75492a5156` ("s390/scsi: Convert timers to use timer_setup()"). Before that we intentionally only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Note: The lifetime of the corresponding zfcp_fsf_req object continues until a (late) response or an (unrelated) adapter recovery. Just like the regular response path ignores dismissed requests [zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the ERP timeout handler now needs to ignore dismissed requests. So simply return early in the ERP timeout handler if the FSF request is marked as dismissed in its status flags. To protect against the race where zfcp_erp_strategy_check_fsfreq() dismisses and sets zfcp_fsf_req->erp_action to NULL after our previous status flag check, return early if zfcp_fsf_req->erp_action is NULL. After all, the former ERP action does not need to be woken up as that was already done as part of the dismissal above [zfcp_erp_action_dismiss()]. This fixes the following panic due to kernel page fault in IRQ context: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d Oops: 0004 ilc:2 [#1] SMP Modules linked in: ... CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G E X ... Hardware name: IBM 8561 T01 701 (LPAR) Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp]) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000 000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02 0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000 0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8 Krnl Code: 001fffff80549bd4: a7180000 lhi %r1,0 001fffff80549bd8: 4120a0f0 la %r2,240(%r10) #001fffff80549bdc: a53e0003 llilh %r3,3 >001fffff80549be0: ba132000 cs %r1,%r3,0(%r2) 001fffff80549be4: a7740037 brc 7,1fffff80549c52 001fffff80549be8: e320b0180004 lg %r2,24(%r11) 001fffff80549bee: e31020e00004 lg %r1,224(%r2) 001fffff80549bf4: 412020e0 la %r2,224(%r2) Call Trace: [<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp] [<00000985915e26f0>] call_timer_fn+0x38/0x190 [<00000985915e2944>] expire_timers+0xfc/0x190 [<00000985915e2ac4>] run_timer_softirq+0xec/0x218 [<0000098591ca7c4c>] __do_softirq+0x144/0x398 [<00000985915110aa>] do_softirq_own_stack+0x72/0x88 [<0000098591551b58>] irq_exit+0xb0/0xb8 [<0000098591510c6a>] do_IRQ+0x82/0xb0 [<0000098591ca7140>] ext_int_handler+0x128/0x12c [<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60 ([<000009859172ae4c>] clear_huge_page+0xec/0x250) [<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768 [<000009859172a712>] __handle_mm_fault+0x88a/0x900 [<000009859172a860>] handle_mm_fault+0xd8/0x1b0 [<0000098591529ef6>] do_dat_exception+0x136/0x3e8 [<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220 Last Breaking-Event-Address: [<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com Fixes: `75492a5156` ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> #4.15+ Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-06-24 00:01:09 -04:00
Benjamin Block	d0dff2ac98	scsi: zfcp: Move allocation of the shost object to after xconf- and xport-data At the moment we allocate and register the Scsi_Host object corresponding to a zfcp adapter (FCP device) very early in the life cycle of the adapter - even before we fully discover and initialize the underlying firmware/hardware. This had the advantage that we could already use the Scsi_Host object, and fill in all its information during said discover and initialize. Due to commit `737eb78e82` ("block: Delay default elevator initialization") (first released in v5.4), we noticed a regression that would prevent us from using any storage volume if zfcp is configured with support for DIF or DIX (zfcp.dif=1 \|\| zfcp.dix=1). Doing so would result in an illegal memory access as soon as the first request is sent with such an configuration. As example for a crash resulting from this: scsi host0: scsi_eh_0: sleeping scsi host0: zfcp qdio: 0.0.1900 ZFCP on SC 4bd using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W AP scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36 Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:0000000035c7c007 R3:00000001effcc007 S:00000001effd1000 P:000000000000003d Oops: 0004 ilc:3 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: ... CPU: 1 PID: 783 Comm: kworker/u760:5 Kdump: loaded Not tainted 5.6.0-rc2-bb-next+ #1 Hardware name: ... Workqueue: scsi_wq_0 fc_scsi_scan_rport [scsi_transport_fc] Krnl PSW : 0704e00180000000 000003ff801fcdae (scsi_queue_rq+0x436/0x740 [scsi_mod]) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0fffffffffffffff 0000000000000000 0000000187150120 0000000000000000 000003ff80223d20 000000000000018e 000000018adc6400 0000000187711000 000003e0062337e8 00000001ae719000 0000000187711000 0000000187150000 00000001ab808100 0000000187150120 000003ff801fcd74 000003e0062336a0 Krnl Code: 000003ff801fcd9e: e310a35c0012 lt %r1,860(%r10) 000003ff801fcda4: a7840010 brc 8,000003ff801fcdc4 #000003ff801fcda8: e310b2900004 lg %r1,656(%r11) >000003ff801fcdae: d71710001000 xc 0(24,%r1),0(%r1) 000003ff801fcdb4: e310b2900004 lg %r1,656(%r11) 000003ff801fcdba: 41201018 la %r2,24(%r1) 000003ff801fcdbe: e32010000024 stg %r2,0(%r1) 000003ff801fcdc4: b904002b lgr %r2,%r11 Call Trace: [<000003ff801fcdae>] scsi_queue_rq+0x436/0x740 [scsi_mod] ([<000003ff801fcd74>] scsi_queue_rq+0x3fc/0x740 [scsi_mod]) [<00000000349c9970>] blk_mq_dispatch_rq_list+0x390/0x680 [<00000000349d1596>] blk_mq_sched_dispatch_requests+0x196/0x1a8 [<00000000349c7a04>] __blk_mq_run_hw_queue+0x144/0x160 [<00000000349c7ab6>] __blk_mq_delay_run_hw_queue+0x96/0x228 [<00000000349c7d5a>] blk_mq_run_hw_queue+0xd2/0xe0 [<00000000349d194a>] blk_mq_sched_insert_request+0x192/0x1d8 [<00000000349c17b8>] blk_execute_rq_nowait+0x80/0x90 [<00000000349c1856>] blk_execute_rq+0x6e/0xb0 [<000003ff801f8ac2>] __scsi_execute+0xe2/0x1f0 [scsi_mod] [<000003ff801fef98>] scsi_probe_and_add_lun+0x358/0x840 [scsi_mod] [<000003ff8020001c>] __scsi_scan_target+0xc4/0x228 [scsi_mod] [<000003ff80200254>] scsi_scan_target+0xd4/0x100 [scsi_mod] [<000003ff802d8b96>] fc_scsi_scan_rport+0x96/0xc0 [scsi_transport_fc] [<0000000034245ce8>] process_one_work+0x458/0x7d0 [<00000000342462a2>] worker_thread+0x242/0x448 [<0000000034250994>] kthread+0x15c/0x170 [<0000000034e1979c>] ret_from_fork+0x30/0x38 INFO: lockdep is turned off. Last Breaking-Event-Address: [<000003ff801fbc36>] scsi_add_cmd_to_list+0x9e/0xa8 [scsi_mod] Kernel panic - not syncing: Fatal exception: panic_on_oops While this issue is exposed by the commit named above, this is only by accident. The real issue exists for longer already - basically since it's possible to use blk-mq via scsi-mq, and blk-mq pre-allocates all requests for a tag-set during initialization of the same. For a given Scsi_Host object this is done when adding the object to the midlayer (`scsi_add_host()` and such). In `scsi_mq_setup_tags()` the midlayer calculates how much memory is required for a single scsi_cmnd, and its additional data, which also might include space for additional protection data - depending on whether the Scsi_Host has any form of protection capabilities (`scsi_host_get_prot()`). The problem is now thus, because zfcp does this step before we actually know whether the firmware/hardware has these capabilities, we don't set any protection capabilities in the Scsi_Host object. And so, no space is allocated for additional protection data for requests in the Scsi_Host tag-set. Once we go through discover and initialize the FCP device firmware/hardware fully (this is done via the firmware commands "Exchange Config Data" and "Exchange Port Data") we find out whether it actually supports DIF and DIX, and we set the corresponding capabilities in the Scsi_Host object (in `zfcp_scsi_set_prot()`). Now the Scsi_Host potentially has protection capabilities, but the already allocated requests in the tag-set don't have any space allocated for that. When we then trigger target scanning or add scsi_devices manually, the midlayer will use requests from that tag-set, and before sending most requests, it will also call `scsi_mq_prep_fn()`. To prepare the scsi_cmnd this function will check again whether the used Scsi_Host has any protection capabilities - and now it potentially has - and if so, it will try to initialize the assumed to be preallocated structures and thus it causes the crash, like shown above. Before delaying the default elevator initialization with the commit named above, we always would also allocate an elevator for any scsi_device before ever sending any requests - in contrast to now, where we do it after device-probing. That elevator in turn would have its own tag-set, and that is initialized after we went through discovery and initialization of the underlying firmware/hardware. So requests from that tag-set can be allocated properly, and if used - unless the user changes/disabled the default elevator - this would hide the underlying issue. To fix this for any configuration - with or without an elevator - we move the allocation and registration of the Scsi_Host object for a given FCP device to after the first complete discovery and initialization of the underlying firmware/hardware. By doing that we can make all basic properties of the Scsi_Host known to the midlayer by the time we call `scsi_add_host()`, including whether we have any protection capabilities. To do that we have to delay all the accesses that we would have done in the past during discovery and initialization, and do them instead once we are finished with it. The previous patches ramp up to this by fencing and factoring out all these accesses, and make it possible to re-do them later on. In addition we make also use of the diagnostic buffers we recently added with commit `92953c6e0a` ("scsi: zfcp: signal incomplete or error for sync exchange config/port data") commit `7e418833e6` ("scsi: zfcp: diagnostics buffer caching and use for exchange port data") commit `088210233e` ("scsi: zfcp: add diagnostics buffer for exchange config data") (first released in v5.5), because these already cache all the information we need for that "re-do operation" - the information cached are always updated during xconf or xport data, so it won't be stale. In addition to the move and re-do, this patch also updates the function-documentation of `zfcp_scsi_adapter_register()` and changes how it reports if a Scsi_Host object already exists. In that case future recovery-operations can skip this step completely and behave much like they would do in the past - zfcp does not release a once allocated Scsi_Host object unless the corresponding FCP device is deconstructed completely. Link: https://lore.kernel.org/r/030dd6da318bbb529f0b5268ec65cebcd20fc0a3.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:52 -04:00
Benjamin Block	71159b6ecb	scsi: zfcp: Fence early sysfs interfaces for accesses of shost objects When setting an adapter online for the first time, we also create a couple of entries for it in the sysfs device tree. This is also true even if the adapter has not yet ever gone successfully through exchange config and exchange port data. When moving the scsi host object allocation and registration to after the first exchange config and exchange port data, this make the `port_rescan` attribute susceptible to invalid pointer-dereferences of the shost field before the adapter is fully initialized. When written to, it schedules a `scan_work` item that will in turn make use of the associated fibre channel host object to check the topology used for this FCP device. Because scanning for remote ports can't be done successfully without completing exchange config and exchange port data first, we can simply fence `port_rescan`, and so prevent the illegal access. As with cases where we can't get a reference to the adapter, we also return -ENODEV here. Applications need to handle that errno today already. After a successful allocation of the scsi host object nothing changes in the work flow. Link: https://lore.kernel.org/r/ef65366d309993ca91b6917727590ca7ca166c8f.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:50 -04:00
Benjamin Block	971f2abb4c	scsi: zfcp: Fence adapter status propagation for common statuses Common status flags that all main objects - adapter, port, and unit - support are propagated to sub-objects when set or cleared. For instance, when setting the status ZFCP_STATUS_COMMON_ERP_INUSE for an adapter object, we will propagate this to all its child ports and units - same for when clearing a common status flag. Units of an adapter object are enumerated via __shost_for_each_device() over the scsi host object of the corresponding adapter. Once we move the scsi host object allocation and registration to after the first exchange config and exchange port data, this won't be possible for cases where we set or clear common statuses during the very first adapter recovery. But since we won't have any port or unit objects yet at that point of time, we can just fence the status propagation for cases where the scsi host object is not yet set in the adapter object. It won't change any effective status propagations, but will prevent us from dereferencing invalid pointers. For any later point in the work flow the scsi host object will be set and thus nothing is changed then. Link: https://lore.kernel.org/r/f51fe5f236a1e3d1ce53379c308777561bfe35e1.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:50 -04:00
Benjamin Block	ac007adc4d	scsi: zfcp: Move p-t-p port allocation to after xport data When doing the very first adapter recovery - initialization - for a FCP device in a point-to-point topology we also allocate the port object corresponding to the attached remote port, and trigger a port recovery for it that will run after the adapter recovery finished. Right now this happens right after we finished with the exchange config data command, and uses the fibre channel host object corresponding to the FCP device to determine whether a point-to-point topology is used. When moving the scsi host object allocation and registration - and thus also the fibre channel host object allocation - to after the first exchange config and exchange port data, this use of the fc_host object is not possible anymore at that point in the work flow. But the allocation and recovery trigger doesn't have notable side-effects on the following exchange port data processing, so we can move those to after xport data, and thus also to after the scsi host object allocation, once we move it. Then the fc_host object can be used again, like it is now. For any further adapter recoveries this doesn't change anything, because at that point the port object already exists and recovery is triggered elsewhere for existing port objects. Link: https://lore.kernel.org/r/73e5d4ac21e2b37bf0c3ca8e530bc5a5c6e74f8f.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:49 -04:00
Benjamin Block	990486f3a8	scsi: zfcp: Fence fc_host updates during link-down handling When receiving a notification that a FCP device lost its local link we usually update the fibre channel host object which represents that FCP device to reflect that. This notification/information can also surface when the FCP device is running through adapter recovery (exchange config and exchange port data return incomplete). When moving the scsi host object allocation and registration - and thus also the fibre channel host object allocation - to after the first exchange config and exchange port data, and this happens during the very first adapter recovery, these updates can not be done until after the scsi host object is allocated. Reorder the fc_host updates in zfcp_fsf_fc_host_link_down() so that they only happen after a check of whether the scsi host object is already allocated or not. During the first adapter recovery this will cause the skip of these updates if a link-down condition is detected, but we can repeat them after we allocated the scsi host object, if necessary. For any further link-down handling the only changes in the work flow are the slightly reordered assignments in zfcp_fsf_fc_host_link_down(). Link: https://lore.kernel.org/r/f841f2cda61dcd7b8549910c44e1831927459edf.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:48 -04:00
Benjamin Block	52e61fde5e	scsi: zfcp: Move fc_host updates during xport data handling into fenced function When executing exchange port data for a FCP device for the first time, or after an adapter recovery, we update several properties of the fibre channel host object which represents that FCP device. When moving the scsi host object allocation and registration - and thus also the fibre channel host object allocation - to after the first exchange config and exchange port data, this is not possible for the former case. Move all these update into separate, and fenced function that first checks whether the scsi host object already exists or not, before making the updates. During the first ever exchange port data in the adapter life cycle this will make the exchange port data handler skip over this update step, but we can repeat it later, after we allocated the scsi host object. For any further recovery of that adapter the work flow is only changed slightly because then the scsi host object already exists and we don't free it until we release the adapter completely at the end of its life cycle. Link: https://lore.kernel.org/r/ae454c2dc6da0b02907c489af91d0b211d331825.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:48 -04:00
Benjamin Block	bd1684817d	scsi: zfcp: Move shost updates during xconfig data handling into fenced function When executing exchange config data for a FCP device for the first time, or after an adapter recovery, we update several properties of the scsi host or fibre channel host object that represent that FCP device. When moving the scsi host object allocation and registration - and thus also the fibre channel host object allocation - to after the first exchange config and exchange port data, this is not possible for the former case. Move all these update into separate, and fenced function that first checks whether the scsi host object already exists or not, before making the updates. During the first ever exchange config data in the adapter life cycle this will make the exchange config data handler skip over this update step, but we can repeat it later, after we allocated the scsi host object. For any further recovery of that adapter the work flow is only changed slightly because then the scsi host object already exists and we don't free it until we release the adapter completely at the end of its life cycle. Link: https://lore.kernel.org/r/5fc3f4d38d4334f7aa595497c6f7865fb1102e0f.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:47 -04:00
Benjamin Block	978857c7e3	scsi: zfcp: Move shost modification after QDIO (re-)open into fenced function When establishing and activating the QDIO queue pair for a FCP device for the first time, or after an adapter recovery, we publish some of its characteristics to the scsi host object representing that FCP device. When moving the scsi host object allocation and registration to after the first exchange config and exchange port data, this is not possible for the former case - QDIO open for the first time - because that happens before exchange config and exchange port data. Move the scsi host object update into a fenced function that checks whether the object already exists or not. This way we can repeat that step later, once we are past the allocation. Once the first recovery succeeds we don't release the scsi host object anymore, so further recoveries do work as before. Link: https://lore.kernel.org/r/a214ebf508f71e3690113e3e90edab1cea0e24e3.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:46 -04:00
Linus Torvalds	93f3321f65	SCSI misc on 20200410 This is a batch of changes that didn't make it in the initial pull request because the lpfc series had to be rebased to redo an incorrect split. It's basically driver updates to lpfc, target, bnx2fc and ufs with the rest being minor updates except the sr_block_release one which fixes a use after free introduced by the removal of the global mutex in the first patch set. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXpC3hSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishRTaAP9umhxu 8rRnJ5hsxXRmxOUzO5BGe403ffcBeAiEKQ2n3gEAjeoxZAaqKuDDDRfXyRnBpt9Z QuBrgpm1gdXrJT5DDj4= =+4Qg -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull more SCSI updates from James Bottomley: "This is a batch of changes that didn't make it in the initial pull request because the lpfc series had to be rebased to redo an incorrect split. It's basically driver updates to lpfc, target, bnx2fc and ufs with the rest being minor updates except the sr_block_release one which fixes a use after free introduced by the removal of the global mutex in the first patch set" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (35 commits) scsi: core: Add DID_ALLOC_FAILURE and DID_MEDIUM_ERROR to hostbyte_table scsi: ufs: Use ufshcd_config_pwr_mode() when scaling gear scsi: bnx2fc: fix boolreturn.cocci warnings scsi: zfcp: use fallthrough; scsi: aacraid: do not overwrite retval in aac_reset_adapter() scsi: sr: Fix sr_block_release() scsi: aic7xxx: Remove more FreeBSD-specific code scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug scsi: ufs: set device as active power mode after resetting device scsi: iscsi: Report unbind session event when the target has been removed scsi: lpfc: Change default SCSI LUN QD to 64 scsi: libfc: rport state move to PLOGI if all PRLI retry exhausted scsi: libfc: If PRLI rejected, move rport to PLOGI state scsi: bnx2fc: Update the driver version to 2.12.13 scsi: bnx2fc: Fix SCSI command completion after cleanup is posted scsi: bnx2fc: Process the RQE with CQE in interrupt context scsi: target: use the stack for XCOPY passthrough cmds scsi: target: increase XCOPY I/O size scsi: target: avoid per-loop XCOPY buffer allocations scsi: target: drop xcopy DISK BLOCK LENGTH debug ...	2020-04-10 12:21:11 -07:00
Julian Wiedmann	1da1092dbf	s390/qdio: remove cdev from init_data It's no longer needed. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-04-06 13:13:50 +02:00
Julian Wiedmann	d8564e19da	s390/qdio: allow for non-contiguous SBAL array in init_data Upper-layer drivers allocate their SBALs by calling qdio_alloc_buffers() for each individual queue. But when later passing the SBAL addresses to qdio_establish(), they need to be in a single array of pointers. So if the driver uses multiple Input or Output queues, it needs to allocate a temporary array just to present all its SBAL pointers in this layout. This patch slightly changes the format of the QDIO initialization data, so that drivers can pass a per-queue array where each element points to a queue's SBAL array. zfcp doesn't use multiple queues, so the impact there is trivial. For qeth this brings a nice reduction in complexity, and removes a page-sized allocation. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-04-06 13:13:50 +02:00
Julian Wiedmann	ad96401cdb	zfcp: inline zfcp_qdio_setup_init_data() In preparation for a subsequent patch, move the setup of init_data into the only caller. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-04-06 13:13:50 +02:00
Julian Wiedmann	3db1db93e3	s390/qdio: cleanly split alloc and establish All that qdio_allocate() actually uses from the init_data is the cdev, and the number of Input and Output Queues. Have the driver pass those as parameters, and defer the init_data processing into qdio_establish(). This includes writing per-device(!) trace entries, and most of the sanity checks. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-04-06 13:13:50 +02:00
Linus Torvalds	79f51b7b9c	SCSI misc on 20200402 update changing all our txt files to rst ones. Excluding that, we have the usual driver updates (qla2xxx, ufs, lpfc, zfcp, ibmvfc, pm80xx, aacraid), a treewide update for scnprintf and some other minor updates. The major core update is Hannes moving functions out of the aacraid driver and into the core. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXoYKiyYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSasAP4iGwSB Y8tFaZgWadu76+wj5MdqTBoXdhnIuFF0rZG3pQEAiIKdsfQlbSFdm75+gUtx5hG/ GOilX/pJczTRJDCGNis= =g7Sk -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This series has a huge amount of churn because it pulls in Mauro's doc update changing all our txt files to rst ones. Excluding that, we have the usual driver updates (qla2xxx, ufs, lpfc, zfcp, ibmvfc, pm80xx, aacraid), a treewide update for scnprintf and some other minor updates. The major core change is Hannes moving functions out of the aacraid driver and into the core" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (223 commits) scsi: aic7xxx: aic97xx: Remove FreeBSD-specific code scsi: ufs: Do not rely on prefetched data scsi: dc395x: remove dc395x_bios_param scsi: libiscsi: Fix error count for active session scsi: hpsa: correct race condition in offload enabled scsi: message: fusion: Replace zero-length array with flexible-array member scsi: qedi: Add PCI shutdown handler support scsi: qedi: Add MFW error recovery process scsi: ufs: Enable block layer runtime PM for well-known logical units scsi: ufs-qcom: Override devfreq parameters scsi: ufshcd: Let vendor override devfreq parameters scsi: ufshcd: Update the set frequency to devfreq scsi: ufs: Resume ufs host before accessing ufs device scsi: ufs-mediatek: customize the delay for enabling host scsi: ufs: make HCE polling more compact to improve initialization latency scsi: ufs: allow custom delay prior to host enabling scsi: ufs-mediatek: use common delay function scsi: ufs: introduce common and flexible delay function scsi: ufs: use an enum for host capabilities scsi: ufs: fix uninitialized tx_lanes in ufshcd_disable_tx_lcc() ...	2020-04-02 17:03:53 -07:00
Joe Perches	cec9cbac52	scsi: zfcp: use fallthrough; Convert the various uses of fallthrough comments to fallthrough; Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe.com/ Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> [bblock@linux.ibm.com: resolved merge conflict with recently upstream-sent patch "zfcp: expose fabric name as common fc_host sysfs attribute"] Link: https://lore.kernel.org/r/d14669a67a17392490d3184117941123765db1a4.1585663010.git.bblock@linux.ibm.com Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-31 22:24:02 -04:00
Jens Remus	42cabdaf10	scsi: zfcp: log FC Endpoint Security errors Log any FC Endpoint Security errors to the kernel ring buffer with rate- limiting. Link: https://lore.kernel.org/r/20200312174505.51294-11-maier@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:42 -04:00
Jens Remus	e53d92856e	scsi: zfcp: enhance handling of FC Endpoint Security errors Enable for explicit FCP channel FC Endpoint Security error reporting and handle any FSF security errors according to specification. Take the following recovery actions when a FSF_SECURITY_ERROR is reported for the specified FSF commands: - Open Port: Retry the command if possible - Send FCP : Physically close the remote port and reopen For Open Port the command status is set to error, which triggers a retry. For Send FCP the command status is set to error and recovery is triggered to physically reopen the remote port. Link: https://lore.kernel.org/r/20200312174505.51294-10-maier@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:41 -04:00
Jens Remus	616da39e00	scsi: zfcp: trace FC Endpoint Security of FCP devices and connections Trace changes in Fibre Channel Endpoint Security capabilities of FCP devices as well as changes in Fibre Channel Endpoint Security state of their connections to FC remote ports as FC Endpoint Security changes with trace level 3 in HBA DBF. A change in FC Endpoint Security capabilities of FCP devices is traced as response to FSF command FSF_QTCB_EXCHANGE_PORT_DATA with a trace tag of "fsfcesa" and a WWPN of ZFCP_DBF_INVALID_WWPN = 0x0000000000000000 (see FC-FS-4 §18 "Name_Identifier Formats", NAA field). A change in FC Endpoint Security state of connections between FCP devices and FC remote ports is traced as response to FSF command FSF_QTCB_OPEN_PORT_WITH_DID with a trace tag of "fsfcesp". Example trace record of FC Endpoint Security capability change of FCP device formatted with zfcpdbf from s390-tools: Timestamp : ... Area : HBA Subarea : 00 Level : 3 Exception : - CPU ID : ... Caller : 0x... Record ID : 5 ZFCP_DBF_HBA_FCES Tag : fsfcesa FSF FC Endpoint Security adapter Request ID : 0x... Request status : 0x00000010 FSF cmnd : 0x0000000e FSF_QTCB_EXCHANGE_PORT_DATA FSF sequence no: 0x... FSF issued : ... FSF stat : 0x00000000 FSF_GOOD FSF stat qual : n/a Prot stat : n/a Prot stat qual : n/a Port handle : 0x00000000 none (invalid) LUN handle : n/a WWPN : 0x0000000000000000 ZFCP_DBF_INVALID_WWPN FCES old : 0x00000000 old FC Endpoint Security FCES new : 0x00000007 new FC Endpoint Security Example trace record of FC Endpoint Security change of connection to FC remote port formatted with zfcpdbf from s390-tools: Timestamp : ... Area : HBA Subarea : 00 Level : 3 Exception : - CPU ID : ... Caller : 0x... Record ID : 5 ZFCP_DBF_HBA_FCES Tag : fsfcesp FSF FC Endpoint Security port Request ID : 0x... Request status : 0x00000010 FSF cmnd : 0x00000005 FSF_QTCB_OPEN_PORT_WITH_DID FSF sequence no: 0x... FSF issued : ... FSF stat : 0x00000000 FSF_GOOD FSF stat qual : n/a Prot stat : n/a Prot stat qual : n/a Port handle : 0x... WWPN : 0x500507630401120c WWPN FCES old : 0x00000000 old FC Endpoint Security FCES new : 0x00000004 new FC Endpoint Security Link: https://lore.kernel.org/r/20200312174505.51294-9-maier@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:40 -04:00
Jens Remus	f0d26ae847	scsi: zfcp: log FC Endpoint Security of connections Log the usage of and subsequent changes in FC Endpoint Security of connections between FCP devices and FC remote ports to the kernel ring buffer. Activation of FC Endpoint Security is logged as informational. Change and deactivation are logged as warning. No logging takes place, if FC Endpoint Security is not used (i.e. never activated) on a connection or if it does not change during reopen of a port (e.g. due to adapter or port recovery). Link: https://lore.kernel.org/r/20200312174505.51294-8-maier@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:39 -04:00
Jens Remus	a17c784600	scsi: zfcp: report FC Endpoint Security in sysfs Add an interface to read Fibre Channel Endpoint Security information of FCP channels and their connections to FC remote ports. It comes in the form of new sysfs attributes that are attached to the CCW device representing the FCP device and its zfcp port objects. The read-only sysfs attribute "fc_security" of a CCW device representing a FCP device shows the FC Endpoint Security capabilities of the device. Possible values are: "unknown", "unsupported", "none", or a comma- separated list of one or more mnemonics and/or one hexadecimal value representing the supported FC Endpoint Security: Authentication: Authentication supported Encryption : Encryption supported The read-only sysfs attribute "fc_security" of a zfcp port object shows the FC Endpoint Security used on the connection between its parent FCP device and the FC remote port. Possible values are: "unknown", "unsupported", "none", or a mnemonic or hexadecimal value representing the FC Endpoint Security used: Authentication: Connection has been authenticated Encryption : Connection is encrypted Both sysfs attributes may return hexadecimal values instead of mnemonics, if the mnemonic lookup table does not contain an entry for the FC Endpoint Security reported by the FCP device. Link: https://lore.kernel.org/r/20200312174505.51294-7-maier@linux.ibm.com Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:38 -04:00
Jens Remus	185f2d2d59	scsi: zfcp: auto variables for dereferenced structs in open port handler Introduce automatic variables for adapter and QTCB bottom in zfcp_fsf_open_port_handler(). This facilitates subsequent changes to meet the 80 character per line limit. Link: https://lore.kernel.org/r/20200312174505.51294-6-maier@linux.ibm.com Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:36 -04:00
Steffen Maier	7e0e4e0958	scsi: zfcp: fix fc_host attributes that should be unknown on local link down When we get an unsolicited notification on local link went down, zfcp_fsf_status_read_link_down() calls zfcp_fsf_link_down_info_eval(). This only blocks rports, and sets ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED and ZFCP_STATUS_COMMON_ERP_FAILED. Only the fc_host port_state changes to "Linkdown", because zfcp_scsi_get_host_port_state() is an active callback and uses the adapter status. Other fc_host attributes model, port_id, port_type, speed, fabric_name (and zfcp device attributes card_version, peer_wwpn, peer_wwnn, peer_d_id) which depend on a local link, continued to show their last known "good" value. Only if something triggered an exchange config data, some values were updated to their unknown equivalent via case FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE due to local link down. Triggers for exchange config data are adapter recovery, or reading any of the following zfcp-specific scsi host sysfs attributes "requests", "megabytes", or "seconds_active" in /sys/devices/css/../../host/scsi_host/host/. The other fc_host attributes active_fc4s and permanent_port_name continued to show their last known "good" value. Only if something triggered an exchange port data, some values changed. Active_fc4s became all zeros as unknown equivalent during link down. Permanent_port_name does not depend on a local link. But for non-NPIV FCP devices, permanent_port_name erroneously became whatever value fc_host port_name had at that point in time (see previous paragraph). Triggers for exchange port data are the zfcp-specific scsi host sysfs attribute "utilization", or [{reset,get}_fc_host_stats] write anything into "reset_statistics" or read any of the other attributes under /sys/devices/css/../../host/fc_host/host/statistics/. (cf. v4.9 commit `bd77befa5b` ("zfcp: fix fc_host port_type with NPIV")) This is particularly confusing when using "lszfcp -b <fcpdevbusid> -Ha" or dbginfo.sh which read fc_host attributes and also scsi_host attributes. After link down, the first invocation produces (abbreviated): Class = "fc_host" active_fc4s = "0x00 0x00 0x01 0x00 ..." ... fabric_name = "0x10000027f8e04c49" ... permanent_port_name = "0xc05076e4588059c1" port_id = "0x244800" port_state = "Linkdown" port_type = "NPort (fabric via point-to-point)" ... speed = "16 Gbit" Class = "scsi_host" ... megabytes = "0 0" ... requests = "0 0 0" seconds_active = "37" ... utilization = "0 0 0" The second and next invocations produce (abbreviated): Class = "fc_host" active_fc4s = "0x00 0x00 0x00 0x00 ..." ... fabric_name = "0x0" ... permanent_port_name = "0x0" port_id = "0x000000" port_state = "Linkdown" port_type = "Unknown" ... speed = "unknown" Class = "scsi_host" ... megabytes = "0 0" ... requests = "0 0 0" seconds_active = "38" ... utilization = "0 0 0" Factor out the resetting of local link dependent fc_host attributes from zfcp_fsf_exchange_config_data_handler() case FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE into a new helper function zfcp_fsf_fc_host_link_down(). All code places that detect local link down (SRB, FSF_PROT_LINK_DOWN, xconf data/port incomplete) call zfcp_fsf_link_down_info_eval(). Call the new helper from there. This works because zfcp_fsf_link_down_info_eval() and thus the helper is called before zfcp_fsf_exchange_{config,port}_evaluate(). Port_name and node_name are always valid, so never reset them. Get the permanent_port_name from exchange port data unconditionally as it always has a valid known good value, even during link down. Note: Rather than hardcode in zfcp_fsf_exchange_config_evaluate(), fc_host supported_classes could theoretically get its value from fsf_qtcb_bottom_port.class_of_service in zfcp_fsf_exchange_port_evaluate(). When the link comes back, we get a different notification, perform adapter recovery, and this triggers an implicit exchange config data followed by exchange port data filling in the link dependent fc_host attributes with known good values again. Link: https://lore.kernel.org/r/20200312174505.51294-5-maier@linux.ibm.com Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:36 -04:00
Steffen Maier	538c6e910b	scsi: zfcp: wire previously driver-specific sysfs attributes also to fc_host Manufacturer, HBA model, firmware version, and hardware version. Use the same value format as for the driver-specific attributes. Keep the driver-specific attributes for stable user space sysfs API. Link: https://lore.kernel.org/r/20200312174505.51294-4-maier@linux.ibm.com Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:35 -04:00
Steffen Maier	e05a10a055	scsi: zfcp: expose fabric name as common fc_host sysfs attribute FICON Express8S or older, as well as card features newer than FICON Express16S+ have no certain firmware level requirement. FICON Express16S or FICON Express16S+ have the following minimum firmware level requirements to show a proper fabric name value: z13 machine FICON Express16S , MCL P08424.005 , LIC version 0x00000721 z14 machine FICON Express16S , MCL P42611.008 , LIC version 0x10200069 FICON Express16S+ , MCL P42625.010 , LIC version 0x10300147 Otherwise, the read value is not the fabric name. Each FCP channel of these card features might need one SAN fabric re-login after concurrent microcode update in order to show the proper fabric name. Possible ways to trigger a SAN fabric re-login are one of: Pull fibres between FCP channel port and SAN switch port on either side and re-plug, disable SAN switch port adjacent to FCP channel port and re-enable switch port, or at Service Element toggle off all CHPIDs of FCP channel over all LPARs and toggle CHPIDs on again. Zfcp operating subchannels (FCP devices) on such FCP channel recovers a fabric re-login. Initialize fabric name for any topology and have it an invalid WWPN 0x0 for anything but fabric topology. Otherwise for e.g. point-to-point topology one could see the initial -1 from fc_host_setup() and after a link unplug our fabric name would turn to 0x0 (with subsequent commit ("zfcp: fix fc_host attributes that should be unknown on local link down") and stay 0x0 on link replug. I did not initialize to 0x0 somewhere even earlier in the code path such that it would not flap from real to 0x0 to real on e.g. an exchange config data with fabric topology. Link: https://lore.kernel.org/r/20200312174505.51294-3-maier@linux.ibm.com Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:34 -04:00
Steffen Maier	819732be9f	scsi: zfcp: fix missing erp_lock in port recovery trigger for point-to-point v2.6.27 commit `cc8c282963` ("[SCSI] zfcp: Automatically attach remote ports") introduced zfcp automatic port scan. Before that, the user had to use the sysfs attribute "port_add" of an FCP device (adapter) to add and open remote (target) ports, even for the remote peer port in point-to-point topology. That code path did a proper port open recovery trigger taking the erp_lock. Since above commit, a new helper function zfcp_erp_open_ptp_port() performed an UNlocked port open recovery trigger. This can race with other parallel recovery triggers. In zfcp_erp_action_enqueue() this could corrupt e.g. adapter->erp_total_count or adapter->erp_ready_head. As already found for fabric topology in v4.17 commit `fa89adba19` ("scsi: zfcp: fix infinite iteration on ERP ready list"), there was an endless loop during tracing of rport (un)block. A subsequent v4.18 commit `9e156c54ac` ("scsi: zfcp: assert that the ERP lock is held when tracing a recovery trigger") introduced a lockdep assertion for that case. As a side effect, that lockdep assertion now uncovered the unlocked code path for PtP. It is from within an adapter ERP action: zfcp_erp_strategy[1479] intentionally DROPs erp lock around zfcp_erp_strategy_do_action() zfcp_erp_strategy_do_action[1441] NO erp lock zfcp_erp_adapter_strategy[876] NO erp lock zfcp_erp_adapter_strategy_open[855] NO erp lock zfcp_erp_adapter_strategy_open_fsf[806]NO erp lock zfcp_erp_adapter_strat_fsf_xconf[772] erp lock only around zfcp_erp_action_to_running(), BUT _not_ around zfcp_erp_enqueue_ptp_port() zfcp_erp_enqueue_ptp_port[728] BUG: _not_ taking erp lock _zfcp_erp_port_reopen[432] assumes to be called with erp lock zfcp_erp_action_enqueue[314] assumes to be called with erp lock zfcp_dbf_rec_trig[288] _checks_ to be called with erp lock: lockdep_assert_held(&adapter->erp_lock); It causes the following lockdep warning: WARNING: CPU: 2 PID: 775 at drivers/s390/scsi/zfcp_dbf.c:288 zfcp_dbf_rec_trig+0x16a/0x188 no locks held by zfcperp0.0.17c0/775. Fix this by using the proper locked recovery trigger helper function. Link: https://lore.kernel.org/r/20200312174505.51294-2-maier@linux.ibm.com Fixes: `cc8c282963` ("[SCSI] zfcp: Automatically attach remote ports") Cc: <stable@vger.kernel.org> #v2.6.27+ Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-17 13:12:33 -04:00
Linus Torvalds	7557c1b3f7	SCSI fixes on 20200229 Four small fixes. Three are in drivers for fairly obvious bugs. The fourth is a set of regressions introduced by the compat_ioctl changes because some of the compat updates wrongly replaced .ioctl instead of .compat_ioctl. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXlpxDCYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSXsAPwOGPkU ObFbUs75Tdmk1M7jqtxgBsNhuNta0S8d7dJ3aAEA/YBtGGQWoeEGivUKwzwA4cwL 1w1GbhPEblpMNO8keVA= =I7qk -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four small fixes. Three are in drivers for fairly obvious bugs. The fourth is a set of regressions introduced by the compat_ioctl changes because some of the compat updates wrongly replaced .ioctl instead of .compat_ioctl" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places scsi: zfcp: fix wrong data and display format of SFP+ temperature scsi: sd_sbc: Fix sd_zbc_report_zones() scsi: libfc: free response frame from GPN_ID	2020-02-29 09:58:47 -06:00
Benjamin Block	a3fd4bfe85	scsi: zfcp: fix wrong data and display format of SFP+ temperature When implementing support for retrieval of local diagnostic data from the FCP channel, the wrong data format was assumed for the temperature of the local SFP+ connector. The Fibre Channel Link Services (FC-LS-3) specification is not clear on the format of the stored integer, and only after consulting the SNIA specification SFF-8472 did we realize it is stored as two's complement. Thus, the used data and display format is wrong, and highly misleading for users when the temperature should drop below 0°C (however unlikely that may be). To fix this, change the data format in `struct fsf_qtcb_bottom_port` from unsigned to signed, and change the printf format string used to generate `zfcp_sysfs_adapter_diag_sfp_temperature_show()` from `%hu` to `%hd`. Link: https://lore.kernel.org/r/d6e3be5428da5c9490cfff4df7cae868bc9f1a7e.1582039501.git.bblock@linux.ibm.com Fixes: `a10a61e807` ("scsi: zfcp: support retrieval of SFP Data via Exchange Port Data") Fixes: `6028f7c4cd` ("scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver") Cc: <stable@vger.kernel.org> # 5.5+ Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-02-24 12:51:15 -05:00
Julian Wiedmann	2db01da8d2	s390/qdio: fill SBALEs with absolute addresses sbale->addr holds an absolute address (or for some FCP usage, an opaque request ID), and should only be used with proper virt/phys translation. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-02-19 17:26:32 +01:00
Steffen Maier	100843f176	scsi: zfcp: trace channel log even for FCP command responses While v2.6.26 commit `b75db73159` ("[SCSI] zfcp: Add qtcb dump to hba debug trace") is right that we don't want to flood the (payload) trace ring buffer, we don't trace successful FCP command responses by default. So we can include the channel log for problem determination with failed responses of any FSF request type. Fixes: `b75db73159` ("[SCSI] zfcp: Add qtcb dump to hba debug trace") Fixes: `a54ca0f62f` ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") Cc: <stable@vger.kernel.org> #2.6.38+ Link: https://lore.kernel.org/r/e37597b5c4ae123aaa85fd86c23a9f71e994e4a9.1572018132.git.bblock@linux.ibm.com Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:15 -04:00
Steffen Maier	e76acc5194	scsi: zfcp: proper indentation to reduce confusion in zfcp_erp_required_act No functional change. The unary not operator only applies to the sub expression before the logical or. So we return early if (not running) or failed. Link: https://lore.kernel.org/r/df4f897f6e83eaa528465d0858d5a22daac47a2f.1572018132.git.bblock@linux.ibm.com Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:15 -04:00
Benjamin Block	48910f8c35	scsi: zfcp: move maximum age of diagnostic buffers into a per-adapter variable Replace the static define (ZFCP_DIAG_MAX_AGE) with a per-adapter variable (${adapter}->diagnostics->max_age). This new variable is exported via sysfs, along with other, already existing adapter variables, and can both be read and written. This way users can choose how much time should pass between refreshes of diagnostic buffers. The default value for the age remains to be five seconds. By setting this new variable to 0, the caching of diagnostic buffers for userspace accesses can also be completely removed. All diagnostic buffers of a given adapter are subject to this setting in the same way. Link: https://lore.kernel.org/r/b1d0977cc884b16dd4ca6418e4320c56a4c31d63.1572018132.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:15 -04:00
Benjamin Block	8a72db70b5	scsi: zfcp: implicitly refresh config-data diagnostics when reading sysfs Adds implicit updates of cached diagnostics via Exchange Config Data when reading sysfs attributes interfacing them. Right now this only affects the new B2B-Credit diagnostic attribute. This uses the same mechanism previously also used for cached diagnostics of Exchange Port Data. Link: https://lore.kernel.org/r/60a94f55f2630b74b468fed5f39880208abb2679.1572018132.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:15 -04:00
Benjamin Block	5a2876f0d1	scsi: zfcp: introduce sysfs interface to read the local B2B-Credit In addition to the diagnostic data from the local SFP transceiver this patch adds an interface to read the advertised buffer-to-buffer credit from the local FC_Port. With this patch the userspace-interface will only read data stored in the corresponding "diagnostic buffer" (that was stored during completion of a previous Exchange Config Data command). Implicit updating will follow later in this series. Link: https://lore.kernel.org/r/8a53aef87b53c50cfb1a3425b799bacb6f82b832.1572018132.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:15 -04:00
Benjamin Block	8155eb0785	scsi: zfcp: implicitly refresh port-data diagnostics when reading sysfs This patch adds implicit updates to the sysfs entries that read the diagnostic data stored in the "caching buffer" for Exchange Port Data. An update is triggered once the buffer is older than ZFCP_DIAG_MAX_AGE milliseconds (5s). This entails sending an Exchange Port Data command to the FCP-Channel, and during its ingress path updating the cached data and the timestamp. To prevent multiple concurrent userspace-applications from triggering this update in parallel we synchronize all of them using a wait-queue (waiting threads are interruptible; the updating thread is not). Link: https://lore.kernel.org/r/c145b5cfc99a63b6a018b1184fbd27bb09c955f5.1572018132.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:15 -04:00
Benjamin Block	6028f7c4cd	scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver This adds an interface to read the diagnostics of the local SFP transceiver of an FCP-Channel from userspace. This comes in the form of new sysfs entries that are attached to the CCW device representing the FCP device. Each type of data gets its own sysfs entry; the whole collection of entries is pooled into a new child-directory of the CCW device node: "diagnostics". Adds sysfs entries for: * sfp_invalid: boolean value evaluating to whether the following 5 fields are invalid; {0, 1}; 1 - invalid * temperature: transceiver temp.; unit 1/256°C; range [-128°C, +128°C] * vcc: supply voltage; unit 100μV; range [0, 6.55V] * tx_bias: transmitter laser bias current; unit 2μA; range [0, 131mA] * tx_power: coupled TX output power; unit 0.1μW; range [0, 6.5mW] * rx_power: received optical power; unit 0.1μW; range [0, 6.5mW] * optical_port: boolean value evaluating to whether the FCP-Channel has an optical port; {0, 1}; 1 - optical * fec_active: boolean value evaluating to whether 16G FEC is active; {0, 1}; 1 - active * port_tx_type: nibble describing the port type; {0, 1, 2, 3}; 0 - unknown, 1 - short wave, 2 - long wave LC 1310nm, 3 - long wave LL 1550nm * connector_type: two bits describing the connector type; {0, 1}; 0 - unknown, 1 - SFP+ This is only supported if the FCP-Channel in turn supports reporting the SFP Diagnostic Data, otherwise read() on these new entries will return EOPNOTSUPP (this affects only adapters older than FICON Express8S, on Mainframe generations older than z14). Other possible errors for read() include ENOLINK, ENODEV and ENOMEM. With this patch the userspace-interface will only read data stored in the corresponding "diagnostic buffer" (that was stored during completion of an previous Exchange Port Data command). Implicit updating will follow later in this series. Link: https://lore.kernel.org/r/1f9cce7c829c881e7d71a3f10c5b57f3dd84ab32.1572018132.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:15 -04:00
Benjamin Block	a10a61e807	scsi: zfcp: support retrieval of SFP Data via Exchange Port Data A new FCP channel feature allows us to read the diagnostics from our local SFP transceivers. To make use of that add a flag (FSF_FEATURE_REQUEST_SFP_DATA) to the feature-set we request from the FCP channel. Whether the channel actually implements this can be determined via an other new flag (FSF_FEATURE_REPORT_SFP_DATA), that is set in the adapter_features field of the adapter structure after Exchange Config Data finished. Also add the corresponding definitions in the QTCB Bottom for Exchange Port Data. These new definitions are only valid, if FSF_FEATURE_REPORT_SFP_DATA is set. Link: https://lore.kernel.org/r/ee1eba4de71eb06b4d82207ad4f428429346156f.1572018132.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:15 -04:00
Benjamin Block	088210233e	scsi: zfcp: add diagnostics buffer for exchange config data In the same vein as the previous patch, add diagnostic data capture for the Exchange Config Data command. Link: https://lore.kernel.org/r/7d8ac0a6cad403fa8f8b888693476a84e80a277b.1572018131.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:14 -04:00
Benjamin Block	7e418833e6	scsi: zfcp: diagnostics buffer caching and use for exchange port data The FCP channel exposes two central interfaces to receive information about the local FCP-Adapter/-Port: Exchange Port and Exchange Config Data. Using these commands can negatively impact the adapter if we allow them to be sent at a very high rate. The later parts of this patchset will introduce new user-interfaces to receive more diagnostics from the adapter. To prevent any negative impact from using those, this patch adds a simple caching-mechanism that will prevent a malicious/faulty userspace-application from generating an abnormal high amount of Exchange Port/Config Data traffic. Relevant diagnostic data that is received via Exchange Config/Port Data is cached in buffers associated with the corresponding adapter-struct. Each buffer is associated with a timestamp that signals how old the data is, and, added via a following patch in this series, lets userspace-interfaces determine when the data is too old and needs to be updated. Buffer-updates are made during the normal response path of the corresponding command. With this patch only the output of the Exchange Port Data command is captured. Link: https://lore.kernel.org/r/054ca020ce0a53dc0d9176428bea373898944e6a.1572018130.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:14 -04:00
Benjamin Block	92953c6e0a	scsi: zfcp: signal incomplete or error for sync exchange config/port data Adds a new FSF-Request status flag (ZFCP_STATUS_FSFREQ_XDATAINCOMPLETE) that signal that the data received using Exchange Config Data or Exchange Port Data was incomplete. This new flags is set in the respective handlers during the response path. With this patch, only the synchronous FSF-functions for each command got support for the new flag, otherwise it is transparent. Together with this new flag and already existing status flags the synchronous FSF-functions are extended to now detect whether the received data is complete, incomplete or completely invalid (this includes cases where a command ran into a timeout). This is now signaled back to the caller, where previously only failures on the request path would result in a bad return-code. For complete data the return-code remains 0. For incomplete data a new return-code -EAGAIN is added to the function-interface. For completely invalid data the already existing return-code -EIO is reused - formerly this was used to signal failures on the request path. Existing callers of the FSF-functions are adjusted so that they behave as before for return-code 0 and -EAGAIN, to not change the user-interface. As -EIO existed all along, it was already exposed to the user - and needed handling - and will now also be exposed in this new special case. Link: https://lore.kernel.org/r/e14f0702fa2b00a4d1f37c7981a13f2dd1ea2c83.1572018130.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-28 22:16:14 -04:00
Steffen Maier	2190168aae	scsi: zfcp: fix reaction on bit error threshold notification On excessive bit errors for the FCP channel ingress fibre path, the channel notifies us. Previously, we only emitted a kernel message and a trace record. Since performance can become suboptimal with I/O timeouts due to bit errors, we now stop using an FCP device by default on channel notification so multipath on top can timely failover to other paths. A new module parameter zfcp.ber_stop can be used to get zfcp old behavior. User explanation of new kernel message: * Description: * The FCP channel reported that its bit error threshold has been exceeded. * These errors might result from a problem with the physical components * of the local fibre link into the FCP channel. * The problem might be damage or malfunction of the cable or * cable connection between the FCP channel and * the adjacent fabric switch port or the point-to-point peer. * Find details about the errors in the HBA trace for the FCP device. * The zfcp device driver closed down the FCP device * to limit the performance impact from possible I/O command timeouts. * User action: * Check for problems on the local fibre link, ensure that fibre optics are * clean and functional, and all cables are properly plugged. * After the repair action, you can manually recover the FCP device by * writing "0" into its "failed" sysfs attribute. * If recovery through sysfs is not possible, set the CHPID of the device * offline and back online on the service element. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: <stable@vger.kernel.org> #2.6.30+ Link: https://lore.kernel.org/r/20191001104949.42810-1-maier@linux.ibm.com Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-03 21:45:53 -04:00
Linus Torvalds	f65420df91	SCSI fixes on 20190720 This is the final round of mostly small fixes in our initial submit. It's mostly minor fixes and driver updates. The only change of note is adding a virt_boundary_mask to the SCSI host and host template to parametrise this for NVMe devices instead of having them do a call in slave_alloc. It's a fairly straightforward conversion except in the two NVMe handling drivers that didn't set it who now have a virtual infinity parameter added. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXTJS/yYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQTNAQCsTdkA IN1BvDBbE+KO8mvL5DuRxLtnDU6Pq5K6fkrE3gD/a1GkqyPPaJIuspq7fQY87DH/ o7VsJd/5uGphIE2Ls+M= =38XV -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is the final round of mostly small fixes in our initial submit. It's mostly minor fixes and driver updates. The only change of note is adding a virt_boundary_mask to the SCSI host and host template to parametrise this for NVMe devices instead of having them do a call in slave_alloc. It's a fairly straightforward conversion except in the two NVMe handling drivers that didn't set it who now have a virtual infinity parameter added" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits) scsi: megaraid_sas: set an unlimited max_segment_size scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs scsi: IB/srp: set virt_boundary_mask in the scsi host scsi: IB/iser: set virt_boundary_mask in the scsi host scsi: storvsc: set virt_boundary_mask in the scsi host template scsi: ufshcd: set max_segment_size in the scsi host template scsi: core: take the DMA max mapping size into account scsi: core: add a host / host template field for the virt boundary scsi: core: Fix race on creating sense cache scsi: sd_zbc: Fix compilation warning scsi: libfc: fix null pointer dereference on a null lport scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized scsi: zfcp: fix request object use-after-free in send path causing wrong traces scsi: zfcp: fix request object use-after-free in send path causing seqno errors scsi: megaraid_sas: Update driver version to 07.710.50.00 scsi: megaraid_sas: Add module parameter for FW Async event logging scsi: megaraid_sas: Enable msix_load_balance for Invader and later controllers scsi: megaraid_sas: Fix calculation of target ID scsi: lpfc: reduce stack size with CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE scsi: devinfo: BLIST_TRY_VPD_PAGES for SanDisk Cruzer Blade ...	2019-07-20 10:04:58 -07:00
Benjamin Block	4846470888	scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized GCC v9 emits this warning: CC drivers/s390/scsi/zfcp_erp.o drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_action_enqueue': drivers/s390/scsi/zfcp_erp.c:217:26: warning: 'erp_action' may be used uninitialized in this function [-Wmaybe-uninitialized] 217 \| struct zfcp_erp_action erp_action; \| ^~~~~~~~~~ This is a possible false positive case, as also documented in the GCC documentations: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmaybe-uninitialized The actual code-sequence is like this: Various callers can invoke the function below with the argument "want" being one of: ZFCP_ERP_ACTION_REOPEN_ADAPTER, ZFCP_ERP_ACTION_REOPEN_PORT_FORCED, ZFCP_ERP_ACTION_REOPEN_PORT, or ZFCP_ERP_ACTION_REOPEN_LUN. zfcp_erp_action_enqueue(want, ...) ... need = zfcp_erp_required_act(want, ...) need = want ... maybe: need = ZFCP_ERP_ACTION_REOPEN_PORT maybe: need = ZFCP_ERP_ACTION_REOPEN_ADAPTER ... return need ... zfcp_erp_setup_act(need, ...) struct zfcp_erp_action erp_action; // <== line 217 ... switch(need) { case ZFCP_ERP_ACTION_REOPEN_LUN: ... erp_action = &zfcp_sdev->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_PORT: case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: ... erp_action = &port->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_ADAPTER: ... erp_action = &adapter->erp_action; WARN_ON_ONCE(erp_action->port != NULL); // <== access ... break; } ... WARN_ON_ONCE(erp_action->adapter != adapter); // <== access When zfcp_erp_setup_act() is called, 'need' will never be anything else than one of the 4 possible enumeration-names that are used in the switch-case, and 'erp_action' is initialized for every one of them, before it is used. Thus the warning is a false positive, as documented. We introduce the extra if{} in the beginning to create an extra code-flow, so the compiler can be convinced that the switch-case will never see any other value. BUG_ON()/BUG() is intentionally not used to not crash anything, should this ever happen anyway - right now it's impossible, as argued above; and it doesn't introduce a 'default:' switch-case to retain warnings should 'enum zfcp_erp_act_type' ever be extended and no explicit case be introduced. See also v5.0 commit `399b6c8bc9` ("scsi: zfcp: drop old default switch case which might paper over missing case"). Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-07-11 21:04:23 -04:00

1 2 3 4 5 ...

709 Commits