linux

Commit Graph

Author	SHA1	Message	Date
Bart Van Assche	c48e5594d0	target: Move a declaration of a global variable into a header file This patch avoids that sparse reports the following warning: drivers/target/target_core_configfs.c:2267:33: warning: symbol 'target_core_dev_item_ops' was not declared. Should it be static? Fixes: `c17cd24959` ("target/configfs: Kill se_device->dev_link_magic") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Christie <mchristi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 15:15:30 -07:00
Mike Christie	0d44374c1a	tcmu: fix double se_cmd completion If cmd_time_out != 0, then tcmu_queue_cmd_ring could end up sleeping waiting for ring space, timing out and then returning failure to lio, and tcmu_check_expired_cmd could also detect the timeout and call target_complete_cmd on the cmd. This patch just delays setting up the deadline value and adding the cmd to the udev->commands idr until we have allocated ring space and are about to send the cmd to userspace. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 15:01:55 -07:00
Mike Christie	a271eac46a	target: return SAM_STAT_TASK_SET_FULL for TCM_OUT_OF_RESOURCES TCM_OUT_OF_RESOURCES is getting translated to TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE which seems like a heavy error when we just cannot allocate a resource that may be allocatable later. This has us translate TCM_OUT_OF_RESOURCES to SAM_STAT_TASK_SET_FULL instead. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 15:01:49 -07:00
David Disseldorp	55435badda	target: fix ALUA state file path truncation A sufficiently long Unit Serial string, dbroot path, and/or ALUA target portal group name may result in truncation of the ALUA state file path prior to usage. Fix this by using kasprintf() instead. Fixes: `fdddf93226` ("target: use new "dbroot" target attribute") Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 15:00:30 -07:00
David Disseldorp	bdc79f0ed1	target: fix PR state file path truncation If an LIO backstore is configured with a sufficiently long Unit Serial string, alongside a similarly long dbroot path, then a truncated Persistent Reservation APTPL state file path will be used. This truncation can unintentionally lead to two LUs with differing serial numbers sharing PR state file. Fixes: `fdddf93226` ("target: use new "dbroot" target attribute") Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 15:00:17 -07:00
Varun Prakash	1ae01724ae	cxgbit: Abort the TCP connection in case of data out timeout If DDP is programmed for a WRITE cmd and data out timer gets expired then abort the TCP connection before freeing the cmd to avoid any possibility of DDP after freeing the cmd. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:54:40 -07:00
Kenjiro Nakayama	b849b45675	target: Add netlink command reply supported option for each device Currently netlink command reply support option (TCMU_ATTR_SUPP_KERN_CMD_REPLY) can be enabled only on module scope. Because of that, once an application enables the netlink command reply support, all applications using target_core_user.ko would be expected to support the netlink reply. To make matters worse, users will not be able to add a device via configfs manually. To fix these issues, this patch adds an option to make netlink command reply disabled on each device through configfs. Original TCMU_ATTR_SUPP_KERN_CMD_REPLY is still enabled on module scope to keep backward-compatibility and used by default, however once users set nl_reply_supported=<NAGATIVE_VALUE> via configfs for a particular device, the device disables the netlink command reply support. Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:54:07 -07:00
Kenjiro Nakayama	b5ab697c62	target/tcmu: Use macro to call container_of in tcmu_cmd_time_out_show This patch makes a tiny change that using TCMU_DEV in tcmu_cmd_time_out_show so it is consistent with other functions. Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:52:39 -07:00
Xiubo Li	c22adc0b0c	tcmu: fix crash when removing the tcmu device Before the nl REMOVE msg has been sent to the userspace, the ring's and other resources have been released, but the userspace maybe still using them. And then we can see the crash messages like: ring broken, not handling completions BUG: unable to handle kernel paging request at ffffffffffffffd0 IP: tcmu_handle_completions+0x134/0x2f0 [target_core_user] PGD 11bdc0c067 P4D 11bdc0c067 PUD 11bdc0e067 PMD 0 Oops: 0000 [#1] SMP cmd_id not found, ring is broken RIP: 0010:tcmu_handle_completions+0x134/0x2f0 [target_core_user] RSP: 0018:ffffb8a2d8983d88 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffffb8a2aaa4e000 RCX: 00000000ffffffff RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000220 R10: 0000000076c71401 R11: ffff8d2e76c713f0 R12: ffffb8a2aad56bc0 R13: 000000000000001c R14: ffff8d2e32c90000 R15: ffff8d2e76c713f0 FS: 00007f411ffff700(0000) GS:ffff8d1e7fdc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd0 CR3: 0000001027070000 CR4: 00000000001406e0 Call Trace: ? tcmu_irqcontrol+0x2a/0x40 [target_core_user] ? uio_write+0x7b/0xc0 [uio] ? __vfs_write+0x37/0x150 ? __getnstimeofday64+0x3b/0xd0 ? vfs_write+0xb2/0x1b0 ? syscall_trace_enter+0x1d0/0x2b0 ? SyS_write+0x55/0xc0 ? do_syscall_64+0x67/0x150 ? entry_SYSCALL64_slow_path+0x25/0x25 Code: 41 5d 41 5e 41 5f 5d c3 83 f8 01 0f 85 cf 01 00 00 48 8b 7d d0 e8 dd 5c 1d f3 41 0f b7 74 24 04 48 8b 7d c8 31 d2 e8 5c c7 1b f3 <48> 8b 7d d0 49 89 c7 c6 07 00 0f 1f 40 00 4d 85 ff 0f 84 82 01 RIP: tcmu_handle_completions+0x134/0x2f0 [target_core_user] RSP: ffffb8a2d8983d88 CR2: ffffffffffffffd0 And the crash also could happen in tcmu_page_fault and other places. Signed-off-by: Zhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:48:21 -07:00
tangwenji	a0884d489e	iscsi-target: fix memory leak in iscsit_release_discovery_tpg() Need to release the param_list for tpg in iscsi_release_discovery_tpg function, this is also required before the iscsit_load_discovery_tpg function exits abnormally. Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:47:04 -07:00
tangwenji	12d5a43b2d	iscsi-target: fix memory leak in lio_target_tiqn_addtpg() tpg must free when call core_tpg_register() return fail Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:45:25 -07:00
tangwenji	24528f089d	target:fix condition return in core_pr_dump_initiator_port() When is pr_reg->isid_present_at_reg is false,this function should return. This fixes a regression originally introduced by: commit `d2843c173e` Author: Andy Grover <agrover@redhat.com> Date: Thu May 16 10:40:55 2013 -0700 target: Alter core_pr_dump_initiator_port for ease of use Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:45:24 -07:00
tangwenji	a2db857bf9	target: fix match_token option in target_core_configfs.c The match_token function does not recognize the option 'l', so that both the mapped_lun and target_lun parameters can not be resolved correctly. And parsed u64-type parameters should use match_u64(). (Use %u instead of %s for Opt_mapped_lun + Opt_target_lun - nab) Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:45:23 -07:00
tangwenji	79dd6f2fd1	target: add sense code INSUFFICIENT REGISTRATION RESOURCES If a PERSISTENT RESERVE OUT command with a REGISTER service action or a REGISTER AND IGNORE EXISTING KEY service action or REGISTER AND MOVE service action is attempted, but there are insufficient device server resources to complete the operation, then the command shall be terminated with CHECK CONDITION status, with the sense key set to ILLEGAL REQUEST,and the additonal sense code set to INSUFFICIENT REGISTRATION RESOURCES. Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:45:23 -07:00
tangwenji	e437fa3e5d	target: fix double unmap data sg in core_scsi3_emulate_pro_register_and_move() Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:45:22 -07:00
tangwenji	c58a252beb	target: fix buffer offset in core_scsi3_pri_read_full_status When at least two initiators register pr on the same LUN, the target returns the exception data due to buffer offset error, therefore the initiator executes command 'sg_persist -s' may cause the initiator to appear segfault error. This fixes a regression originally introduced by: commit `a85d667e58` Author: Bart Van Assche <bart.vanassche@sandisk.com> Date: Tue May 23 16:48:27 2017 -0700 target: Use {get,put}_unaligned_be*() instead of open coding these functions Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:45:21 -07:00
tangwenji	88fb2fa7db	target: fix null pointer regression in core_tmr_drain_tmr_list The target system kernel crash when the initiator executes the sg_persist -A command,because of the second argument to be set to NULL when core_tmr_lun_reset is called in core_scsi3_pro_preempt function. This fixes a regression originally introduced by: commit `51ec502a32` Author: Bart Van Assche <bart.vanassche@sandisk.com> Date: Tue Feb 14 16:25:54 2017 -0800 target: Delete tmr from list before processing Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Cc: stable@vger.kernel.org # 4.11+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:45:20 -07:00
Jiang Yi	594e25e734	target/file: Do not return error for UNMAP if length is zero The function fd_execute_unmap() in target_core_file.c calles ret = file->f_op->fallocate(file, mode, pos, len); Some filesystems implement fallocate() to return error if length is zero (e.g. btrfs) but according to SCSI Block Commands spec UNMAP should return success for zero length. Signed-off-by: Jiang Yi <jiangyilism@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-11-04 14:45:16 -07:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
Kees Cook	f7c9564a7c	target/iscsi: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Includes a fix for correcting an on-stack timer usage. Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org> Cc: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Jiang Yi <jiangyilism@gmail.com> Cc: Varun Prakash <varun@chelsio.com> Cc: linux-scsi@vger.kernel.org Cc: target-devel@vger.kernel.org Reviewed-and-Tested-by: Bart Van Assche <Bart.VanAssche@wdc.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2017-11-01 11:44:50 -07:00
Bart Van Assche	8a47aa9dc6	target/iscsi: Simplify timer manipulation code Move timer initialization from before add_timer() to the context where the containing object is initialized. Use setup_timer() and mod_timer() instead of open coding these. Use 'jiffies' instead of get_jiffies_64() when calculating expiry times because expiry times have type unsigned long, just like 'jiffies'. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Kees Cook <keescook@chromium.org>	2017-11-01 11:44:49 -07:00
Mark Rutland	6aa7de0591	locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-10-25 11:01:08 +02:00
Bhumika Goyal	ece550b575	target: make config_item_type const Make these structures const as they are either passed to the functions having the argument as const or stored as a reference in the "ci_type" const field of a config_item structure. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-10-19 16:15:17 +02:00
Linus Torvalds	581bfce969	Merge branch 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more set_fs removal from Al Viro: "Christoph's 'use kernel_read and friends rather than open-coding set_fs()' series" * 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: unexport vfs_readv and vfs_writev fs: unexport vfs_read and vfs_write fs: unexport __vfs_read/__vfs_write lustre: switch to kernel_write gadget/f_mass_storage: stop messing with the address limit mconsole: switch to kernel_read btrfs: switch write_buf to kernel_write net/9p: switch p9_fd_read to kernel_write mm/nommu: switch do_mmap_private to kernel_read serial2002: switch serial2002_tty_write to kernel_{read/write} fs: make the buf argument to __kernel_write a void pointer fs: fix kernel_write prototype fs: fix kernel_read prototype fs: move kernel_read to fs/read_write.c fs: move kernel_write to fs/read_write.c autofs4: switch autofs4_write to __kernel_write ashmem: switch to ->read_iter	2017-09-14 18:13:32 -07:00
Linus Torvalds	a0725ab0c7	Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-block Pull block layer updates from Jens Axboe: "This is the first pull request for 4.14, containing most of the code changes. It's a quiet series this round, which I think we needed after the churn of the last few series. This contains: - Fix for a registration race in loop, from Anton Volkov. - Overflow complaint fix from Arnd for DAC960. - Series of drbd changes from the usual suspects. - Conversion of the stec/skd driver to blk-mq. From Bart. - A few BFQ improvements/fixes from Paolo. - CFQ improvement from Ritesh, allowing idling for group idle. - A few fixes found by Dan's smatch, courtesy of Dan. - A warning fixup for a race between changing the IO scheduler and device remova. From David Jeffery. - A few nbd fixes from Josef. - Support for cgroup info in blktrace, from Shaohua. - Also from Shaohua, new features in the null_blk driver to allow it to actually hold data, among other things. - Various corner cases and error handling fixes from Weiping Zhang. - Improvements to the IO stats tracking for blk-mq from me. Can drastically improve performance for fast devices and/or big machines. - Series from Christoph removing bi_bdev as being needed for IO submission, in preparation for nvme multipathing code. - Series from Bart, including various cleanups and fixes for switch fall through case complaints" * 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits) kernfs: checking for IS_ERR() instead of NULL drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set drbd: Fix allyesconfig build, fix recent commit drbd: switch from kmalloc() to kmalloc_array() drbd: abort drbd_start_resync if there is no connection drbd: move global variables to drbd namespace and make some static drbd: rename "usermode_helper" to "drbd_usermode_helper" drbd: fix race between handshake and admin disconnect/down drbd: fix potential deadlock when trying to detach during handshake drbd: A single dot should be put into a sequence. drbd: fix rmmod cleanup, remove _all_ debugfs entries drbd: Use setup_timer() instead of init_timer() to simplify the code. drbd: fix potential get_ldev/put_ldev refcount imbalance during attach drbd: new disk-option disable-write-same drbd: Fix resource role for newly created resources in events2 drbd: mark symbols static where possible drbd: Send P_NEG_ACK upon write error in protocol != C drbd: add explicit plugging when submitting batches drbd: change list_for_each_safe to while(list_first_entry_or_null) drbd: introduce drbd_recv_header_maybe_unplug ...	2017-09-07 11:59:42 -07:00
Christoph Hellwig	e13ec939e9	fs: fix kernel_write prototype Make the position an in/out argument like all the other read/write helpers and and make the buf argument a void pointer. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-09-04 19:05:15 -04:00
Christoph Hellwig	74d46992e0	block: replace bi_bdev with a gendisk pointer and partitions index This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-23 12:49:55 -06:00
Nicholas Bellinger	6f48655fac	target: Fix node_acl demo-mode + uncached dynamic shutdown regression This patch fixes a generate_node_acls = 1 + cache_dynamic_acls = 0 regression, that was introduced by commit `01d4d67355` Author: Nicholas Bellinger <nab@linux-iscsi.org> Date: Wed Dec 7 12:55:54 2016 -0800 which originally had the proper list_del_init() usage, but was dropped during list review as it was thought unnecessary by HCH. However, list_del_init() usage is required during the special generate_node_acls = 1 + cache_dynamic_acls = 0 case when transport_free_session() does a list_del(&se_nacl->acl_list), followed by target_complete_nacl() doing the same thing. This was manifesting as a general protection fault as reported by Justin: kernel: general protection fault: 0000 [#1] SMP kernel: Modules linked in: kernel: CPU: 0 PID: 11047 Comm: iscsi_ttx Not tainted 4.13.0-rc2.x86_64.1+ #20 kernel: Hardware name: Intel Corporation S5500BC/S5500BC, BIOS S5500.86B.01.00.0064.050520141428 05/05/2014 kernel: task: ffff88026939e800 task.stack: ffffc90007884000 kernel: RIP: 0010:target_put_nacl+0x49/0xb0 kernel: RSP: 0018:ffffc90007887d70 EFLAGS: 00010246 kernel: RAX: dead000000000200 RBX: ffff8802556ca000 RCX: 0000000000000000 kernel: RDX: dead000000000100 RSI: 0000000000000246 RDI: ffff8802556ce028 kernel: RBP: ffffc90007887d88 R08: 0000000000000001 R09: 0000000000000000 kernel: R10: ffffc90007887df8 R11: ffffea0009986900 R12: ffff8802556ce020 kernel: R13: ffff8802556ce028 R14: ffff8802556ce028 R15: ffffffff88d85540 kernel: FS: 0000000000000000(0000) GS:ffff88027fc00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007fffe36f5f94 CR3: 0000000009209000 CR4: 00000000003406f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: Call Trace: kernel: transport_free_session+0x67/0x140 kernel: transport_deregister_session+0x7a/0xc0 kernel: iscsit_close_session+0x92/0x210 kernel: iscsit_close_connection+0x5f9/0x840 kernel: iscsit_take_action_for_connection_exit+0xfe/0x110 kernel: iscsi_target_tx_thread+0x140/0x1e0 kernel: ? wait_woken+0x90/0x90 kernel: kthread+0x124/0x160 kernel: ? iscsit_thread_get_cpumask+0x90/0x90 kernel: ? kthread_create_on_node+0x40/0x40 kernel: ret_from_fork+0x22/0x30 kernel: Code: 00 48 89 fb 4c 8b a7 48 01 00 00 74 68 4d 8d 6c 24 08 4c 89 ef e8 e8 28 43 00 48 8b 93 20 04 00 00 48 8b 83 28 04 00 00 4c 89 ef <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 20 kernel: RIP: target_put_nacl+0x49/0xb0 RSP: ffffc90007887d70 kernel: ---[ end trace f12821adbfd46fed ]--- To address this, go ahead and use proper list_del_list() for all cases of se_nacl->acl_list deletion. Reported-by: Justin Maggard <jmaggard01@gmail.com> Tested-by: Justin Maggard <jmaggard01@gmail.com> Cc: Justin Maggard <jmaggard01@gmail.com> Cc: stable@vger.kernel.org # 4.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-08-09 20:55:19 -07:00
Nicholas Bellinger	978d13d60c	iscsi-target: Fix iscsi_np reset hung task during parallel delete This patch fixes a bug associated with iscsit_reset_np_thread() that can occur during parallel configfs rmdir of a single iscsi_np used across multiple iscsi-target instances, that would result in hung task(s) similar to below where configfs rmdir process context was blocked indefinately waiting for iscsi_np->np_restart_comp to finish: [ 6726.112076] INFO: task dcp_proxy_node_:15550 blocked for more than 120 seconds. [ 6726.119440] Tainted: G W O 4.1.26-3321 #2 [ 6726.125045] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 6726.132927] dcp_proxy_node_ D ffff8803f202bc88 0 15550 1 0x00000000 [ 6726.140058] ffff8803f202bc88 ffff88085c64d960 ffff88083b3b1ad0 ffff88087fffeb08 [ 6726.147593] ffff8803f202c000 7fffffffffffffff ffff88083f459c28 ffff88083b3b1ad0 [ 6726.155132] ffff88035373c100 ffff8803f202bca8 ffffffff8168ced2 ffff8803f202bcb8 [ 6726.162667] Call Trace: [ 6726.165150] [<ffffffff8168ced2>] schedule+0x32/0x80 [ 6726.170156] [<ffffffff8168f5b4>] schedule_timeout+0x214/0x290 [ 6726.176030] [<ffffffff810caef2>] ? __send_signal+0x52/0x4a0 [ 6726.181728] [<ffffffff8168d7d6>] wait_for_completion+0x96/0x100 [ 6726.187774] [<ffffffff810e7c80>] ? wake_up_state+0x10/0x10 [ 6726.193395] [<ffffffffa035d6e2>] iscsit_reset_np_thread+0x62/0xe0 [iscsi_target_mod] [ 6726.201278] [<ffffffffa0355d86>] iscsit_tpg_disable_portal_group+0x96/0x190 [iscsi_target_mod] [ 6726.210033] [<ffffffffa0363f7f>] lio_target_tpg_store_enable+0x4f/0xc0 [iscsi_target_mod] [ 6726.218351] [<ffffffff81260c5a>] configfs_write_file+0xaa/0x110 [ 6726.224392] [<ffffffff811ea364>] vfs_write+0xa4/0x1b0 [ 6726.229576] [<ffffffff811eb111>] SyS_write+0x41/0xb0 [ 6726.234659] [<ffffffff8169042e>] system_call_fastpath+0x12/0x71 It would happen because each iscsit_reset_np_thread() sets state to ISCSI_NP_THREAD_RESET, sends SIGINT, and then blocks waiting for completion on iscsi_np->np_restart_comp. However, if iscsi_np was active processing a login request and more than a single iscsit_reset_np_thread() caller to the same iscsi_np was blocked on iscsi_np->np_restart_comp, iscsi_np kthread process context in __iscsi_target_login_thread() would flush pending signals and only perform a single completion of np->np_restart_comp before going back to sleep within transport specific iscsit_transport->iscsi_accept_np code. To address this bug, add a iscsi_np->np_reset_count and update __iscsi_target_login_thread() to keep completing np->np_restart_comp until ->np_reset_count has reached zero. Reported-by: Gary Guo <ghg@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-08-06 14:41:41 -07:00
Varun Prakash	d96adb9b07	cxgbit: fix sg_nents calculation The current logic of calculating sg_nents can fail if data_offset % PAGE_SIZE is not zero. For example - PAGE_SIZE = 4096 data_len = 3072 data_offset = 3072 As per current logic sg_nents = max(1UL, DIV_ROUND_UP(data_len, PAGE_SIZE)); sg_nents = max(1UL, DIV_ROUND_UP(3072, 4096)); sg_nents = 1 But as data_offset % PAGE_SIZE = 3072 we should skip 3072 bytes skip = 3K sg_nents = max(1UL, DIV_ROUND_UP(3K(skip) + 3K(data_len), 4K(PAGE_SIZE)); sg_nents = 2; This patch fixes this issue by adding skip to data_len. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-30 15:35:34 -07:00
Varun Prakash	310d40a973	iscsi-target: fix invalid flags in text response In case of multiple text responses iscsi-target sets both 'F' and 'C' bit for the final text response pdu, this issue happens because hdr->flags is not zeroed out before ORing with 'F' bit. This patch removes the \| operator to fix this issue. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-30 15:32:48 -07:00
Varun Prakash	ea8dc5b4cd	iscsi-target: fix memory leak in iscsit_setup_text_cmd() On receiving text request iscsi-target allocates buffer for payload in iscsit_handle_text_cmd() and assigns buffer pointer to cmd->text_in_ptr, this buffer is currently freed in iscsit_release_cmd(), if iscsi-target sets 'C' bit in text response then it will receive another text request from the initiator with ttt != 0xffffffff in this case iscsi-target will find cmd using itt and call iscsit_setup_text_cmd() which will set cmd->text_in_ptr to NULL without freeing previously allocated buffer. This patch fixes this issue by calling kfree(cmd->text_in_ptr) in iscsit_setup_text_cmd() before assigning NULL to it. For the first text request cmd->text_in_ptr is NULL as cmd is memset to 0 in iscsit_allocate_cmd(). Signed-off-by: Varun Prakash <varun@chelsio.com> Cc: <stable@vger.kernel.org> # 4.0+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-30 15:31:34 -07:00
Varun Prakash	66b59f9b1f	cxgbit: add missing __kfree_skb() Call __kfree_skb() after processing skb to avoid memory leak. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-30 15:26:00 -07:00
Bryant G. Ly	ededd039d1	tcmu: free old string on reconfig On initial tcmu_configure_device call the info->name would have already been allocated and set, so on the second call make sure to free it first. Reported-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-30 15:23:21 -07:00
Xiubo Li	c542942cb4	tcmu: Fix possible to/from address overflow when doing the memcpy For most case the sg->length equals to PAGE_SIZE, so this bug won't be triggered. Otherwise this will crash the kernel, for example when all segments' sg->length equal to 1K. Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-30 15:12:32 -07:00
Linus Torvalds	52f6c588c7	Add wait_for_random_bytes() and get_random__wait() functions so that callers can more safely get random bytes if they can block until the CRNG is initialized. Also print a warning if get_random_() is called before the CRNG is initialized. By default, only one single-line warning will be printed per boot. If CONFIG_WARN_ALL_UNSEEDED_RANDOM is defined, then a warning will be printed for each function which tries to get random bytes before the CRNG is initialized. This can get spammy for certain architecture types, so it is not enabled by default. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAllqXNUACgkQ8vlZVpUN gaPtAgf/aUbXZuWYsDQzslHsbzEWi+qz4QgL885/w4L00pEImTTp91Q06SDxWhtB KPvGnZHS3IofxBh2DC+6AwN6dPMoWDCfYhhO6po3FSz0DiPRIQCTuvOb8fhKY1X7 rTdDq2xtDxPGxJ25bMJtlrgzH2XlXPpVyPUeoc9uh87zUK5aesXpUn9kBniRexoz ume+M/cDzPKkwNQpbLq8vzhNjoWMVv0FeW2akVvrjkkWko8nZLZ0R/kIyKQlRPdG LZDXcz0oTHpDS6+ufEo292ZuWm2IGer2YtwHsKyCAsyEWsUqBz2yurtkSj3mAVyC hHafyS+5WNaGdgBmg0zJxxwn5qxxLg== =ua7p -----END PGP SIGNATURE----- Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull random updates from Ted Ts'o: "Add wait_for_random_bytes() and get_random__wait() functions so that callers can more safely get random bytes if they can block until the CRNG is initialized. Also print a warning if get_random_() is called before the CRNG is initialized. By default, only one single-line warning will be printed per boot. If CONFIG_WARN_ALL_UNSEEDED_RANDOM is defined, then a warning will be printed for each function which tries to get random bytes before the CRNG is initialized. This can get spammy for certain architecture types, so it is not enabled by default" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: reorder READ_ONCE() in get_random_uXX random: suppress spammy warnings about unseeded randomness random: warn when kernel uses unseeded randomness net/route: use get_random_int for random counter net/neighbor: use get_random_u32 for 32-bit hash random rhashtable: use get_random_u32 for hash_rnd ceph: ensure RNG is seeded before using iscsi: ensure RNG is seeded before use cifs: use get_random_u32 for 32-bit lock random random: add get_random_{bytes,u32,u64,int,long,once}_wait family random: add wait_for_random_bytes() API	2017-07-15 12:44:02 -07:00
Linus Torvalds	48ea2cedde	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "It's been usually busy for summer, with most of the efforts centered around TCMU developments and various target-core + fabric driver bug fixing activities. Not particularly large in terms of LoC, but lots of smaller patches from many different folks. The highlights include: - ibmvscsis logical partition manager support (Michael Cyr + Bryant Ly) - Convert target/iblock WRITE_SAME to blkdev_issue_zeroout (hch + nab) - Add support for TMR percpu LUN reference counting (nab) - Fix a potential deadlock between EXTENDED_COPY and iscsi shutdown (Bart) - Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce (Jiang Yi) - Fix TMCU module removal (Xiubo Li) - Fix iser-target OOPs during login failure (Andrea Righi + Sagi) - Breakup target-core free_device backend driver callback (mnc) - Perform TCMU add/delete/reconfig synchronously (mnc) - Fix TCMU multiple UIO open/close sequences (mnc) - Fix TCMU CHECK_CONDITION sense handling (mnc) - Fix target-core SAM_STAT_BUSY + TASK_SET_FULL handling (mnc + nab) - Introduce TYPE_ZBC support in PSCSI (Damien Le Moal) - Fix possible TCMU memory leak + OOPs when recalculating cmd base size (Xiubo Li + Bryant Ly + Damien Le Moal + mnc) - Add login_keys_workaround attribute for non RFC initiators (Robert LeBlanc + Arun Easi + nab)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (68 commits) iscsi-target: Add login_keys_workaround attribute for non RFC initiators Revert "qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT" tcmu: clean up the code and with one small fix tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size target: export lio pgr/alua support as device attr target: Fix return sense reason in target_scsi3_emulate_pr_out target: Fix cmd size for PR-OUT in passthrough_parse_cdb tcmu: Fix dev_config_store target: pscsi: Introduce TYPE_ZBC support target: Use macro for WRITE_VERIFY_32 operation codes target: fix SAM_STAT_BUSY/TASK_SET_FULL handling target: remove transport_complete pscsi: finish cmd processing from pscsi_req_done tcmu: fix sense handling during completion target: add helper to copy sense to se_cmd buffer target: do not require a transport_complete for SCF_TRANSPORT_TASK_SENSE target: make device_mutex and device_list static tcmu: Fix flushing cmd entry dcache page tcmu: fix multiple uio open/close sequences tcmu: drop configured check in destroy ...	2017-07-13 14:27:32 -07:00
Michal Hocko	dcda9b0471	mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic __GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to the page allocator. This has been true but only for allocations requests larger than PAGE_ALLOC_COSTLY_ORDER. It has been always ignored for smaller sizes. This is a bit unfortunate because there is no way to express the same semantic for those requests and they are considered too important to fail so they might end up looping in the page allocator for ever, similarly to GFP_NOFAIL requests. Now that the whole tree has been cleaned up and accidental or misled usage of __GFP_REPEAT flag has been removed for !costly requests we can give the original flag a better name and more importantly a more useful semantic. Let's rename it to __GFP_RETRY_MAYFAIL which tells the user that the allocator would try really hard but there is no promise of a success. This will work independent of the order and overrides the default allocator behavior. Page allocator users have several levels of guarantee vs. cost options (take GFP_KERNEL as an example) - GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_ attempt to free memory at all. The most light weight mode which even doesn't kick the background reclaim. Should be used carefully because it might deplete the memory and the next user might hit the more aggressive reclaim - GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic allocation without any attempt to free memory from the current context but can wake kswapd to reclaim memory if the zone is below the low watermark. Can be used from either atomic contexts or when the request is a performance optimization and there is another fallback for a slow path. - (GFP_KERNEL\|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) - non sleeping allocation with an expensive fallback so it can access some portion of memory reserves. Usually used from interrupt/bh context with an expensive slow path fallback. - GFP_KERNEL - both background and direct reclaim are allowed and the _default_ page allocator behavior is used. That means that !costly allocation requests are basically nofail but there is no guarantee of that behavior so failures have to be checked properly by callers (e.g. OOM killer victim is allowed to fail currently). - GFP_KERNEL \| __GFP_NORETRY - overrides the default allocator behavior and all allocation requests fail early rather than cause disruptive reclaim (one round of reclaim in this implementation). The OOM killer is not invoked. - GFP_KERNEL \| __GFP_RETRY_MAYFAIL - overrides the default allocator behavior and all allocation requests try really hard. The request will fail if the reclaim cannot make any progress. The OOM killer won't be triggered. - GFP_KERNEL \| __GFP_NOFAIL - overrides the default allocator behavior and all allocation requests will loop endlessly until they succeed. This might be really dangerous especially for larger orders. Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL because they already had their semantic. No new users are added. __alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if there is no progress and we have already passed the OOM point. This means that all the reclaim opportunities have been exhausted except the most disruptive one (the OOM killer) and a user defined fallback behavior is more sensible than keep retrying in the page allocator. [akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c] [mhocko@suse.com: semantic fix] Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz [mhocko@kernel.org: address other thing spotted by Vlastimil] Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alex Belits <alex.belits@cavium.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David Daney <david.daney@cavium.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: NeilBrown <neilb@suse.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-07-12 16:26:03 -07:00
Linus Torvalds	130568d5ea	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull more block updates from Jens Axboe: "This is a followup for block changes, that didn't make the initial pull request. It's a bit of a mixed bag, this contains: - A followup pull request from Sagi for NVMe. Outside of fixups for NVMe, it also includes a series for ensuring that we properly quiesce hardware queues when browsing live tags. - Set of integrity fixes from Dmitry (mostly), fixing various issues for folks using DIF/DIX. - Fix for a bug introduced in cciss, with the req init changes. From Christoph. - Fix for a bug in BFQ, from Paolo. - Two followup fixes for lightnvm/pblk from Javier. - Depth fix from Ming for blk-mq-sched. - Also from Ming, performance fix for mtip32xx that was introduced with the dynamic initialization of commands" * 'for-linus' of git://git.kernel.dk/linux-block: (44 commits) block: call bio_uninit in bio_endio nvmet: avoid unneeded assignment of submit_bio return value nvme-pci: add module parameter for io queue depth nvme-pci: compile warnings in nvme_alloc_host_mem() nvmet_fc: Accept variable pad lengths on Create Association LS nvme_fc/nvmet_fc: revise Create Association descriptor length lightnvm: pblk: remove unnecessary checks lightnvm: pblk: control I/O flow also on tear down cciss: initialize struct scsi_req null_blk: fix error flow for shared tags during module_init block: Fix __blkdev_issue_zeroout loop nvme-rdma: unconditionally recycle the request mr nvme: split nvme_uninit_ctrl into stop and uninit virtio_blk: quiesce/unquiesce live IO when entering PM states mtip32xx: quiesce request queues to make sure no submissions are inflight nbd: quiesce request queues to make sure no submissions are inflight nvme: kick requeue list when requeueing a request instead of when starting the queues nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues ...	2017-07-11 15:36:52 -07:00
Nicholas Bellinger	138d351eef	iscsi-target: Add login_keys_workaround attribute for non RFC initiators This patch re-introduces part of a long standing login workaround that was recently dropped by: commit `1c99de981f` Author: Nicholas Bellinger <nab@linux-iscsi.org> Date: Sun Apr 2 13:36:44 2017 -0700 iscsi-target: Drop work-around for legacy GlobalSAN initiator Namely, the workaround for FirstBurstLength ended up being required by Mellanox Flexboot PXE boot ROMs as reported by Robert. So this patch re-adds the work-around for FirstBurstLength within iscsi_check_proposer_for_optional_reply(), and makes the key optional to respond when the initiator does not propose, nor respond to it. Also as requested by Arun, this patch introduces a new TPG attribute named 'login_keys_workaround' that controls the use of both the FirstBurstLength workaround, as well as the two other existing workarounds for gPXE iSCSI boot client. By default, the workaround is enabled with login_keys_workaround=1, since Mellanox FlexBoot requires it, and Arun has verified the Qlogic MSFT initiator already proposes FirstBurstLength, so it's uneffected by this re-adding this part of the original work-around. Reported-by: Robert LeBlanc <robert@leblancnet.us> Cc: Robert LeBlanc <robert@leblancnet.us> Reviewed-by: Arun Easi <arun.easi@cavium.com> Cc: <stable@vger.kernel.org> # 3.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-11 10:56:39 -07:00
Xiubo Li	daf78c3051	tcmu: clean up the code and with one small fix Remove useless blank line and code and at the same time add one error path to catch the errors. Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-11 10:48:07 -07:00
Xiubo Li	b3743c71b7	tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size For all the entries allocated from the ring cmd area, the memory is something like the stack memory, which will always reserve the old data, so the entry->req.iov_bidi_cnt maybe none zero. On some environments, the crash could be reproduce very easy and some not. The following is the crash core trace as reported by Damien: [ 240.143969] CPU: 0 PID: 1285 Comm: iscsi_trx Not tainted 4.12.0-rc1+ #3 [ 240.150607] Hardware name: ASUS All Series/H87-PRO, BIOS 2104 10/28/2014 [ 240.157331] task: ffff8807de4f5800 task.stack: ffffc900047dc000 [ 240.163270] RIP: 0010:memcpy_erms+0x6/0x10 [ 240.167377] RSP: 0018:ffffc900047dfc68 EFLAGS: 00010202 [ 240.172621] RAX: ffffc9065db85540 RBX: ffff8807f7980000 RCX: 0000000000000010 [ 240.179771] RDX: 0000000000000010 RSI: ffff8807de574fe0 RDI: ffffc9065db85540 [ 240.186930] RBP: ffffc900047dfd30 R08: ffff8807de41b000 R09: 0000000000000000 [ 240.194088] R10: 0000000000000040 R11: ffff8807e9b726f0 R12: 00000006565726b0 [ 240.201246] R13: ffffc90007612ea0 R14: 000000065657d540 R15: 0000000000000000 [ 240.208397] FS: 0000000000000000(0000) GS:ffff88081fa00000(0000) knlGS:0000000000000000 [ 240.216510] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 240.222280] CR2: ffffc9065db85540 CR3: 0000000001c0f000 CR4: 00000000001406f0 [ 240.229430] Call Trace: [ 240.231887] ? tcmu_queue_cmd+0x83c/0xa80 [ 240.235916] ? target_check_reservation+0xcd/0x6f0 [ 240.240725] __target_execute_cmd+0x27/0xa0 [ 240.244918] target_execute_cmd+0x232/0x2c0 [ 240.249124] ? __local_bh_enable_ip+0x64/0xa0 [ 240.253499] iscsit_execute_cmd+0x20d/0x270 [ 240.257693] iscsit_sequence_cmd+0x110/0x190 [ 240.261985] iscsit_get_rx_pdu+0x360/0xc80 [ 240.267565] ? iscsi_target_rx_thread+0x54/0xd0 [ 240.273571] iscsi_target_rx_thread+0x9a/0xd0 [ 240.279413] kthread+0x113/0x150 [ 240.284120] ? iscsi_target_tx_thread+0x1e0/0x1e0 [ 240.290297] ? kthread_create_on_node+0x40/0x40 [ 240.296297] ret_from_fork+0x2e/0x40 [ 240.301332] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 [ 240.321751] RIP: memcpy_erms+0x6/0x10 RSP: ffffc900047dfc68 [ 240.328838] CR2: ffffc9065db85540 [ 240.333667] ---[ end trace b7e5354cfb54d08b ]--- To fix this, just memset all the entry memory before using it, and also to be more readable we adjust the bidi code. Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area memories) Reported-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reported-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Cc: <stable@vger.kernel.org> # 4.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-11 10:47:58 -07:00
Mike Christie	c17d5d5f51	target: export lio pgr/alua support as device attr Older kernels could crash or hang if the user write/read some ALUA files with pscsi and tcmu backends. This patch exports if LIO supports executing PGR and ALUA scsi commands/checks for the se_device, so userspace can easily test. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-10 20:05:27 -07:00
Tang Wenji	7ee0031786	target: Fix return sense reason in target_scsi3_emulate_pr_out The sense reason should be TCM_PARAMETER_LIST_LENGTH_ERROR when parmeter length error. Also the cdb[1] & 0x1f has been assigned to local variable sa, so use sa instead of it. Signed-off-by: Tang Wenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-09 20:59:17 -07:00
Tang Wenji	388fe6996b	target: Fix cmd size for PR-OUT in passthrough_parse_cdb The cmd size should be 4bytes form byte5 to byte8 when CDB opcode is PERSISTENT_RESERVE_OUT in SPC3 and SPC4 (Also fix up the same in spc_parse_cdb - MNC) Signed-off-by: Tang Wenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-09 20:58:49 -07:00
Bryant G. Ly	de8c5221aa	tcmu: Fix dev_config_store Currently when there is a reconfig, the uio_info->name does not get updated to reflect the change in the dev_config name change. On restart tcmu-runner there will be a mismatch between the dev_config string in uio and the tcmu structure that contains the string. When this occurs it'll reload the one in uio and you lose the reconfigured device path. v2: Created a helper function for the updating of uio_info Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-09 20:57:56 -07:00
Damien Le Moal	016a5fec19	target: pscsi: Introduce TYPE_ZBC support TYPE_ZBC host managed zoned block devices are also block devices despite the non-standard device type (14h). Handle them similarly to regular TYPE_DISK devices. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:49 -07:00
Damien Le Moal	e5dc9a7055	target: Use macro for WRITE_VERIFY_32 operation codes Add WRITE_VERIFY_32 definition to scsi prototypes and use this macro definition isntead of the hard coded value. (Drop WRITE_VERIFY_16 that's already part of another patch - nab) Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:49 -07:00
Mike Christie	402242c904	target: fix SAM_STAT_BUSY/TASK_SET_FULL handling If the scsi status was not SAM_STAT_GOOD or there was no transport sense, we would ignore the scsi status and do a generic not ready LUN communication failure check condition failure. The problem is that LUN COMM failure is treated as a hard error sometimes and will cause apps to get IO errors instead of the OS's SCSI layer retrying. For example, the tcmu daemon will return SAM_STAT_QUEUE_FULL when memory runs low and can still make progress but wants the initiator to reduce the work load. Windows will fail this error directly the app instead of retrying. This patch is based on Nick's "target/iblock: Use -EAGAIN/-ENOMEM to propigate SAM BUSY/TASK_SET_FULL" patch here: http://comments.gmane.org/gmane.linux.scsi.target.devel/11301 but instead of only setting SAM_STAT_GOOD, SAM_STAT_TASK_SET_FULL and SAM_STAT_BUSY as success, it sets all non check condition status as success so they are passed back to the initiator, so passthrough type backends can return all SCSI status codes. Since only passthrough uses this, I was not sure if we wanted to add checks for non-passthrough and specific codes. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:48 -07:00
Mike Christie	1a44417548	target: remove transport_complete transport_complete is no longer used, so drop the code. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:47 -07:00
Mike Christie	dce6ce8cfb	pscsi: finish cmd processing from pscsi_req_done This patch performs the pscsi_transport_complete operations from pscsi_req_done. It looks like the only difference the transport_complete callout provides is that it is called under t_state_lock which seems to only be needed for the SCF_TRANSPORT_TASK_SENSE bit handling. We can now use transport_copy_sense_to_cmd to handle the se_cmd sense bits, and we can then drop the code where we have to copy the request info to the pscsi_plugin_task for transport_complete use. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:46 -07:00
Mike Christie	406f74c20d	tcmu: fix sense handling during completion We were just copying the sense to the cmd sense_buffer and did not implement a transport_complete or set the SCF_TRANSPORT_TASK_SENSE, so the sense was ignored. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:46 -07:00
Mike Christie	c6d66aba98	target: add helper to copy sense to se_cmd buffer This adds a helper to copy sense from backend module buffer to the se_cmd's sense buffer. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:45 -07:00
Mike Christie	9fe3698450	target: do not require a transport_complete for SCF_TRANSPORT_TASK_SENSE tcmu needs to pass raw sense to target_complete_cmd, but a a transport_complete callout is akward to implement for it. This moves the check for SCF_TRANSPORT_TASK_SENSE so any backend can pass sense. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:44 -07:00
Colin Ian King	c82ff239ec	target: make device_mutex and device_list static Variables device_mutex and device_list static are local to the source, so make them static. Cleans up sparse warnings: "symbol 'device_list' was not declared. Should it be static?" "symbol 'device_mutex' was not declared. Should it be static?" Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:44 -07:00
Xiubo Li	9d62bc0e6d	tcmu: Fix flushing cmd entry dcache page When feeding the tcmu's cmd ring, we need to flush the dcache page for the cmd entry to make sure these kernel stores are visible to user space mappings of that page. For the none PAD cmd entry, this will be flushed at the end of the tcmu_queue_cmd_ring(). Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:43 -07:00
Mike Christie	9260695d65	tcmu: fix multiple uio open/close sequences If the uio device is open and closed multiple times, the kref count will be off due to tcmu_release getting called multiple times for each close. This patch integrates Wenji Tang's patch to add a kref_get on open that now matches the kref_put done on tcmu_release and adds a kref_put in tcmu_destroy_device to match the kref_get done in succesful tcmu_configure_device calls. Signed-off-by: Mike Christie <mchristi@redhat.com> Cc: Wenji Tang <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:42 -07:00
Mike Christie	531283ff75	tcmu: drop configured check in destroy destroy_device is only called if we have successfully run configure_device, so drop the duplicate tcmu_dev_configured check. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:41 -07:00
Mike Christie	be50f538e9	target: remove g_device_list g_device_list is no longer needed because we now use the idr code for lookups and seaches. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:41 -07:00
Mike Christie	6906d008b4	xcopy: loop over devices using idr helper This converts the xcopy code to use the idr helper. The next patch will drop the g_device_list and make g_device_mutex local to the target_core_device.c file. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:40 -07:00
Mike Christie	b1943fd454	target: add helper to iterate over devices This adds a wrapper around idr_for_each so the xcopy code can loop over the devices in the next patch. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:39 -07:00
Mike Christie	b3af66e243	tcmu: perfom device add, del and reconfig synchronously This makes the device add, del reconfig operations sync. It fixes the issue where for add and reconfig, we do not know if userspace successfully completely the operation, so we leave invalid kernel structs or report incorrect status for the config/reconfig operations. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:39 -07:00
Mike Christie	85441e6b8c	target: add helper to find se_device by dev_index This adds a helper to find a se_device by dev_index. It will be used in the next patches so tcmu's netlink interface can execute commands on specific devices. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:38 -07:00
Mike Christie	0a5eee647b	target: use idr for se_device dev index In the next patches we will add tcmu netlink support that allows userspace to send commands to target_core_user. To execute operations on a se_device/tcmu_dev we need to be able to look up a dev by any old id. This patch replaces the se_device->dev_index with a idr created id. The next patches will also remove the g_device_list and replace it with the idr. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:37 -07:00
Mike Christie	926347061e	target: break up free_device callback With this patch free_device is now used to free what is allocated in the alloc_device callback and destroy_device tears down the resources that are setup in the configure_device callback. This patch will be needed in the next patch where tcmu needs to be able to look up the device in the destroy callback. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:37 -07:00
Mike Christie	2d76443e02	tcmu: reconfigure netlink attr changes 1. TCMU_ATTR_TYPE is too generic when it describes only the reconfiguration type, so rename to TCMU_ATTR_RECONFIG_TYPE. 2. Only return the reconfig type when it is a TCMU_CMD_RECONFIG_DEVICE command. 3. CONFIG_* type is not needed. We can pass the value along with an ATTR to userspace, so it does not need to read sysfs/configfs. 4. Fix leak in tcmu_dev_path_store and rename to dev_config to reflect it is more than just a path that can be changed. 6. Don't update kernel struct value if netlink sending fails. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: "Bryant G. Ly" <bryantly@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:36 -07:00
Colin Ian King	5821783bca	tcmu: make array tcmu_attrib_attrs static const The array tcmu_attrib_attrs does not need to be in global scope, so make it static. Cleans up sparse warning: "symbol 'tcmu_attrib_attrs' was not declared. Should it be static?" Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:34 -07:00
Xiubo Li	07932a023a	tcmu: Fix module removal due to stuck unmap_thread thread again Because the unmap code just after the schdule() returned may take a long time and if the kthread_stop() is fired just when in this routine, the module removal maybe stuck too. Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:33 -07:00
Jiang Yi	1d6ef27659	target: Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce This patch addresses a COMPARE_AND_WRITE se_device->caw_sem leak, that would be triggered during normal se_cmd shutdown or abort via __transport_wait_for_tasks(). This would occur because target_complete_cmd() would catch this early and do complete_all(&cmd->t_transport_stop_comp), but since target_complete_ok_work() or target_complete_failure_work() are never called to invoke se_cmd->transport_complete_callback(), the COMPARE_AND_WRITE specific callbacks never release caw_sem. To address this special case, go ahead and release caw_sem directly from target_complete_cmd(). (Remove '&& success' from check, to release caw_sem regardless of scsi_status - nab) Signed-off-by: Jiang Yi <jiangyilism@gmail.com> Cc: <stable@vger.kernel.org> # 3.14+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:32 -07:00
Bryant G. Ly	8a45885c15	tcmu: Add Type of reconfig into netlink This patch adds more info about the attribute being changed, so that usersapce can easily figure out what is happening. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-By: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:32 -07:00
Bryant G. Ly	ee01825220	tcmu: Make dev_config configurable This allows for userspace to change the device path after it has been created. Thus giving the user the ability to change the path. The use case for this is to allow for virtual optical to have media change. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-By: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:31 -07:00
Bryant G. Ly	801fc54d5d	tcmu: Make dev_size configurable via userspace Allow tcmu backstores to be able to set the device size after it has been configured via set attribute. Part of support in userspace to support certain backstores changing device size. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-By: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:30 -07:00
Bryant G. Ly	1068be7bd4	tcmu: Add netlink for device reconfiguration This gives tcmu the ability to handle events that can cause reconfiguration, such as resize, path changes, write_cache, etc... Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-By: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:30 -07:00
Bryant G. Ly	9a8bb60650	tcmu: Support emulate_write_cache This will enable the toggling of write_cache in tcmu through targetcli-fb Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-By: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:29 -07:00
Bart Van Assche	8fa4011e0d	target/iscsi: Remove dead code from iscsit_process_scsi_cmd() If an iSCSI command is rejected before iscsit_process_scsi_cmd() is called, .reject_reason is set but iscsit_process_scsi_cmd() is not called. This means that the "if (cmd->reject_reason) ..." code in this function can be removed without changing the behavior of this function. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:28 -07:00
Bart Van Assche	d1c26857cd	target/iscsi: Simplify iscsit_free_cmd() Since .se_tfo is only set if a command has been submitted to the LIO core, check .se_tfo instead of .iscsi_opcode. Since __iscsit_free_cmd() only affects SCSI commands but not TMFs, calling that function for TMFs does not change behavior. This patch does not change the behavior of iscsit_free_cmd(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:27 -07:00
Bart Van Assche	4412a67131	target/iscsi: Remove second argument of __iscsit_free_cmd() Initialize .data_direction to DMA_NONE in iscsit_allocate_cmd() such that the second argument of __iscsit_free_cmd() can be left out. Note: this patch causes the first part of __iscsit_free_cmd() no longer to be skipped for TMFs. That's fine since no data segments are associated with TMFs. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:26 -07:00
Bart Van Assche	4c1f0e6539	target/tcm_loop: Make TMF processing slightly faster Target drivers must guarantee that struct se_cmd and struct se_tmr_req exist as long as target_tmr_work() is in progress. This is why the tcm_loop driver today passes 1 as second argument to transport_generic_free_cmd() from inside the TMF code. Instead of making the TMF code wait, make the TMF code obtain two references (SCF_ACK_KREF) and drop one reference from inside the .check_stop_free() callback. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:25 -07:00
Bart Van Assche	75f141aaf4	target/tcm_loop: Use target_submit_tmr() instead of open-coding this function Use target_submit_tmr() instead of open-coding this function. The only functional change is that TMFs are now added to sess_cmd_list, something the current code does not do. This behavior change is a bug fix because it makes LUN RESETs wait for other TMFs that are in progress for the same LUN. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:25 -07:00
Bart Van Assche	d17203c411	target/tcm_loop: Replace a waitqueue and a counter by a completion This patch simplifies the implementation of the tcm_loop driver but does not change its behavior. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:24 -07:00
Bart Van Assche	4d3895d5ea	target/tcm_loop: Merge struct tcm_loop_cmd and struct tcm_loop_tmr This patch simplifies the tcm_loop implementation but does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 23:11:23 -07:00
Bart Van Assche	c00e622023	target: Introduce a function that shows the command state Introduce target_show_cmd() and use it where appropriate. If transport_wait_for_tasks() takes too long, make it show the state of the command it is waiting for. (Add missing brackets around multi-line conditions - nab) Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:58:04 -07:00
Nicholas Bellinger	03db016a1b	iscsi-target: Kill left-over iscsi_target_do_cleanup With commit `25cdda95fd` in place to address the initial login PDU asynchronous socket close OOPs, go ahead and kill off the left-over iscsi_target_do_cleanup() and ->login_cleanup_work. Reported-by: Mike Christie <mchristi@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:58:04 -07:00
Bart Van Assche	d877d7275b	target: Fix a deadlock between the XCOPY code and iSCSI session shutdown Move the code for parsing an XCOPY command from the context of the iSCSI receiver thread to the context of the XCOPY workqueue. Keep the simple XCOPY checks in the context of the iSCSI receiver thread. Move the code for allocating and freeing struct xcopy_op from the code that parses an XCOPY command to its caller. This patch fixes the following deadlock: ====================================================== [ INFO: possible circular locking dependency detected ] 4.10.0-rc7-dbg+ #1 Not tainted ------------------------------------------------------- rmdir/13321 is trying to acquire lock: (&sess->cmdsn_mutex){+.+.+.}, at: [<ffffffffa02cb47d>] iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] but task is already holding lock: (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sb->s_type->i_mutex_key#14){++++++}: lock_acquire+0x71/0x90 down_write+0x3f/0x70 configfs_depend_item+0x3a/0xb0 [configfs] target_depend_item+0x13/0x20 [target_core_mod] target_xcopy_locate_se_dev_e4+0xdd/0x1a0 [target_core_mod] target_do_xcopy+0x34b/0x970 [target_core_mod] __target_execute_cmd+0x22/0xa0 [target_core_mod] target_execute_cmd+0x233/0x2c0 [target_core_mod] iscsit_execute_cmd+0x208/0x270 [iscsi_target_mod] iscsit_sequence_cmd+0x10b/0x190 [iscsi_target_mod] iscsit_get_rx_pdu+0x37d/0xcd0 [iscsi_target_mod] iscsi_target_rx_thread+0x6e/0xa0 [iscsi_target_mod] kthread+0x102/0x140 ret_from_fork+0x31/0x40 -> #0 (&sess->cmdsn_mutex){+.+.+.}: __lock_acquire+0x10e6/0x1260 lock_acquire+0x71/0x90 mutex_lock_nested+0x5f/0x670 iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] iscsit_close_session+0xac/0x200 [iscsi_target_mod] lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod] target_shutdown_sessions+0xc3/0xd0 [target_core_mod] core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod] target_fabric_nacl_base_release+0x20/0x30 [target_core_mod] config_item_release+0x5a/0xc0 [configfs] config_item_put+0x1d/0x1f [configfs] configfs_rmdir+0x1a6/0x300 [configfs] vfs_rmdir+0xb7/0x140 do_rmdir+0x1f4/0x200 SyS_rmdir+0x11/0x20 entry_SYSCALL_64_fastpath+0x23/0xc6 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#14); lock(&sess->cmdsn_mutex); lock(&sb->s_type->i_mutex_key#14); lock(&sess->cmdsn_mutex); * DEADLOCK * 3 locks held by rmdir/13321: #0: (sb_writers#10){.+.+.+}, at: [<ffffffff811e1aff>] mnt_want_write+0x1f/0x50 #1: (&default_group_class[depth - 1]#2/1){+.+.+.}, at: [<ffffffff811cc8ce>] do_rmdir+0x15e/0x200 #2: (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140 stack backtrace: CPU: 2 PID: 13321 Comm: rmdir Not tainted 4.10.0-rc7-dbg+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x86/0xc3 print_circular_bug+0x1c7/0x220 __lock_acquire+0x10e6/0x1260 lock_acquire+0x71/0x90 mutex_lock_nested+0x5f/0x670 iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] iscsit_close_session+0xac/0x200 [iscsi_target_mod] lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod] target_shutdown_sessions+0xc3/0xd0 [target_core_mod] core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod] target_fabric_nacl_base_release+0x20/0x30 [target_core_mod] config_item_release+0x5a/0xc0 [configfs] config_item_put+0x1d/0x1f [configfs] configfs_rmdir+0x1a6/0x300 [configfs] vfs_rmdir+0xb7/0x140 do_rmdir+0x1f4/0x200 SyS_rmdir+0x11/0x20 entry_SYSCALL_64_fastpath+0x23/0xc6 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:58:00 -07:00
Bart Van Assche	a85d667e58	target: Use {get,put}_unaligned_be() instead of open coding these functions Introduce the function get_unaligned_be24(). Use {get,put}_unaligned_be() where appropriate. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:57:59 -07:00
Bart Van Assche	f2b72d6a8e	target: Fix transport_init_se_cmd() Avoid that aborting a command before it has been submitted onto a workqueue triggers the following warning: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 3 PID: 46 Comm: kworker/u8:1 Not tainted 4.12.0-rc2-dbg+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Workqueue: tmr-iblock target_tmr_work [target_core_mod] Call Trace: dump_stack+0x86/0xcf register_lock_class+0xe8/0x570 __lock_acquire+0xa1/0x11d0 lock_acquire+0x59/0x80 flush_work+0x42/0x2b0 __cancel_work_timer+0x10c/0x180 cancel_work_sync+0xb/0x10 core_tmr_lun_reset+0x352/0x740 [target_core_mod] target_tmr_work+0xd6/0x130 [target_core_mod] process_one_work+0x1ca/0x3f0 worker_thread+0x49/0x3b0 kthread+0x109/0x140 ret_from_fork+0x31/0x40 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:57:59 -07:00
Bart Van Assche	9f2f342892	target: Remove se_device.dev_list The last user of se_device.dev_list was removed through commit `0fd97ccf45` ("target: kill struct se_subsystem_dev"). Hence also remove se_device.dev_list. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:57:58 -07:00
Bart Van Assche	3e182db787	target: Use symbolic value for WRITE_VERIFY_16 Now that a symbolic value has been introduced for WRITE_VERIFY_16, use it. This patch does not change any functionality. References: commit `c2d26f18dc` ("target: Add WRITE_VERIFY_16") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:57:57 -07:00
Nicholas Bellinger	5465e7d3b9	target: Add TARGET_SCF_LOOKUP_LUN_FROM_TAG support for ABORT_TASK This patch introduces support in target_submit_tmr() for locating a unpacked_lun from an existing se_cmd->tag during ABORT_TASK. When TARGET_SCF_LOOKUP_LUN_FROM_TAG is set, target_submit_tmr() will do the extra lookup via target_lookup_lun_from_tag() and subsequently invoke transport_lookup_tmr_lun() so a proper percpu se_lun->lun_ref is taken before workqueue dispatch into se_device->tmr_wq happens. Aside from the extra target_lookup_lun_from_tag(), the existing code-path remains unchanged. Reviewed-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Quinn Tran <quinn.tran@cavium.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:57:56 -07:00
Nicholas Bellinger	eeb64d239e	target: Add support for TMR percpu reference counting This patch introduces TMR percpu reference counting using se_lun->lun_ref in transport_lookup_tmr_lun(), following how existing non TMR per se_lun reference counting works within transport_lookup_cmd_lun(). It also adds explicit transport_lun_remove_cmd() calls to drop the reference in the three tmr related locations that invoke transport_cmd_check_stop_to_fabric(); - target_tmr_work() during normal ->queue_tm_rsp() - target_complete_tmr_failure() during error ->queue_tm_rsp() - transport_generic_handle_tmr() during early failure Also, note the exception paths in transport_generic_free_cmd() and transport_cmd_finish_abort() already check SCF_SE_LUN_CMD, and will invoke transport_lun_remove_cmd() when necessary. Reviewed-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Quinn Tran <quinn.tran@cavium.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:57:55 -07:00
Jiang Yi	12f66e4a0f	target: reject COMPARE_AND_WRITE if emulate_caw is not set In struct se_dev_attrib, there is a field emulate_caw exposed as a /sys/kernel/config/target/core/$HBA/$DEV/attrib/. If this field is set zero, it means the corresponding struct se_device does not support the scsi cmd COMPARE_AND_WRITE In function sbc_parse_cdb(), go ahead and reject scsi COMPARE_AND_WRITE if emulate_caw is not set, because it has been explicitly disabled from user-space. (Make pr_err ratelimited - nab) Signed-off-by: Jiang Yi <jiangyilism@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-07-06 22:57:54 -07:00
Linus Torvalds	89fbf5384d	Merge branch 'work.read_write' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull read/write updates from Al Viro: "Christoph's fs/read_write.c series - consolidation and cleanups" * 'work.read_write' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: nfsd: remove nfsd_vfs_read nfsd: use vfs_iter_read/write fs: implement vfs_iter_write using do_iter_write fs: implement vfs_iter_read using do_iter_read fs: move more code into do_iter_read/do_iter_write fs: remove __do_readv_writev fs: remove do_compat_readv_writev fs: remove do_readv_writev	2017-07-05 14:35:57 -07:00
Linus Torvalds	5518b69b76	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Reasonably busy this cycle, but perhaps not as busy as in the 4.12 merge window: 1) Several optimizations for UDP processing under high load from Paolo Abeni. 2) Support pacing internally in TCP when using the sch_fq packet scheduler for this is not practical. From Eric Dumazet. 3) Support mutliple filter chains per qdisc, from Jiri Pirko. 4) Move to 1ms TCP timestamp clock, from Eric Dumazet. 5) Add batch dequeueing to vhost_net, from Jason Wang. 6) Flesh out more completely SCTP checksum offload support, from Davide Caratti. 7) More plumbing of extended netlink ACKs, from David Ahern, Pablo Neira Ayuso, and Matthias Schiffer. 8) Add devlink support to nfp driver, from Simon Horman. 9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa Prabhu. 10) Add stack depth tracking to BPF verifier and use this information in the various eBPF JITs. From Alexei Starovoitov. 11) Support XDP on qed device VFs, from Yuval Mintz. 12) Introduce BPF PROG ID for better introspection of installed BPF programs. From Martin KaFai Lau. 13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann. 14) For loads, allow narrower accesses in bpf verifier checking, from Yonghong Song. 15) Support MIPS in the BPF selftests and samples infrastructure, the MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David Daney. 16) Support kernel based TLS, from Dave Watson and others. 17) Remove completely DST garbage collection, from Wei Wang. 18) Allow installing TCP MD5 rules using prefixes, from Ivan Delalande. 19) Add XDP support to Intel i40e driver, from Björn Töpel 20) Add support for TC flower offload in nfp driver, from Simon Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub Kicinski, and Bert van Leeuwen. 21) IPSEC offloading support in mlx5, from Ilan Tayari. 22) Add HW PTP support to macb driver, from Rafal Ozieblo. 23) Networking refcount_t conversions, From Elena Reshetova. 24) Add sock_ops support to BPF, from Lawrence Brako. This is useful for tuning the TCP sockopt settings of a group of applications, currently via CGROUPs" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits) net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap cxgb4: Support for get_ts_info ethtool method cxgb4: Add PTP Hardware Clock (PHC) support cxgb4: time stamping interface for PTP nfp: default to chained metadata prepend format nfp: remove legacy MAC address lookup nfp: improve order of interfaces in breakout mode net: macb: remove extraneous return when MACB_EXT_DESC is defined bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case bpf: fix return in load_bpf_file mpls: fix rtm policy in mpls_getroute net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t net, ax25: convert ax25_route.refcount from atomic_t to refcount_t net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t ...	2017-07-05 12:31:59 -07:00
Dmitry Monakhov	128b6f9fdd	t10-pi: Move opencoded contants to common header Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-07-03 16:56:25 -06:00
Linus Torvalds	c6b1e36c8f	Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-block Pull core block/IO updates from Jens Axboe: "This is the main pull request for the block layer for 4.13. Not a huge round in terms of features, but there's a lot of churn related to some core cleanups. Note this depends on the UUID tree pull request, that Christoph already sent out. This pull request contains: - A series from Christoph, unifying the error/stats codes in the block layer. We now use blk_status_t everywhere, instead of using different schemes for different places. - Also from Christoph, some cleanups around request allocation and IO scheduler interactions in blk-mq. - And yet another series from Christoph, cleaning up how we handle and do bounce buffering in the block layer. - A blk-mq debugfs series from Bart, further improving on the support we have for exporting internal information to aid debugging IO hangs or stalls. - Also from Bart, a series that cleans up the request initialization differences across types of devices. - A series from Goldwyn Rodrigues, allowing the block layer to return failure if we will block and the user asked for non-blocking. - Patch from Hannes for supporting setting loop devices block size to that of the underlying device. - Two series of patches from Javier, fixing various issues with lightnvm, particular around pblk. - A series from me, adding support for write hints. This comes with NVMe support as well, so applications can help guide data placement on flash to improve performance, latencies, and write amplification. - A series from Ming, improving and hardening blk-mq support for stopping/starting and quiescing hardware queues. - Two pull requests for NVMe updates. Nothing major on the feature side, but lots of cleanups and bug fixes. From the usual crew. - A series from Neil Brown, greatly improving the bio rescue set support. Most notably, this kills the bio rescue work queues, if we don't really need them. - Lots of other little bug fixes that are all over the place" * 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits) lightnvm: pblk: set line bitmap check under debug lightnvm: pblk: verify that cache read is still valid lightnvm: pblk: add initialization check lightnvm: pblk: remove target using async. I/Os lightnvm: pblk: use vmalloc for GC data buffer lightnvm: pblk: use right metadata buffer for recovery lightnvm: pblk: schedule if data is not ready lightnvm: pblk: remove unused return variable lightnvm: pblk: fix double-free on pblk init lightnvm: pblk: fix bad le64 assignations nvme: Makefile: remove dead build rule blk-mq: map all HWQ also in hyperthreaded system nvmet-rdma: register ib_client to not deadlock in device removal nvme_fc: fix error recovery on link down. nvmet_fc: fix crashes on bad opcodes nvme_fc: Fix crash when nvme controller connection fails. nvme_fc: replace ioabort msleep loop with completion nvme_fc: fix double calls to nvme_cleanup_cmd() nvme-fabrics: verify that a controller returns the correct NQN nvme: simplify nvme_dev_attrs_are_visible ...	2017-07-03 10:34:51 -07:00
David S. Miller	b079115937	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-30 12:43:08 -04:00
Christoph Hellwig	abbb65899a	fs: implement vfs_iter_write using do_iter_write De-dupliate some code and allow for passing the flags argument to vfs_iter_write. Additionally it now properly updates timestamps. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-29 17:49:23 -04:00
Christoph Hellwig	18e9710ee5	fs: implement vfs_iter_read using do_iter_read De-dupliate some code and allow for passing the flags argument to vfs_iter_read. Additional it properly updates atime now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-29 17:49:23 -04:00
Bart Van Assche	ca18d6f769	block: Make most scsi_req_init() calls implicit Instead of explicitly calling scsi_req_init() after blk_get_request(), call that function from inside blk_get_request(). Add an .initialize_rq_fn() callback function to the block drivers that need it. Merge the IDE .init_rq_fn() function into .initialize_rq_fn() because it is too small to keep it as a separate function. Keep the scsi_req_init() call in ide_prep_sense() because it follows a blk_rq_init() call. References: commit `82ed4db499` ("block: split scsi_request out of struct request") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-06-20 19:27:14 -06:00
yuan linyu	de77b966ce	net: introduce __skb_put_[zero, data, u8] follow Johannes Berg, semantic patch file as below, @@ identifier p, p2; expression len; expression skb; type t, t2; @@ ( -p = __skb_put(skb, len); +p = __skb_put_zero(skb, len); \| -p = (t)__skb_put(skb, len); +p = __skb_put_zero(skb, len); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, len); \| -memset(p, 0, len); ) @@ identifier p; expression len; expression skb; type t; @@ ( -t p = __skb_put(skb, len); +t p = __skb_put_zero(skb, len); ) ... when != p ( -memset(p, 0, len); ) @@ type t, t2; identifier p, p2; expression skb; @@ t p; ... ( -p = __skb_put(skb, sizeof(t)); +p = __skb_put_zero(skb, sizeof(t)); \| -p = (t )__skb_put(skb, sizeof(t)); +p = __skb_put_zero(skb, sizeof(t)); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, sizeof(p)); \| -memset(p, 0, sizeof(p)); ) @@ expression skb, len; @@ -memset(__skb_put(skb, len), 0, len); +__skb_put_zero(skb, len); @@ expression skb, len, data; @@ -memcpy(__skb_put(skb, len), data, len); +__skb_put_data(skb, data, len); @@ expression SKB, C, S; typedef u8; identifier fn = {__skb_put}; fresh identifier fn2 = fn ## "_u8"; @@ - (u8 )fn(SKB, S) = C; + fn2(SKB, C); Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-20 13:30:14 -04:00
Jason A. Donenfeld	6787ab81b2	iscsi: ensure RNG is seeded before use It's not safe to use weak random data here, especially for the challenge response randomness. Since we're always in process context, it's safe to simply wait until we have enough randomness to carry out the authentication correctly. While we're at it, we clean up a small memleak during an error condition. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org> Cc: Lee Duncan <lduncan@suse.com> Cc: Chris Leech <cleech@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2017-06-19 22:06:28 -04:00
NeilBrown	011067b056	blk: replace bioset_create_nobvec() with a flags arg to bioset_create() "flags" arguments are often seen as good API design as they allow easy extensibility. bioset_create_nobvec() is implemented internally as a variation in flags passed to __bioset_create(). To support future extension, make the internal structure part of the API. i.e. add a 'flags' argument to bioset_create() and discard bioset_create_nobvec(). Note that the bio_split allocations in drivers/md/raid* do not need the bvec mempool - they should have used bioset_create_nobvec(). Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-06-18 12:40:59 -06:00
Johannes Berg	d58ff35122	networking: make skb_push & __skb_push return void pointers It seems like a historic accident that these return unsigned char , and in many places that means casts are required, more often than not. Make these functions return void and remove all the casts across the tree, adding a (u8 ) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - (fn(SKB, LEN)) + (u8 )fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; type T; @@ - E = ((T )(fn(SKB, LEN))) + E = fn(SKB, LEN) @@ expression SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - fn(SKB, LEN)[0] + (u8 )fn(SKB, LEN) Note that the last part there converts from push(...)[0] to the more idiomatic (u8 *)push(...). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-16 11:48:40 -04:00
Johannes Berg	4df864c1d9	networking: make skb_put & friends return void pointers It seems like a historic accident that these return unsigned char , and in many places that means casts are required, more often than not. Make these functions (skb_put, __skb_put and pskb_put) return void and remove all the casts across the tree, adding a (u8 ) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_put, __skb_put }; @@ - (fn(SKB, LEN)) + (u8 )fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_put, __skb_put }; type T; @@ - E = ((T *)(fn(SKB, LEN))) + E = fn(SKB, LEN) which actually doesn't cover pskb_put since there are only three users overall. A handful of stragglers were converted manually, notably a macro in drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many instances in net/bluetooth/hci_sock.c. In the former file, I also had to fix one whitespace problem spatch introduced. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-16 11:48:39 -04:00
Jens Axboe	8f66439eec	Linux 4.12-rc5 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZPdbLAAoJEHm+PkMAQRiGx4wH/1nCjfnl6fE8oJ24/1gEAOUh biFdqJkYZmlLYHVtYfLm4Ueg4adJdg0wx6qM/4RaAzmQVvLfDV34bc1qBf1+P95G kVF+osWyXrZo5cTwkwapHW/KNu4VJwAx2D1wrlxKDVG5AOrULH1pYOYGOpApEkZU 4N+q5+M0ce0GJpqtUZX+UnI33ygjdDbBxXoFKsr24B7eA0ouGbAJ7dC88WcaETL+ 2/7tT01SvDMo0jBSV0WIqlgXwZ5gp3yPGnklC3F4159Yze6VFrzHMKS/UpPF8o8E W9EbuzwxsKyXUifX2GY348L1f+47glen/1sedbuKnFhP6E9aqUQQJXvEO7ueQl4= =m2Gx -----END PGP SIGNATURE----- Merge tag 'v4.12-rc5' into for-4.13/block We've already got a few conflicts and upcoming work depends on some of the changes that have gone into mainline as regression fixes for this series. Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream trees to continue working on 4.13 changes. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-06-12 08:30:13 -06:00
Christoph Hellwig	4e4cbee93d	block: switch bios to blk_status_t Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-06-09 09:27:32 -06:00
Christoph Hellwig	2a842acab1	block: introduce new block status code type Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-06-09 09:27:32 -06:00
Nicholas Bellinger	eceb4459df	iscsi-target: Avoid holding ->tpg_state_lock during param update As originally reported by Jia-Ju, iscsit_tpg_enable_portal_group() holds iscsi_portal_group->tpg_state_lock while updating AUTHMETHOD via iscsi_update_param_value(), which performs a GFP_KERNEL allocation. However, since iscsit_tpg_enable_portal_group() is already protected by iscsit_get_tpg() -> iscsi_portal_group->tpg_access_lock in it's parent caller, ->tpg_state_lock only needs to be held when setting TPG_STATE_ACTIVE. Reported-by: Jia-Ju Bai <baijiaju1990@163.com> Reviewed-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-06-08 23:26:38 -07:00
Nicholas Bellinger	9ae0e9ade5	target/configfs: Kill se_lun->lun_link_magic Instead of using a hardcoded magic value in se_lun when verifying a target config_item symlink source during target_fabric_mappedlun_link(), go ahead and use target_fabric_port_item_ops directly instead. Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-06-08 23:26:38 -07:00
Nicholas Bellinger	c17cd24959	target/configfs: Kill se_device->dev_link_magic Instead of using a hardcoded magic value in se_device when verifying a target config_item symlink source during target_fabric_port_link(), go ahead and use target_core_dev_item_ops directly instead. Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-06-08 23:26:38 -07:00
Nicholas Bellinger	2237498f0b	target/iblock: Convert WRITE_SAME to blkdev_issue_zeroout The people who are actively using iblock_execute_write_same_direct() are doing so in the context of ESX VAAI BlockZero, together with EXTENDED_COPY and COMPARE_AND_WRITE primitives. In practice though I've not seen any users of IBLOCK WRITE_SAME for anything other than VAAI BlockZero, so just using blkdev_issue_zeroout() when available, and falling back to iblock_execute_write_same() if the WRITE_SAME buffer contains anything other than zeros should be OK. (Hook up max_write_zeroes_sectors to signal LBPRZ feature bit in target_configure_unmap_from_queue - nab) Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-06-08 23:26:37 -07:00
Gustavo A. R. Silva	fb418240ec	target: remove dead code Local variable _ret_ is assigned to a constant value and it is never updated again. Remove this variable and the dead code it guards. Addresses-Coverity-ID: 140761 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-06-08 23:26:37 -07:00
Nicholas Bellinger	abb85a9b51	iscsi-target: Reject immediate data underflow larger than SCSI transfer length When iscsi WRITE underflow occurs there are two different scenarios that can happen. Normally in practice, when an EDTL vs. SCSI CDB TRANSFER LENGTH underflow is detected, the iscsi immediate data payload is the smaller SCSI CDB TRANSFER LENGTH. That is, when a host fabric LLD is using a fixed size EDTL for a specific control CDB, the SCSI CDB TRANSFER LENGTH and actual SCSI payload ends up being smaller than EDTL. In iscsi, this means the received iscsi immediate data payload matches the smaller SCSI CDB TRANSFER LENGTH, because there is no more SCSI payload to accept beyond SCSI CDB TRANSFER LENGTH. However, it's possible for a malicous host to send a WRITE underflow where EDTL is larger than SCSI CDB TRANSFER LENGTH, but incoming iscsi immediate data actually matches EDTL. In the wild, we've never had a iscsi host environment actually try to do this. For this special case, it's wrong to truncate part of the control CDB payload and continue to process the command during underflow when immediate data payload received was larger than SCSI CDB TRANSFER LENGTH, so go ahead and reject and drop the bogus payload as a defensive action. Note this potential bug was originally relaxed by the following for allowing WRITE underflow in MSFT FCP host environments: commit `c72c525022` Author: Roland Dreier <roland@purestorage.com> Date: Wed Jul 22 15:08:18 2015 -0700 target: allow underflow/overflow for PR OUT etc. commands Cc: Roland Dreier <roland@purestorage.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-06-08 22:25:29 -07:00
Nicholas Bellinger	105fa2f44e	iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP This patch fixes a BUG() in iscsit_close_session() that could be triggered when iscsit_logout_post_handler() execution from within tx thread context was not run for more than SECONDS_FOR_LOGOUT_COMP (15 seconds), and the TCP connection didn't already close before then forcing tx thread context to automatically exit. This would manifest itself during explicit logout as: [33206.974254] 1 connection(s) still exist for iSCSI session to iqn.1993-08.org.debian:01:3f5523242179 [33206.980184] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 2100.772 msecs [33209.078643] ------------[ cut here ]------------ [33209.078646] kernel BUG at drivers/target/iscsi/iscsi_target.c:4346! Normally when explicit logout attempt fails, the tx thread context exits and iscsit_close_connection() from rx thread context does the extra cleanup once it detects conn->conn_logout_remove has not been cleared by the logout type specific post handlers. To address this special case, if the logout post handler in tx thread context detects conn->tx_thread_active has already been cleared, simply return and exit in order for existing iscsit_close_connection() logic from rx thread context do failed logout cleanup. Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.14+ Tested-by: Gary Guo <ghg@datera.io> Tested-by: Chu Yuan Lin <cyl@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-06-08 22:25:14 -07:00
Nicholas Bellinger	73d4e580cc	target: Fix kref->refcount underflow in transport_cmd_finish_abort This patch fixes a se_cmd->cmd_kref underflow during CMD_T_ABORTED when a fabric driver drops it's second reference from below the target_core_tmr.c based callers of transport_cmd_finish_abort(). Recently with the conversion of kref to refcount_t, this bug was manifesting itself as: [705519.601034] refcount_t: underflow; use-after-free. [705519.604034] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 20116.512 msecs [705539.719111] ------------[ cut here ]------------ [705539.719117] WARNING: CPU: 3 PID: 26510 at lib/refcount.c:184 refcount_sub_and_test+0x33/0x51 Since the original kref atomic_t based kref_put() didn't check for underflow and only invoked the final callback when zero was reached, this bug did not manifest in practice since all se_cmd memory is using preallocated tags. To address this, go ahead and propigate the existing return from transport_put_cmd() up via transport_cmd_finish_abort(), and change transport_cmd_finish_abort() + core_tmr_handle_tas_abort() callers to only do their local target_put_sess_cmd() if necessary. Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.14+ Tested-by: Gary Guo <ghg@datera.io> Tested-by: Chu Yuan Lin <cyl@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-06-08 22:24:18 -07:00
Ganesh Goudar	1dec4cec9f	cxgb4: Fix tids count for ipv6 offload connection the adapter consumes two tids for every ipv6 offload connection be it active or passive, calculate tid usage count accordingly. Also change the signatures of relevant functions to get the address family. Signed-off-by: Rizwan Ansari <rizwana@chelsio.com> Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-07 15:14:34 -04:00
Jiang Yi	5e0cf5e6c4	iscsi-target: Always wait for kthread_should_stop() before kthread exit There are three timing problems in the kthread usages of iscsi_target_mod: - np_thread of struct iscsi_np - rx_thread and tx_thread of struct iscsi_conn In iscsit_close_connection(), it calls send_sig(SIGINT, conn->tx_thread, 1); kthread_stop(conn->tx_thread); In conn->tx_thread, which is iscsi_target_tx_thread(), when it receive SIGINT the kthread will exit without checking the return value of kthread_should_stop(). So if iscsi_target_tx_thread() exit right between send_sig(SIGINT...) and kthread_stop(...), the kthread_stop() will try to stop an already stopped kthread. This is invalid according to the documentation of kthread_stop(). (Fix -ECONNRESET logout handling in iscsi_target_tx_thread and early iscsi_target_rx_thread failure case - nab) Signed-off-by: Jiang Yi <jiangyilism@gmail.com> Cc: <stable@vger.kernel.org> # v3.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-31 15:12:57 -07:00
Nicholas Bellinger	25cdda95fd	iscsi-target: Fix initial login PDU asynchronous socket close OOPs This patch fixes a OOPs originally introduced by: commit `bb048357da` Author: Nicholas Bellinger <nab@linux-iscsi.org> Date: Thu Sep 5 14:54:04 2013 -0700 iscsi-target: Add sk->sk_state_change to cleanup after TCP failure which would trigger a NULL pointer dereference when a TCP connection was closed asynchronously via iscsi_target_sk_state_change(), but only when the initial PDU processing in iscsi_target_do_login() from iscsi_np process context was blocked waiting for backend I/O to complete. To address this issue, this patch makes the following changes. First, it introduces some common helper functions used for checking socket closing state, checking login_flags, and atomically checking socket closing state + setting login_flags. Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP connection has dropped via iscsi_target_sk_state_change(), but the initial PDU processing within iscsi_target_do_login() in iscsi_np context is still running. For this case, it sets LOGIN_FLAGS_CLOSED, but doesn't invoke schedule_delayed_work(). The original NULL pointer dereference case reported by MNC is now handled by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before transitioning to FFP to determine when the socket has already closed, or iscsi_target_start_negotiation() if the login needs to exchange more PDUs (eg: iscsi_target_do_login returned 0) but the socket has closed. For both of these cases, the cleanup up of remaining connection resources will occur in iscsi_target_start_negotiation() from iscsi_np process context once the failure is detected. Finally, to handle to case where iscsi_target_sk_state_change() is called after the initial PDU procesing is complete, it now invokes conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once existing iscsi_target_sk_check_close() checks detect connection failure. For this case, the cleanup of remaining connection resources will occur in iscsi_target_do_login_rx() from delayed workqueue process context once the failure is detected. Reported-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Tested-by: Mike Christie <mchristi@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Reported-by: Hannes Reinecke <hare@suse.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Varun Prakash <varun@chelsio.com> Cc: <stable@vger.kernel.org> # v3.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-31 15:12:31 -07:00
Mike Christie	f3cdbe39b2	tcmu: fix crash during device removal We currently do tcmu_free_device ->tcmu_netlink_event(TCMU_CMD_REMOVED_DEVICE) -> uio_unregister_device -> kfree(tcmu_dev). The problem is that the kernel does not wait for userspace to do the close() on the uio device before freeing the tcmu_dev. We can then hit a race where the kernel frees the tcmu_dev before userspace does close() and so when close() -> release -> tcmu_release is done, we try to access a freed tcmu_dev. This patch made over the target-pending master branch moves the freeing of the tcmu_dev to when the last reference has been dropped. This also fixes a leak where if tcmu_configure_device was not called on a device we did not free udev->name which was allocated at tcmu_alloc_device time. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-23 19:50:49 -07:00
Nicholas Bellinger	4ff83daa02	target: Re-add check to reject control WRITEs with overflow data During v4.3 when the overflow/underflow check was relaxed by commit c72c525022: commit `c72c525022` Author: Roland Dreier <roland@purestorage.com> Date: Wed Jul 22 15:08:18 2015 -0700 target: allow underflow/overflow for PR OUT etc. commands to allow underflow/overflow for Windows compliance + FCP, a consequence was to allow control CDBs to process overflow data for iscsi-target with immediate data as well. As per Roland's original change, continue to allow underflow cases for control CDBs to make Windows compliance + FCP happy, but until overflow for control CDBs is supported tree-wide, explicitly reject all control WRITEs with overflow following pre v4.3.y logic. Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Roland Dreier <roland@purestorage.com> Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-15 20:20:29 -07:00
Linus Torvalds	bd1286f964	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Things were a lot more calm than previously expected. It's primarily fixes in various areas, with most of the new functionality centering around TCMU backend driver work that Xiubo Li has been driving. Here's the summary on the feature side: - Make T10-PI verify configurable for emulated (FILEIO + RD) backends (Dmitry Monakhov) - Allow target-core/TCMU pass-through to use in-kernel SPC-PR logic (Bryant Ly + MNC) - Add TCMU support for growing ring buffer size (Xiubo Li + MNC) - Add TCMU support for global block data pool (Xiubo Li + MNC) and on the bug-fix side: - Fix COMPARE_AND_WRITE non GOOD status handling for READ phase failures (Gary Guo + nab) - Fix iscsi-target hang with explicitly changing per NodeACL CmdSN number depth with concurrent login driven session reinstatement. (Gary Guo + nab) - Fix ibmvscsis fabric driver ABORT task handling (Bryant Ly) - Fix target-core/FILEIO zero length handling (Bart Van Assche) Also, there was an OOPs introduced with the WRITE_VERIFY changes that I ended up reverting at the last minute, because as not unusual Bart and I could not agree on the fix in time for -rc1. Since it's specific to a conformance test, it's been reverted for now. There is a separate patch in the queue to address the underlying control CDB write overflow regression in >= v4.3 separate from the WRITE_VERIFY revert here, that will be pushed post -rc1" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (30 commits) Revert "target: Fix VERIFY and WRITE VERIFY command parsing" IB/srpt: Avoid that aborting a command triggers a kernel warning IB/srpt: Fix abort handling target/fileio: Fix zero-length READ and WRITE handling ibmvscsis: Do not send aborted task response tcmu: fix module removal due to stuck thread target: Don't force session reset if queue_depth does not change iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement target: Fix compare_and_write_callback handling for non GOOD status tcmu: Recalculate the tcmu_cmd size to save cmd area memories tcmu: Add global data block pool support tcmu: Add dynamic growing data area feature support target: fixup error message in target_tg_pt_gp_tg_pt_gp_id_store() target: fixup error message in target_tg_pt_gp_alua_access_type_store() target/user: PGR Support target: Add WRITE_VERIFY_16 Documentation/target: add an example script to configure an iSCSI target target: Use kmalloc_array() in transport_kmap_data_sg() target: Use kmalloc_array() in compare_and_write_callback() target: Improve size determinations in two functions ...	2017-05-12 11:44:13 -07:00
Nicholas Bellinger	984a9d4c40	Revert "target: Fix VERIFY and WRITE VERIFY command parsing" This reverts commit `0e2eb7d12e` Author: Bart Van Assche <bart.vanassche@sandisk.com> Date: Thu Mar 30 10:12:39 2017 -0700 target: Fix VERIFY and WRITE VERIFY command parsing This patch broke existing behaviour for WRITE_VERIFY because it dropped the original SCF_SCSI_DATA_CDB assignment for bytchk = 0 so target_cmd_size_check() no longer rejected this case, allowing an overflow case to trigger an OOPs in iscsi-target. Since the short term and long term fixes are still being discussed, revert it for now since it's late in the merge window and try again in v4.13-rc1. Conflicts: drivers/target/target_core_sbc.c Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-11 01:01:05 -07:00
Bart Van Assche	59ac9c0781	target/fileio: Fix zero-length READ and WRITE handling This patch fixes zero-length READ and WRITE handling in target/FILEIO, which was broken a long time back by: Since: commit `d81cb44726` Author: Paolo Bonzini <pbonzini@redhat.com> Date: Mon Sep 17 16:36:11 2012 -0700 target: go through normal processing for all zero-length commands which moved zero-length READ and WRITE completion out of target-core, to doing submission into backend driver code. To address this, go ahead and invoke target_complete_cmd() for any non negative return value in fd_do_rw(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Cc: <stable@vger.kernel.org> # v3.7+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-07 16:05:16 -07:00
Mike Christie	d906d8af28	tcmu: fix module removal due to stuck thread We need to do a kthread_should_stop to check when kthread_stop has been called. This was a regression added in `b6df4b79a5` tcmu: Add global data block pool support so not sure if you wanted to merge it in with that patch or what. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-04 20:01:41 -07:00
Nicholas Bellinger	46861cdd80	target: Don't force session reset if queue_depth does not change Keeping in the idempotent nature of target_core_fabric_configfs.c, if a queue_depth value is set and it's the same as the existing value, don't attempt to force session reinstatement. Reported-by: Raghu Krishnamurthy <rk@datera.io> Cc: Raghu Krishnamurthy <rk@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-04 20:01:40 -07:00
Nicholas Bellinger	197b806ae5	iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement While testing modification of per se_node_acl queue_depth forcing session reinstatement via lio_target_nacl_cmdsn_depth_store() -> core_tpg_set_initiator_node_queue_depth(), a hung task bug triggered when changing cmdsn_depth invoked session reinstatement while an iscsi login was already waiting for session reinstatement to complete. This can happen when an outstanding se_cmd descriptor is taking a long time to complete, and session reinstatement from iscsi login or cmdsn_depth change occurs concurrently. To address this bug, explicitly set session_fall_back_to_erl0 = 1 when forcing session reinstatement, so session reinstatement is not attempted if an active session is already being shutdown. This patch has been tested with two scenarios. The first when iscsi login is blocked waiting for iscsi session reinstatement to complete followed by queue_depth change via configfs, and second when queue_depth change via configfs us blocked followed by a iscsi login driven session reinstatement. Note this patch depends on commit `d36ad77f70` to handle multiple sessions per se_node_acl when changing cmdsn_depth, and for pre v4.5 kernels will need to be included for stable as well. Reported-by: Gary Guo <ghg@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Cc: <stable@vger.kernel.org> # v4.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-04 20:01:39 -07:00
Nicholas Bellinger	a71a5dc7f8	target: Fix compare_and_write_callback handling for non GOOD status Following the bugfix for handling non SAM_STAT_GOOD COMPARE_AND_WRITE status during COMMIT phase in commit `9b2792c3da`, the same bug exists for the READ phase as well. This would manifest first as a lost SCSI response, and eventual hung task during fabric driver logout or re-login, as existing shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref to reach zero. To address this bug, compare_and_write_callback() has been changed to set post_ret = 1 and return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE as necessary to signal failure status. Reported-by: Bill Borsari <wgb@datera.io> Cc: Bill Borsari <wgb@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Cc: <stable@vger.kernel.org> # v4.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-04 20:01:39 -07:00
Xiubo Li	fe25cc3479	tcmu: Recalculate the tcmu_cmd size to save cmd area memories For the "struct tcmu_cmd_entry" in cmd area, the minimum size will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could fill about (sizeof(struct rsp) - sizeof(struct req)) / sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by default. For most tcmu_cmds, the data block indexes allocated from the data area will be continuous. And for the continuous blocks they will be merged into the same region using only one iovec. For the current code, it will always allocates the same number of iovecs with blocks for each tcmu_cmd, and it will wastes much memories. For example, when the block size is 4K and the DATA_OUT buffer size is 64K, and the regions needed is less than 5(on my environment is almost 99.7%). The current code will allocate about 16 iovecs, and there will be (16 - 4) * sizeof(struct iovec) = 192 Bytes cmd area memories wasted. Here adds two helpers to calculate the base size and full size of the tcmu_cmd. And will recalculate them again when it make sure how many iovs is needed before insert it to cmd area. Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Acked-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-02 20:39:38 -07:00
Xiubo Li	b6df4b79a5	tcmu: Add global data block pool support For each target there will be one ring, when the target number grows larger and larger, it could eventually runs out of the system memories. In this patch for each target ring, currently for the cmd area the size will be fixed to 8MB and for the data area the size will grow from 0 to max 256K * PAGE_SIZE(1G for 4K page size). For all the targets' data areas, they will get empty blocks from the "global data block pool", which has limited to 512K * PAGE_SIZE(2G for 4K page size) for now. When the "global data block pool" has been used up, then any target could wake up the unmap thread routine to shrink other targets' data area memories. And the unmap thread routine will always try to truncate the ring vma from the last using block offset. When user space has touched the data blocks out of tcmu_cmd iov[], the tcmu_page_fault() will try to return one zeroed blocks. Here we move the timeout's tcmu_handle_completions() into unmap thread routine, that's to say when the timeout fired, it will only do the tcmu_check_expired_cmd() and then wake up the unmap thread to do the completions() and then try to shrink its idle memories. Then the cmdr_lock could be a mutex and could simplify this patch because the unmap_mapping_range() or zap_* may go to sleep. Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Signed-off-by: Jianfei Hu <hujianfei@cmss.chinamobile.com> Acked-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:58:04 -07:00
Xiubo Li	141685a391	tcmu: Add dynamic growing data area feature support Currently for the TCMU, the ring buffer size is fixed to 64K cmd area + 1M data area, and this will be bottlenecks for high iops. The struct tcmu_cmd_entry {} size is fixed about 112 bytes with iovec[N] & N <= 4, and the size of struct iovec is about 16 bytes. If N == 0, the ratio will be sizeof(cmd entry) : sizeof(datas) == 112Bytes : (N * 4096)Bytes = 28 : 0, no data area is need. If 0 < N <=4, the ratio will be sizeof(cmd entry) : sizeof(datas) == 112Bytes : (N * 4096)Bytes = 28 : (N * 1024), so the max will be 28 : 1024. If N > 4, the sizeof(cmd entry) will be [(N - 4) 16 + 112] bytes, and its corresponding data size will be [N 4096], so the ratio of sizeof(cmd entry) : sizeof(datas) == [(N - 4) * 16 + 112)Bytes : (N * 4096)Bytes == 4/1024 - 12/(N * 1024), so the max is about 4 : 1024. When N is bigger, the ratio will be smaller. As the initial patch, we will set the cmd area size to 2M, and the cmd area size to 32M. The TCMU will dynamically grows the data area from 0 to max 32M size as needed. The cmd area memory will be allocated through vmalloc(), and the data area's blocks will be allocated individually later when needed. The allocated data area block memory will be managed via radix tree. For now the bitmap still be the most efficient way to search and manage the block index, this could be update later. Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Signed-off-by: Jianfei Hu <hujianfei@cmss.chinamobile.com> Acked-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:57:36 -07:00
Hannes Reinecke	3d035237a5	target: fixup error message in target_tg_pt_gp_tg_pt_gp_id_store() When setting up an ALUA target port group with an invalid ID the error message kstrtoul() returned -22 for tg_pt_gp_id is displayed, which is not really helpful. Convert it to something sane. And while we're at it, join the messages onto a single line. Signed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:53 -07:00
Hannes Reinecke	c0dcafd8c5	target: fixup error message in target_tg_pt_gp_alua_access_type_store() When setting up a target the error message: Unable to do set ##_name ALUA state on non valid tg_pt_gp ID: 0 is displayed. Apparently concatenation doesn't work in a string; one should be using implicit string concatenation here. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:49 -07:00
Bryant G. Ly	4ec5bf0ea8	target/user: PGR Support This adds initial PGR support for just TCMU, since tcmu doesn't have the necessary IT_NEXUS info to process PGR in userspace, so have those commands be processed in kernel. HA support is not available yet, we will work on it if this patch is acceptable. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:45 -07:00
Bryant G. Ly	c2d26f18dc	target: Add WRITE_VERIFY_16 This patch addresses clients who needs write_verify_16 for large volume groups such as AIX. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:40 -07:00
Markus Elfring	df6751f340	target: Use kmalloc_array() in transport_kmap_data_sg() A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kmalloc_array". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:33 -07:00
Markus Elfring	f318aef55f	target: Use kmalloc_array() in compare_and_write_callback() * A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kmalloc_array". This issue was detected by using the Coccinelle software. * Replace the specification of a data structure by a pointer dereference to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:32 -07:00
Markus Elfring	5d68fb72d3	target: Improve size determinations in two functions Replace the specification of two data structures by pointer dereferences as the parameter for the operator "sizeof" to make the corresponding size determinations a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:30 -07:00
Markus Elfring	fbc4040bf6	target: Delete error messages for failed memory allocations The script "checkpatch.pl" pointed information out like the following. WARNING: Possible unnecessary 'out of memory' message Thus remove such statements here. Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:28 -07:00
Markus Elfring	55ec409202	target: Use kcalloc() in two functions * Multiplications for the size determination of memory allocations indicated that array data structures should be processed. Thus use the corresponding function "kcalloc". This issue was detected by using the Coccinelle software. * Replace the specification of data structures by pointer dereferences to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:27 -07:00
Markus Elfring	3829f38169	iscsi-target: Improve size determinations in four functions Replace the specification of four data structures by pointer dereferences as the parameter for the operator "sizeof" to make the corresponding size determinations a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:08 -07:00
Markus Elfring	c46e22f106	iscsi-target: Delete error messages for failed memory allocations The script "checkpatch.pl" pointed information out like the following. WARNING: Possible unnecessary 'out of memory' message Thus remove such statements here. Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:07 -07:00
Markus Elfring	f172511032	iscsi-target: Use kcalloc() in iscsit_allocate_iovecs() * A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kcalloc". This issue was detected by using the Coccinelle software. * Replace the specification of a data structure by a pointer dereference to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:21:05 -07:00
Dmitry Monakhov	056e8924a0	tcm: make pi data verification configurable Currently ramdisk and fileio always perform PI verification before and after backend IO. This approach is not very flexible. Because some one may want to postpone this work to other layers in IO stack. For example if we want to test blk_integrity_profile testcase: `dee408c868` Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:20:58 -07:00
Dmitry Monakhov	f11b55d135	tcm_fileio: Prevent information leak for short reads If we failed to read data from backing file (probably because some one truncate file under us), we must zerofill cmd's data, otherwise it will be returned as is. Most likely cmd's data are unitialized pages from page cache. This result in information leak. (Change BUG_ON into -EINVAL se_cmd failure - nab) testcase: `e11a1b7b90` Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:20:57 -07:00
Bart Van Assche	0e2eb7d12e	target: Fix VERIFY and WRITE VERIFY command parsing Use the value of the BYTCHK field to determine the size of the Data-Out buffer. For VERIFY, honor the VRPROTECT, DPO and FUA fields. This patch avoids that LIO complains about a mismatch between the expected transfer length and the SCSI CDB length if the value of the BYTCHK field is 0. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Max Lohrmann <post@wickenrode.com> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:20:49 -07:00
Zhu Lingshan	d65e655706	target/pr: update PR out action code table This commit updated persistent revervation out service action code table in SPC-5 for development. Signed-off-by: Zhu Lingshan <lszhu@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:20:44 -07:00
Elena Reshetova	5981c245a8	target/iblock: convert iblock_req.pending from atomic_t to refcount_t refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-05-01 22:20:43 -07:00
Linus Torvalds	694752922b	Merge branch 'for-4.12/block' of git://git.kernel.dk/linux-block Pull block layer updates from Jens Axboe: - Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ was initially a fork of CFQ, but subsequently changed to implement fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant to be used on desktop type single drives, providing good fairness. From Paolo. - Add Kyber IO scheduler. This is a full multiqueue aware scheduler, using a scalable token based algorithm that throttles IO based on live completion IO stats, similary to blk-wbt. From Omar. - A series from Jan, moving users to separately allocated backing devices. This continues the work of separating backing device life times, solving various problems with hot removal. - A series of updates for lightnvm, mostly from Javier. Includes a 'pblk' target that exposes an open channel SSD as a physical block device. - A series of fixes and improvements for nbd from Josef. - A series from Omar, removing queue sharing between devices on mostly legacy drivers. This helps us clean up other bits, if we know that a queue only has a single device backing. This has been overdue for more than a decade. - Fixes for the blk-stats, and improvements to unify the stats and user windows. This both improves blk-wbt, and enables other users to register a need to receive IO stats for a device. From Omar. - blk-throttle improvements from Shaohua. This provides a scalable framework for implementing scalable priotization - particularly for blk-mq, but applicable to any type of block device. The interface is marked experimental for now. - Bucketized IO stats for IO polling from Stephen Bates. This improves efficiency of polled workloads in the presence of mixed block size IO. - A few fixes for opal, from Scott. - A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics. From a variety of folks, mostly Sagi and James Smart. - A series from Bart, improving our exposed info and capabilities from the blk-mq debugfs support. - A series from Christoph, cleaning up how handle WRITE_ZEROES. - A series from Christoph, cleaning up the block layer handling of how we track errors in a request. On top of being a nice cleanup, it also shrinks the size of struct request a bit. - Removal of mg_disk and hd (sorry Linus) by Christoph. The former was never used by platforms, and the latter has outlived it's usefulness. - Various little bug fixes and cleanups from a wide variety of folks. * 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits) block: hide badblocks attribute by default blk-mq: unify hctx delay_work and run_work block: add kblock_mod_delayed_work_on() blk-mq: unify hctx delayed_run_work and run_work nbd: fix use after free on module unload MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler blk-mq-sched: alloate reserved tags out of normal pool mtip32xx: use runtime tag to initialize command header scsi: Implement blk_mq_ops.show_rq() blk-mq: Add blk_mq_ops.show_rq() blk-mq: Show operation, cmd_flags and rq_flags names blk-mq: Make blk_flags_show() callers append a newline character blk-mq: Move the "state" debugfs attribute one level down blk-mq: Unregister debugfs attributes earlier blk-mq: Only unregister hctxs for which registration succeeded blk-mq-debugfs: Rename functions for registering and unregistering the mq directory blk-mq: Let blk_mq_debugfs_register() look up the queue name blk-mq: Register <dev>/queue/mq after having registered <dev>/queue ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset ide-pm: always pass 0 error to __blk_end_request_all ..	2017-05-01 10:39:57 -07:00
Christoph Hellwig	17d5363b83	scsi: introduce a result field in struct scsi_request This passes on the scsi_cmnd result field to users of passthrough requests. Currently we abuse req->errors for this purpose, but that field will go away in its current form. Note that the old IDE code abuses the errors field in very creative ways and stores all kinds of different values in it. I didn't dare to touch this magic, so the abuses are brought forward 1:1. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-20 12:16:10 -06:00
Christoph Hellwig	48920ff2a5	block: remove the discard_zeroes_data flag Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can kill this hack. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-08 11:25:38 -06:00
Christoph Hellwig	64c7f1d157	block, scsi: move the retries field to struct scsi_request Instead of bloating the generic struct request with it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-05 12:05:08 -06:00
Sagi Grimberg	4459e04297	iscsi-target: use generic inet_pton_with_scope Instead of parsing address strings, use a generic helper. Acked-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-04 09:48:23 -06:00
Xiubo Li	a5d68ba858	tcmu: Skip Data-Out blocks before gathering Data-In buffer for BIDI case For the bidirectional case, the Data-Out buffer blocks will always at the head of the tcmu_cmd's bitmap, and before gathering the Data-In buffer, first of all it should skip the Data-Out ones, or the device supporting BIDI commands won't work. Fixed: `26418649ee` ("target/user: Introduce data_bitmap, replace data_length/data_head/data_tail") Reported-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Cc: stable@vger.kernel.org # 4.6+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-04-02 16:18:51 -07:00
Nicholas Bellinger	1c99de981f	iscsi-target: Drop work-around for legacy GlobalSAN initiator Once upon a time back in 2009, a work-around was added to support the GlobalSAN iSCSI initiator v3.3 for MacOSX, which during login did not propose nor respond to MaxBurstLength, FirstBurstLength, DefaultTime2Wait and DefaultTime2Retain keys. The work-around in iscsi_check_proposer_for_optional_reply() allowed the missing keys to be proposed, but did not require waiting for a response before moving to full feature phase operation. This allowed GlobalSAN v3.3 to work out-of-the box, and for many years we didn't run into login interopt issues with any other initiators.. Until recently, when Martin tried a QLogic 57840S iSCSI Offload HBA on Windows 2016 which completed login, but subsequently failed with: Got unknown iSCSI OpCode: 0x43 The issue was QLogic MSFT side did not propose DefaultTime2Wait + DefaultTime2Retain, so LIO proposes them itself, and immediately transitions to full feature phase because of the GlobalSAN hack. However, the QLogic MSFT side still attempts to respond to DefaultTime2Retain + DefaultTime2Wait, even though LIO has set ISCSI_FLAG_LOGIN_NEXT_STAGE3 + ISCSI_FLAG_LOGIN_TRANSIT in last login response. So while the QLogic MSFT side should have been proposing these two keys to start, it was doing the correct thing per RFC-3720 attempting to respond to proposed keys before transitioning to full feature phase. All that said, recent versions of GlobalSAN iSCSI (v5.3.0.541) does correctly propose the four keys during login, making the original work-around moot. So in order to allow QLogic MSFT to run unmodified as-is, go ahead and drop this long standing work-around. Reported-by: Martin Svec <martin.svec@zoner.cz> Cc: Martin Svec <martin.svec@zoner.cz> Cc: Himanshu Madhani <Himanshu.Madhani@cavium.com> Cc: Arun Easi <arun.easi@cavium.com> Cc: stable@vger.kernel.org # 3.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-04-02 14:10:16 -07:00
Mike Christie	d19c4643a5	target: Fix ALUA transition state race between multiple initiators Multiple threads could be writing to alua_access_state at the same time, or there could be multiple STPGs in flight (different initiators sending them or one initiator sending them to different ports), or a combo of both and the core_alua_do_transition_tg_pt calls will race with each other. Because from the last patches we no longer delay running core_alua_do_transition_tg_pt_work, there does not seem to be any point in running that in a workqueue. And, we always wait for it to complete one way or another, so we can sleep in this code path. So, this patch made over target-pending just adds a mutex and does the work core_alua_do_transition_tg_pt_work was doing in core_alua_do_transition_tg_pt. There is also no need to use an atomic for the tg_pt_gp_alua_access_state. In core_alua_do_transition_tg_pt we will test and set it under the transition mutex. And, it is a int/32 bits so in the other places where it is read, we will never see it partially updated. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-30 23:12:40 -07:00
Nicholas Bellinger	a4467018c2	iscsi-target: Propigate queue_data_in + queue_status errors This patch changes iscsi-target to propagate iscsit_transport ->iscsit_queue_data_in() and ->iscsit_queue_status() callback errors, back up into target-core. This allows target-core to retry failed iscsit_transport callbacks using internal queue-full logic. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Cc: Potnuri Bharat Teja <bharat@chelsio.com> Reported-by: Steve Wise <swise@opengridcomputing.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-30 20:34:58 -07:00
Nicholas Bellinger	fa7e25cf13	target: Fix unknown fabric callback queue-full errors This patch fixes a set of queue-full response handling bugs, where outgoing responses are leaked when a fabric driver is propagating non -EAGAIN or -ENOMEM errors to target-core. It introduces TRANSPORT_COMPLETE_QF_ERR state used to signal when CHECK_CONDITION status should be generated, when fabric driver ->write_pending(), ->queue_data_in(), or ->queue_status() callbacks fail with non -EAGAIN or -ENOMEM errors, and data-transfer should not be retried. Note all fabric driver -EAGAIN and -ENOMEM errors are still retried indefinately with associated data-transfer callbacks, following existing queue-full logic. Also fix two missing ->queue_status() queue-full cases related to CMD_T_ABORTED w/ TAS status handling. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Cc: Potnuri Bharat Teja <bharat@chelsio.com> Reported-by: Steve Wise <swise@opengridcomputing.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-30 20:34:31 -07:00
Xiubo Li	abe342a5b4	tcmu: Fix wrongly calculating of the base_command_size The t_data_nents and t_bidi_data_nents are the numbers of the segments, but it couldn't be sure the block size equals to size of the segment. For the worst case, all the blocks are discontiguous and there will need the same number of iovecs, that's to say: blocks == iovs. So here just set the number of iovs to block count needed by tcmu cmd. Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Cc: stable@vger.kernel.org # 3.18+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-30 01:36:53 -07:00
Xiubo Li	ab22d2604c	tcmu: Fix possible overwrite of t_data_sg's last iov[] If there has BIDI data, its first iov[] will overwrite the last iov[] for se_cmd->t_data_sg. To fix this, we can just increase the iov pointer, but this may introuduce a new memory leakage bug: If the se_cmd->data_length and se_cmd->t_bidi_data_sg->length are all not aligned up to the DATA_BLOCK_SIZE, the actual length needed maybe larger than just sum of them. So, this could be avoided by rounding all the data lengthes up to DATA_BLOCK_SIZE. Reviewed-by: Mike Christie <mchristi@redhat.com> Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Cc: stable@vger.kernel.org # 3.18+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-30 01:36:52 -07:00
Nicholas Bellinger	49cb77e297	target: Avoid mappedlun symlink creation during lun shutdown This patch closes a race between se_lun deletion during configfs unlink in target_fabric_port_unlink() -> core_dev_del_lun() -> core_tpg_remove_lun(), when transport_clear_lun_ref() blocks waiting for percpu_ref RCU grace period to finish, but a new NodeACL mappedlun is added before the RCU grace period has completed. This can happen in target_fabric_mappedlun_link() because it only checks for se_lun->lun_se_dev, which is not cleared until after transport_clear_lun_ref() percpu_ref RCU grace period finishes. This bug originally manifested as NULL pointer dereference OOPsen in target_stat_scsi_att_intr_port_show_attr_dev() on v4.1.y code, because it dereferences lun->lun_se_dev without a explicit NULL pointer check. In post v4.1 code with target-core RCU conversion, the code in target_stat_scsi_att_intr_port_show_attr_dev() no longer uses se_lun->lun_se_dev, but the same race still exists. To address the bug, go ahead and set se_lun>lun_shutdown as early as possible in core_tpg_remove_lun(), and ensure new NodeACL mappedlun creation in target_fabric_mappedlun_link() fails during se_lun shutdown. Reported-by: James Shen <jcs@datera.io> Cc: James Shen <jcs@datera.io> Tested-by: James Shen <jcs@datera.io> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-30 01:36:52 -07:00
Nicholas Bellinger	efb2ea770b	iscsi-target: Fix TMR reference leak during session shutdown This patch fixes a iscsi-target specific TMR reference leak during session shutdown, that could occur when a TMR was quiesced before the hand-off back to iscsi-target code via transport_cmd_check_stop_to_fabric(). The reference leak happens because iscsit_free_cmd() was incorrectly skipping the final target_put_sess_cmd() for TMRs when transport_generic_free_cmd() returned zero because the se_cmd->cmd_kref did not reach zero, due to the missing se_cmd assignment in original code. The result was iscsi_cmd and it's associated se_cmd memory would be freed once se_sess->sess_cmd_map where released, but the associated se_tmr_req was leaked and remained part of se_device->dev_tmr_list. This bug would manfiest itself as kernel paging request OOPsen in core_tmr_lun_reset(), when a left-over se_tmr_req attempted to dereference it's se_cmd pointer that had already been released during normal session shutdown. To address this bug, go ahead and treat ISCSI_OP_SCSI_CMD and ISCSI_OP_SCSI_TMFUNC the same when there is an extra se_cmd->cmd_kref to drop in iscsit_free_cmd(), and use op_scsi to signal __iscsit_free_cmd() when the former needs to clear any further iscsi related I/O state. Reported-by: Rob Millner <rlm@daterainc.com> Cc: Rob Millner <rlm@daterainc.com> Reported-by: Chu Yuan Lin <cyl@datera.io> Cc: Chu Yuan Lin <cyl@datera.io> Tested-by: Chu Yuan Lin <cyl@datera.io> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-30 01:36:51 -07:00
Nicholas Bellinger	740372b76e	tcmu: Allow cmd_time_out to be set to zero (disabled) The new cmd_time_out configfs attribute for TCMU is allowed to be disabled, so go ahead and drop the tcmu_cmd_time_out_store() check. Reported-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-30 01:36:44 -07:00
Nicholas Bellinger	7d7a743543	tcmu: Convert cmd_time_out into backend device attribute Instead of putting cmd_time_out under ../target/core/user_0/foo/control, which has historically been used by parameters needed for initial backend device configuration, go ahead and move cmd_time_out into a backend device attribute. In order to do this, tcmu_module_init() has been updated to create a local struct configfs_attribute tcmu_attrs, that is based upon the existing passthrough_attrib_attrs along with the new cmd_time_out attribute. Once tcm_attrs has been setup, go ahead and point it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core. Also following MNC's previous change, ->cmd_time_out is stored in milliseconds but exposed via configfs in seconds. Also, note this patch restricts the modification of ->cmd_time_out to before + after the TCMU device has been configured, but not while it has active fabric exports. Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 16:32:30 -07:00
Mike Christie	af980e46a2	tcmu: make cmd timeout configurable A single daemon could implement multiple types of devices using multuple types of real devices that may not support restarting from crashes and/or handling tcmu timeouts. This makes the cmd timeout configurable, so handlers that do not support it can turn if off for now. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 16:32:27 -07:00
Mike Christie	972c7f1679	tcmu: add helper to check if dev was configured This adds a helper to check if the dev was configured. It will be used in the next patch to prevent updates to some config settings after the device has been setup. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 16:32:19 -07:00
Mike Christie	760bf578ed	target: fix race during implicit transition work flushes This fixes the following races: 1. core_alua_do_transition_tg_pt could have read tg_pt_gp_alua_access_state and gone into this if chunk: if (!explicit && atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) == ALUA_ACCESS_STATE_TRANSITION) { and then core_alua_do_transition_tg_pt_work could update the state. core_alua_do_transition_tg_pt would then only set tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would not get updated with the second calls state. 2. core_alua_do_transition_tg_pt could be setting tg_pt_gp_transition_complete while the tg_pt_gp_transition_work is already completing. core_alua_do_transition_tg_pt then waits on the completion that will never be called. To handle these issues, we just call flush_work which will return when core_alua_do_transition_tg_pt_work has completed so there is no need to do the complete/wait. And, if core_alua_do_transition_tg_pt_work was running, instead of trying to sneak in the state change, we just schedule up another core_alua_do_transition_tg_pt_work call. Note that this does not handle a possible race where there are multiple threads call core_alua_do_transition_tg_pt at the same time. I think we need a mutex in target_tg_pt_gp_alua_access_state_store. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 14:47:29 -07:00
Mike Christie	1ca4d4fa3b	target: allow userspace to set state to transitioning Userspace target_core_user handlers like tcmu-runner may want to set the ALUA state to transitioning while it does implicit transitions. This patch allows that state when set from configfs. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 14:47:28 -07:00
Mike Christie	d7175373f2	target: fix ALUA transition timeout handling The implicit transition time tells initiators the min time to wait before timing out a transition. We currently schedule the transition to occur in tg_pt_gp_implicit_trans_secs seconds so there is no room for delays. If core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata needs to write out info to a remote file, then the initiator can easily time out the operation. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 14:47:28 -07:00
Mike Christie	207ee84133	target: Use system workqueue for ALUA transitions If tcmu-runner is processing a STPG and needs to change the kernel's ALUA state then we cannot use the same work queue for task management requests and ALUA transitions, because we could deadlock. The problem occurs when a STPG times out before tcmu-runner is able to call into target_tg_pt_gp_alua_access_state_store-> core_alua_do_port_transition -> core_alua_do_transition_tg_pt -> queue_work. In this case, the tmr is on the work queue waiting for the STPG to complete, but the STPG transition is now queued behind the waiting tmr. Note: This bug will also be fixed by this patch: http://www.spinics.net/lists/target-devel/msg14560.html which switches the tmr code to use the system workqueues. For both, I am not sure if we need a dedicated workqueue since it is not a performance path and I do not think we need WQ_MEM_RECLAIM to make forward progress to free up memory like the block layer does. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 14:47:27 -07:00
Mike Christie	0a41457298	target: fail ALUA transitions for pscsi We do not setup the LU group for pscsi devices, so if you write a state to alua_access_state that will cause a transition you will get a NULL pointer dereference. This patch will fail attempts to try and transition the path for backend devices that set the TRANSPORT_FLAG_PASSTHROUGH_ALUA flag. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 14:47:26 -07:00
Mike Christie	530c6891b1	target: allow ALUA setup for some passthrough backends This patch allows passthrough backends to use the core/base LIO ALUA setup and state checks, but still handle the execution of commands. This will allow the target_core_user module to execute STPG and RTPG in userspace, and not have to duplicate the ALUA state checks, path information (needed so we can check if command is executable on specific paths) and setup (rtslib sets/updates the configfs ALUA interface like it does for iblock or file). For STPG, the target_core_user userspace daemon, tcmu-runner will still execute the STPG, and to update the core/base LIO state it will use the existing configfs interface. For RTPG, tcmu-runner will loop over configfs and/or cache the state. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 14:47:25 -07:00
Mike Christie	2579325ca0	tcmu: return on first Opt parse failure We only were returing failure if the last opt to be parsed failed. This has a return failure when we first detect a failure. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 14:47:25 -07:00
Mike Christie	3abaa2bfdb	tcmu: allow hw_max_sectors greater than 128 tcmu hard codes the hw_max_sectors to 128 which is a litle small. Userspace uses the max_sectors to report the optimal IO size and some initiators perform better with larger IOs (open-iscsi seems to do better with 256 to 512 depending on the test). (Fix do not display hw max sectors twice - MNC) Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 14:46:52 -07:00
Nicholas Bellinger	9c28ca4ff8	target: Drop pointless tfo->check_stop_free check All in-tree fabric drivers provide a tfo->check_stop_free(), so there is no need to do the extra check within existing transport_cmd_check_stop_to_fabric() code. Just to be sure, add a check in target_fabric_tf_ops_check() to notify any out-of-tree drivers that might be missing it. Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-18 14:42:50 -07:00
Max Lohrmann	13603685c1	target: Fix VERIFY_16 handling in sbc_parse_cdb As reported by Max, the Windows 2008 R2 chkdsk utility expects VERIFY_16 to be supported, and does not handle the returned CHECK_CONDITION properly, resulting in an infinite loop. The kernel will log huge amounts of this error: kernel: TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x8f, sending CHECK_CONDITION. Signed-off-by: Max Lohrmann <post@wickenrode.com> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-07 22:15:23 -08:00
Nicholas Bellinger	a04e54f2c3	target/pscsi: Fix TYPE_TAPE + TYPE_MEDIMUM_CHANGER export The following fixes a divide by zero OOPs with TYPE_TAPE due to pscsi_tape_read_blocksize() failing causing a zero sd->sector_size being propigated up via dev_attrib.hw_block_size. It also fixes another long-standing bug where TYPE_TAPE and TYPE_MEDIMUM_CHANGER where using pscsi_create_type_other(), which does not call scsi_device_get() to take the device reference. Instead, rename pscsi_create_type_rom() to pscsi_create_type_nondisk() and use it for all cases. Finally, also drop a dump_stack() in pscsi_get_blocks() for non TYPE_DISK, which in modern target-core can get invoked via target_sense_desc_format() during CHECK_CONDITION. Reported-by: Malcolm Haak <insanemal@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-03-07 22:15:19 -08:00
Linus Torvalds	1827adb11a	Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull sched.h split-up from Ingo Molnar: "The point of these changes is to significantly reduce the <linux/sched.h> header footprint, to speed up the kernel build and to have a cleaner header structure. After these changes the new <linux/sched.h>'s typical preprocessed size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K lines), which is around 40% faster to build on typical configs. Not much changed from the last version (-v2) posted three weeks ago: I eliminated quirks, backmerged fixes plus I rebased it to an upstream SHA1 from yesterday that includes most changes queued up in -next plus all sched.h changes that were pending from Andrew. I've re-tested the series both on x86 and on cross-arch defconfigs, and did a bisectability test at a number of random points. I tried to test as many build configurations as possible, but some build breakage is probably still left - but it should be mostly limited to architectures that have no cross-compiler binaries available on kernel.org, and non-default configurations" * 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits) sched/headers: Clean up <linux/sched.h> sched/headers: Remove #ifdefs from <linux/sched.h> sched/headers: Remove the <linux/topology.h> include from <linux/sched.h> sched/headers, hrtimer: Remove the <linux/wait.h> include from <linux/hrtimer.h> sched/headers, x86/apic: Remove the <linux/pm.h> header inclusion from <asm/apic.h> sched/headers, timers: Remove the <linux/sysctl.h> include from <linux/timer.h> sched/headers: Remove <linux/magic.h> from <linux/sched/task_stack.h> sched/headers: Remove <linux/sched.h> from <linux/sched/init.h> sched/core: Remove unused prefetch_stack() sched/headers: Remove <linux/rculist.h> from <linux/sched.h> sched/headers: Remove the 'init_pid_ns' prototype from <linux/sched.h> sched/headers: Remove <linux/signal.h> from <linux/sched.h> sched/headers: Remove <linux/rwsem.h> from <linux/sched.h> sched/headers: Remove the runqueue_is_locked() prototype sched/headers: Remove <linux/sched.h> from <linux/sched/hotplug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/debug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/nohz.h> sched/headers: Remove <linux/sched.h> from <linux/sched/stat.h> sched/headers: Remove the <linux/gfp.h> include from <linux/sched.h> sched/headers: Remove <linux/rtmutex.h> from <linux/sched.h> ...	2017-03-03 10:16:38 -08:00
Linus Torvalds	69fd110eb6	Merge branch 'work.sendmsg' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs sendmsg updates from Al Viro: "More sendmsg work. This is a fairly separate isolated stuff (there's a continuation around lustre, but that one was too late to soak in -next), thus the separate pull request" * 'work.sendmsg' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ncpfs: switch to sock_sendmsg() ncpfs: don't mess with manually advancing iovec on send ncpfs: sendmsg does not bugger iovec these days ceph_tcp_sendpage(): use ITER_BVEC sendmsg afs_send_pages(): use ITER_BVEC rds: remove dead code ceph: switch to sock_recvmsg() usbip_recv(): switch to sock_recvmsg() iscsi_target: deal with short writes on the tx side [nbd] pass iov_iter to nbd_xmit() [nbd] switch sock_xmit() to sock_{send,recv}msg() [drbd] use sock_sendmsg()	2017-03-02 15:16:38 -08:00
Linus Torvalds	821fd6f6cb	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "The highlights this round include: - enable dual mode (initiator + target) qla2xxx operation. (Quinn + Himanshu) - add a framework for qla2xxx async fabric discovery. (Quinn + Himanshu) - enable iscsi PDU DDP completion offload in cxgbit/T6 NICs. (Varun) - fix target-core handling of aborted failed commands. (Bart) - fix a long standing target-core issue NULL pointer dereference with active I/O LUN shutdown. (Rob Millner + Bryant + nab)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (44 commits) target: Add counters for ABORT_TASK success + failure iscsi-target: Fix early login failure statistics misses target: Fix NULL dereference during LUN lookup + active I/O shutdown target: Delete tmr from list before processing target: Fix handling of aborted failed commands uapi: fix linux/target_core_user.h userspace compilation errors target: export protocol identifier qla2xxx: Fix a warning reported by the "smatch" static checker target/iscsi: Fix unsolicited data seq_end_offset calculation target/cxgbit: add T6 iSCSI DDP completion feature target/cxgbit: Enable DDP for T6 only if data sequence and pdu are in order target/cxgbit: Use T6 specific macros to get ETH/IP hdr len target/cxgbit: use cxgb4_tp_smt_idx() to get smt idx target/iscsi: split iscsit_check_dataout_hdr() target: Remove command flag CMD_T_DEV_ACTIVE target: Remove command flag CMD_T_BUSY target: Move session check from target_put_sess_cmd() into target_release_cmd_kref() target: Inline transport_cmd_check_stop() target: Remove an overly chatty debug message target: Stop execution if CMD_T_STOP has been set ...	2017-03-02 14:52:05 -08:00
Ingo Molnar	174cd4b1e5	sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:32 +01:00
Ingo Molnar	3f07c01441	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/signal.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:29 +01:00
Linus Torvalds	cf393195c3	Merge branch 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax Pull IDR rewrite from Matthew Wilcox: "The most significant part of the following is the patch to rewrite the IDR & IDA to be clients of the radix tree. But there's much more, including an enhancement of the IDA to be significantly more space efficient, an IDR & IDA test suite, some improvements to the IDR API (and driver changes to take advantage of those improvements), several improvements to the radix tree test suite and RCU annotations. The IDR & IDA rewrite had a good spin in linux-next and Andrew's tree for most of the last cycle. Coupled with the IDR test suite, I feel pretty confident that any remaining bugs are quite hard to hit. 0-day did a great job of watching my git tree and pointing out problems; as it hit them, I added new test-cases to be sure not to be caught the same way twice" Willy goes on to expand a bit on the IDR rewrite rationale: "The radix tree and the IDR use very similar data structures. Merging the two codebases lets us share the memory allocation pools, and results in a net deletion of 500 lines of code. It also opens up the possibility of exposing more of the features of the radix tree to users of the IDR (and I have some interesting patches along those lines waiting for 4.12) It also shrinks the size of the 'struct idr' from 40 bytes to 24 which will shrink a fair few data structures that embed an IDR" * 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax: (32 commits) radix tree test suite: Add config option for map shift idr: Add missing __rcu annotations radix-tree: Fix __rcu annotations radix-tree: Add rcu_dereference and rcu_assign_pointer calls radix tree test suite: Run iteration tests for longer radix tree test suite: Fix split/join memory leaks radix tree test suite: Fix leaks in regression2.c radix tree test suite: Fix leaky tests radix tree test suite: Enable address sanitizer radix_tree_iter_resume: Fix out of bounds error radix-tree: Store a pointer to the root in each node radix-tree: Chain preallocated nodes through ->parent radix tree test suite: Dial down verbosity with -v radix tree test suite: Introduce kmalloc_verbose idr: Return the deleted entry from idr_remove radix tree test suite: Build separate binaries for some tests ida: Use exceptional entries for small IDAs ida: Move ida_bitmap to a percpu variable Reimplement IDR and IDA using the radix tree radix-tree: Add radix_tree_iter_delete ...	2017-02-28 20:29:41 -08:00
Nicholas Bellinger	c87ba9c49c	target: Add counters for ABORT_TASK success + failure This patch introduces two counters for ABORT_TASK success + failure under: /sys/kernel/config/target/core/$HBA/$DEV/statistics/scsi_tgt_dev/ that are useful for diagnosing various backend device latency and front fabric issues. Normally when folks see alot of aborts_complete happening, it means the backend device I/O completion latency is high, and not returning completions fast enough before host side timeouts trigger. And normally when folks see alot of aborts_no_task, it means completions are being posted by target-core into fabric driver code, but the responses aren't making it back to the host. Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-26 16:21:06 -08:00
Nicholas Bellinger	17c61ad66f	iscsi-target: Fix early login failure statistics misses Due to the long standing checks in iscsit_snmp_get_tiqn() that assume conn->sess->tpg dereference of tpg->tpg_tiqn for iscsit_collect_login_stats() usage, some of the early login failure cases like ISCSI_LOGIN_STATUS_TGT_FORBIDDEN where not getting incremented, due to sess->tpg assignment happening later in iscsi_login_zero_tsih_s2(). Instead, use the earlier conn->tpg assignment done by iscsi_target_locate_portal() -> iscsit_get_tpg_from_np() so the existing counters are incremented correctly for the various early login failure cases. Also, go ahead and drop the old rate limiting check in iscsit_collect_login_stats(), so we get the true number of failed login attempts in the existing statistics. Reported-by: Ryan Stiles <ras@datera.io> Cc: Ryan Stiles <ras@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-26 16:18:22 -08:00
Nicholas Bellinger	bd4e2d2907	target: Fix NULL dereference during LUN lookup + active I/O shutdown When transport_clear_lun_ref() is shutting down a se_lun via configfs with new I/O in-flight, it's possible to trigger a NULL pointer dereference in transport_lookup_cmd_lun() due to the fact percpu_ref_get() doesn't do any __PERCPU_REF_DEAD checking before incrementing lun->lun_ref.count after lun->lun_ref has switched to atomic_t mode. This results in a NULL pointer dereference as LUN shutdown code in core_tpg_remove_lun() continues running after the existing ->release() -> core_tpg_lun_ref_release() callback completes, and clears the RCU protected se_lun->lun_se_dev pointer. During the OOPs, the state of lun->lun_ref in the process which triggered the NULL pointer dereference looks like the following on v4.1.y stable code: struct se_lun { lun_link_magic = 4294932337, lun_status = TRANSPORT_LUN_STATUS_FREE, ..... lun_se_dev = 0x0, lun_sep = 0x0, ..... lun_ref = { count = { counter = 1 }, percpu_count_ptr = 3, release = 0xffffffffa02fa1e0 <core_tpg_lun_ref_release>, confirm_switch = 0x0, force_atomic = false, rcu = { next = 0xffff88154fa1a5d0, func = 0xffffffff8137c4c0 <percpu_ref_switch_to_atomic_rcu> } } } To address this bug, use percpu_ref_tryget_live() to ensure once __PERCPU_REF_DEAD is visable on all CPUs and ->lun_ref has switched to atomic_t, all new I/Os will fail to obtain a new lun->lun_ref reference. Also use an explicit percpu_ref_kill_and_confirm() callback to block on ->lun_ref_comp to allow the first stage and associated RCU grace period to complete, and then block on ->lun_ref_shutdown waiting for the final percpu_ref_put() to drop the last reference via transport_lun_remove_cmd() before continuing with core_tpg_remove_lun() shutdown. Reported-by: Rob Millner <rlm@daterainc.com> Tested-by: Rob Millner <rlm@daterainc.com> Cc: Rob Millner <rlm@daterainc.com> Tested-by: Vaibhav Tandon <vst@datera.io> Cc: Vaibhav Tandon <vst@datera.io> Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> # v3.14+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-26 16:08:44 -08:00
Dave Jiang	11bac80004	mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf ->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-24 17:46:54 -08:00
Linus Torvalds	4cc4b9323f	First set of updates for 4.11 kernel merge window - Add new Broadcom bnxt_re RoCE driver - rxe driver updates - ioctl cleanups - ETH_P_IBOE declaration cleanup - IPoIB changes - Add port state cache - Allow srpt driver to accept guids as port names in config - Update to hfi1 driver - Update to srp driver - Lots of misc. minor changes all over -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYrfewAAoJELgmozMOVy/dFnEP/2Qe7NqXRqxLS0ZqsQseFHgQ jd236E7R/XtQQTE3PTcrWL0mq0DRF6tMEjfhUASKTbZVfCBTniJAoXYrvWhN/STq LxAdigdV/0SPbxO3r9B1Xvk2v5BySaIBkaUDvcEXzT4e7UVQwZgxDkhhsYeY0Z/r 9bNB5760PzW8uO5cctXccNcWztZnW0IUZuAHVfQCPjZ7svoGwLnNDW6YQx+FsEkW tbPdzMXX8VKHlC5RcKbfOOBjdNyrUpWl+uvWEc/7mazKscp4yKVFZL7PcxqPJSfd aKdfqXYawhjZZpyws8Kn0rhkfT7xWKD/y9G5STykRJPj9/n1BDScFkmyDQhtP5bJ GANzdgH0z7Dt9LkcAs86A8EVBbIdbdT2cpPVu7t0uWEIsJw/O5ThKpgjnrrTm6m+ 89tgqLZooifTEsdj4UkZoyktrD3J9LSNZkgVmWtRn01W3oYFOPbdM4TmBZtg+/Yl VGmOJEHMEsNuJBcJcOuRJ1MVz2LebXmPUcB0RXzgmHHgulZ/DqoOtlpg5JNmJcr5 wpw/yppkBop4V4+etJBlzDsZNmZZlX+AY0ZLqQJsDHNszDjwXgAy5Rn5FYIdMyk4 ff0FKb5dzASSxHRDxAsu2uoGaREM0NkpA0UYiIZbepGLSO8PuFG2ScQ6qzU47vqu 9SEzOaaQY2S2uqFFFnYp =ugNm -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "First set of updates for 4.11 kernel merge window - Add new Broadcom bnxt_re RoCE driver - rxe driver updates - ioctl cleanups - ETH_P_IBOE declaration cleanup - IPoIB changes - Add port state cache - Allow srpt driver to accept guids as port names in config - Update to hfi1 driver - Update to srp driver - Lots of misc minor changes all over" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (114 commits) RDMA/bnxt_re: fix for "bnxt_en: Update to firmware interface spec 1.7.0." rdma_cm: fail iwarp accepts w/o connection params IB/srp: Drain the send queue before destroying a QP IB/core: Add support for draining IB_POLL_DIRECT completion queues IB/srp: Improve an error path IB/srp: Make a diagnostic message more informative IB/srp: Document locking conventions IB/srp: Fix race conditions related to task management IB/srp: Avoid that duplicate responses trigger a kernel bug IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS RDMA/qedr: Fix some error handling RDMA/bnxt_re: add DCB dependency IB/hns: include linux/module.h IB/vmw_pvrdma: Expose vendor error to ULPs vmw_pvrdma: switch to pci_alloc_irq_vectors IB/hfi1: use size_t for passing array length IB/ipoib: Remove redudant label IB/ipoib: remove the unnecessary memory free IB/mthca: switch to pci_alloc_irq_vectors IB/hfi1: Code reuse with memdup_copy ...	2017-02-23 08:27:57 -08:00
Linus Torvalds	3051bf36c2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini Varadhan. 2) Simplify classifier state on sk_buff in order to shrink it a bit. From Willem de Bruijn. 3) Introduce SIPHASH and it's usage for secure sequence numbers and syncookies. From Jason A. Donenfeld. 4) Reduce CPU usage for ICMP replies we are going to limit or suppress, from Jesper Dangaard Brouer. 5) Introduce Shared Memory Communications socket layer, from Ursula Braun. 6) Add RACK loss detection and allow it to actually trigger fast recovery instead of just assisting after other algorithms have triggered it. From Yuchung Cheng. 7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot. 8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert. 9) Export MPLS packet stats via netlink, from Robert Shearman. 10) Significantly improve inet port bind conflict handling, especially when an application is restarted and changes it's setting of reuseport. From Josef Bacik. 11) Implement TX batching in vhost_net, from Jason Wang. 12) Extend the dummy device so that VF (virtual function) features, such as configuration, can be more easily tested. From Phil Sutter. 13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric Dumazet. 14) Add new bpf MAP, implementing a longest prefix match trie. From Daniel Mack. 15) Packet sample offloading support in mlxsw driver, from Yotam Gigi. 16) Add new aquantia driver, from David VomLehn. 17) Add bpf tracepoints, from Daniel Borkmann. 18) Add support for port mirroring to b53 and bcm_sf2 drivers, from Florian Fainelli. 19) Remove custom busy polling in many drivers, it is done in the core networking since 4.5 times. From Eric Dumazet. 20) Support XDP adjust_head in virtio_net, from John Fastabend. 21) Fix several major holes in neighbour entry confirmation, from Julian Anastasov. 22) Add XDP support to bnxt_en driver, from Michael Chan. 23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan. 24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi. 25) Support GRO in IPSEC protocols, from Steffen Klassert" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits) Revert "ath10k: Search SMBIOS for OEM board file extension" net: socket: fix recvmmsg not returning error from sock_error bnxt_en: use eth_hw_addr_random() bpf: fix unlocking of jited image when module ronx not set arch: add ARCH_HAS_SET_MEMORY config net: napi_watchdog() can use napi_schedule_irqoff() tcp: Revert "tcp: tcp_probe: use spin_lock_bh()" net/hsr: use eth_hw_addr_random() net: mvpp2: enable building on 64-bit platforms net: mvpp2: switch to build_skb() in the RX path net: mvpp2: simplify MVPP2_PRS_RI_* definitions net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT net: mvpp2: remove unused register definitions net: mvpp2: simplify mvpp2_bm_bufs_add() net: mvpp2: drop useless fields in mvpp2_bm_pool and related code net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue' net: mvpp2: release reference to txq_cpu[] entry after unmapping net: mvpp2: handle too large value in mvpp2_rx_time_coal_set() net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set() net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set ...	2017-02-22 10:15:09 -08:00
Linus Torvalds	772c8f6f3b	for-4.11/linus-merge-signed -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJYqeb8AAoJEPfTWPspceCmB3UP/3UtcPrzEm8w2cxB9MaWhZN3 J+jiwlO4vaqhm2HVzQtoJqfaqRlud/iDx5cIXE2S7FnIM54ZKs3CANbKu8X+b1zm eJije3zMI8A8qyftigbz6a/Y2kWE4ZqFEc9WU5CWawfTl3ImCVUi8+F5X0wOLU/h r50zAQOEyURH4G5usNl9q0olF6FonJ82AcYm1iJ0QP2wYWZRJauC0rRn8IT93tyK bZPHnGKdkd7km8yi3zr2GNWOfuZZuA0HWAaF4qfrHPZQ883gITFAUIlFb1f+2TNl DkQzRrBB2wPWPnlbfb9KejMkvL94hflzsLb5rHt835DyVXFRyjxsgyAI8A+LPGSz vqZ3rsbWj6H4F9z2CkZ+T+AP/ZSWDNjwc0RXPm9HYdR5CDeTxIUVvnFQ44YNsmTv Xd5BKrUJ2oKegAxQG6zcuFx23p8JzhT70l+mNrMdtyeKnDD9FRdDvhKG9AHeTipn o/DnGivhS3UMQoQ7D68KOO+kuhLDeo7my5XGsnjzMO/iHqg++7IP2HyYYs/Ba4qZ cYaCtSDQW71Zt0vsqa6dvPuXBveu4h8Qh8R7uAGjSGS9IAFFb4Cab2tiUdISE6PE YnMWzY+G6pT8imlLVOL5/QFuo2Q4pUsaL0AHpXMCN9TZnQtbqXa8eqwnKnQ0m2KN 7ut0IYYEPaYUX5xFn1K6 =z7AL -----END PGP SIGNATURE----- Merge tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-block Pull block layer updates from Jens Axboe: - blk-mq scheduling framework from me and Omar, with a port of the deadline scheduler for this framework. A port of BFQ from Paolo is in the works, and should be ready for 4.12. - Various fixups and improvements to the above scheduling framework from Omar, Paolo, Bart, me, others. - Cleanup of the exported sysfs blk-mq data into debugfs, from Omar. This allows us to export more information that helps debug hangs or performance issues, without cluttering or abusing the sysfs API. - Fixes for the sbitmap code, the scalable bitmap code that was migrated from blk-mq, from Omar. - Removal of the BLOCK_PC support in struct request, and refactoring of carrying SCSI payloads in the block layer. This cleans up the code nicely, and enables us to kill the SCSI specific parts of struct request, shrinking it down nicely. From Christoph mainly, with help from Hannes. - Support for ranged discard requests and discard merging, also from Christoph. - Support for OPAL in the block layer, and for NVMe as well. Mainly from Scott Bauer, with fixes/updates from various others folks. - Error code fixup for gdrom from Christophe. - cciss pci irq allocation cleanup from Christoph. - Making the cdrom device operations read only, from Kees Cook. - Fixes for duplicate bdi registrations and bdi/queue life time problems from Jan and Dan. - Set of fixes and updates for lightnvm, from Matias and Javier. - A few fixes for nbd from Josef, using idr to name devices and a workqueue deadlock fix on receive. Also marks Josef as the current maintainer of nbd. - Fix from Josef, overwriting queue settings when the number of hardware queues is updated for a blk-mq device. - NVMe fix from Keith, ensuring that we don't repeatedly mark and IO aborted, if we didn't end up aborting it. - SG gap merging fix from Ming Lei for block. - Loop fix also from Ming, fixing a race and crash between setting loop status and IO. - Two block race fixes from Tahsin, fixing request list iteration and fixing a race between device registration and udev device add notifiations. - Double free fix from cgroup writeback, from Tejun. - Another double free fix in blkcg, from Hou Tao. - Partition overflow fix for EFI from Alden Tondettar. * tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-block: (156 commits) nvme: Check for Security send/recv support before issuing commands. block/sed-opal: allocate struct opal_dev dynamically block/sed-opal: tone down not supported warnings block: don't defer flushes on blk-mq + scheduling blk-mq-sched: ask scheduler for work, if we failed dispatching leftovers blk-mq: don't special case flush inserts for blk-mq-sched blk-mq-sched: don't add flushes to the head of requeue queue blk-mq: have blk_mq_dispatch_rq_list() return if we queued IO or not block: do not allow updates through sysfs until registration completes lightnvm: set default lun range when no luns are specified lightnvm: fix off-by-one error on target initialization Maintainers: Modify SED list from nvme to block Move stack parameters for sed_ioctl to prevent oversized stack with CONFIG_KASAN uapi: sed-opal fix IOW for activate lsp to use correct struct cdrom: Make device operations read-only elevator: fix loading wrong elevator type for blk-mq devices cciss: switch to pci_irq_alloc_vectors block/loop: fix race between I/O and set_status blk-mq-sched: don't hold queue_lock when calling exit_icq block: set make_request_fn manually in blk_mq_update_nr_hw_queues ...	2017-02-21 10:57:33 -08:00
Bart Van Assche	51ec502a32	target: Delete tmr from list before processing This patch does an explicit list_del_init(tmr->tmr_list) in core_tmr_drain_tmr_list() before starting processing of outstanding TMRs to abort, instead of explicitly checking which TMR descriptor matches the caller. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-20 14:37:19 -08:00
Bart Van Assche	e3b88ee95b	target: Fix handling of aborted failed commands If a target driver (e.g. tcm_qla2xxx) calls transport_generic_request_failure() to report that receiving data has failed and that SCSI command has already been aborted by the initiator, ensure that the SCSI status ABORTED is sent back to the initiator instead of the sense code provided by the target driver. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-20 13:28:09 -08:00
Linus Torvalds	42e1b14b6e	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Implement wraparound-safe refcount_t and kref_t types based on generic atomic primitives (Peter Zijlstra) - Improve and fix the ww_mutex code (Nicolai Hähnle) - Add self-tests to the ww_mutex code (Chris Wilson) - Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr Bueso) - Micro-optimize the current-task logic all around the core kernel (Davidlohr Bueso) - Tidy up after recent optimizations: remove stale code and APIs, clean up the code (Waiman Long) - ... plus misc fixes, updates and cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) fork: Fix task_struct alignment locking/spinlock/debug: Remove spinlock lockup detection code lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS lkdtm: Convert to refcount_t testing kref: Implement 'struct kref' using refcount_t refcount_t: Introduce a special purpose refcount type sched/wake_q: Clarify queue reinit comment sched/wait, rcuwait: Fix typo in comment locking/mutex: Fix lockdep_assert_held() fail locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock() locking/rwsem: Reinit wake_q after use locking/rwsem: Remove unnecessary atomic_long_t casts jump_labels: Move header guard #endif down where it belongs locking/atomic, kref: Implement kref_put_lock() locking/ww_mutex: Turn off __must_check for now locking/atomic, kref: Avoid more abuse locking/atomic, kref: Use kref_get_unless_zero() more locking/atomic, kref: Kill kref_sub() locking/atomic, kref: Add kref_read() locking/atomic, kref: Add KREF_INIT() ...	2017-02-20 13:23:30 -08:00
Mike Christie	0ab8ac6f50	target: export protocol identifier I think the transport statistics device file was supposed to show scsiTransportProtocolType. It instead shows the fabric name which is normally closer to the driver name. I was thinking I cannot change from fabric name to protocol type name incase people are expecting the driver name, so this patch adds another file proto_id that exports the SCSI protocol identifier ID. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-18 21:36:50 -08:00
Varun Prakash	4d65491c26	target/iscsi: Fix unsolicited data seq_end_offset calculation In case of unsolicited data for the first sequence seq_end_offset must be set to minimum of total data length and FirstBurstLength, so do not add cmd->write_data_done to the min of total data length and FirstBurstLength. This patch avoids that with ImmediateData=Yes, InitialR2T=No, MaxXmitDataSegmentLength < FirstBurstLength that a WRITE command with IO size above FirstBurstLength triggers sequence error messages, for example Set following parameters on target (linux-4.8.12) ImmediateData = Yes InitialR2T = No MaxXmitDataSegmentLength = 8k FirstBurstLength = 64k Log in from Open iSCSI initiator and execute dd if=/dev/zero of=/dev/sdb bs=128k count=1 oflag=direct Error messages on target Command ITT: 0x00000035 with Offset: 65536, Length: 8192 outside of Sequence 73728:131072 while DataSequenceInOrder=Yes. Command ITT: 0x00000035, received DataSN: 0x00000001 higher than expected 0x00000000. Unable to perform within-command recovery while ERL=0. Signed-off-by: Varun Prakash <varun@chelsio.com> [ bvanassche: Use min() instead of open-coding it / edited patch description ] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-18 21:24:21 -08:00
Varun Prakash	79e57cfe00	target/cxgbit: add T6 iSCSI DDP completion feature Chelsio T6 adapters reduce number of completion to host by generating single completion for all directly placed(DDP) iSCSI pdus in a sequence, completion contains iSCSI hdr of the last pdu in a sequence. On receiving DDP completion cxgbit driver finds iSCSI cmd using iscsit_find_cmd_from_itt_or_dump(), then updates cmd->write_data_done, cmd->next_burst_len, cmd->data_sn and calls __iscsit_check_dataout_hdr() to validate iSCSI hdr. (Update __iscsit_check_dataout_hdr parameter usage - nab) Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-18 21:24:21 -08:00
Varun Prakash	5248788ec3	target/cxgbit: Enable DDP for T6 only if data sequence and pdu are in order Enable DDP for T6 only if DataSequenceInOrder=YES and DataPDUInOrder=YES to ensure inorder delivery of iSCSI pdus. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-18 21:24:20 -08:00
Varun Prakash	d88b504e41	target/cxgbit: Use T6 specific macros to get ETH/IP hdr len Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-18 21:24:19 -08:00
Varun Prakash	3da1110fd1	target/cxgbit: use cxgb4_tp_smt_idx() to get smt idx cxgb4_tp_smt_idx() returns smt idx for T4,T5,T6 adapters. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-18 21:24:19 -08:00
Varun Prakash	9a584bf9bf	target/iscsi: split iscsit_check_dataout_hdr() Split iscsit_check_dataout_hdr() into two functions 1. __iscsit_check_dataout_hdr() - This function validates data out hdr. 2. iscsit_check_dataout_hdr() - This function finds iSCSI cmd using iscsit_find_cmd_from_itt_or_dump(), then it calls __iscsit_check_dataout_hdr() to validate iSCSI hdr. This split is required to support Chelsio T6 iSCSI DDP completion feature. T6 adapters reduce number of completions to host by generating single completion for all directly placed(DDP) iSCSI pdus in a sequence, DDP completion contains iSCSI hdr of the last pdu in a sequence. On receiving DDP completion cxgbit driver will first find iSCSI cmd using iscsit_find_cmd_from_itt_or_dump() then updates cmd->write_data_done, cmd->next_burst_len, cmd->data_sn and calls __iscsit_check_dataout_hdr() to validate iSCSI hdr. (Move XRDSL check ahead of itt lookup / dump - nab) Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2017-02-18 21:24:15 -08:00
Jens Axboe	818551e2b2	Merge branch 'for-4.11/next' into for-4.11/linus-merge Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-17 14:08:19 -07:00

... 2 3 4 5 6 ...

1866 Commits