PREEMPT (and PREEMPT AND ABORT) should return CONFLICT iff a specified
SERVICE ACTION RESERVATION KEY is specified and matches no existing
persistent reservation.
Without this patch, a PREEMPT will return CONFLICT if either all
reservations are held by the initiator (self preemption) or there is
nothing to preempt. According to the spec, both of these cases should
succeed.
Signed-off-by: Steven Allen <steven.allen@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The fact that a target is published on the any address has no bearing on
which port(s) it is published. SendTargets should always send the
portal's port, not the port used for discovery.
Signed-off-by: Steven Allen <steven.allen@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If an initiator sends a zero-length command (e.g. TEST UNIT READY) but
sets the transfer direction in the transport layer to indicate a
data-out phase, we still shouldn't try to transfer data. At best it's
a NOP, and depending on the transport, we might crash on an
uninitialized sg list.
Reported-by: Craig Watson <craig.watson@vanguard-rugged.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: <stable@vger.kernel.org> # 3.1
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Pull SCSI target updates from Nicholas Bellinger:
"Here are the target updates for v3.18-rc2 code. These where
originally destined for -rc1, but due to the combination of travel
last week for KVM Forum and my mistake of taking the three week merge
window literally, the pull request slipped.. Apologies for that.
Things where reasonably quiet this round. The highlights include:
- New userspace backend driver (target_core_user.ko) by Shaohua Li
and Andy Grover
- A number of cleanups in target, iscsi-taret and qla_target code
from Joern Engel
- Fix an OOPs related to queue full handling with CHECK_CONDITION
status from Quinn Tran
- Fix to disable TX completion interrupt coalescing in iser-target,
that was causing problems on some hardware
- Fix for PR APTPL metadata handling with demo-mode ACLs
I'm most excited about the new backend driver that uses UIO + shared
memory ring to dispatch I/O and control commands into user-space.
This was probably the most requested feature by users over the last
couple of years, and opens up a new area of development + porting of
existing user-space storage applications to LIO. Thanks to Shaohua +
Andy for making this happen.
Also another honorable mention, a new Xen PV SCSI driver was merged
via the xen/tip.git tree recently, which puts us now at 10 target
drivers in upstream! Thanks to David Vrabel + Juergen Gross for their
work to get this code merged"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (40 commits)
target/file: fix inclusive vfs_fsync_range() end
iser-target: Disable TX completion interrupt coalescing
target: Add force_pr_aptpl device attribute
target: Fix APTPL metadata handling for dynamic MappedLUNs
qla_target: don't delete changed nacls
target/user: Recalculate pad size inside is_ring_space_avail()
tcm_loop: Fixup tag handling
iser-target: Fix smatch warning
target/user: Fix up smatch warnings in tcmu_netlink_event
target: Add a user-passthrough backstore
target: Add documentation on the target userspace pass-through driver
uio: Export definition of struct uio_device
target: Remove unneeded check in sbc_parse_cdb
target: Fix queue full status NULL pointer for SCF_TRANSPORT_TASK_SENSE
qla_target: rearrange struct qla_tgt_prm
qla_target: improve qlt_unmap_sg()
qla_target: make some global functions static
qla_target: remove unused parameter
target: simplify core_tmr_abort_task
target: encapsulate smp_mb__after_atomic()
...
Pull core block layer changes from Jens Axboe:
"This is the core block IO pull request for 3.18. Apart from the new
and improved flush machinery for blk-mq, this is all mostly bug fixes
and cleanups.
- blk-mq timeout updates and fixes from Christoph.
- Removal of REQ_END, also from Christoph. We pass it through the
->queue_rq() hook for blk-mq instead, freeing up one of the request
bits. The space was overly tight on 32-bit, so Martin also killed
REQ_KERNEL since it's no longer used.
- blk integrity updates and fixes from Martin and Gu Zheng.
- Update to the flush machinery for blk-mq from Ming Lei. Now we
have a per hardware context flush request, which both cleans up the
code should scale better for flush intensive workloads on blk-mq.
- Improve the error printing, from Rob Elliott.
- Backing device improvements and cleanups from Tejun.
- Fixup of a misplaced rq_complete() tracepoint from Hannes.
- Make blk_get_request() return error pointers, fixing up issues
where we NULL deref when a device goes bad or missing. From Joe
Lawrence.
- Prep work for drastically reducing the memory consumption of dm
devices from Junichi Nomura. This allows creating clone bio sets
without preallocating a lot of memory.
- Fix a blk-mq hang on certain combinations of queue depths and
hardware queues from me.
- Limit memory consumption for blk-mq devices for crash dump
scenarios and drivers that use crazy high depths (certain SCSI
shared tag setups). We now just use a single queue and limited
depth for that"
* 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits)
block: Remove REQ_KERNEL
blk-mq: allocate cpumask on the home node
bio-integrity: remove the needless fail handle of bip_slab creating
block: include func name in __get_request prints
block: make blk_update_request print prefix match ratelimited prefix
blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio
block: fix alignment_offset math that assumes io_min is a power-of-2
blk-mq: Make bt_clear_tag() easier to read
blk-mq: fix potential hang if rolling wakeup depth is too high
block: add bioset_create_nobvec()
block: use bio_clone_fast() in blk_rq_prep_clone()
block: misplaced rq_complete tracepoint
sd: Honor block layer integrity handling flags
block: Replace strnicmp with strncasecmp
block: Add T10 Protection Information functions
block: Don't merge requests if integrity flags differ
block: Integrity checksum flag
block: Relocate bio integrity flags
block: Add a disk flag to block integrity profile
block: Add prefix to block integrity profile flags
...
Both of the file target's calls to vfs_fsync_range() got the end offset
off by one. The range is inclusive, not exclusive. It would sync a bit
more data than was required.
The sync path already tested the length of the range and fell back to
LLONG_MAX so I copied that pattern in the rw path.
This is untested. I found the errors by inspection while following other
code.
Signed-off-by: Zach Brown <zab@zabbo.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a force_pr_aptpl device attribute used to force SPC-3 PR
Activate Persistence across Target Power Loss (APTPL) operation. This
makes PR metadata write-out occur during state change regardless if new
PERSISTENT_RESERVE_OUT CDBs have their APTPL feature bit set.
This is useful during H/A failover in active/passive setups where all PR
state is being re-created on a different node, driven by configfs backend
device + export layout and pre-loaded $DEV/pr/res_aptpl_metadata.
Cc: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a bug in handling of SPC-3 PR Activate Persistence
across Target Power Loss (APTPL) logic where re-creation of state for
MappedLUNs from dynamically generated NodeACLs did not occur during
I_T Nexus establishment.
It adds the missing core_scsi3_check_aptpl_registration() call during
core_tpg_check_initiator_node_acl() -> core_tpg_add_node_to_devs() in
order to replay any pre-loaded APTPL metadata state associated with
the newly connected SCSI Initiator Port.
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If more than one thread is waiting for command ring space that includes
a PAD, then if the first one finishes (inserts a PAD and a CMD at the
start of the cmd ring) then the second one will incorrectly think it still
needs to insert a PAD (i.e. cmdr_space_needed is now wrong.) This will
lead to it asking for more space than it actually needs, and then inserting
a PAD somewhere else than at the end -- not what we want.
This patch moves the pad calculation inside is_ring_space_available() so
in the above scenario the second thread would then ask for space not
including a PAD. The patch also inserts a PAD op based upon an up-to-date
cmd_head, instead of the potentially stale value.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The SCSI command tag is set to the tag assigned from the block
layer, not the SCSI-II tag message. So we need to convert
it into the correct SCSI-II tag message based on the
device flags, not the tag value itself.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes up the following unused return smatch warnings:
drivers/target/target_core_user.c:778 tcmu_netlink_event warn: unused return: ret = nla_put_string()
drivers/target/target_core_user.c:780 tcmu_netlink_event warn: unused `return: ret = nla_put_u32()
(Fix up missing semicolon: grover)
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Add a LIO storage engine that presents commands to userspace for execution.
This would allow more complex backstores to be implemented out-of-kernel,
and also make experimentation a-la FUSE (but at the SCSI level -- "SUSE"?)
possible.
It uses a mmap()able UIO device per LUN to share a command ring and data
area. The commands are raw SCSI CDBs and iovs for in/out data. The command
ring is also reused for returning scsi command status and optional sense
data.
This implementation is based on Shaohua Li's earlier version but heavily
modified. Differences include:
* Shared memory allocated by kernel, not locked-down user pages
* Single ring for command request and response
* Offsets instead of embedded pointers
* Generic SCSI CDB passthrough instead of per-cmd specialization in ring
format.
* Uses UIO device instead of anon_file passed in mailbox.
* Optional in-kernel handling of some commands.
The main reason for these differences is to permit greater resiliency
if the user process dies or hangs.
Things not yet implemented (on purpose):
* Zero copy. The data area is flexible enough to allow page flipping or
backend-allocated pages to be used by fabrics, but it's not clear these
are performance wins. Can come later.
* Out-of-order command completion by userspace. Possible to add by just
allowing userspace to change cmd_id in rsp cmd entries, but currently
not supported.
* No locks between kernel cmd submission and completion routines. Sounds
like it's possible, but this can come later.
* Sparse allocation of mmaped area. Current code vmallocs the whole thing.
If the mapped area was larger and not fully mapped then the driver would
have more freedom to change cmd and data area sizes based on demand.
Current code open issues:
* The use of idrs may be overkill -- we maybe can replace them with a
simple counter to generate cmd_ids, and a hash table to get a cmd_id's
associated pointer.
* Use of a free-running counter for cmd ring instead of explicit modulo
math. This would require power-of-2 cmd ring size.
(Add kconfig depends NET - Randy)
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The check of SCF_SCSI_DATA_CDB seems to be a remnant from before hch's
refactoring of this function. There are no places where that flag is set
that cmd->execute_cmd isn't also set.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
During temporary resource starvation at lower transport layer, command
is placed on queue full retry path, which expose this problem. The TCM
queue full handling of SCF_TRANSPORT_TASK_SENSE currently sends the same
cmd twice to lower layer. The 1st time led to cmd normal free path.
The 2nd time cause Null pointer access.
This regression bug was originally introduced v3.1-rc code in the
following commit:
commit e057f53308
Author: Christoph Hellwig <hch@infradead.org>
Date: Mon Oct 17 13:56:41 2011 -0400
target: remove the transport_qf_callback se_cmd callback
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
list_for_each_entry_safe is necessary if list objects are deleted from
the list while traversing it. Not the case here, so we can use the base
list_for_each_entry variant.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The target code has a rather generous helping of smp_mb__after_atomic()
throughout the code base. Most atomic operations were followed by one
and none were preceded by smp_mb__before_atomic(), nor accompanied by a
comment explaining the need for a barrier.
Instead of trying to prove for every case whether or not it is needed,
this patch introduces atomic_inc_mb() and atomic_dec_mb(), which
explicitly include the memory barriers before and after the atomic
operation. For now they are defined in a target header, although they
could be of general use.
Most of the existing atomic/mb combinations were replaced by the new
helpers. In a few cases the atomic was sandwiched in
spin_lock/spin_unlock and I simply removed the barrier.
I suspect that in most cases the correct conversion would have been to
drop the barrier. I also suspect that a few cases exist where a) the
barrier was necessary and b) a second barrier before the atomic would
have been necessary and got added by this patch.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
atomic_inc_return() already does an implicit memory barrier and the
second case was moved from an atomic to a plain flag operation. If a
barrier were needed in the second case, it would have to be smp_mb(),
not a variant optimized away for x86 and other architectures.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
And while at it, do minimal coding style fixes in the area.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Simple and just called from one place.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Remove core_tpg_pre_dellun entirely, since we don't need to get/check
a pointer we already have.
Nothing else can return an error, so core_dev_del_lun can return void.
Rename core_tpg_post_dellun to remove_lun - a clearer name, now that
pre_dellun is gone.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nothing in it can raise an error.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
With the recent addition of percpu_ref_reinit(), percpu_ref now can be
used as a persistent switch which can be turned on and off repeatedly
where turning off maps to killing the ref and waiting for it to drain;
however, there currently isn't a way to initialize a percpu_ref in its
off (killed and drained) state, which can be inconvenient for certain
persistent switch use cases.
Similarly, percpu_ref_switch_to_atomic/percpu() allow dynamic
selection of operation mode; however, currently a newly initialized
percpu_ref is always in percpu mode making it impossible to avoid the
latency overhead of switching to atomic mode.
This patch adds @flags to percpu_ref_init() and implements the
following flags.
* PERCPU_REF_INIT_ATOMIC : start ref in atomic mode
* PERCPU_REF_INIT_DEAD : start ref killed and drained
These flags should be able to serve the above two use cases.
v2: target_core_tpg.c conversion was missing. Fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
This is to receive 0a30288da1 ("blk-mq, percpu_ref: implement a
kludge for SCSI blk-mq stall during probe") which implements
__percpu_ref_kill_expedited() to work around SCSI blk-mq stall. The
commit reverted and patches to implement proper fix will be added.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Clearly a right-shift was meant. Effectively doesn't make a difference,
as add_len is hard-coded to 8 and the high byte will be zero no matter
which way you shift. But I hate leaving bad examples for others to
copy.
Found by coverity.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch correctly handles match_int() errors in FILEIO + PSCSI
backend parameter parsing, which can potentially fail due to a
memory allocation failure or invalid argument.
Found by coverity.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Old code in iscsi_parse_pr_out_transport_id() was obviously buggy
and effectively ignored the high byte.
Found by coverity.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Coverity complained that lun_cg has been dereferenced in all paths
leading to NULL check. It didn't mention that only a single path could
lead there and the code can be simplified even further.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a memory leak on error in target_fabric_make_mappedlun(),
where se_lun_acl memory does not get released on exit.
Found by coverity.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Each case of match_strdup could leak memory if the same argument was
present before. I am not too concerned, as it would require a
non-sensical combination like "target_lun=foo target_lun=bar", done
with root privileges and even then leak just a few bytes per instance.
But arg_p is different, as it will always leak memory. Let's plug that
one. And while at it, replace some &args[0] with args.
Found by coverity.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
last_intr_fail_name is a fixed-size array and could theoretically
overflow. In reality intrname->value doesn't seem to depend on
untrusted input or be anywhere near 224 characters, so the overflow is
pretty theoretical. But strlcpy is cheap enough.
Found by coverity.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In iscsi_copy_param_list() a failed iscsi_param_list memory allocation
currently invokes iscsi_release_param_list() to cleanup, and will promptly
trigger a NULL pointer dereference.
Instead, go ahead and return for the first iscsi_copy_param_list()
failure case.
Found by coverity.
Signed-off-by: Joern Engel <joern@logfs.org>
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a bug in iscsit_logout_post_handler_diffcid() where
a pointer used as storage for list_for_each_entry() was incorrectly
being used to determine if no matching entry had been found.
This patch changes iscsit_logout_post_handler_diffcid() to key off
bool conn_found to determine if the function needs to exit early.
Reported-by: Joern Engel <joern@logfs.org>
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Found by coverity. At this point sock is non-NULL, so the check
to unnecessary.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch drops the now duplicate + unnecessary check for -ENODEV from
iscsi_transport->iscsit_accept_np() for jumping to out:, or immediately
returning 1 in __iscsi_target_login_thread() code.
Since commit 81a9c5e72b the jump to out: and returning 1 have the same
effect, and end up hitting the ISCSI_NP_THREAD_SHUTDOWN check regardless
at the top of __iscsi_target_login_thread() during next loop iteration.
So that said, it's safe to go ahead and remove this duplicate check.
Reported-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The return statement cannot be reached without either recovery or dump
being set to 1. Therefore the condition always evaluates to true and
recovery and dump are useless variables.
Found by Coverity.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Found by coverity. InitiatorName and InitiatorAlias are static arrays
and therefore always non-NULL. At some point in the past they may have
been dynamically allocated, but for current code the condition is
useless. If the intent was to check InitiatorName[0] instead, I cannot
find a use for that either. Let's get rid of it.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Last user of buf was removed with c6037cc546. While at it,
free_cpumask_var() handles a NULL argument just fine, so remove the
conditionals.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fix inverted logic in SE_DEV_ALUA_SUPPORT_STATE_STORE for setting
the supported ALUA access states via configfs, originally introduced
in commit b0a382c5.
A value of 1 should enable the support, not disable it.
Signed-off-by: Sebastian Herbszt <herbszt@gmx.de>
Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes an apparent cut and paste error in spc_emulate_evpd_b3(),
where lba_map_segment_size was being used twice for the Referrals VPD.
Go ahead and set the correct user data segment multiplier instead of
user data segment size.
Signed-off-by: Sebastian Herbszt <herbszt@gmx.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Clearly the file was meant to contain an include guard, but it was
missing the #define part.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1. This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.
The following Coccinelle semantic patch was used:
@@
@@
- rcu_assign_pointer
+ RCU_INIT_POINTER
(..., NULL)
Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch removes the null test on lun_cg. lun_cg is initialized
at the beginning of the function to &lun->lun_group. Since lun_cg is
dereferenced prior to the null test, it must be a valid pointer.
The following Coccinelle script is used for detecting the change:
@r@
expression e,f;
identifier g,y;
statement S1,S2;
@@
*e = &f->g
<+...
f->y
...+>
*if (e != NULL || ...)
S1 else S2
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a explicit check in iscsit_find_cmd_from_itt_or_dump()
to ignore commands with ICF_GOT_LAST_DATAOUT set. This is done to
address the case where an ITT is being reused for DataOUT, but the
previous command with the same ITT has not yet been acknowledged by
ExpStatSN and removed from the per connection command list.
This issue was originally manifesting itself by referencing the
previous command during ITT lookup, and subsequently hitting the
check in iscsit_check_dataout_hdr() for ICF_GOT_LAST_DATAOUT, that
resulted in the DataOUT PDU + associated payload being silently
dumped.
Reported-by: Arshad Hussain <arshad.hussain@calsoftinc.com>
Tested-by: Arshad Hussain <arshad.hussain@calsoftinc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Percpu allocator now supports allocation mask. Add @gfp to
percpu_ref_init() so that !GFP_KERNEL allocation masks can be used
with percpu_refs too.
This patch doesn't make any functional difference.
v2: blk-mq conversion was missing. Updated.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: Jens Axboe <axboe@kernel.dk>
The blk_get_request function may fail in low-memory conditions or during
device removal (even if __GFP_WAIT is set). To distinguish between these
errors, modify the blk_get_request call stack to return the appropriate
ERR_PTR. Verify that all callers check the return status and consider
IS_ERR instead of a simple NULL pointer check.
For consistency, make a similar change to the blk_mq_alloc_request leg
of blk_get_request. It may fail if the queue is dead, or the caller was
unwilling to wait.
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd]
Acked-by: Boaz Harrosh <bharrosh@panasas.com> [for osd]
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This patch set consists of the usual driver updates (ufs, storvsc, pm8001
hpsa). It also has removal of the user space target driver code (everyone is
using LIO now), a partial PCI MSI-X update, more multi-queue updates,
conversion to 64 bit LUNs (so we could theoretically cope with any LUN
returned by a device) and placeholder support for the ZBC device type (Shingle
drives), plus an assortment of minor updates and bug fixes.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJT4mS9AAoJEDeqqVYsXL0Mq34H/2AeXiM8GEVO3PIsBtF3TFZ9
poJvAyb8t//+VwAIVLHU9wrssIrIcyvNQmNHH/InGt5rOaXwGQRsnEc73bBtot4b
aC1t+hAnp2Ddvu6phmyUg7iY2GmQhAoZmeaj7krGIu2XgtLGiPg26eSsgk4Yv/U9
cuULEuOc/UnTj3w5VK8SvpyXMybVF6oQhSrS1slOglfFwPTlTI/NHU9xo7Wc3qHT
VifHXNphIvye5EH8zwtKX5p8qCrFW0pevJwyfPz7Hp2CTA9XYKx3SoeOh+n9F9ez
udBBggg7Vb1tb4mPKUoZ78UrtCVdFSCmesBU/RJe7cIh8daKaO5MVr3WPSx2JhM=
=yGai
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This patch set consists of the usual driver updates (ufs, storvsc,
pm8001 hpsa). It also has removal of the user space target driver
code (everyone is using LIO now), a partial PCI MSI-X update, more
multi-queue updates, conversion to 64 bit LUNs (so we could
theoretically cope with any LUN returned by a device) and placeholder
support for the ZBC device type (Shingle drives), plus an assortment
of minor updates and bug fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (143 commits)
scsi: do not issue SCSI RSOC command to Promise Vtrak E610f
vmw_pvscsi: Use pci_enable_msix_exact() instead of pci_enable_msix()
pm8001: Fix invalid return when request_irq() failed
lpfc: Remove superfluous call to pci_disable_msix()
isci: Use pci_enable_msix_exact() instead of pci_enable_msix()
bfa: Use pci_enable_msix_exact() instead of pci_enable_msix()
bfa: Cleanup bfad_setup_intr() function
bfa: Do not call pci_enable_msix() after it failed once
fnic: Use pci_enable_msix_exact() instead of pci_enable_msix()
scsi: use short driver name for per-driver cmd slab caches
scsi_debug: support scsi-mq, queues and locks
Drivers: add blist flags
scsi: ufs: fix endianness sparse warnings
scsi: ufs: make undeclared functions static
bnx2i: Update driver version to 2.7.10.1
pm8001: fix a memory leak in nvmd_resp
pm8001: fix update_flash
pm8001: fix a memory leak in flash_update
pm8001: Cleaning up uninitialized variables
pm8001: Fix to remove null pointer checks that could never happen
...
Pull percpu updates from Tejun Heo:
- Major reorganization of percpu header files which I think makes
things a lot more readable and logical than before.
- percpu-refcount is updated so that it requires explicit destruction
and can be reinitialized if necessary. This was pulled into the
block tree to replace the custom percpu refcnting implemented in
blk-mq.
- In the process, percpu and percpu-refcount got cleaned up a bit
* 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (21 commits)
percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero()
percpu-refcount: require percpu_ref to be exited explicitly
percpu-refcount: use unsigned long for pcpu_count pointer
percpu-refcount: add helpers for ->percpu_count accesses
percpu-refcount: one bit is enough for REF_STATUS
percpu-refcount, aio: use percpu_ref_cancel_init() in ioctx_alloc()
workqueue: stronger test in process_one_work()
workqueue: clear POOL_DISASSOCIATED in rebind_workers()
percpu: Use ALIGN macro instead of hand coding alignment calculation
percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations
percpu: preffity percpu header files
percpu: use raw_cpu_*() to define __this_cpu_*()
percpu: reorder macros in percpu header files
percpu: move {raw|this}_cpu_*() definitions to include/linux/percpu-defs.h
percpu: move generic {raw|this}_cpu_*_N() definitions to include/asm-generic/percpu.h
percpu: only allow sized arch overrides for {raw|this}_cpu_*() ops
percpu: reorganize include/linux/percpu-defs.h
percpu: move accessors from include/linux/percpu.h to percpu-defs.h
percpu: include/asm-generic/percpu.h should contain only arch-overridable parts
percpu: introduce arch_raw_cpu_ptr()
...
The SCSI standard defines 64-bit values for LUNs, and large arrays
employing large or hierarchical LUN numbers become more and more
common.
So update the linux SCSI stack to use 64-bit LUN numbers.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Currently, a percpu_ref undoes percpu_ref_init() automatically by
freeing the allocated percpu area when the percpu_ref is killed.
While seemingly convenient, this has the following niggles.
* It's impossible to re-init a released reference counter without
going through re-allocation.
* In the similar vein, it's impossible to initialize a percpu_ref
count with static percpu variables.
* We need and have an explicit destructor anyway for failure paths -
percpu_ref_cancel_init().
This patch removes the automatic percpu counter freeing in
percpu_ref_kill_rcu() and repurposes percpu_ref_cancel_init() into a
generic destructor now named percpu_ref_exit(). percpu_ref_destroy()
is considered but it gets confusing with percpu_ref_kill() while
"exit" clearly indicates that it's the counterpart of
percpu_ref_init().
All percpu_ref_cancel_init() users are updated to invoke
percpu_ref_exit() instead and explicit percpu_ref_exit() calls are
added to the destruction path of all percpu_ref users.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: Li Zefan <lizefan@huawei.com>
On uniprocessor preemptible kernel, target core deadlocks on unload. The
following events happen:
* iscsit_del_np is called
* it calls send_sig(SIGINT, np->np_thread, 1);
* the scheduler switches to the np_thread
* the np_thread is woken up, it sees that kthread_should_stop() returns
false, so it doesn't terminate
* the np_thread clears signals with flush_signals(current); and goes back
to sleep in iscsit_accept_np
* the scheduler switches back to iscsit_del_np
* iscsit_del_np calls kthread_stop(np->np_thread);
* the np_thread is waiting in iscsit_accept_np and it doesn't respond to
kthread_stop
The deadlock could be resolved if the administrator sends SIGINT signal to
the np_thread with killall -INT iscsi_np
The reproducible deadlock was introduced in commit
db6077fd0b, but the thread-stopping code was
racy even before.
This patch fixes the problem. Using kthread_should_stop to stop the
np_thread is unreliable, so we test np_thread_state instead. If
np_thread_state equals ISCSI_NP_THREAD_SHUTDOWN, the thread exits.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch changes iscsit_check_dataout_hdr() to dump the incoming
Data-Out payload when the received ITT is not associated with a
WRITE, instead of calling iscsit_reject_cmd() for the non WRITE
ITT descriptor.
This addresses a bug where an initiator sending an Data-Out for
an ITT associated with a READ would end up generating a reject
for the READ, eventually resulting in list corruption.
Reported-by: Santosh Kulkarni <santosh.kulkarni@calsoftinc.com>
Reported-by: Arshad Hussain <arshad.hussain@calsoftinc.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a tcm_loop_cmd descriptor memory leak in the
tcm_loop_submission_work() error path, and would result in
warnings about leaked tcm_loop_cmd_cache objects at module
unload time.
Go ahead and invoke kmem_cache_free() to release tl_cmd back to
tcm_loop_cmd_cache before calling sc->scsi_done().
Reported-by: Sebastian Herbszt <herbszt@gmx.de>
Tested-by: Sebastian Herbszt <herbszt@gmx.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a explicit memset to the login response PDU
exception path in iscsit_tx_login_rsp().
This addresses a regression bug introduced in commit baa4d64b
where the initiator would end up not receiving the login
response and associated status class + detail, before closing
the login connection.
Reported-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Tested-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a left-over se_lun->lun_sep pointer OOPs when one
of the /sys/kernel/config/target/$FABRIC/$WWPN/$TPGT/lun/$LUN/alua*
attributes is accessed after the $DEVICE symlink has been removed.
To address this bug, go ahead and clear se_lun->lun_sep memory in
core_dev_unexport(), so that the existing checks for show/store
ALUA attributes in target_core_fabric_configfs.c work as expected.
Reported-by: Sebastian Herbszt <herbszt@gmx.de>
Tested-by: Sebastian Herbszt <herbszt@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a check in chap_server_compute_md5() to enforce a
1024 byte maximum for the CHAP_C key value following the requirement
in RFC-3720 Section 11.1.4:
"..., C and R are large-binary-values and their binary length (not
the length of the character string that represents them in encoded
form) MUST not exceed 1024 bytes."
Reported-by: rahul.rane <rahul.rane@calsoftinc.com>
Tested-by: rahul.rane <rahul.rane@calsoftinc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch converts chap_server_compute_md5() from simple_strtoul() to
kstrtoul usage().
This addresses the case where a empty 'CHAP_I=' key value received during
mutual authentication would be converted to a '0' by simple_strtoul(),
instead of failing the login attempt.
Reported-by: Tejas Vaykole <tejas.vaykole@calsoftinc.com>
Tested-by: Tejas Vaykole <tejas.vaykole@calsoftinc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Pull SCSI target updates from Nicholas Bellinger:
"The highlights this round include:
- Add support for T10 PI pass-through between vhost-scsi +
virtio-scsi (MST + Paolo + MKP + nab)
- Add support for T10 PI in qla2xxx target mode (Quinn + MKP + hch +
nab, merged through scsi.git)
- Add support for percpu-ida pre-allocation in qla2xxx target code
(Quinn + nab)
- A number of iser-target fixes related to hardening the network
portal shutdown path (Sagi + Slava)
- Fix response length residual handling for a number of control CDBs
(Roland + Christophe V.)
- Various iscsi RFC conformance fixes in the CHAP authentication path
(Tejas and Calsoft folks + nab)
- Return TASK_SET_FULL status for tcm_fc(FCoE) DataIn + Response
failures (Vasu + Jun + nab)
- Fix long-standing ABORT_TASK + session reset hang (nab)
- Convert iser-initiator + iser-target to include T10 bytes into EDTL
(Sagi + Or + MKP + Mike Christie)
- Fix NULL pointer dereference regression related to XCOPY introduced
in v3.15 + CC'ed to v3.12.y (nab)"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (34 commits)
target: Fix NULL pointer dereference for XCOPY in target_put_sess_cmd
vhost-scsi: Include prot_bytes into expected data transfer length
TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire
libiscsi, iser: Adjust data_length to include protection information
scsi_cmnd: Introduce scsi_transfer_length helper
target: Report correct response length for some commands
target/sbc: Check that the LBA and number of blocks are correct in VERIFY
target/sbc: Remove sbc_check_valid_sectors()
Target/iscsi: Fix sendtargets response pdu for iser transport
Target/iser: Fix a wrong dereference in case discovery session is over iser
iscsi-target: Fix ABORT_TASK + connection reset iscsi_queue_req memory leak
target: Use complete_all for se_cmd->t_transport_stop_comp
target: Set CMD_T_ACTIVE bit for Task Management Requests
target: cleanup some boolean tests
target/spc: Simplify INQUIRY EVPD=0x80
tcm_fc: Generate TASK_SET_FULL status for response failures
tcm_fc: Generate TASK_SET_FULL status for DataIN failures
iscsi-target: Reject mutual authentication with reflected CHAP_C
iscsi-target: Remove no-op from iscsit_tpg_del_portal_group
iscsi-target: Fix CHAP_A parameter list handling
...
This patch fixes a NULL pointer dereference regression bug that was
introduced with:
commit 1e1110c43b
Author: Mikulas Patocka <mpatocka@redhat.com>
Date: Sat May 17 06:49:22 2014 -0400
target: fix memory leak on XCOPY
Now that target_put_sess_cmd() -> kref_put_spinlock_irqsave() is
called with a valid se_cmd->cmd_kref, a NULL pointer dereference
is triggered because the XCOPY passthrough commands don't have
an associated se_session pointer.
To address this bug, go ahead and checking for a NULL se_sess pointer
within target_put_sess_cmd(), and call se_cmd->se_tfo->release_cmd()
to release the XCOPY's xcopy_pt_cmd memory.
Reported-by: Thomas Glanzmann <thomas@glanzmann.de>
Cc: Thomas Glanzmann <thomas@glanzmann.de>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In various areas of the code, it is assumed that
se_cmd->data_length describes pure data. In case
that protection information exists over the wire
(protect bits is are on) the target core re-calculates
the data length from the CDB and the backed device
block size (instead of each transport peeking in the cdb).
Modify loopback device to include protection information
in the transferred data length (like other scsi transports).
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
When an initiator sends an allocation length bigger than what its
command consumes, the target should only return the actual response data
and set the residual length to the unused part of the allocation length.
Add a helper function that command handlers (INQUIRY, READ CAPACITY,
etc) can use to do this correctly, and use this code to get the correct
residual for commands that don't use the full initiator allocation in the
handlers for READ CAPACITY, READ CAPACITY(16), INQUIRY, MODE SENSE and
REPORT LUNS.
This addresses a handful of failures as reported by Christophe with
the Windows Certification Kit:
http://permalink.gmane.org/gmane.linux.scsi.target.devel/6515
Signed-off-by: Roland Dreier <roland@purestorage.com>
Tested-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch extracts LBA + sectors for VERIFY, and adds a goto check_lba
to perform the end-of-device checking.
(Update patch to drop lba_check usage - nab)
Signed-off-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
A similar check is performed at the end of sbc_parse_cdb() and is now
enforced if the SYNCHRONIZE CACHE command's backend supports
->execute_sync_cache().
(Add check_lba goto to avoid *_max_sectors checks - nab)
Signed-off-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In case the transport is iser we should not include the
iscsi target info in the sendtargets text response pdu.
This causes sendtargets response to include the target
info twice.
Modify iscsit_build_sendtargets_response to filter
transport types that don't match.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reported-by: Slava Shwartsman <valyushash@gmail.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Pull block layer fixes from Jens Axboe:
"Final small batch of fixes to be included before -rc1. Some general
cleanups in here as well, but some of the blk-mq fixes we need for the
NVMe conversion and/or scsi-mq. The pull request contains:
- Support for not merging across a specified "chunk size", if set by
the driver. Some NVMe devices perform poorly for IO that crosses
such a chunk, so we need to support it generically as part of
request merging avoid having to do complicated split logic. From
me.
- Bump max tag depth to 10Ki tags. Some scsi devices have a huge
shared tag space. Before we failed with EINVAL if a too large tag
depth was specified, now we truncate it and pass back the actual
value. From me.
- Various blk-mq rq init fixes from me and others.
- A fix for enter on a dying queue for blk-mq from Keith. This is
needed to prevent oopsing on hot device removal.
- Fixup for blk-mq timer addition from Ming Lei.
- Small round of performance fixes for mtip32xx from Sam Bradshaw.
- Minor stack leak fix from Rickard Strandqvist.
- Two __init annotations from Fabian Frederick"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: add __init to blkcg_policy_register
block: add __init to elv_register
block: ensure that bio_add_page() always accepts a page for an empty bio
blk-mq: add timer in blk_mq_start_request
blk-mq: always initialize request->start_time
block: blk-exec.c: Cleaning up local variable address returnd
mtip32xx: minor performance enhancements
blk-mq: ->timeout should be cleared in blk_mq_rq_ctx_init()
blk-mq: don't allow queue entering for a dying queue
blk-mq: bump max tag depth to 10K tags
block: add blk_rq_set_block_pc()
block: add notion of a chunk size for request merging
This patch fixes a iscsi_queue_req memory leak when ABORT_TASK response
has been queued by TFO->queue_tm_rsp() -> lio_queue_tm_rsp() after a
long standing I/O completes, but the connection has already reset and
waiting for cleanup to complete in iscsit_release_commands_from_conn()
-> transport_generic_free_cmd() -> transport_wait_for_tasks() code.
It moves iscsit_free_queue_reqs_for_conn() after the per-connection command
list has been released, so that the associated se_cmd tag can be completed +
released by target-core before freeing any remaining iscsi_queue_req memory
for the connection generated by lio_queue_tm_rsp().
Cc: Thomas Glanzmann <thomas@glanzmann.de>
Cc: Charalampos Pournaris <charpour@gmail.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a bug where multiple waiters on ->t_transport_stop_comp
occurs due to a concurrent ABORT_TASK and session reset both invoking
transport_wait_for_tasks(), while waiting for the associated se_cmd
descriptor backend processing to complete.
For this case, complete_all() should be invoked in order to wake up
both waiters in core_tmr_abort_task() + transport_generic_free_cmd()
process contexts.
Cc: Thomas Glanzmann <thomas@glanzmann.de>
Cc: Charalampos Pournaris <charpour@gmail.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a bug where se_cmd descriptors associated with a
Task Management Request (TMR) where not setting CMD_T_ACTIVE before
being dispatched into target_tmr_work() process context.
This is required in order for transport_generic_free_cmd() ->
transport_wait_for_tasks() to wait on se_cmd->t_transport_stop_comp
if a session reset event occurs while an ABORT_TASK is outstanding
waiting for another I/O to complete.
Cc: Thomas Glanzmann <thomas@glanzmann.de>
Cc: Charalampos Pournaris <charpour@gmail.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Now that 3.15 is released, this merges the 'next' branch into 'master',
bringing us to the normal situation where my 'master' branch is the
merge window.
* accumulated work in next: (6809 commits)
ufs: sb mutex merge + mutex_destroy
powerpc: update comments for generic idle conversion
cris: update comments for generic idle conversion
idle: remove cpu_idle() forward declarations
nbd: zero from and len fields in NBD_CMD_DISCONNECT.
mm: convert some level-less printks to pr_*
MAINTAINERS: adi-buildroot-devel is moderated
MAINTAINERS: add linux-api for review of API/ABI changes
mm/kmemleak-test.c: use pr_fmt for logging
fs/dlm/debug_fs.c: replace seq_printf by seq_puts
fs/dlm/lockspace.c: convert simple_str to kstr
fs/dlm/config.c: convert simple_str to kstr
mm: mark remap_file_pages() syscall as deprecated
mm: memcontrol: remove unnecessary memcg argument from soft limit functions
mm: memcontrol: clean up memcg zoneinfo lookup
mm/memblock.c: call kmemleak directly from memblock_(alloc|free)
mm/mempool.c: update the kmemleak stack trace for mempool allocations
lib/radix-tree.c: update the kmemleak stack trace for radix tree allocations
mm: introduce kmemleak_update_trace()
mm/kmemleak.c: use %u to print ->checksum
...
Convert "x == true" to "x" and "x == false" to "!x".
Signed-off-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch changes ft_queue_status() to set SAM_STAT_TASK_SET_FULL
status upon lport->tt.seq_send( failure, and return -EAGAIN to notify
target-core to attempt to requeue the response.
It also does the same for a fc_frame_alloc() failures, in order to
signal the initiator that it should try to reduce it's current
queue_depth, to lower the number of outstanding I/Os on the wire.
Reported-by: Vasu Dev <vasu.dev@linux.intel.com>
Reviewed-by: Vasu Dev <vasu.dev@linux.intel.com>
Cc: Jun Wu <jwu@stormojo.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch changes ft_queue_data_in() to set SAM_STAT_TASK_SET_FULL
status upon a lport->tt.seq_send() failure, where it will now stop
sending subsequent DataIN, and immediately attempt to send the
response with exception status.
Sending a response with SAM_STAT_TASK_SET_FULL status is useful in
order to signal the initiator that it should try to reduce it's
current queue_depth, to lower the number of outstanding I/Os on
the wire.
Also, add a check to skip sending DataIN if TASK_SET_FULL status
has already been set due to a response lport->tt.seq_send()
failure, that has asked target-core to requeue a response.
Reported-by: Vasu Dev <vasu.dev@linux.intel.com>
Reviewed-by: Vasu Dev <vasu.dev@linux.intel.com>
Cc: Jun Wu <jwu@stormojo.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
With the optimizations around not clearing the full request at alloc
time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC
up to the user allocating the request.
Add a blk_rq_set_block_pc() that sets the command type to
REQ_TYPE_BLOCK_PC, and properly initializes the members associated
with this type of request. Update callers to use this function instead
of manipulating rq->cmd_type directly.
Includes fixes from Christoph Hellwig <hch@lst.de> for my half-assed
attempt.
Signed-off-by: Jens Axboe <axboe@fb.com>
This patch fixes a OOPs where an attempt to write to the per-device
alua_access_state configfs attribute at:
/sys/kernel/config/target/core/$HBA/$DEV/alua/$TG_PT_GP/alua_access_state
results in an NULL pointer dereference when the backend device has not
yet been configured.
This patch adds an explicit check for DF_CONFIGURED, and fails with
-ENODEV to avoid this case.
Reported-by: Chris Boot <crb@tiger-computing.co.uk>
Reported-by: Philip Gaw <pgaw@darktech.org.uk>
Cc: Chris Boot <crb@tiger-computing.co.uk>
Cc: Philip Gaw <pgaw@darktech.org.uk>
Cc: stable@vger.kernel.org # 3.8+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch allows READ_CAPACITY + SAI_READ_CAPACITY_16 opcode
processing to occur while the associated ALUA group is in Standby
access state.
This is required to avoid host side LUN probe failures during the
initial scan if an ALUA group has already implicitly changed into
Standby access state.
This addresses a bug reported by Chris + Philip using dm-multipath
+ ESX hosts configured with ALUA multipath.
Reported-by: Chris Boot <crb@tiger-computing.co.uk>
Reported-by: Philip Gaw <pgaw@darktech.org.uk>
Cc: Chris Boot <crb@tiger-computing.co.uk>
Cc: Philip Gaw <pgaw@darktech.org.uk>
Cc: Hannes Reinecke <hare@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds an explicit check in chap_server_compute_md5() to ensure
the CHAP_C value received from the initiator during mutual authentication
does not match the original CHAP_C provided by the target.
This is in line with RFC-3720, section 8.2.1:
Originators MUST NOT reuse the CHAP challenge sent by the Responder
for the other direction of a bidirectional authentication.
Responders MUST check for this condition and close the iSCSI TCP
connection if it occurs.
Reported-by: Tejas Vaykole <tejas.vaykole@calsoftinc.com>
Cc: stable@vger.kernel.org # 3.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch removes a no-op iscsit_clear_tpg_np_login_threads() call
in iscsit_tpg_del_portal_group(), which is unnecessary because
iscsit_tpg_del_portal_group() can only ever be removed from configfs
once all of the child network portals have been released.
Also, go ahed and make iscsit_clear_tpg_np_login_threads() declared
as static.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The target is failing to handle list of CHAP_A key-value pair form
initiator.The target is expecting CHAP_A=5 always. In other cases,
where initiator sends list (for example) CHAP_A=6,5 target is failing
the security negotiation. Which is incorrect.
This patch handles the case (RFC 3720 section 11.1.4).
where in the initiator may send list of CHAP_A values and target replies
with appropriate CHAP_A value in response
(Drop whitespaces + rename to chap_check_algorithm + save original
pointer + add explicit check for CHAP_A key - nab)
Signed-off-by: Tejas Vaykole <tejas.vaykole@calsoftinc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If the message "Unable to allocate…" pops up, it's useful to know
whether the problem is that the system is genuinely out of memory, or
that some bug has led to a crazy allocation length.
In particular this helped debug a corruption of login headers in
iscsi_login_non_zero_tsih_s1().
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch changes iscsi_target_handle_csg_zero() to explicitly
reject login requests in SecurityNegotiation with a zero-length
payload, following the language in RFC-3720 Section 8.2:
Whenever an iSCSI target gets a response whose keys, or their
values, are not according to the step definition, it MUST answer
with a Login reject with the "Initiator Error" or "Missing Parameter"
status.
Previously when a zero-length login request in CSG=0 was received,
the target would send a login response with CSG=0 + T_BIT=0 asking
the initiator to complete authentication, and not fail the login
until MAX_LOGIN_PDUS was reached. This change will now immediately
fail the login attempt with ISCSI_STATUS_CLS_INITIATOR_ERR status.
Reported-by: Tejas Vaykole <tejas.vaykole@calsoftinc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a iser-target specific regression introduced in
v3.15-rc6 with:
commit 14f4b54fe3
Author: Sagi Grimberg <sagig@mellanox.com>
Date: Tue Apr 29 13:13:47 2014 +0300
Target/iscsi,iser: Avoid accepting transport connections during stop stage
where the change to set iscsi_np->enabled = false within
iscsit_clear_tpg_np_login_thread() meant that a iscsi_np with
two iscsi_tpg_np exports would have it's parent iscsi_np set
to a disabled state, even if other iscsi_tpg_np exports still
existed.
This patch changes iscsit_clear_tpg_np_login_thread() to only
set iscsi_np->enabled = false when shutdown = true, and also
changes iscsit_del_np() to set iscsi_np->enabled = true when
iscsi_np->np_exports is non zero.
Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In non-leading connection login, iscsi_login_non_zero_tsih_s1() calls
iscsi_change_param_value() with the buffer it uses to hold the login
PDU, not a temporary buffer. This leads to the login header getting
corrupted and login failing for non-leading connections in MC/S.
Fix this by adding a wrapper iscsi_change_param_sprintf() that handles
the temporary buffer itself to avoid confusion. Also handle sending a
reject in case of failure in the wrapper, which lets the calling code
get quite a bit smaller and easier to read.
Finally, bump the size of the temporary buffer from 32 to 64 bytes to be
safe, since "MaxRecvDataSegmentLength=" by itself is 25 bytes; with a
trailing NUL, a value >= 1M will lead to a buffer overrun. (This isn't
the default but we don't need to run right at the ragged edge here)
Reported-by: Santosh Kulkarni <santosh.kulkarni@calsoftinc.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
On each processed XCOPY command, two "kmalloc-512" memory objects are
leaked. These represent two allocations of struct xcopy_pt_cmd in
target_core_xcopy.c.
The reason for the memory leak is that the cmd_kref field is not
initialized (thus, it is zero because the allocations were done with
kzalloc). When we decrement zero kref in target_put_sess_cmd, the result
is not zero, thus target_release_cmd_kref is not called.
This patch fixes the bug by moving kref initialization from
target_get_sess_cmd to transport_init_se_cmd (this function is called from
target_core_xcopy.c, so it will correctly initialize cmd_kref). It can be
easily verified that all code that calls target_get_sess_cmd also calls
transport_init_se_cmd earlier, thus moving kref_init shouldn't introduce
any new problems.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Just like for pSCSI, if the transport sets get_write_cache, then it is
not valid to enable write cache emulation for it. Return an error.
see https://bugzilla.redhat.com/show_bug.cgi?id=1082675
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch explicitly disables Immediate + Unsolicited Data for ISER
connections during login in iscsi_login_zero_tsih_s2() when protection
has been enabled for the session by the underlying hardware.
This is currently required because protection / signature memory regions
(MRs) expect T10 PI to occur on RDMA READs + RDMA WRITEs transfers, and
not on a immediate data payload associated with ISCSI_OP_SCSI_CMD, or
unsolicited data-out associated with a ISCSI_OP_SCSI_DATA_OUT.
v2 changes:
- Add TARGET_PROT_DOUT_INSERT check (Sagi)
- Add pr_debug noisemaker (Sagi)
- Add goto to avoid early return from MRDSL check (nab)
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a free-after-use regression in ft_free_cmd(), where
ft_sess_put() is called with cmd->sess after percpu_ida_free() has
already released the tag.
Fix this bug by saving the ft_sess pointer ahead of percpu_ida_free(),
and pass it directly to ft_sess_put().
The regression was originally introduced in v3.13-rc1 commit:
commit 5f544cfac9
Author: Nicholas Bellinger <nab@daterainc.com>
Date: Mon Sep 23 12:12:42 2013 -0700
tcm_fc: Convert to per-cpu command map pre-allocation of ft_cmd
Reported-by: Jun Wu <jwu@stormojo.com>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: <stable@vger.kernel.org> #3.13+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch changes an incorrect use of BUG_ON to instead generate a
REJECT + PROTOCOL_ERROR in iscsit_process_nop_out() code. This case
can occur with traditional TCP where a flood of zeros in the data
stream can reach this block for what is presumed to be a NOP-OUT with
a solicited reply, but without a valid iscsi_cmd pointer.
This incorrect BUG_ON was introduced during the v3.11-rc timeframe
with the following commit:
commit 778de36896
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Fri Jun 14 16:07:47 2013 -0700
iscsi/isert-target: Refactor ISCSI_OP_NOOP RX handling
Reported-by: Arshad Hussain <arshad.hussain@calsoftinc.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
When the target is in stop stage, iSER transport initiates RDMA disconnects.
The iSER initiator may wish to establish a new connection over the
still existing network portal. In this case iSER transport should not
accept and resume new RDMA connections. In order to learn that, iscsi_np
is added with enabled flag so the iSER transport can check when deciding
weather to accept and resume a new connection request.
The iscsi_np is enabled after successful transport setup, and disabled
before iscsi_np login threads are cleaned up.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Userspace tools assume if a value is read from configfs, it is valid
and will not cause an error if the same value is written back. The only
valid value for pi_prot_type for backends not supporting DIF is 0, so allow
this particular value to be set without returning an error.
Reported-by: Krzysztof Chojnowski <frirajder@gmail.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Pull yet more networking updates from David Miller:
1) Various fixes to the new Redpine Signals wireless driver, from
Fariya Fatima.
2) L2TP PPP connect code takes PMTU from the wrong socket, fix from
Dmitry Petukhov.
3) UFO and TSO packets differ in whether they include the protocol
header in gso_size, account for that in skb_gso_transport_seglen().
From Florian Westphal.
4) If VLAN untagging fails, we double free the SKB in the bridging
output path. From Toshiaki Makita.
5) Several call sites of sk->sk_data_ready() were referencing an SKB
just added to the socket receive queue in order to calculate the
second argument via skb->len. This is dangerous because the moment
the skb is added to the receive queue it can be consumed in another
context and freed up.
It turns out also that none of the sk->sk_data_ready()
implementations even care about this second argument.
So just kill it off and thus fix all these use-after-free bugs as a
side effect.
6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti.
7) pktgen needs to do locking properly for LLTX devices, from Daniel
Borkmann.
8) xen-netfront driver initializes TX array entries in RX loop :-) From
Vincenzo Maffione.
9) After refactoring, some tunnel drivers allow a tunnel to be
configured on top itself. Fix from Nicolas Dichtel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
vti: don't allow to add the same tunnel twice
gre: don't allow to add the same tunnel twice
drivers: net: xen-netfront: fix array initialization bug
pktgen: be friendly to LLTX devices
r8152: check RTL8152_UNPLUG
net: sun4i-emac: add promiscuous support
net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
net: ipv6: Fix oif in TCP SYN+ACK route lookup.
drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts
drivers: net: cpsw: discard all packets received when interface is down
net: Fix use after free by removing length arg from sk_data_ready callbacks.
Drivers: net: hyperv: Address UDP checksum issues
Drivers: net: hyperv: Negotiate suitable ndis version for offload support
Drivers: net: hyperv: Allocate memory for all possible per-pecket information
bridge: Fix double free and memory leak around br_allowed_ingress
bonding: Remove debug_fs files when module init fails
i40evf: program RSS LUT correctly
i40evf: remove open-coded skb_cow_head
ixgb: remove open-coded skb_cow_head
igbvf: remove open-coded skb_cow_head
...
Because it doesn't always create, if there's an existing one it just
returns it.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
These functions are not adding or deleting an lport. They are adding a
wwn that may match with an lport that is present on the system.
Renaming ft_del_lport also means we won't have functions named
both ft_del_lport and ft_lport_del any more.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Rename struct ft_lport_acl to ft_lport_wwn. "acl" is associated with
something different in LIO terms. Really, ft_lport_wwn is the
fabric-specific wrapper for the struct se_wwn.
Rename "lacl" local variables to "ft_wwn" as well.
Rename list_heads used as list members to make it clear they're nodes, not
heads.
Rename lport_node to ft_wwn_node.
Rename ft_lport_list to ft_wwn_list
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
tcm_fc doesn't support multiple TPGs per wwn. For proof, see
ft_lport_find_tpg. Enforce this in the code.
Replace ft_lport_wwn.tpg_list with a single pointer. We can't fold ft_tpg
into ft_lport_wwn because they can have different lifetimes.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
ft_del_tpg checks tpg->tport is set before unlinking the tpg from the
tport when the tpg is being removed. Set this pointer in ft_tport_create,
or the unlinking won't happen in ft_del_tpg and tport->tpg will reference
a deleted object.
This patch sets tpg->tport in ft_tport_create, because that's what
ft_del_tpg checks, and is the only way to get back to the tport to
clear tport->tpg.
The bug was occuring when:
- lport created, tport (our per-lport, per-provider context) is
allocated.
tport->tpg = NULL
- tpg created
- a PRLI is received. ft_tport_create is called, tpg is found and
tport->tpg is set
- tpg removed. ft_tpg is freed in ft_del_tpg. Since tpg->tport was not
set, tport->tpg is not cleared and points at freed memory
- Future calls to ft_tport_create return tport via first conditional,
instead of searching for new tpg by calling ft_lport_find_tpg.
tport->tpg is still invalid, and will access freed memory.
see https://bugzilla.redhat.com/show_bug.cgi?id=1071340
Cc: stable@vger.kernel.org # 3.0+
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch addresses an issue that occurs when an ABTS is received
for an se_cmd that completes just before the sess_cmd_list is searched
in core_tmr_abort_task(). When the sess_cmd_list is searched, since
the ABTS and the FCP_CMND being aborted (that just completed) both
have the same OXID, TFO->get_task_tag(TMR) returns a value that
matches tmr->ref_task_tag (from TFO->get_task_tag(FCP_CMND)), and
the Abort Task tries to abort itself. When this occurs,
transport_wait_for_tasks() hangs forever since the TMR is waiting
for itself to finish.
This patch adds a check to core_tmr_abort_task() to make sure the
TMR does not attempt to abort itself.
Signed-off-by: Alex Leung <alex.leung@emulex.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Several spots in the kernel perform a sequence like:
skb_queue_tail(&sk->s_receive_queue, skb);
sk->sk_data_ready(sk, skb->len);
But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up. So this skb->len access is potentially
to freed up memory.
Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.
And finally, no actual implementation of this callback actually uses
the length argument. And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.
So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.
Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.
Signed-off-by: David S. Miller <davem@davemloft.net>