Thus spake David Howells:
The code that follows this:
remaining = total_data_size - data_in_this_rsp;
if (remaining == 0)
return 0;
else if (remaining < 0) {
generates better code if you drop the 'remaining' variable and compare
the values directly.
Clean it up per his recommendation...
Reported-and-acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Commit 522440ed made cifs set backing_dev_info on the mapping attached
to new inodes. This change caused a fairly significant read performance
regression, as cifs started doing page-sized reads exclusively.
By virtue of the fact that they're allocated as part of cifs_sb_info by
kzalloc, the ra_pages on cifs BDIs get set to 0, which prevents any
readahead. This forces the normal read codepaths to use readpage instead
of readpages causing a four-fold increase in the number of read calls
with the default rsize.
Fix it by setting ra_pages in the BDI to the same value as that in the
default_backing_dev_info.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=31662
Cc: stable@kernel.org
Reported-and-Tested-by: Till <till2.schaefer@uni-dortmund.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The BCC is still __le16 at this point, and in any case we need to
use the get_bcc_le macro to make sure we don't hit alignment
problems.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Currently, we skip doing the is_path_accessible check in cifs_mount if
there is no prefixpath. I have a report of at least one server however
that allows a TREE_CONNECT to a share that has a DFS referral at its
root. The reporter in this case was using a UNC that had no prefixpath,
so the is_path_accessible check was not triggered and the box later hit
a BUG() because we were chasing a DFS referral on the root dentry for
the mount.
This patch fixes this by removing the check for a zero-length
prefixpath. That should make the is_path_accessible check be done in
this situation and should allow the client to chase the DFS referral at
mount time instead.
Cc: stable@kernel.org
Reported-and-Tested-by: Yogesh Sharma <ysharma@cymer.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
make modules C=2 M=fs/cifs CF=-D__CHECK_ENDIAN__
Found for example:
CHECK fs/cifs/cifssmb.c
fs/cifs/cifssmb.c:728:22: warning: incorrect type in assignment (different base types)
fs/cifs/cifssmb.c:728:22: expected unsigned short [unsigned] [usertype] Tid
fs/cifs/cifssmb.c:728:22: got restricted __le16 [usertype] <noident>
fs/cifs/cifssmb.c:1883:45: warning: incorrect type in assignment (different base types)
fs/cifs/cifssmb.c:1883:45: expected long long [signed] [usertype] fl_start
fs/cifs/cifssmb.c:1883:45: got restricted __le64 [usertype] start
fs/cifs/cifssmb.c:1884:54: warning: restricted __le64 degrades to integer
fs/cifs/cifssmb.c:1885:58: warning: restricted __le64 degrades to integer
fs/cifs/cifssmb.c:1886:43: warning: incorrect type in assignment (different base types)
fs/cifs/cifssmb.c:1886:43: expected unsigned int [unsigned] fl_pid
fs/cifs/cifssmb.c:1886:43: got restricted __le32 [usertype] pid
In checking new smb2 code for missing endian conversions, I noticed
some endian errors had crept in over the last few releases into the
cifs code (symlink, ntlmssp, posix lock, and also a less problematic warning
in fscache). A followon patch will address a few smb2 endian
problems.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Ports are __be16 not unsigned short int
Eliminates the remaining fixable endian warnings:
~/cifs-2.6$ make modules C=1 M=fs/cifs CF=-D__CHECK_ENDIAN__
CHECK fs/cifs/connect.c
fs/cifs/connect.c:2408:23: warning: incorrect type in assignment (different base types)
fs/cifs/connect.c:2408:23: expected unsigned short *sport
fs/cifs/connect.c:2408:23: got restricted __be16 *<noident>
fs/cifs/connect.c:2410:23: warning: incorrect type in assignment (different base types)
fs/cifs/connect.c:2410:23: expected unsigned short *sport
fs/cifs/connect.c:2410:23: got restricted __be16 *<noident>
fs/cifs/connect.c:2416:24: warning: incorrect type in assignment (different base types)
fs/cifs/connect.c:2416:24: expected unsigned short [unsigned] [short] <noident>
fs/cifs/connect.c:2416:24: got restricted __be16 [usertype] <noident>
fs/cifs/connect.c:2423:24: warning: incorrect type in assignment (different base types)
fs/cifs/connect.c:2423:24: expected unsigned short [unsigned] [short] <noident>
fs/cifs/connect.c:2423:24: got restricted __be16 [usertype] <noident>
fs/cifs/connect.c:2326:23: warning: incorrect type in assignment (different base types)
fs/cifs/connect.c:2326:23: expected unsigned short [unsigned] sport
fs/cifs/connect.c:2326:23: got restricted __be16 [usertype] sin6_port
fs/cifs/connect.c:2330:23: warning: incorrect type in assignment (different base types)
fs/cifs/connect.c:2330:23: expected unsigned short [unsigned] sport
fs/cifs/connect.c:2330:23: got restricted __be16 [usertype] sin_port
fs/cifs/connect.c:2394:22: warning: restricted __be16 degrades to integer
Signed-off-by: Steve French <sfrench@us.ibm.com>
We artificially limited the user name to 32 bytes, but modern servers handle
larger. Set the maximum length to a reasonable 256, and make the user name
string dynamically allocated rather than a fixed size in session structure.
Also clean up old checkpatch warning.
Signed-off-by: Steve French <sfrench@us.ibm.com>
This flag currently only affects whether we allow "zero-copy" writes
with signing enabled. Typically we map pages in the pagecache directly
into the write request. If signing is enabled however and the contents
of the page change after the signature is calculated but before the
write is sent then the signature will be wrong. Servers typically
respond to this by closing down the socket.
Still, this can provide a performance benefit so the "Experimental" flag
was overloaded to allow this. That's really not a good place for this
option however since it's not clear what that flag does.
Move that flag instead to a new module parameter that better describes
its purpose. That's also better since it can be set at module insertion
time by configuring modprobe.d.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs_close doesn't check that the filp->private_data is non-NULL before
trying to put it. That can cause an oops in certain error conditions
that can occur on open or lookup before the private_data is set.
Reported-by: Ben Greear <greearb@candelatech.com>
CC: Stable <stable@kernel.org>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
LANMAN response length was changed to 16 bytes instead of 24 bytes.
Revert it back to 24 bytes.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
CC: stable@kernel.org
Signed-off-by: Steve French <sfrench@us.ibm.com>
The code finds, the '%' sign in an ipv6 address and copies that to a
buffer allocated on the stack. It then ignores that buffer, and passes
'pct' to simple_strtoul(), which doesn't work right because we're
comparing 'endp' against a completely different string.
Fix it by passing the correct pointer. While we're at it, this is a
good candidate for conversion to strict_strtoul as well.
Cc: stable@kernel.org
Cc: David Howells <dhowells@redhat.com>
Reported-by: Björn JACKE <bj@sernet.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Slight revision to this patch...use min_t() instead of conditional
assignment. Also, remove the FIXME comment and replace it with the
explanation that Steve gave earlier.
After receiving a packet, we currently check the header. If it's no
good, then we toss it out and continue the loop, leaving the caller
waiting on that response.
In cases where the packet has length inconsistencies, but the MID is
valid, this leads to unneeded delays. That's especially problematic now
that the client waits indefinitely for responses.
Instead, don't immediately discard the packet if checkSMB fails. Try to
find a matching mid_q_entry, mark it as having a malformed response and
issue the callback.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Follow-on patch to 7e90d705 which is already in Steve's tree...
The check for tcpStatus == CifsGood is not meaningful since it doesn't
indicate whether the NEGOTIATE request has been done. Also, clarify
why we're checking for maxBuf == 0.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
In order to determine whether an SMBEcho request can be sent
we need to know that the socket is established (server tcpStatus == CifsGood)
AND that an SMB NegotiateProtocol has been sent (server maxBuf != 0).
Without the second check we can send an Echo request during reconnection
before the server can accept it.
CC: JG <jg@cms.ac>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
ses->status is never set to CifsExiting, so these checks are
always false.
Tested-by: JG <jg@cms.ac>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Exit from parse_dacl if no memory returned from the call to kmalloc.
Signed-off-by: Stanislav Fomichev <kernel@fomichev.me>
Signed-off-by: Steve French <sfrench@us.ibm.com>
When the socket to the server is disconnected, the client more or less
immediately calls cifs_reconnect to reconnect the socket. The NegProt
and SessSetup however are not done until an actual call needs to be
made.
With the addition of the SMB echo code, it's possible that the server
will initiate a disconnect on an idle socket. The client will then
reconnect the socket but no NegotiateProtocol request is done. The
SMBEcho workqueue job will then eventually pop, and an SMBEcho will be
sent on the socket. The server will then reject it since no NegProt was
done.
The ideal fix would be to either have the socket not be reconnected
until we plan to use it, or to immediately do a NegProt when the
reconnect occurs. The code is not structured for this however. For now
we must just settle for not sending any echoes until the NegProt is
done.
Reported-by: JG <jg@cms.ac>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs_sign_smb only generates a signature if the correct Flags2 bit is
set. Make sure that it gets set correctly if we're sending an async
call.
This patch fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=28142
Reported-and-Tested-by: JG <jg@cms.ac>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Updating extended statistics here can cause slab memory corruption
if a callback function frees slab memory (mid_entry).
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Variable 'i' should be unsigned long as it's used in circle with num_pages,
and bytes_read/total_written should be ssize_t according to return value.
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
length at this point is the length returned by the last kernel_recvmsg
call. total_read is the length of all of the data read so far. length
is more or less meaningless at this point, so use total_read for
everything.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The cERROR message in checkSMB when the calculated length doesn't match
the RFC1001 length is incorrect in many cases. It always says that the
RFC1001 length is bigger than the SMB, even when it's actually the
reverse.
Fix the error message to say the reverse of what it does now when the
SMB length goes beyond the end of the received data. Also, clarify the
error message when the RFC length is too big. Finally, clarify the
comments to show that the 512 byte limit on extra data at the end of
the packet is arbitrary.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Missed one change as per earlier suggestion.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
New compiler warnings that I noticed when building a patchset based
on recent Fedora kernel:
fs/cifs/cifssmb.c: In function 'CIFSSMBSetFileSize':
fs/cifs/cifssmb.c:4813:8: warning: variable 'data_offset' set but not used
[-Wunused-but-set-variable]
fs/cifs/file.c: In function 'cifs_open':
fs/cifs/file.c:349:24: warning: variable 'pCifsInode' set but not used
[-Wunused-but-set-variable]
fs/cifs/file.c: In function 'cifs_partialpagewrite':
fs/cifs/file.c:1149:23: warning: variable 'cifs_sb' set but not used
[-Wunused-but-set-variable]
fs/cifs/file.c: In function 'cifs_iovec_write':
fs/cifs/file.c:1740:9: warning: passing argument 6 of 'CIFSSMBWrite2' from
incompatible pointer type [enabled by default]
fs/cifs/cifsproto.h:337:12: note: expected 'unsigned int *' but argument is
of type 'size_t *'
fs/cifs/readdir.c: In function 'cifs_readdir':
fs/cifs/readdir.c:767:23: warning: variable 'cifs_sb' set but not used
[-Wunused-but-set-variable]
fs/cifs/cifs_dfs_ref.c: In function 'cifs_dfs_d_automount':
fs/cifs/cifs_dfs_ref.c:342:2: warning: 'rc' may be used uninitialized in
this function [-Wuninitialized]
fs/cifs/cifs_dfs_ref.c:278:6: note: 'rc' was declared here
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Recently CIFS was changed to use the kernel crypto API for MD4 hashes,
but the Kconfig dependencies were not changed to reflect this.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reported-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Currently, we allow the pending_mid_q to grow without bound with
SIGKILL'ed processes. This could eventually be a DoS'able problem. An
unprivileged user could a process that does a long-running call and then
SIGKILL it.
If he can also intercept the NT_CANCEL calls or the replies from the
server, then the pending_mid_q could grow very large, possibly even to
2^16 entries which might leave GetNextMid in an infinite loop. Fix this
by imposing a hard limit of 32k calls per server. If we cross that
limit, set the tcpStatus to CifsNeedReconnect to force cifsd to
eventually reconnect the socket and clean out the pending_mid_q.
While we're at it, clean up the function a bit and eliminate an
unnecessary NULL pointer check.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
If we kill the process while it's sending on a socket then the
kernel_sendmsg will return -EINTR. This is normal. No need to spam the
ring buffer with this info.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
...just cleanup. There should be no behavior change.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Use the new send_nt_cancel function to send an NT_CANCEL when the
process is delivered a fatal signal. This is a "best effort" enterprise
however, so don't bother to check the return code. There's nothing we
can reasonably do if it fails anyway.
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Currently, when a request is cancelled via signal, we delete the mid
immediately. If the request was already transmitted however, the client
is still likely to receive a response. When it does, it won't recognize
it however and will pop a printk.
It's also a little dangerous to just delete the mid entry like this. We
may end up reusing that mid. If we do then we could potentially get the
response from the first request confused with the later one.
Prevent the reuse of mids by marking them as cancelled and keeping them
on the pending_mid_q list. If the reply comes in, we'll delete it from
the list then. If it never comes, then we'll delete it at reconnect
or when cifsd comes down.
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
fs/cifs/link.c: In function ‘symlink_hash’:
fs/cifs/link.c:58:3: warning: ‘rc’ may be used uninitialized in this
function [-Wuninitialized]
fs/cifs/smbencrypt.c: In function ‘mdfour’:
fs/cifs/smbencrypt.c:61:3: warning: ‘rc’ may be used uninitialized in this
function [-Wuninitialized]
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Replaced md4 hashing function local to cifs module with kernel crypto APIs.
As a result, md4 hashing function and its supporting functions in
file md4.c are not needed anymore.
Cleaned up function declarations, removed forward function declarations,
and removed a header file that is being deleted from being included.
Verified that sec=ntlm/i, sec=ntlmv2/i, and sec=ntlmssp/i work correctly.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Use for switching on strict cache mode. In this mode the
client reads from the cache all the time it has Oplock Level II,
otherwise - read from the server. As for write - the client stores
a data in the cache in Exclusive Oplock case, otherwise - write
directly to the server.
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
If we don't have Exclusive oplock we write a data to the server.
Also set invalidate_mapping flag on the inode if we wrote something
to the server. Add cifs_iovec_write to let the client write iovec
buffers through CIFSSMBWrite2.
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Replace remaining use of md5 hash functions local to cifs module
with kernel crypto APIs.
Remove header and source file containing those local functions.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Teach cifs about network namespaces, so mounting uses adresses/routing
visible from the container rather than from init context.
A container is a chroot on steroids that changes more than just the root
filesystem the new processes see. One thing containers can isolate is
"network namespaces", meaning each container can have its own set of
ethernet interfaces, each with its own own IP address and routing to the
outside world. And if you open a socket in _userspace_ from processes
within such a container, this works fine.
But sockets opened from within the kernel still use a single global
networking context in a lot of places, meaning the new socket's address
and routing are correct for PID 1 on the host, but are _not_ what
userspace processes in the container get to use.
So when you mount a network filesystem from within in a container, the
mount code in the CIFS driver uses the host's networking context and not
the container's networking context, so it gets the wrong address, uses
the wrong routing, and may even try to go out an interface that the
container can't even access... Bad stuff.
This patch copies the mount process's network context into the CIFS
structure that stores the rest of the server information for that mount
point, and changes the socket open code to use the saved network context
instead of the global network context. I.E. "when you attempt to use
these addresses, do so relative to THIS set of network interfaces and
routing rules, not the old global context from back before we supported
containers".
The big long HOWTO sets up a test environment on the assumption you've
never used ocntainers before. It basically says:
1) configure and build a new kernel that has container support
2) build a new root filesystem that includes the userspace container
control package (LXC)
3) package/run them under KVM (so you don't have to mess up your host
system in order to play with containers).
4) set up some containers under the KVM system
5) set up contradictory routing in the KVM system and the container so
that the host and the container see different things for the same address
6) try to mount a CIFS share from both contexts so you can both force it
to work and force it to fail.
For a long drawn out test reproduction sequence, see:
http://landley.livejournal.com/47024.htmlhttp://landley.livejournal.com/47205.htmlhttp://landley.livejournal.com/47476.html
Signed-off-by: Rob Landley <rlandley@parallels.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
In fs/cifs/cifs_dfs_ref.c::cifs_dfs_do_automount() we have this code:
...
mnt = ERR_PTR(-EINVAL);
if (IS_ERR(tlink)) {
mnt = ERR_CAST(tlink);
goto free_full_path;
}
ses = tlink_tcon(tlink)->ses;
rc = get_dfs_path(xid, ses, full_path + 1, cifs_sb->local_nls,
&num_referrals, &referrals,
cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
cifs_put_tlink(tlink);
mnt = ERR_PTR(-ENOENT);
...
The assignment of 'mnt = ERR_PTR(-EINVAL);' is completely pointless. If we
take the 'if (IS_ERR(tlink))' branch we'll set 'mnt' again and we'll also
do so if we do not take the branch. There is no way we'll ever use 'mnt'
with the assigned 'ERR_PTR(-EINVAL)' value, so we may as well just remove
the pointless assignment.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Make sure that CIFSSMBEcho can handle unaligned fields. Also fix a minor
bug that causes this warning:
fs/cifs/cifssmb.c: In function 'CIFSSMBEcho':
fs/cifs/cifssmb.c:740: warning: large integer implicitly truncated to unsigned type
...WordCount is u8, not __le16, so no need to convert it.
This patch should apply cleanly on top of the rest of the patchset to
clean up unaligned access.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Move cifsConvertToUCS to cifs_unicode.c where all of the other unicode
related functions live. Have it store mapped characters in 'temp' and
then use put_unaligned_le16 to copy it to the target buffer. Also fix
the comments to match kernel coding style.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Make sure we use get/put_unaligned routines when accessing wide
character strings.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
...and clean up function to reduce indentation.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
It's possible that when we access the ByteCount that the alignment
will be off. Most CPUs deal with that transparently, but there's
usually some performance impact. Some CPUs raise an exception on
unaligned accesses.
Fix this by accessing the byte count using the get_unaligned and
put_unaligned inlined functions. While we're at it, fix the types
of some of the variables that end up getting returns from these
functions.
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Remove fields that are completely unused, and rearrange struct
according to recommendations by "pahole".
Before:
/* size: 1112, cachelines: 18, members: 49 */
/* sum members: 1086, holes: 8, sum holes: 26 */
/* bit holes: 1, sum bit holes: 7 bits */
/* last cacheline: 24 bytes */
After:
/* size: 1072, cachelines: 17, members: 42 */
/* sum members: 1065, holes: 3, sum holes: 7 */
/* last cacheline: 48 bytes */
...savings of 40 bytes per struct on x86_64. 21 bytes by field removal,
and 19 by reorganizing to eliminate holes.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Read from the cache if we have at least Level II oplock - otherwise
read from the server. Add cifs_user_readv to let the client read into
iovec buffers.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Invalidate inode mapping if we don't have at least Level II oplock.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Invalidate inode mapping if we don't have at least Level II oplock in
cifs_strict_fsync. Also remove filemap_write_and_wait call from cifs_fsync
because it is previously called from vfs_fsync_range. Add file operations'
structures for strict cache mode.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
On strict cache mode when we close the last file handle of the inode we
should set invalid_mapping flag on this inode to prevent data coherency
problem when we open it again but it has been modified on the server.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The NT_CANCEL command looks just like the original command, except for a
few small differences. The send_nt_cancel function however currently takes
a tcon, which we don't have in SendReceive and SendReceive2.
Instead of "respinning" the entire header for an NT_CANCEL, just mangle
the existing header by replacing just the fields we need. This means we
don't need a tcon and allows us to call it from other places.
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Since we don't time out individual requests anymore, remove the code
that we used to use for setting timeouts on different requests.
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
If the server isn't responding to echoes, we don't want to leave tasks
hung waiting for it to reply. At that point, we'll want to reconnect
so that soft mounts can return an error to userspace quickly.
If the client hasn't received a reply after a specified number of echo
intervals, assume that the transport is down and attempt to reconnect
the socket.
The number of echo_intervals to wait before attempting to reconnect is
tunable via a module parameter. Setting it to 0, means that the client
will never attempt to reconnect. The default is 5.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Add a function that will send a request, and set up the mid for an
async reply.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
In order to incorporate async requests, we need to allow for a more
general way to do things on receive, rather than just waking up a
process.
Turn the task pointer in the mid_q_entry into a callback function and a
generic data pointer. When a response comes in, or the socket is
reconnected, cifsd can call the callback function in order to wake up
the process.
The default is to just wake up the current process which should mean no
change in behavior for existing code.
Also, clean up the locking in cifs_reconnect. There doesn't seem to be
any need to hold both the srv_mutex and GlobalMid_Lock when walking the
list of mids.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Make it use a switch statement based on the value of the midStatus. If
the resp_buf is set, then MID_RESPONSE_RECEIVED is too.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
We only want to force a reconnect to the server under very limited and
specific circumstances. Now that we have processes waiting indefinitely
for responses, we shouldn't reach this point unless a reconnect is
already in process. Thus, there's no reason to re-mark the server for
reconnect here.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The client should not be timing out on individual SMB requests. Too much
of the state between client and server is tied to the state of the
socket. If we time out requests and issue spurious disconnects then that
comprimises data integrity.
Instead of doing this complicated dance where we try to decide how long
to wait for a response for particular requests, have the client instead
wait indefinitely for a response. Also, use a TASK_KILLABLE sleep here
so that fatal signals will break out of this waiting.
Later patches will add support for detecting dead peers and forcing
reconnects based on that.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
If a DACL has entries for ACEs for SID Everyone and Authenticated Users,
factor in mask in respective entries during calculation of permissions
for all three, user, group, and other.
http://technet.microsoft.com/en-us/library/bb463216.aspx
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
NTLM response length was changed to 16 bytes instead of 24 bytes
that are sent in Tree Connection Request during share-level security
share mounts. Revert it back to 24 bytes.
Reported-and-Tested-by: Grzegorz Ozanski <grzegorz.ozanski@intel.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Suresh Jayaraman <sjayaraman@suse.de>
Cc: stable@kernel.org
Signed-off-by: Steve French <sfrench@us.ibm.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
In later patches, we're going to need to have finer-grained control
over the addition and removal of these structs from the pending_mid_q
and we'll need to be able to call the destructor while holding the
spinlock. Move the locked sections out of both routines and into
the callers. Fix up current callers of DeleteMidQEntry to call a new
routine that dequeues the entry and then destroys it.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
It's an atomic_t and the code accesses the "counter" field in it directly
instead of using atomic_read(). It also is sometimes accessed under a
spinlock and sometimes not. Move it out of the spinlock since we don't need
belt-and-suspenders for something that's just informational.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The cifsSesInfo pointer is only used to get at the server.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The TCP_Server_Info is refcounted and every SMB session holds a
reference to it. Thus, smb_ses_list is always going to be empty when
cifsd is coming down. This is dead code.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
If CIFSSMBWrite2 returns -EAGAIN, then the error should be considered
temporary. CIFS should retry the write instead of setting an error on
the mapping and returning.
For WB_SYNC_ALL, just retry the write immediately. In the WB_SYNC_NONE
case, call redirty_page_for_writeback on all of the pages that didn't
get written out and then move on.
Also, fix up the handling of a short write with a successful return
code. MS-CIFS says that 0 bytes_written means ENOSPC or EFBIG. It
doesn't mention what a short, but non-zero write means, so for now
treat it as we would an -EAGAIN return.
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
When we get oplock break notification we should set the appropriate
value of OplockLevel field in oplock break acknowledge according to
the oplock level held by the client in this time. As we only can have
level II oplock or no oplock in the case of oplock break, we should be
aware only about clientCanCacheRead field in cifsInodeInfo structure.
Also fix bug connected with wrong interpretation of OplockLevel field
during oplock break notification processing.
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: add cruid= mount option
cifs: cFYI the entire error code in map_smb_to_linux_error
Unexport do_add_mount() and make ->d_automount() return the vfsmount to be
added rather than calling do_add_mount() itself. follow_automount() will then
do the addition.
This slightly complicates things as ->d_automount() normally wants to add the
new vfsmount to an expiration list and start an expiration timer. The problem
with that is that the vfsmount will be deleted if it has a refcount of 1 and
the timer will not repeat if the expiration list is empty.
To this end, we require the vfsmount to be returned from d_automount() with a
refcount of (at least) 2. One of these refs will be dropped unconditionally.
In addition, follow_automount() must get a 3rd ref around the call to
do_add_mount() lest it eat a ref and return an error, leaving the mount we
have open to being expired as we would otherwise have only 1 ref on it.
d_automount() should also add the the vfsmount to the expiration list (by
calling mnt_set_expiry()) and start the expiration timer before returning, if
this mechanism is to be used. The vfsmount will be unlinked from the
expiration list by follow_automount() if do_add_mount() fails.
This patch also fixes the call to do_add_mount() for AFS to propagate the mount
flags from the parent vfsmount.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Make CIFS use the new d_automount() dentry operation rather than abusing
follow_link() on directories.
[NOTE: THIS IS UNTESTED!]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Steve French <sfrench@samba.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add a dentry op (d_manage) to permit a filesystem to hold a process and make it
sleep when it tries to transit away from one of that filesystem's directories
during a pathwalk. The operation is keyed off a new dentry flag
(DCACHE_MANAGE_TRANSIT).
The filesystem is allowed to be selective about which processes it holds and
which it permits to continue on or prohibits from transiting from each flagged
directory. This will allow autofs to hold up client processes whilst letting
its userspace daemon through to maintain the directory or the stuff behind it
or mounted upon it.
The ->d_manage() dentry operation:
int (*d_manage)(struct path *path, bool mounting_here);
takes a pointer to the directory about to be transited away from and a flag
indicating whether the transit is undertaken by do_add_mount() or
do_move_mount() skipping through a pile of filesystems mounted on a mountpoint.
It should return 0 if successful and to let the process continue on its way;
-EISDIR to prohibit the caller from skipping to overmounted filesystems or
automounting, and to use this directory; or some other error code to return to
the user.
->d_manage() is called with namespace_sem writelocked if mounting_here is true
and no other locks held, so it may sleep. However, if mounting_here is true,
it may not initiate or wait for a mount or unmount upon the parameter
directory, even if the act is actually performed by userspace.
Within fs/namei.c, follow_managed() is extended to check with d_manage() first
on each managed directory, before transiting away from it or attempting to
automount upon it.
follow_down() is renamed follow_down_one() and should only be used where the
filesystem deliberately intends to avoid management steps (e.g. autofs).
A new follow_down() is added that incorporates the loop done by all other
callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS
and CIFS do use it, their use is removed by converting them to use
d_automount()). The new follow_down() calls d_manage() as appropriate. It
also takes an extra parameter to indicate if it is being called from mount code
(with namespace_sem writelocked) which it passes to d_manage(). follow_down()
ignores automount points so that it can be used to mount on them.
__follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with
DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to
sleep. It would be possible to enter d_manage() in rcu-walk mode too, and have
that determine whether to abort or not itself. That would allow the autofs
daemon to continue on in rcu-walk mode.
Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't
required as every tranist from that directory will cause d_manage() to be
invoked. It can always be set again when necessary.
==========================
WHAT THIS MEANS FOR AUTOFS
==========================
Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to
trigger the automounting of indirect mounts, and both of these can be called
with i_mutex held.
autofs knows that the i_mutex will be held by the caller in lookup(), and so
can drop it before invoking the daemon - but this isn't so for d_revalidate(),
since the lock is only held on _some_ of the code paths that call it. This
means that autofs can't risk dropping i_mutex from its d_revalidate() function
before it calls the daemon.
The bug could manifest itself as, for example, a process that's trying to
validate an automount dentry that gets made to wait because that dentry is
expired and needs cleaning up:
mkdir S ffffffff8014e05a 0 32580 24956
Call Trace:
[<ffffffff885371fd>] :autofs4:autofs4_wait+0x674/0x897
[<ffffffff80127f7d>] avc_has_perm+0x46/0x58
[<ffffffff8009fdcf>] autoremove_wake_function+0x0/0x2e
[<ffffffff88537be6>] :autofs4:autofs4_expire_wait+0x41/0x6b
[<ffffffff88535cfc>] :autofs4:autofs4_revalidate+0x91/0x149
[<ffffffff80036d96>] __lookup_hash+0xa0/0x12f
[<ffffffff80057a2f>] lookup_create+0x46/0x80
[<ffffffff800e6e31>] sys_mkdirat+0x56/0xe4
versus the automount daemon which wants to remove that dentry, but can't
because the normal process is holding the i_mutex lock:
automount D ffffffff8014e05a 0 32581 1 32561
Call Trace:
[<ffffffff80063c3f>] __mutex_lock_slowpath+0x60/0x9b
[<ffffffff8000ccf1>] do_path_lookup+0x2ca/0x2f1
[<ffffffff80063c89>] .text.lock.mutex+0xf/0x14
[<ffffffff800e6d55>] do_rmdir+0x77/0xde
[<ffffffff8005d229>] tracesys+0x71/0xe0
[<ffffffff8005d28d>] tracesys+0xd5/0xe0
which means that the system is deadlocked.
This patch allows autofs to hold up normal processes whilst the daemon goes
ahead and does things to the dentry tree behind the automouter point without
risking a deadlock as almost no locks are held in d_manage() and none in
d_automount().
Signed-off-by: David Howells <dhowells@redhat.com>
Was-Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In commit 3e4b3e1f we separated the "uid" mount option such that it
no longer determined the owner of the credential cache by default. When
we did this, we added a new option to cifs.upcall (--legacy-uid) to
try to make it so that it would behave the same was as it did before.
This ignored a rather important point -- the kernel has no way to know
what options are being passed to cifs.upcall, so it doesn't know what
uid it should use to determine whether to match an existing krb5 session.
The simplest solution is to simply add a new "cruid=" mount option that
only governs the uid owner of the credential cache for the mount.
Unfortunately, this means that the --legacy-uid option in cifs.upcall was
ill-considered and is now useless, but I don't see a better way to deal
with this.
A patch for the mount.cifs manpage will follow once this patch has been
accepted.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Reduce false inode collisions by using the CreationTime like an
i_generation field. This way, even if the server ends up reusing
a uniqueid after a delete/create cycle, we can avoid matching
the inode incorrectly.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
We call CIFSSMBUnixSetPathInfo in these functions, but we have a
filehandle since an open was just done. Switch these functions to
use CIFSSMBUnixSetFileInfo instead.
In practice, these codepaths are only used if posix opens are broken.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
...and while we're at it, reduce the number of calls into the seq_*
functions by prepending spaces to strings.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
I see no real need to leave these sorts of options under an
EXPERIMENTAL ifdef. Since you need a mount option to turn this code
on, that only blows out the testing matrix.
local_leases has been under the EXPERIMENTAL tag for some time, but
it's only the mount option that's under this label. Move it out
from under this tag.
The NTLMSSP code is also under EXPERIMENTAL, but it needs a mount
option to turn it on, and in the future any distro will reasonably
want this enabled. Go ahead and move it out from under the
EXPERIMENTAL tag.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
A number of places that deal with RFC1001/1002 negotiations have bare
"15" or "16" values. Replace them with RFC_1001_NAME_LEN and
RFC_1001_NAME_LEN_WITH_NULL.
The patch also cleans up some checkpatch warnings for code surrounding
the changes. This should apply cleanly on top of the patch to remove
Local_System_Name.
Reported-and-Reviwed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The server->sequence_number is already protected by the srv_mutex. The
GlobalMid_lock is unneeded here.
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
dcache_inode_lock can be replaced with per-inode locking. Use existing
inode->i_lock for this. This is slightly non-trivial because we sometimes
need to find the inode from the dentry, which requires d_inode to be
stabilised (either with refcount or d_lock).
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Require filesystems be aware of .d_revalidate being called in rcu-walk
mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning
-ECHILD from all implementations.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry->d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.
Patched with:
git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
RCU free the struct inode. This will allow:
- Subsequent store-free path walking patch. The inode must be consulted for
permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
to take i_lock no longer need to take sb_inode_list_lock to walk the list in
the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
page lock to follow page->mapping.
The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.
In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.
The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Add a new lock, dcache_inode_lock, to protect the inode's i_dentry list
from concurrent modification. d_alias is also protected by d_lock.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Change d_hash so it may be called from lock-free RCU lookups. See similar
patch for d_compare for details.
For in-tree filesystems, this is just a mechanical change.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Change d_compare so it may be called from lock-free RCU lookups. This
does put significant restrictions on what may be done from the callback,
however there don't seem to have been any problems with in-tree fses.
If some strange use case pops up that _really_ cannot cope with the
rcu-walk rules, we can just add new rcu-unaware callbacks, which would
cause name lookup to drop out of rcu-walk mode.
For in-tree filesystems, this is just a mechanical change.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Use vfat's method for dealing with negative dentries to preserve case,
rather than overwrite dentry name in d_revalidate, which is a bit ugly
and also gets in the way of doing lock-free path walking.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Make the code more general for use in posix and non-posix open.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Delete cifs_open_inode_helper and move non-posix open related things
to cifs_nt_open function.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
If we have a share mounted by non-standard port and try to mount another share
on the same host with standard port, we connect to the first share again -
that's wrong. This patch fixes this bug.
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Make connect logic more ip-protocol independent and move RFC1001 stuff into
a separate function. Also replace union addr in TCP_Server_Info structure
with sockaddr_storage.
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-and-Tested-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Indicate to the server a capability of NTLM2 session security (NTLM2 Key)
during ntlmssp protocol exchange in one of the bits of the flags field.
If server supports this capability, send NTLM2 key even if signing is not
required on the server.
If the server requires signing, the session keys exchanged for NTLMv2
and NTLM2 session security in auth packet of the nlmssp exchange are same.
Send the same flags in authenticate message (type 3) that client sent in
negotiate message (type 1).
Remove function setup_ntlmssp_neg_req
Make sure ntlmssp negotiate and authenticate messages are zero'ed
before they are built.
Reported-and-Tested-by: Robbert Kouprie <robbert@exx.nl>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Instead, use fatfs's method for dealing with negative dentries to
preserve case, rather than overwrite dentry name in d_revalidate, which
is a bit ugly and also gets in the way of doing lock-free path walking.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Steve French <sfrench@us.ibm.com>
As the FIXME points out correctly, now filldir() itself returns -EOVERFLOW if
it not possible to represent the inode number supplied by the filesystem in
the field provided by userspace.
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
It's possible that cifs_mount will call cifs_build_path_to_root on a
newly instantiated cifs_sb. In that case, it's likely that the
master_tlink pointer has not yet been instantiated.
Fix this by having cifs_build_path_to_root take a cifsTconInfo pointer
as well, and have the caller pass that in.
Reported-and-Tested-by: Robbert Kouprie <robbert@exx.nl>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
This function will return 0 if everything went ok. Commit 9d002df4
however added a block of code after the following check for
rc == -EREMOTE. With that change and when rc == 0, doing the
"goto mount_fail_check" here skips that code, leaving the tlink_tree
and master_tlink pointer unpopulated. That causes an oops later
in cifs_root_iget.
Reported-and-Tested-by: Robbert Kouprie <robbert@exx.nl>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
...this string is zeroed out and nothing ever changes it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Some of the code under CONFIG_CIFS_ACL is dependent upon code under
CONFIG_CIFS_EXPERIMENTAL, but the Kconfig options don't reflect that
dependency. Move more of the ACL code out from under
CONFIG_CIFS_EXPERIMENTAL and under CONFIG_CIFS_ACL.
Also move find_readable_file out from other any sort of Kconfig
option and make it a function normally compiled in.
Reported-and-Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Currently, the attribute cache timeout for CIFS is hardcoded to 1 second. This
means that the client might have to issue a QPATHINFO/QFILEINFO call every 1
second to verify if something has changes, which seems too expensive. On the
other hand, if the timeout is hardcoded to a higher value, workloads that
expect strict cache coherency might see unexpected results.
Making attribute cache timeout as a tunable will allow us to make a tradeoff
between performance and cache metadata correctness depending on the
application/workload needs.
Add 'actimeo' tunable that can be used to tune the attribute cache timeout.
The default timeout is set to 1 second. Also, display actimeo option value in
/proc/mounts.
It appears to me that 'actimeo' and the proposed (but not yet merged)
'strictcache' option cannot coexist, so care must be taken that we reset the
other option if one of them is set.
Changes since last post:
- fix option parsing and handle possible values correcly
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The DFS referral parsing code does a memchr() call to find the '\\'
delimiter that separates the hostname in the referral UNC from the
sharename. It then uses that value to set the length of the hostname via
pointer subtraction. Instead of subtracting the start of the hostname
however, it subtracts the start of the UNC, which causes the code to
pass in a hostname length that is 2 bytes too long.
Regression introduced in commit 1a4240f4.
Reported-and-Tested-by: Robbert Kouprie <robbert@exx.nl>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Wang Lei <wang840925@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Steve French <sfrench@us.ibm.com>
Currently, if CONFIG_CIFS_FSCACHE is set, fscache is enabled on files opened
as read-only irrespective of the 'fsc' mount option. Fix this by enabling
fscache only if 'fsc' mount option is specified explicitly.
Remove an extraneous cFYI debug message and fix a typo while at it.
Reported-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Currently, it is possible to specify 'fsc' mount option even if
CONFIG_CIFS_FSCACHE has not been set. The option is being ignored silently
while the user fscache functionality to work. Fix this by raising error when
the CONFIG option is not set.
Reported-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Add extended attribute name system.cifs_acl
Get/generate cifs/ntfs acl blob and hand over to the invoker however
it wants to parse/process it under experimental configurable option CIFS_ACL.
Do not get CIFS/NTFS ACL for xattr for attribute system.posix_acl_access
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Change the name of function mode_to_acl to mode_to_cifs_acl.
Handle return code in functions mode_to_cifs_acl and
cifs_acl_to_fattr.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Only the callers check whether the invalid_mapping flag is set and not
cifs_invalidate_mapping().
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
linux-2.6.37-rc1: I compiled a kernel with CIFS which subsequently
failed with an error indicating it couldn't initialize crypto module
"hmacmd5". CONFIG_CRYPTO_HMAC=y fixed the problem.
This patch makes CIFS depend on CRYPTO_HMAC in kconfig.
Signed-off-by: Jody Bruchon<jody@nctritech.com>
CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
It's possible for initiate_cifs_search to be called on a filp that
already has private_data attached. If this happens, we'll end up
calling cifs_sb_tlink, taking an extra reference to the tlink and
attaching that to the cifsFileInfo. This leads to refcount leaks
that manifest as a "stuck" cifsd at umount time.
Fix this by only looking up the tlink for the cifsFile on the filp's
first pass through this function. When called on a filp that already
has cifsFileInfo associated with it, just use the tlink reference
that it already owns.
This patch fixes samba.org bug 7792:
https://bugzilla.samba.org/show_bug.cgi?id=7792
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Modify get/set_cifs_acl* calls to reutrn error code and percolate the
error code up to the caller.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs_root_iget allocates full_path through
cifs_build_path_to_root, but fails to kfree it upon
cifs_get_inode_info* failure.
Make all failure exit paths traverse clean up
handling at the end of the function.
Signed-off-by: Oskar Schirmer <oskar@scara.com>
Reviewed-by: Jesper Juhl <jj@chaosbits.net>
Cc: stable@kernel.org
Signed-off-by: Steve French <sfrench@us.ibm.com>
cfile may very well be freed after the cifsFileInfo_put. Make sure we
have a valid pointer to the superblock for cifs_sb_deactive.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Commit 13cfb7334e made cifs_ioctl use the tlink attached to the
cifsFileInfo for a filp. This ignores the case of an open directory
however, which in CIFS can have a NULL private_data until a readdir
is done on it.
This patch re-adds the NULL pointer checks that were removed in commit
50ae28f01 and moves the setting of tcon and "caps" variables lower.
Long term, a better fix would be to establish a f_op->open routine for
directories that populates that field at open time, but that requires
some other changes to how readdir calls are handled.
Reported-by: Kjell Rune Skaaraas <kjella79@yahoo.no>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
All the callers already have a pointer to struct cifsInodeInfo. Use it.
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
This patch is based on Dan's original patch. His original description is
below:
Smatch complained about a couple checking for NULL after dereferencing
bugs. I'm not super familiar with the code so I did the conservative
thing and move the dereferences after the checks.
The dereferences in cifs_lock() and cifs_fsync() were added in
ba00ba64cf "cifs: make various routines use the cifsFileInfo->tcon
pointer". The dereference in find_writable_file() was added in
6508d904e6 "cifs: have find_readable/writable_file filter by fsuid".
The comments there say it's possible to trigger the NULL dereference
under stress.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Noticed while reviewing (late) the rbtree conversion patchset (which has been merged
already).
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Radix trees are ideal when you want to track a bunch of pointers and
can't embed a tracking structure within the target of those pointers.
The tradeoff is an increase in memory, particularly if the tree is
sparse.
In CIFS, we use the tlink_tree to track tcon_link structs. A tcon_link
can never be in more than one tlink_tree, so there's no impediment to
using a rb_tree here instead of a radix tree.
Convert the new multiuser mount code to use a rb_tree instead. This
should reduce the memory required to manage the tlink_tree.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
This is the second version of this patch, the only difference between
it and the first one is that this explicitly makes cifs_sb_master_tlink
a static inline.
Instead of keeping a tag on the master tlink in the tree, just keep a
pointer to the master in the superblock. That eliminates the need for
using the radix tree to look up a tagged entry.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Simplify many places when we need to set oplock level on an inode.
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Stanse found that pSMBFile in cifs_ioctl and file->f_path.dentry in
cifs_user_write are dereferenced prior their test to NULL.
The alternative is not to dereference them before the tests. The patch is
to point out the problem, you have to decide.
While at it we cache the inode in cifs_user_write to a local variable
and use all over the function.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Steve French <sfrench@samba.org>
Cc: linux-cifs@vger.kernel.org
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The caller allocated it, the caller should free it.
The only issue so far is that we could change the flp pointer even on an
error return if the fl_change callback failed. But we can simply move
the flp assignment after the fl_change invocation, as the callers don't
care about the flp return value if the setlease call failed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We modified setlease to require the caller to allocate the new lease in
the case of creating a new lease, but forgot to fix up the filesystem
methods.
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: Cleanup and thus reduce smb session structure and fields used during authentication
NTLM auth and sign - Use appropriate server challenge
cifs: add kfree() on error path
NTLM auth and sign - minor error corrections and cleanup
NTLM auth and sign - Use kernel crypto apis to calculate hashes and smb signatures
NTLM auth and sign - Define crypto hash functions and create and send keys needed for key exchange
cifs: cifs_convert_address() returns zero on error
NTLM auth and sign - Allocate session key/client response dynamically
cifs: update comments - [s/GlobalSMBSesLock/cifs_file_list_lock/g]
cifs: eliminate cifsInodeInfo->write_behind_rc (try #6)
[CIFS] Fix checkpatch warnings and bump cifs version number
cifs: wait for writeback to complete in cifs_flush
cifs: convert cifsFileInfo->count to non-atomic counter
Removed following fields from smb session structure
cryptkey, ntlmv2_hash, tilen, tiblob
and ntlmssp_auth structure is allocated dynamically only if the auth mech
in NTLMSSP.
response field within a session_key structure is used to initially store the
target info (either plucked from type 2 challenge packet in case of NTLMSSP
or fabricated in case of NTLMv2 without extended security) and then to store
Message Authentication Key (mak) (session key + client response).
Server challenge or cryptkey needed during a NTLMSSP authentication
is now part of ntlmssp_auth structure which gets allocated and freed
once authenticaiton process is done.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Need to have cryptkey or server challenge in smb connection
(struct TCP_Server_Info) for ntlm and ntlmv2 auth types for which
cryptkey (Encryption Key) is supplied just once in Negotiate Protocol
response during an smb connection setup for all the smb sessions over
that smb connection.
For ntlmssp, cryptkey or server challenge is provided for every
smb session in type 2 packet of ntlmssp negotiation, the cryptkey
provided during Negotiation Protocol response before smb connection
does not count.
Rename cryptKey to cryptkey and related changes.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
We leak 256 bytes here on this error path.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Minor cleanup - Fix spelling mistake, make meaningful (goto) label
In function setup_ntlmv2_rsp(), do not return 0 and leak memory,
let the tiblob get freed.
For function find_domain_name(), pass already available nls table pointer
instead of loading and unloading the table again in this function.
For ntlmv2, the case sensitive password length is the length of the
response, so subtract session key length (16 bytes) from the .len.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
This removes more dead code that was somehow missed by commit 0d99519efe
(writeback: remove unused nonblocking and congestion checks). There are
no behavior change except for the removal of two entries from one of the
ext4 tracing interface.
The nonblocking checks in ->writepages are no longer used because the
flusher now prefer to block on get_request_wait() than to skip inodes on
IO congestion. The latter will lead to more seeky IO.
The nonblocking checks in ->writepage are no longer used because it's
redundant with the WB_SYNC_NONE check.
We no long set ->nonblocking in VM page out and page migration, because
a) it's effectively redundant with WB_SYNC_NONE in current code
b) it's old semantic of "Don't get stuck on request queues" is mis-behavior:
that would skip some dirty inodes on congestion and page out others, which
is unfair in terms of LRU age.
Inspired by Christoph Hellwig. Thanks!
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: David Howells <dhowells@redhat.com>
Cc: Sage Weil <sage@newdream.net>
Cc: Steve French <sfrench@samba.org>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use kernel crypto sync hash apis insetead of cifs crypto functions.
The calls typically corrospond one to one except that insead of
key init, setkey is used.
Use crypto apis to generate smb signagtures also.
Use hmac-md5 to genereate ntlmv2 hash, ntlmv2 response, and HMAC (CR1 of
ntlmv2 auth blob.
User crypto apis to genereate signature and to verify signature.
md5 hash is used to calculate signature.
Use secondary key to calculate signature in case of ntlmssp.
For ntlmv2 within ntlmssp, during signature calculation, only 16 bytes key
(a nonce) stored within session key is used. during smb signature calculation.
For ntlm and ntlmv2 without extended security, 16 bytes key
as well as entire response (24 bytes in case of ntlm and variable length
in case of ntlmv2) is used for smb signature calculation.
For kerberos, there is no distinction between key and response.
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Mark dependency on crypto modules in Kconfig.
Defining per structures sdesc and cifs_secmech which are used to store
crypto hash functions and contexts. They are stored per smb connection
and used for all auth mechs to genereate hash values and signatures.
Allocate crypto hashing functions, security descriptiors, and respective
contexts when a smb/tcp connection is established.
Release them when a tcp/smb connection is taken down.
md5 and hmac-md5 are two crypto hashing functions that are used
throught the life of an smb/tcp connection by various functions that
calcualte signagure and ntlmv2 hash, HMAC etc.
structure ntlmssp_auth is defined as per smb connection.
ntlmssp_auth holds ciphertext which is genereated by rc4/arc4 encryption of
secondary key, a nonce using ntlmv2 session key and sent in the session key
field of the type 3 message sent by the client during ntlmssp
negotiation/exchange
A key is exchanged with the server if client indicates so in flags in
type 1 messsage and server agrees in flag in type 2 message of ntlmssp
negotiation. If both client and agree, a key sent by client in
type 3 message of ntlmssp negotiation in the session key field.
The key is a ciphertext generated off of secondary key, a nonce, using
ntlmv2 hash via rc4/arc4.
Signing works for ntlmssp in this patch. The sequence number within
the server structure needs to be zero until session is established
i.e. till type 3 packet of ntlmssp exchange of a to be very first
smb session on that smb connection is sent.
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The cifs_convert_address() returns zero on error but this caller is
testing for negative returns.
Btw. "i" is unsigned here, so it's never negative.
Reviewed-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Start calculating auth response within a session. Move/Add pertinet
data structures like session key, server challenge and ntlmv2_hash in
a session structure. We should do the calculations within a session
before copying session key and response over to server data
structures because a session setup can fail.
Only after a very first smb session succeeds, it copy/make its
session key, session key of smb connection. This key stays with
the smb connection throughout its life.
sequence_number within server is set to 0x2.
The authentication Message Authentication Key (mak) which consists
of session key followed by client response within structure session_key
is now dynamic. Every authentication type allocates the key + response
sized memory within its session structure and later either assigns or
frees it once the client response is sent and if session's session key
becomes connetion's session key.
ntlm/ntlmi authentication functions are rearranged. A function
named setup_ntlm_resp(), similar to setup_ntlmv2_resp(), replaces
function cifs_calculate_session_key().
size of CIFS_SESS_KEY_SIZE is changed to 16, to reflect the byte size
of the key it holds.
Reviewed-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
GlobalSMBSesLock is now cifs_file_list_lock. Update comments to reflect this.
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
write_behind_rc is redundant and just adds complexity to the code. What
we really want to do instead is to use mapping_set_error to reset the
flags on the mapping when we find a writeback error and can't report it
to userspace yet.
For cifs_flush and cifs_fsync, we shouldn't reset the flags since errors
returned there do get reported to userspace.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>