Commit Graph

4613 Commits

Author SHA1 Message Date
Trond Myklebust ee6625a948 pNFS: Fix a reference leak in _pnfs_return_layout
IF NFS_LAYOUT_RETURN_REQUESTED is not set, then we currently exit
without freeing the list of invalidated layout segments, leading
to a reference leak.

Reported-by: Olga Kornievskaia <aglo@umich.edu>
Fixes: 24408f5282 ("pNFS: Fix bugs in _pnfs_return_layout")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-26 15:50:41 -05:00
Chuck Lever 406dab8450 nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
Lock sequence IDs are bumped in decode_lock by calling
nfs_increment_seqid(). nfs_increment_sequid() does not use the
seqid_mutating_err() function fixed in commit 059aa73482 ("Don't
increment lock sequence ID after NFS4ERR_MOVED").

Fixes: 059aa73482 ("Don't increment lock sequence ID after ...")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Xuan Qi <xuan.qi@oracle.com>
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-26 15:25:03 -05:00
Benjamin Coddington a430607b2e NFSv4.0: always send mode in SETATTR after EXCLUSIVE4
Some nfsv4.0 servers may return a mode for the verifier following an open
with EXCLUSIVE4 createmode, but this does not mean the client should skip
setting the mode in the following SETATTR.  It should only do that for
EXCLUSIVE4_1 or UNGAURDED createmode.

Fixes: 5334c5bdac ("NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-24 12:52:34 -05:00
Trond Myklebust 8ac092519a NFSv4.1: Fix a deadlock in layoutget
We cannot call nfs4_handle_exception() without first ensuring that the
slot has been freed. If not, we end up deadlocking with the process
waiting for recovery to complete, and recovery waiting for the slot
table to drain.

Fixes: 2e80dbe7ac ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-23 22:44:12 -05:00
Trond Myklebust c6180a6237 NFSv4: Fix client recovery when server reboots multiple times
If the server reboots multiple times, the client should rely on the
server to tell it that it cannot reclaim state as per section 9.6.3.4
in RFC7530 and section 8.4.2.1 in RFC5661.
Currently, the client is being to conservative, and is assuming that
if the server reboots while state recovery is in progress, then it must
ignore state that was not recovered before the reboot.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-13 13:31:32 -05:00
Trond Myklebust d3129ef672 NFSv4: update_changeattr should update the attribute timestamp
Otherwise, the attribute cache remains marked as being expired.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-12 15:51:19 -05:00
Trond Myklebust c40d52fe1c NFSv4: Don't call update_changeattr() unless the unlink is successful
If the unlink wasn't successful, then the directory has presumably not
changed.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-12 15:51:18 -05:00
Trond Myklebust c733c49c32 NFSv4: Don't apply change_info4 twice on rename within a directory
If a file is renamed, but stays in the same directory, we will still receive
2 change_info4 structures describing the change to that directory, but we
only want to apply it once.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-12 15:51:18 -05:00
Trond Myklebust 2dfc617364 NFSv4: Call update_changeattr() from _nfs4_proc_open only if a file was created
We don't want to invalidate the directory attribute and data cache unless we
know that a file was created, or the change attribute differs from the one
in our cache.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-12 15:51:17 -05:00
Benjamin Coddington 4b09ec4b14 nfs: Don't take a reference on fl->fl_file for LOCK operation
I have reports of a crash that look like __fput() was called twice for
a NFSv4.0 file.  It seems possible that the state manager could try to
reclaim a lock and take a reference on the fl->fl_file at the same time the
file is being released if, during the close(), a signal interrupts the wait
for outstanding IO while removing locks which then skips the removal
of that lock.

Since 83bfff23e9 ("nfs4: have do_vfs_lock take an inode pointer") has
removed the need to traverse fl->fl_file->f_inode in nfs4_lock_done(),
taking that reference is no longer necessary.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-12 12:51:29 -05:00
Thomas Gleixner 1f3a8e49d8 ktime: Get rid of ktime_equal()
No point in going through loops and hoops instead of just comparing the
values.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2016-12-25 17:21:23 +01:00
Thomas Gleixner 2456e85535 ktime: Get rid of the union
ktime is a union because the initial implementation stored the time in
scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
variant for 32bit machines. The Y2038 cleanup removed the timespec variant
and switched everything to scalar nanoseconds. The union remained, but
become completely pointless.

Get rid of the union and just keep ktime_t as simple typedef of type s64.

The conversion was done with coccinelle and some manual mopping up.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2016-12-25 17:21:22 +01:00
Linus Torvalds 7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Linus Torvalds bc1ecd626b NFS client updates for Linux 4.10
Highlights include:
 
 - Further attribute cache improvements to make revalidation more fine grained
 - NFSv4 locking improvements
 
 Bugfixes:
 - nfs4_fl_prepare_ds must be careful about reporting success in files layout
 - pNFS/flexfiles: Instead of marking a device inactive, remove it from the cache
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYWoWvAAoJEGcL54qWCgDydigQAMFTUZTjHGx34l3yqie3uV0D
 P0Ws6qfQJwN6xwgbhLMt6pbECadN7d55KszIM71gu/Iqa1UbtHZht2Slf/Tw5wY8
 vjptzLkvOie5g9sGIhhQqkgrXnG3A96aEuaknIj9JXWALJb01M94ZfC0STgUjmil
 sp100cDmmOonqflk7Tgh/jmfiUucBw9HnyNRY59sUaR0lAmHRrpJuqftmarYKRrq
 1rTCeGx6MyM2kVyqGsY7FP1prQcNcIJXwti/EXrOywXbHZxmjl+l0WLu6iUOEj/I
 zvfWCeMFyMXeVsTrRLYeM1vQ/maUbSev6gms2RshbmDITE6Ir9nHfBTCfOmh95wV
 paKnAzuEKsUR+LhEbsORk03efQYkeDQuJW3QqhdVuqGnEwhaboTiP4GRJLjEo287
 dHy6sEGEbAcieUsolXhDIaxggoF39gGihOT3+m+WgIlXRd38W2mvlQC9oaq89ehw
 Ipetm3swnkY8bTavsNYFW1x+V5zrGH9Dlu2/mpnqVAIwGT+ByolD8tYDAHf4zj3b
 cy9XhFzdLDRy4V95l4hzZFDCdjy4dZZu7cJEO5zqUB3236qhiZbGLjL232ZEHFi3
 wNHn13nADV+DzoOuB5HJwfhKyOOBQsQpBIvzAgt1n7muZwxa35nX/2waX1xt2HX5
 J+s+e02xJnMStcr/2O5v
 =8gLp
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull more NFS client updates from Trond Myklebust:
 "Highlights include:

   - further attribute cache improvements to make revalidation more fine
     grained

   - NFSv4 locking improvements

  Bugfixes:

   - nfs4_fl_prepare_ds must be careful about reporting success in files
     layout

   - pNFS/flexfiles: Instead of marking a device inactive, remove it
     from the cache"

* tag 'nfs-for-4.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Retry the DELEGRETURN if the embedded GETATTR is rejected with EACCES
  NFS: Retry the CLOSE if the embedded GETATTR is rejected with EACCES
  NFSv4: Place the GETATTR operation before the CLOSE
  NFSv4: Also ask for attributes when downgrading to a READ-only state
  NFS: Don't abuse NFS_INO_REVAL_FORCED in nfs_post_op_update_inode_locked()
  pNFS: Return RW layouts on OPEN_DOWNGRADE
  NFSv4: Add encode/decode of the layoutreturn op in OPEN_DOWNGRADE
  NFS: Don't disconnect open-owner on NFS4ERR_BAD_SEQID
  NFSv4: ensure __nfs4_find_lock_state returns consistent result.
  NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.
  pNFS/flexfiles: delete deviceid, don't mark inactive
  NFS: Clean up nfs_attribute_timeout()
  NFS: Remove unused function nfs_revalidate_inode_rcu()
  NFS: Fix and clean up the access cache validity checking
  NFS: Only look at the change attribute cache state in nfs_weak_revalidate()
  NFS: Clean up cache validity checking
  NFS: Don't revalidate the file on close if we hold a delegation
  NFSv4: Don't discard the attributes returned by asynchronous DELEGRETURN
  NFSv4: Update the attribute cache info in update_changeattr
2016-12-21 10:40:30 -08:00
Trond Myklebust 8ac2b42238 NFSv4: Retry the DELEGRETURN if the embedded GETATTR is rejected with EACCES
If our DELEGRETURN RPC call is rejected with an EACCES call, then we should
remove the GETATTR call from the compound RPC and retry.
This could potentially happen when there is a conflict between an
ACL denying attribute reads and our use of SP4_MACH_CRED.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:30:03 -05:00
Trond Myklebust f07d4a31cc NFS: Retry the CLOSE if the embedded GETATTR is rejected with EACCES
If our CLOSE RPC call is rejected with an EACCES call, then we should
remove the GETATTR call from the compound RPC and retry.
This could potentially happen when there is a conflict between an
ACL denying attribute reads and our use of SP4_MACH_CRED.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:30:01 -05:00
Trond Myklebust d8d849835e NFSv4: Place the GETATTR operation before the CLOSE
In order to benefit from the DENY share lock protection, we should
put the GETATTR operation before the CLOSE. Otherwise, we might race
with a Windows machine that thinks it is now safe to modify the file.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:30:00 -05:00
Trond Myklebust 9413a1a1bf NFSv4: Also ask for attributes when downgrading to a READ-only state
If we're downgrading from a READ+WRITE mode to a READ-only mode, then
ask for cache consistency attributes so that we avoid the revalidation
in nfs_close_context()

Fixes: 3947b74d0f ("NFSv4: Don't request a GETATTR on open_downgrade.")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:58 -05:00
Trond Myklebust a5f925bccc NFS: Don't abuse NFS_INO_REVAL_FORCED in nfs_post_op_update_inode_locked()
The NFS_INO_REVAL_FORCED flag now really only has meaning for the
case when we've just been handed a delegation for a file that was already
cached, and we're unsure about that cache.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:56 -05:00
Trond Myklebust e71708d4df pNFS: Return RW layouts on OPEN_DOWNGRADE
If the client holds no more writeable open state, and does not hold a
write delegation, then send a layoutreturn as part of the OPEN_DOWNGRADE.

We do this only for writes, since some layout drivers may require you to
also hold a read layout if you are doing a R/W workload.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:55 -05:00
Trond Myklebust b6808145ad NFSv4: Add encode/decode of the layoutreturn op in OPEN_DOWNGRADE
While we do not need to return the RW layout when downgrading from a
read/write open state to read-only, we might want to do so in order
to reduce the burden on the metadataserver so that it does not need
to check for changed data when responding to GETATTR requests.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:52 -05:00
NeilBrown 86cfb04185 NFS: Don't disconnect open-owner on NFS4ERR_BAD_SEQID
When an NFS4ERR_BAD_SEQID is received the open-owner is removed from
the ->state_owners rbtree so that it will no longer be used.

If any stateids attached to this open-owner are still in use, and if a
request using one gets an NFS4ERR_BAD_STATEID reply, this can for bad.

The state is marked as needing recovery and the nfs4_state_manager()
is scheduled to clean up.  nfs4_state_manager() finds states to be
recovered by walking the state_owners rbtree.  As the open-owner is
not in the rbtree, the bad state is not found so nfs4_state_manager()
completes having done nothing.  The request is then retried, with a
predicatable result (indefinite retries).

If the stateid is for a delegation, this open_owner will be used
to open files when the delegation is returned.  For that to work,
a new open-owner needs to be presented to the server.

This patch changes NFS4ERR_BAD_SEQID handling to leave the open-owner
in the rbtree but updates the 'create_time' so it looks like a new
open-owner.  With this the indefinite retries no longer happen.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:51 -05:00
NeilBrown 3f8f25489f NFSv4: ensure __nfs4_find_lock_state returns consistent result.
If a file has both flock locks and OFD locks, then it is possible that
two different nfs4 lock states could apply to file accesses from a
single process.

It is not possible to know, efficiently, which one is "correct".
Presumably the state which represents a lock that covers the region
undergoing IO would be the "correct" one to use, but finding that has
a non-trivial cost and would provide miniscule value.

Currently we just return whichever is first in the list, which could
result in inconsistent behaviour if an application ever put it self in
this position.  As consistent behaviour is preferable (when perfectly
correct behaviour is not available), change the search to return a
consistent result in this circumstance.
Specifically: if there is both a flock and OFD lock state, always return
the flock one.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:50 -05:00
NeilBrown cfd278c280 NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.
Various places assume that if nfs4_fl_prepare_ds() turns a non-NULL 'ds',
then ds->ds_clp will also be non-NULL.

This is not necessasrily true in the case when the process received a fatal signal
while nfs4_pnfs_ds_connect is waiting in nfs4_wait_ds_connect().
In that case ->ds_clp may not be set, and the devid may not recently have been marked
unavailable.

So add a test for ds_clp == NULL and return NULL in that case.

Fixes: c23266d532 ("NFS4.1 Fix data server connection race")
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Olga Kornievskaia <aglo@umich.edu>
Acked-by: Adamson, Andy <William.Adamson@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:48 -05:00
Weston Andros Adamson 1c48cee83b pNFS/flexfiles: delete deviceid, don't mark inactive
Instead of marking a device inactive, remove it from the cache entirely.

Flexfiles has a way to report errors back to the server, so we don't want
to stop devices from being tried again for 120 seconds.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:45 -05:00
Trond Myklebust 187e593d27 NFS: Clean up nfs_attribute_timeout()
It can be made static.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:44 -05:00
Trond Myklebust 3f642a1335 NFS: Remove unused function nfs_revalidate_inode_rcu()
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:41 -05:00
Trond Myklebust 21c3ba7e5d NFS: Fix and clean up the access cache validity checking
The access cache needs to check whether or not the mode bits, ownership,
or ACL has changed or the cache has timed out.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:39 -05:00
Trond Myklebust 9cdd1d3f1a NFS: Only look at the change attribute cache state in nfs_weak_revalidate()
Just like in nfs_check_verifier(), we want to use
nfs_mapping_need_revalidate_inode() to check our knowledge of the
change attribute is up to date.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:37 -05:00
Trond Myklebust 61540bf6bb NFS: Clean up cache validity checking
Consolidate the open-coded checking of NFS_I(inode)->cache_validity
into a couple of helper functions.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:35 -05:00
Trond Myklebust 58ff41842c NFS: Don't revalidate the file on close if we hold a delegation
If we're holding a delegation, we can skip sending the close-to-open
GETATTR until we're returning that delegation.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:32 -05:00
Trond Myklebust 0bc2c9b4dc NFSv4: Don't discard the attributes returned by asynchronous DELEGRETURN
DELEGRETURN will always carry a reference to the inode except when
the latter is being freed, so let's ensure that we always use that
inode information to ensure close-to-open cache consistency, even
when the DELEGRETURN call is asynchronous.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:29 -05:00
Trond Myklebust e603a4c1b5 NFSv4: Update the attribute cache info in update_changeattr
If we successfully updated the change attribute, we should timestamp the
cache. While we do know that the other attributes are not completely up
to date, we have the NFS_INO_INVALID_ATTR flag that let us know that,
so it is valid to say that the cache has not timed out.
We can also clear NFS_INO_REVAL_PAGECACHE, since our change attribute
is now known to be valid.

Conversely, if the change attribute did not match, we should make sure to
also revalidate the access and ACL caches.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:27 -05:00
Linus Torvalds 231753ef78 Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull partial readlink cleanups from Miklos Szeredi.

This is the uncontroversial part of the readlink cleanup patch-set that
simplifies the default readlink handling.

Miklos and Al are still discussing the rest of the series.

* git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  vfs: make generic_readlink() static
  vfs: remove ".readlink = generic_readlink" assignments
  vfs: default to generic_readlink()
  vfs: replace calling i_op->readlink with vfs_readlink()
  proc/self: use generic_readlink
  ecryptfs: use vfs_get_link()
  bad_inode: add missing i_op initializers
2016-12-17 19:16:12 -08:00
Linus Torvalds 0110c350c8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "In this pile:

   - autofs-namespace series
   - dedupe stuff
   - more struct path constification"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
  ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features
  ocfs2: charge quota for reflinked blocks
  ocfs2: fix bad pointer cast
  ocfs2: always unlock when completing dio writes
  ocfs2: don't eat io errors during _dio_end_io_write
  ocfs2: budget for extent tree splits when adding refcount flag
  ocfs2: prohibit refcounted swapfiles
  ocfs2: add newlines to some error messages
  ocfs2: convert inode refcount test to a helper
  simple_write_end(): don't zero in short copy into uptodate
  exofs: don't mess with simple_write_{begin,end}
  9p: saner ->write_end() on failing copy into non-uptodate page
  fix gfs2_stuffed_write_end() on short copies
  fix ceph_write_end()
  nfs_write_end(): fix handling of short copies
  vfs: refactor clone/dedupe_file_range common functions
  fs: try to clone files first in vfs_copy_file_range
  vfs: misc struct path constification
  namespace.c: constify struct path passed to a bunch of primitives
  quota: constify struct path in quota_on
  ...
2016-12-17 18:44:00 -08:00
Linus Torvalds 73e2e0c9b1 NFS client updates for Linux 4.10
Highlights include:
 
 Stable bugfixes:
  - Fix a pnfs deadlock between read resends and layoutreturn
  - Don't invalidate the layout stateid while a layout return is outstanding
  - Don't schedule a layoutreturn if the layout stateid is marked as invalid
  - On a pNFS error, do not send LAYOUTGET until the LAYOUTRETURN is complete
  - SUNRPC: fix refcounting problems with auth_gss messages.
 
 Features:
  - Add client support for the NFSv4 umask attribute.
  - NFSv4: Correct support for flock() stateids.
  - Add a LAYOUTRETURN operation to CLOSE and DELEGRETURN when return-on-close
    is specified
  - Allow the pNFS/flexfiles layoutstat information to piggyback on LAYOUTRETURN
  - Optimise away redundant GETATTR calls when doing state recovery and/or
    when not required by cache revalidation rules or close-to-open cache
    consistency.
  - Attribute cache improvements
  - RPC/RDMA support for SG_GAP devices
 
 Bugfixes:
  - NFS: Fix performance regressions in readdir
  - pNFS/flexfiles: Fix a deadlock on LAYOUTGET
  - NFSv4: Add missing nfs_put_lock_context()
  - NFSv4.1: Fix regression in callback retry handling
  - Fix false positive NFSv4.0 trunking detection.
  - pNFS/flexfiles: Only send layoutstats updates for mirrors that were updated
  - Various layout stateid related bugfixes
  - RPC/RDMA bugfixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYUyemAAoJEGcL54qWCgDy96wP/Ry86cknfLUqLKJCbFVV4nV8
 HdovCY8if8JQO0HUPDJ25ITvoRJNVRRwJMWnVq5XHRrPUHletDks6/UYfa63UDMv
 umHvGST1cQPU1G+vBIQ3sdkVi1X1GeyBY4rU8aDWxLyKWwyeNptCK12i80ifyaGV
 GZIIxuKVDOFS15M7NwMPRkrBacF8TyVK6S7275z6ZNmhFtvYwMAbvMxLabTwWAe8
 4A03m4RDBTYhQIc2xLJbHfOTYoHi34l90wrn3C7Wv0I2zp8EJlzCY2tSbYKhfPg7
 0HVKNdruRL+cHwLwJEcjFbxOg9MArgRxyup3dwAYQq7Ivsf9oR8/D61CDhanXAzy
 cAWyrCyxaAoPWCOb8k4OFRh6jOF9LBGb5WTNpXRi1LoGrbvi6/WLlJccV60325wd
 gmSAiwIE7aLG8pFk54J0Et86VaQ6qQNBUtJY/4m87uf1FSv3yzQvh7qDr7s+t8ZQ
 kmSTZJzMWZLEEeyvEPZCfjygFu7n4PuTePJu31217styvat39TpY2p0HaaMhgC0V
 /Y0ygGH7VlGp0oaVQ70CtBzGsCWTKU2DU8di7nvsCKg6iLv89QBILIJhVeP42tKd
 juNCWVw4bpW1Zex7HXKecKfMXkDJ4qSDLFzGWj6Ue85f/rCOSKQH01jfvwBlvtBc
 3E6fk85ExTw2+siHWiGy
 =MapM
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable bugfixes:
   - Fix a pnfs deadlock between read resends and layoutreturn
   - Don't invalidate the layout stateid while a layout return is
     outstanding
   - Don't schedule a layoutreturn if the layout stateid is marked as
     invalid
   - On a pNFS error, do not send LAYOUTGET until the LAYOUTRETURN is
     complete
   - SUNRPC: fix refcounting problems with auth_gss messages.

  Features:
   - Add client support for the NFSv4 umask attribute.
   - NFSv4: Correct support for flock() stateids.
   - Add a LAYOUTRETURN operation to CLOSE and DELEGRETURN when
     return-on-close is specified
   - Allow the pNFS/flexfiles layoutstat information to piggyback on
     LAYOUTRETURN
   - Optimise away redundant GETATTR calls when doing state recovery
     and/or when not required by cache revalidation rules or
     close-to-open cache consistency.
   - Attribute cache improvements
   - RPC/RDMA support for SG_GAP devices

  Bugfixes:
   - NFS: Fix performance regressions in readdir
   - pNFS/flexfiles: Fix a deadlock on LAYOUTGET
   - NFSv4: Add missing nfs_put_lock_context()
   - NFSv4.1: Fix regression in callback retry handling
   - Fix false positive NFSv4.0 trunking detection.
   - pNFS/flexfiles: Only send layoutstats updates for mirrors that were
     updated
   - Various layout stateid related bugfixes
   - RPC/RDMA bugfixes"

* tag 'nfs-for-4.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (82 commits)
  SUNRPC: fix refcounting problems with auth_gss messages.
  nfs: add support for the umask attribute
  pNFS/flexfiles: Ensure we have enough buffer for layoutreturn
  pNFS/flexfiles: Remove a redundant parameter in ff_layout_encode_ioerr()
  pNFS/flexfiles: Fix a deadlock on LAYOUTGET
  pNFS: Layoutreturn must free the layout after the layout-private data
  pNFS/flexfiles: Fix ff_layout_add_ds_error_locked()
  NFSv4: Add missing nfs_put_lock_context()
  pNFS: Release NFS_LAYOUT_RETURN when invalidating the layout stateid
  NFSv4.1: Don't schedule lease recovery in nfs4_schedule_session_recovery()
  NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE
  NFS: Only look at the change attribute cache state in nfs_check_verifier
  NFS: Fix incorrect size revalidation when holding a delegation
  NFS: Fix incorrect mapping revalidation when holding a delegation
  pNFS/flexfiles: Support sending layoutstats in layoutreturn
  pNFS/flexfiles: Minor refactoring before adding iostats to layoutreturn
  NFS: Fix up read of mirror stats
  pNFS/flexfiles: Clean up layoutstats
  pNFS/flexfiles: Refactor encoding of the layoutreturn payload
  pNFS: Add a layoutreturn callback to performa layout-private setup
  ...
2016-12-15 18:47:31 -08:00
Andreas Gruenbacher dff25ddb48 nfs: add support for the umask attribute
Clients can set the umask attribute when creating files to cause the
server to apply it always except when inheriting permissions from the
parent directory.  That way, the new files will end up with the same
permissions as files created locally.

See https://tools.ietf.org/html/draft-ietf-nfsv4-umask-02 for more details.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09 23:47:10 -05:00
Al Viro c0cf3ef5e0 nfs_write_end(): fix handling of short copies
What matters when deciding if we should make a page uptodate is
not how much we _wanted_ to copy, but how much we actually have
copied.  As it is, on architectures that do not zero tail on
short copy we can leave uninitialized data in page marked uptodate.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-09 22:41:47 -05:00
Trond Myklebust d9152114f7 pNFS/flexfiles: Ensure we have enough buffer for layoutreturn
The flexfiles client can piggyback both layout errors and layoutstats
as part of the layoutreturn. Both these payloads can get large, with
20 layout error entries taking up about 1.2K, and 4 layoutstats entries
taking up another 1K.
This patch allows a maximum payload of 4k by allocating a full page.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09 20:26:59 -05:00
Trond Myklebust 5ba6a09e92 pNFS/flexfiles: Remove a redundant parameter in ff_layout_encode_ioerr()
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09 20:26:58 -05:00
Miklos Szeredi dfeef68862 vfs: remove ".readlink = generic_readlink" assignments
If .readlink == NULL implies generic_readlink().

Generated by:

to_del="\.readlink.*=.*generic_readlink"
for i in `git grep -l $to_del`; do sed -i "/$to_del"/d $i; done

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-09 16:45:04 +01:00
Fred Isaman 65990d1afb pNFS/flexfiles: Fix a deadlock on LAYOUTGET
We encountered a deadlock where the SEQUENCE that accompanied the
LAYOUTGET triggered a session drain, while ff_layout_alloc_lseg
triggered a GETDEVICEINFO.  The GETDEVICEINFO hung waiting for the
session drain, while the LAYOUTGET held the slot waiting for
alloc_lseg to finish.
  Avoid this by moving the call to nfs4_find_get_deviceid out of
ff_layout_alloc_lseg and into nfs4_ff_layout_prepare_ds.

Signed-off-by: Fred Isaman <fred.isaman@gmail.com>
[dros@primarydata.com: pNFS/flexfiles: fix races in ff_layout_mirror_valid]
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-08 21:49:57 -05:00
Trond Myklebust 2f065ddb64 pNFS: Layoutreturn must free the layout after the layout-private data
The layout-private data may depend on the layout and/or the inode
still existing when it does post-processing and frees its data, so we
need to free them after calling lrp->ld_private.ops->free().

This fixes a mirror list corruption issue in the flexfiles driver.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-07 13:41:59 -05:00
Trond Myklebust cb06793517 pNFS/flexfiles: Fix ff_layout_add_ds_error_locked()
When we're merging an old entry into our new entry, we want to ensure that
we add the list entry in the correct place.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-07 13:41:58 -05:00
NeilBrown 7a0566b38c NFSv4: Add missing nfs_put_lock_context()
Otherwise the lock context won't be freed when we're done with it.

From: NeilBrown <neilb@suse.com>
Fixes: 5bd3f817 ("NFSv4: change nfs4_select_rw_stateid to take a lock_context inplace of lock_owner")
Signed-off-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-07 13:41:58 -05:00
Trond Myklebust 362fb578a5 pNFS: Release NFS_LAYOUT_RETURN when invalidating the layout stateid
Ensure we release the NFS_LAYOUT_RETURN lock when we invalidate the
layout stateid, so that processes and RPC tasks that are waiting on
the layout return can continue.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-05 22:52:01 -05:00
Trond Myklebust d94cbf6c73 NFSv4.1: Don't schedule lease recovery in nfs4_schedule_session_recovery()
If the session has an error, then we want to start by recovering the
session, as any SEQUENCE we send is going to fail with a session
error.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-04 19:34:38 -05:00
Trond Myklebust 2cf10cdd48 NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE
In the case where SEQUENCE receives a NFS4ERR_BADSESSION or
NFS4ERR_DEADSESSION error, we just want to report the session as needing
recovery, and then we want to retry the operation.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-04 19:26:40 -05:00
Trond Myklebust 1cd9cb05f9 NFS: Only look at the change attribute cache state in nfs_check_verifier
When looking at whether or not our dcache is valid, we really don't care
about the general state of the directory attribute cache. Instead, we
we only care about the state of the change attribute.

This fixes a performance issue when the client is responsible for
changing the directory contents; a number of NFSv4 operations will
atomically update the directory change attribute, but may not return
all the other attributes.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-04 18:34:34 -05:00
Trond Myklebust 9310b224f2 NFS: Fix incorrect size revalidation when holding a delegation
We should only care about checking the attributes if the page cache
is marked as dubious (using NFS_INO_REVAL_PAGECACHE) and the
NFS_INO_REVAL_FORCED flag is set.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-04 18:08:40 -05:00