Commit Graph

5652 Commits

Author SHA1 Message Date
Olga Kornievskaia d826e5b827 NFSv4.x recover from pre-mature loss of openstateid
Ever since the commit 0e0cb35b41, it's possible to lose an open stateid
while retrying a CLOSE due to ERR_OLD_STATEID. Once that happens,
operations that require openstateid fail with EAGAIN which is propagated
to the application then tests like generic/446 and generic/168 fail with
"Resource temporarily unavailable".

Instead of returning this error, initiate state recovery when possible to
recover the open stateid and then try calling nfs4_select_rw_stateid()
again.

Fixes: 0e0cb35b41 ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:33 -05:00
Olga Kornievskaia 62a1573fcf NFSv4 fix acl retrieval over krb5i/krb5p mounts
For the krb5i and krb5p mount, it was problematic to truncate the
received ACL to the provided buffer because an integrity check
could not be preformed.

Instead, provide enough pages to accommodate the largest buffer
bounded by the largest RPC receive buffer size.

Note: I don't think it's possible for the ACL to be truncated now.
Thus NFS4_ACL_TRUNC flag and related code could be possibly
removed but since I'm unsure, I'm leaving it.

v2: needs +1 page.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:33 -05:00
Trond Myklebust c74dfe97c1 NFS: Add mount option 'softreval'
Add a mount option 'softreval' that allows attribute revalidation 'getattr'
calls to time out, and causes them to fall back to using the cached
attributes.
The use case for this option is for ensuring that we can still (slowly)
traverse paths and use cached information even when the server is down.
Once the server comes back up again, the getattr calls start succeeding,
and the caches will revalidate as usual.

The 'softreval' mount option is automatically enabled if you have
specified 'softerr'.  It can be turned off using the options
'nosoftreval', or 'hard'.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:33 -05:00
Trond Myklebust 5c965db86e NFS: Trust cached access if we've already revalidated the inode once
If we've already revalidated the inode once then don't distrust the
access cache unless the NFS_INO_INVALID_ACCESS flag is actually set.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:33 -05:00
Trond Myklebust 4daaeba938 NFS: Fix nfs_direct_write_reschedule_io()
The 'hdr->good_bytes' is defined as the number of bytes we expect to
read or write starting at offset hdr->io_start. In the case of a partial
read/write we may end up adjusting hdr->args.offset and hdr->args.count
to skip I/O for data that was already read/written, and so we must ensure
the calculation takes that into account.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:33 -05:00
Trond Myklebust 8c9cb71491 NFS: When resending after a short write, reset the reply count to zero
If we're resending a write due to a short read or write, ensure we
reset the reply count to zero.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:33 -05:00
Trond Myklebust e8194b7dd3 NFS: Improve tracing of permission calls
On exit from nfs_do_access(), record the mask representing the requested
permissions, as well as the server-supplied set of access rights for
this user.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:33 -05:00
Trond Myklebust 088f3e68d8 pNFS/flexfiles: Add tracing for layout errors
Trace layout errors for pNFS/flexfiles on read/write/commit operations.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:33 -05:00
Trond Myklebust 7bdd297ea6 NFS: Clean up generic file commit tracepoint
Clean up the generic file commit tracepoints to use a 64-bit value
for the verifier, and to display the pNFS filehandle, if it exists.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:33 -05:00
Trond Myklebust 5bb2a7cb9f NFS: Clean up generic writeback tracepoints
Clean up the generic writeback tracepoints so they do pass the
full structures as arguments. Also ensure we report the number
of bytes actually written.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:32 -05:00
Trond Myklebust 2343172d34 NFS: Clean up generic file read tracepoints
Clean up the generic file read tracepoints so they do pass the
full structures as arguments. Also ensure we report the number
of bytes actually read.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:32 -05:00
Trond Myklebust 0722dc9fea pNFS/flexfiles: Record resend attempts on I/O failure
If the attempt to do pNFS fails, then record what action we
take to recover (resend, reset to pnfs or reset to mds).

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:32 -05:00
Trond Myklebust 118b629219 NFS: Fix fix of show_nfs_errors
Casting a negative value to an unsigned long is not the same as
converting it to its absolute value.

Fixes: 96650e2eff ("NFS: Fix show_nfs_errors macros again")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:32 -05:00
Trond Myklebust 25925b00a9 NFSv4: Improve read/write/commit tracing
Ensure we always return the number of bytes read/written. Also display
the pnfs filehandle if it is in use.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:32 -05:00
Trond Myklebust 221203ce64 NFS/pnfs: Fix pnfs_generic_prepare_to_resend_writes()
Instead of making assumptions about the commit verifier contents, change
the commit code to ensure we always check that the verifier was set
by the XDR code.

Fixes: f54bcf2ece ("pnfs: Prepare for flexfiles by pulling out common code")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:32 -05:00
Trond Myklebust 2197e9b06c NFS: Fix up fsync() when the server rebooted
Don't clear the NFS_CONTEXT_RESEND_WRITES flag until after calling
nfs_commit_inode(). Otherwise, if nfs_commit_inode() returns an
error, we end up with dirty pages in the page cache, but no tag
to tell us that those pages need resending.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:32 -05:00
Trond Myklebust b8946d7bfb NFS: Revalidate the file mapping on all fatal writeback errors
If a write or commit failed, and the mapping sees a fatal error, we
need to revalidate the contents of that mapping.

Fixes: 06c9fdf3b9 ("NFS: On fatal writeback errors, we need to call nfs_inode_remove_request()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:32 -05:00
Trond Myklebust 0df68ced55 NFS: Revalidate the file size on a fatal write error
If we suffer a fatal error upon writing a file, which causes us to
need to revalidate the entire mapping, then we should also revalidate
the file size.

Fixes: d2ceb7e570 ("NFS: Don't use page_file_mapping after removing the page")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:32 -05:00
Colin Ian King e0b27d98bf NFS: Add missing null check for failed allocation
Currently the allocation of buf is not being null checked and
a null pointer dereference can occur when the memory allocation fails.
Fix this by adding a check and returning -ENOMEM.

Addresses-Coverity: ("Dereference null return")
Fixes: 6d972518b821 ("NFS: Add fs_context support.")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:31 -05:00
Geert Uytterhoeven 474c4f306e nfs: NFS_SWAP should depend on SWAP
If CONFIG_SWAP=n, it does not make much sense to offer the user the
option to enable support for swapping over NFS, as that will still fail
at run time:

    # swapon /swap
    swapon: /swap: swapon failed: Function not implemented

Fix this by adding a dependency on CONFIG_SWAP.

Fixes: a564b8f039 ("nfs: enable swap on NFS")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:31 -05:00
Murphy Zhou bd89bc67f6 fs/nfs, swapon: check holes in swapfile
swapon over NFS does not go through generic_swapfile_activate
code path when setting up extents. This makes holes in NFS
swapfiles possible which is not expected for swapon.

Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:31 -05:00
Chuck Lever 2bb50aabb6 NFS4: Report callback authentication errors
This seems to be a somewhat common issue with Kerberos NFSv4.0
set-ups.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:31 -05:00
Chuck Lever 861e1671bc NFS: Introduce trace events triggered by page writeback errors
Try to capture the reason for the writeback path tagging an error on
a page.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:31 -05:00
zhengbin 6ed2144a80 NFS: move dprintk after nfs_alloc_fattr in nfs3_proc_lookup
In nfs3_proc_lookup, if nfs_alloc_fattr fails, will only print
"NFS call lookup". This may be confusing, move dprintk after
nfs_alloc_fattr.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:31 -05:00
zhengbin 8b98a53248 NFS4: Remove unneeded semicolon
Fixes coccicheck warning:

fs/nfs/nfs4state.c:1138:2-3: Unneeded semicolon
fs/nfs/nfs4proc.c:6862:2-3: Unneeded semicolon
fs/nfs/nfs4proc.c:8629:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:31 -05:00
Arnd Bergmann a3167dacba nfs: encode nfsv4 timestamps as 64-bit
On 32-bit architectures, xdr_encode_nfstime4() needlessly
truncates timestamps to a 32-bit value in the range between
year 1902 and 2038.

Change it to use 'struct timespec64' to allow the entire range
of values supported by the server.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:30 -05:00
Arnd Bergmann e5189e9a51 nfs: remove timespec from xdr_encode_nfstime
For NFSv2 and NFSv3, timestamps are stored using 32-bit entities
and overflow in y2038. For historic reasons we truncate the
64-bit timestamps by converting from a timespec64 to a timespec
first.

Remove this unnecessary conversion step and do the truncation
in the final functions that take a timestamp.

This is transparent to users, but avoids one of the last uses
of 'timespec' and lets us remove it later.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:30 -05:00
Arnd Bergmann bc35b6b0cf nfs: fscache: use timespec64 in inode auxdata
nfs currently behaves differently on 32-bit and 64-bit kernels regarding
the on-disk format of nfs_fscache_inode_auxdata.

That format should really be the same on any kernel, and we should avoid
the 'timespec' type in order to remove that from the kernel later on.

Using plain 'timespec64' would not be good here, since that includes
implied padding and would possibly leak kernel stack data to the on-disk
format on 32-bit architectures.

struct __kernel_timespec would work as a replacement, but open-coding
the two struct members in nfs_fscache_inode_auxdata makes it more
obvious what's going on here, and keeps the current format for 64-bit
architectures.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:30 -05:00
Arnd Bergmann ae08483cdd nfs: use timespec64 in nfs_fattr
Push down the use of timespec64 into NFS nfs_fattr, to avoid needless
conversions, and get closer to having 64-bit time_t support on 32-bit
NFSv4 and removing some old interfaces from the kernel.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:54:30 -05:00
Scott Mayhew ce8866f091 NFS: Attach supplementary error information to fs_context.
Split out from commit "NFS: Add fs_context support."

Add wrappers nfs_errorf(), nfs_invalf(), and nfs_warnf() which log error
information to the fs_context.  Convert some printk's to use these new
wrappers instead.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
Scott Mayhew 62a55d088c NFS: Additional refactoring for fs_context conversion
Split out from commit "NFS: Add fs_context support."

This patch adds additional refactoring for the conversion of NFS to use
fs_context, namely:

 (*) Merge nfs_mount_info and nfs_clone_mount into nfs_fs_context.
     nfs_clone_mount has had several fields removed, and nfs_mount_info
     has been removed altogether.
 (*) Various functions now take an fs_context as an argument instead
     of nfs_mount_info, nfs_fs_context, etc.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
David Howells f2aedb713c NFS: Add fs_context support.
Add filesystem context support to NFS, parsing the options in advance and
attaching the information to struct nfs_fs_context.  The highlights are:

 (*) Merge nfs_mount_info and nfs_clone_mount into nfs_fs_context.  This
     structure represents NFS's superblock config.

 (*) Make use of the VFS's parsing support to split comma-separated lists

 (*) Pin the NFS protocol module in the nfs_fs_context.

 (*) Attach supplementary error information to fs_context.  This has the
     downside that these strings must be static and can't be formatted.

 (*) Remove the auxiliary file_system_type structs since the information
     necessary can be conveyed in the nfs_fs_context struct instead.

 (*) Root mounts are made by duplicating the config for the requested mount
     so as to have the same parameters.  Submounts pick up their parameters
     from the parent superblock.

[AV -- retrans is u32, not string]
[SM -- Renamed cfg to ctx in a few functions in an earlier patch]
[SM -- Moved fs_context mount option parsing to an earlier patch]
[SM -- Moved fs_context error logging to a later patch]
[SM -- Fixed printks in nfs4_try_get_tree() and nfs4_get_referral_tree()]
[SM -- Added is_remount_fc() helper]
[SM -- Deferred some refactoring to a later patch]
[SM -- Fixed referral mounts, which were broken in the original patch]
[SM -- Fixed leak of nfs_fattr when fs_context is freed]

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
Scott Mayhew e38bb238ed NFS: Convert mount option parsing to use functionality from fs_parser.h
Split out from commit "NFS: Add fs_context support."

Convert existing mount option definitions to fs_parameter_enum's and
fs_parameter_spec's.  Parse mount options using fs_parse() and
lookup_constant().

Notes:

1) Fixed a typo in the udp6 definition in nfs_xprt_protocol_tokens
from the original commit.

2) fs_parse() expects an fs_context as the first arg so that any
errors can be logged to the fs_context.  We're passing NULL for the
fs_context (this will change in commit "NFS: Add fs_context support.")
which is okay as it will cause logfc() to do a printk() instead.

3) fs_parse() expects an fs_paramter as the third arg.  We're
building an fs_parameter manually in nfs_fs_context_parse_option(),
which will go away in commit "NFS: Add fs_context support.".

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
Scott Mayhew 38465f5d1a NFS: rename nfs_fs_context pointer arg in a few functions
Split out from commit "NFS: Add fs_context support."

Rename cfg to ctx in nfs_init_server(), nfs_verify_authflavors(),
and nfs_request_mount().  No functional changes.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
David Howells e558100fda NFS: Do some tidying of the parsing code
Do some tidying of the parsing code, including:

 (*) Returning 0/error rather than true/false.

 (*) Putting the nfs_fs_context pointer first in some arg lists.

 (*) Unwrap some lines that will now fit on one line.

 (*) Provide unioned sockaddr/sockaddr_storage fields to avoid casts.

 (*) nfs_parse_devname() can paste its return values directly into the
     nfs_fs_context struct as that's where the caller puts them.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
David Howells 48be8a66cf NFS: Add a small buffer in nfs_fs_context to avoid string dup
Add a small buffer in nfs_fs_context to avoid string duplication when
parsing numbers.  Also make the parsing function wrapper place the parsed
integer directly in the appropriate nfs_fs_context struct member.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
David Howells cbd071b5da NFS: Deindent nfs_fs_context_parse_option()
Deindent nfs_fs_context_parse_option().

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
David Howells f8ee01e3e2 NFS: Split nfs_parse_mount_options()
Split nfs_parse_mount_options() to move the prologue, list-splitting and
epilogue into one function and the per-option processing into another.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
David Howells 5eb005caf5 NFS: Rename struct nfs_parsed_mount_data to struct nfs_fs_context
Rename struct nfs_parsed_mount_data to struct nfs_fs_context and rename
pointers to it to "ctx".  At some point this will be pointed to by an
fs_context struct's fs_private pointer.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
David Howells e0a626b124 NFS: Constify mount argument match tables
The mount argument match tables should never be altered so constify them.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
David Howells 9954bf92c0 NFS: Move mount parameterisation bits into their own file
Split various bits relating to mount parameterisation out from
fs/nfs/super.c into their own file to form the basis of filesystem context
handling for NFS.

No other changes are made to the code beyond removing 'static' qualifiers.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
Al Viro adf2314fe6 nfs: get rid of ->set_security()
it's always either nfs_set_sb_security() or nfs_clone_sb_security(),
the choice being controlled by mount_info->cloned != NULL.  No need
to add methods, especially when both instances live right next to
the caller and are never accessed anywhere else.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro ba8b614806 nfs_clone_sb_security(): simplify the check for server bogosity
We used to check ->i_op for being nfs_dir_inode_operations.  With
separate inode_operations for v3 and v4 that became bogus, but
rather than going for protocol-dependent comparison we could've
just checked ->i_fop instead; _that_ is the same for all protocol
versions.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro ab88dca311 nfs: get rid of mount_info ->fill_super()
The only possible values are nfs_fill_super and nfs_clone_super.  The
latter is used only when crossing into a submount and it is almost
identical to the former; the only differences are
	* ->s_time_gran unconditionally set to 1 (even for v2 mounts).
Regression dating back to 2012, actually.
	* ->s_blocksize/->s_blocksize_bits set to that of parent.

Rather than messing with the method, stash ->s_blocksize_bits in
mount_info in submount case and after the (now unconditional)
call of nfs_fill_super() override ->s_blocksize/->s_blocksize_bits
if that has been set.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 0c38f2131d nfs: don't pass nfs_subversion to ->create_server()
pick it from mount_info

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 1bc3a2cbf2 nfs: unexport nfs_fs_mount_common()
Make it static, even.  And remove a stale extern of (long-gone)
nfs_xdev_mount_common() from internal.h, while we are at it.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 82eaed2bee nfs: merge xdev and remote file_system_type
they are identical now...

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro a55d3297be nfs: don't bother passing nfs_subversion to ->try_mount() and nfs_fs_mount_common()
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 6a3f7a399e nfs: stash nfs_subversion reference into nfs_mount_info
That will allow to get rid of passing those references around in
quite a few places.  Moreover, that will allow to merge xdev and
remote file_system_type.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 250d69f6a4 nfs: lift setting mount_info from nfs_xdev_mount()
Do it in nfs_do_submount() instead.  As a side benefit, nfs_clone_data
doesn't need ->fh and ->fattr anymore.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 4e357761bd nfs4: fold nfs_do_root_mount/nfs_follow_remote_path
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 6654f8e246 nfs: don't bother setting/restoring export_path around do_nfs_root_mount()
nothing in it will be looking at that thing anyway

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 15a9c4eff6 nfs: fold nfs4_remote_fs_type and nfs4_remote_referral_fs_type
They are identical now.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 7643c12e95 nfs: lift setting mount_info from nfs4_remote{,_referral}_mount
Do that (fhandle allocation, setting struct server up) in
nfs4_referral_mount() and nfs4_try_mount() resp. and pass the
server and pointer to mount_info into nfs_do_root_mount() so that
nfs4_remote_referral_mount()/nfs_remote_mount() could be merged.

Since we are moving stuff from ->mount() instances to the points
prior to vfs_kern_mount() that would trigger those, we need to
make sure that do_nfs_root_mount() will do the corresponding
cleanup itself if it doesn't trigger those ->mount() instances.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro d0b779d47c nfs: stash server into struct nfs_mount_info
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro 444a52960c saner calling conventions for nfs_fs_mount_common()
Allow it to take ERR_PTR() for server and return ERR_CAST() of it in
such case.  All callers used to open-code that...

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Al Viro c64cd6e34e reimplement path_mountpoint() with less magic
... and get rid of a bunch of bugs in it.  Background:
the reason for path_mountpoint() is that umount() really doesn't
want attempts to revalidate the root of what it's trying to umount.
The thing we want to avoid actually happen from complete_walk();
solution was to do something parallel to normal path_lookupat()
and it both went overboard and got the boilerplate subtly
(and not so subtly) wrong.

A better solution is to do pretty much what the normal path_lookupat()
does, but instead of complete_walk() do unlazy_walk().  All it takes
to avoid that ->d_weak_revalidate() call...  mountpoint_last() goes
away, along with everything it got wrong, and so does the magic around
LOOKUP_NO_REVAL.

Another source of bugs is that when we traverse mounts at the final
location (and we need to do that - umount . expects to get whatever's
overmounting ., if any, out of the lookup) we really ought to take
care of ->d_manage() - as it is, manual umount of autofs automount
in progress can lead to unpleasant surprises for the daemon.  Easily
solved by using handle_lookup_down() instead of follow_mount().

Tested-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-15 01:36:06 -05:00
Arnd Bergmann 6e31ded689 nfs: fscache: use timespec64 in inode auxdata
nfs currently behaves differently on 32-bit and 64-bit kernels regarding
the on-disk format of nfs_fscache_inode_auxdata.

That format should really be the same on any kernel, and we should avoid
the 'timespec' type in order to remove that from the kernel later on.

Using plain 'timespec64' would not be good here, since that includes
implied padding and would possibly leak kernel stack data to the on-disk
format on 32-bit architectures.

struct __kernel_timespec would work as a replacement, but open-coding
the two struct members in nfs_fscache_inode_auxdata makes it more
obvious what's going on here, and keeps the current format for 64-bit
architectures.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-12-18 18:07:33 +01:00
Arnd Bergmann 057f184b12 nfs: fix timstamp debug prints
Starting in v5.5, the timestamps are correctly passed down as
64-bit seconds with NFSv4 on 32-bit machines, but some debug
statements still truncate them to 'long'.

Fixes: e86d5a0287 ("NFS: Convert struct nfs_fattr to use struct timespec64")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-12-18 18:07:32 +01:00
Chuck Lever 21f86d2d63 NFS4: Trace lock reclaims
One of the most frustrating messages our sustaining team sees is
the "Lock reclaim failed!" message. Add some observability in the
client's lock reclaim logic so we can capture better data the
first time a problem occurs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 11:04:32 +01:00
Chuck Lever 511ba52e4c NFS4: Trace state recovery operation
Add a trace point in the main state manager loop to observe state
recovery operation. Help track down state recovery bugs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:58:39 +01:00
Olga Kornievskaia f751c54525 NFSv4.2 fix memory leak in nfs42_ssc_open
Static analysis with Coverity detected a memory leak

Reported-by: Colin King <colin.king@canonical.com>
Fixes: ec4b092508 ("NFS: inter ssc open")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:50:41 +01:00
Olga Kornievskaia 66588abe2d NFSv4.2 fix kfree in __nfs42_copy_file_range
This is triggering problems with static analysis with Coverity

Reported-by: Colin King <colin.king@netapp.com>
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:50:30 +01:00
YueHaibing 843aa17a35 NFS: remove duplicated include from nfs4file.c
Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:47:39 +01:00
YueHaibing 0003010424 NFSv4: Make _nfs42_proc_copy_notify() static
Fix sparse warning:

fs/nfs/nfs42proc.c:527:5: warning:
 symbol '_nfs42_proc_copy_notify' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:47:38 +01:00
Anna Schumaker 913eca1aea NFS: Fallocate should use the nfs4_fattr_bitmap
Changing a sparse file could have an effect not only on the file size,
but also on the number of blocks used by the file in the underlying
filesystem. The server's cache_consistency_bitmap doesn't update the
SPACE_USED attribute, so let's switch to the nfs4_fattr_bitmap to catch
this update whenever we do an ALLOCATE or DEALLOCATE.

This patch fixes xfstests generic/568, which tests that fallocating an
unaligned range allocates all blocks touched by that range. Without this
patch, `stat` reports 0 bytes used immediately after the fallocate.
Adding a `sleep 5` to the test also catches the update, but it's better
to do so when we know something has changed.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:47:05 +01:00
Anna Schumaker 89658c4d04 NFS: Return -ETXTBSY when attempting to write to a swapfile
My understanding is that -EBUSY refers to the underlying device, and
that -ETXTBSY is used when attempting to access a file in use by the
kernel (like a swapfile). Changing this return code helps us pass
xfstests generic/569

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:43:24 +01:00
Saurav Girepunje 0e96322b24 fs: nfs: sysfs: Remove NULL check before kfree
Remove NULL check before kfree, NULL check is taken care
on kfree.

Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:38:04 +01:00
YueHaibing 9c91fa36b6 NFS: remove unneeded semicolon
remove unneeded semicolon.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:37:58 +01:00
Ben Dooks d49dd11753 NFSv4: add declaration of current_stateid
The current_stateid is exported from nfs4state.c but not
declared in any of the headers. Add to nfs4_fs.h to
remove the following warning:

fs/nfs/nfs4state.c:80:20: warning: symbol 'current_stateid' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:36:45 +01:00
Trond Myklebust 5326de9e94 NFSv4.x: Drop the slot if nfs4_delegreturn_prepare waits for layoutreturn
If nfs4_delegreturn_prepare needs to wait for a layoutreturn to complete
then make sure we drop the sequence slot if we hold it.

Fixes: 1c5bd76d17 ("pNFS: Enable layoutreturn operation for return-on-close")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-13 16:37:17 +01:00
Trond Myklebust 5c441544f0 NFSv4.x: Handle bad/dead sessions correctly in nfs41_sequence_process()
If the server returns a bad or dead session error, the we don't want
to update the session slot number, but just immediately schedule
recovery and allow it to proceed.

We can/should then remove handling in other places

Fixes: 3453d5708b ("NFSv4.1: Avoid false retries when RPC calls are interrupted")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-13 16:37:17 +01:00
Trond Myklebust 807ce06c24 Merge branch 'linux-ssc-for-5.5' 2019-11-06 08:55:23 -05:00
Trond Myklebust 43622eab8d NFS: Add a tracepoint in nfs_fh_to_dentry()
Add a tracepoint in nfs_fh_to_dentry() for debugging issues with bad
userspace filehandles.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust 70d136b2dc NFSv4: Don't retry the GETATTR on old stateid in nfs4_delegreturn_done()
If the server returns NFS4ERR_OLD_STATEID, then just skip retrying the
GETATTR when replaying the delegreturn compound. We know nothing will
have changed on the server.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust 246afc0aa5 NFSv4: Handle NFS4ERR_OLD_STATEID in delegreturn
If the server returns NFS4ERR_OLD_STATEID in response to our delegreturn,
we want to sync to the most recent seqid for the delegation stateid. However
if we are already at the most recent, we have two possibilities:

- an OPEN reply is still outstanding and will return a new seqid
- an earlier OPEN reply was dropped on the floor due to a timeout.

In the latter case, we may end up unable to complete the delegreturn,
so we want to bump the seqid to a value greater than the cached value.
While this may cause us to lose the delegation in the former case,
it should now be safe to assume that the client will replay the OPEN
if necessary in order to get a new valid stateid.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust ee05f45677 NFSv4: Fix races between open and delegreturn
If the server returns the same delegation in an open that we just used
in a delegreturn, we need to ensure we don't apply that stateid if
the delegreturn has freed it on the server.
To do so, we ensure that we do not free the storage for the delegation
until either it is replaced by a new one, or we throw the inode out of
cache.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust 42c304c34e NFS: nfs_inode_find_state_and_recover() fix stateid matching
In nfs_inode_find_state_and_recover() we want to mark for recovery
only those stateids that match or are older than the supplied
stateid parameter.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust 3887ce1aac NFSv4: Fix nfs4_inode_make_writeable()
Fix the checks in nfs4_inode_make_writeable() to ignore the case where
we hold no delegations. Currently, in such a case, we automatically
flush writes.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust 40e6aa10aa NFSv4: nfs4_return_incompatible_delegation() should check delegation validity
Ensure that we check that the delegation is valid in
nfs4_return_incompatible_delegation() before we try to return it.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust 1deed57235 NFSv4: Don't reclaim delegations that have been returned or revoked
If the delegation has already been revoked, we want to avoid reclaiming
it on reboot.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust af20b7b850 NFSv4: Ignore requests to return the delegation if it was revoked
If the delegation was revoked, or is already being returned, just
clear the NFS_DELEGATION_RETURN and NFS_DELEGATION_RETURN_IF_CLOSED
flags and keep going.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust d51f91d262 NFSv4: Revoke the delegation on success in nfs4_delegreturn_done()
If the delegation was successfully returned, then mark it as revoked.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust f2d47b5502 NFSv4: Update the stateid seqid in nfs_revoke_delegation()
If we revoke a delegation, but the stateid's seqid is newer, then
ensure we update the seqid when marking the delegation as revoked.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Trond Myklebust ae084a32ee NFSv4: Clear the NFS_DELEGATION_REVOKED flag in nfs_update_inplace_delegation()
If the server sent us a new delegation stateid that is more recent than
the one that got revoked, then clear the NFS_DELEGATION_REVOKED flag.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust e0f07896af NFSv4: Hold the delegation spinlock when updating the seqid
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust f9e0cc9c97 NFSv4: Don't remove the delegation from the super_list more than once
Add a check to ensure that we haven't already removed the delegation
from the inode after we take all the relevant locks.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust b47e0e478c NFS: Rename nfs_inode_return_delegation_noreclaim()
Rename nfs_inode_return_delegation_noreclaim() to
nfs_inode_evict_delegation(), which better describes what it
does.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust b57562087b NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was revoked
If the delegation was revoked, we don't want to retry the delegreturn.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust 457a50424b NFSv4: Delegation recalls should not find revoked delegations
If we're processsing a delegation recall, ignore the delegations that
have already been revoked or returned.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust 5decae1623 NFSv4: nfs4_callback_getattr() should ignore revoked delegations
If the delegation has been revoked, ignore it.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust 333ac786a1 NFSv4: Fix delegation handling in update_open_stateid()
If the delegation is marked as being revoked, then don't use it in
the open state structure.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust e6237b6feb NFSv4.1: Don't rebind to the same source port when reconnecting to the server
NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
at the source port when deciding whether or not an RPC call is a replay
of a previous call. This requires clients to perform strange TCP gymnastics
in order to ensure that when they reconnect to the server, they bind
to the same source port.

NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
that do not look at the source port of the connection. This patch therefore
ensures they can ignore the rebind requirement.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust 52f98f1a2d NFS/pnfs: Separate NFSv3 DS and MDS traffic
If a NFSv3 server is being used as both a DS and as a regular NFSv3 server,
we may want to keep the IO traffic on a separate TCP connection, since
it will typically have very different timeout characteristics.

This patch therefore sets up a flag to separate the two modes of operation
for the nfs_client.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust c6eb58435b pNFS: nfs3_set_ds_client should set NFS_CS_NOPING
Connecting to the DS is a non-interactive, asynchronous task, so there is
no reason to fire up an extra RPC null ping in order to ensure that the
server is up.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust 4b1b69cedf NFS: Add a flag to tell nfs_client to set RPC_CLNT_CREATE_NOPING
Add a flag to tell the nfs_client it should set RPC_CLNT_CREATE_NOPING when
creating the rpc client.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:44 -05:00
Trond Myklebust d0372b679c NFS: Use non-atomic bit ops when initialising struct nfs_client_initdata
We don't need atomic bit ops when initialising a local structure on the
stack.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:44 -05:00
Trond Myklebust 6430b323ae NFSv3: Clean up timespec encode
Simplify the struct iattr timestamp encoding by skipping the step of
an intermediate struct timespec.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:44 -05:00
Trond Myklebust c9dbfd961b NFSv2: Clean up timespec encode
Simplify the struct iattr timestamp encoding by skipping the step of
an intermediate struct timespec.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:44 -05:00
Trond Myklebust ad97a995d8 NFSv2: Fix a typo in encode_sattr()
Encode the mtime correctly.

Fixes: 95582b0083 ("vfs: change inode times to use struct timespec64")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:44 -05:00