Commit Graph

99 Commits

Author SHA1 Message Date
Scott Mayhew c156618e15 nfs: fix a deadlock in nfs client initialization
The following deadlock can occur between a process waiting for a client
to initialize in while walking the client list during nfsv4 server trunking
detection and another process waiting for the nfs_clid_init_mutex so it
can initialize that client:

Process 1                               Process 2
---------                               ---------
spin_lock(&nn->nfs_client_lock);
list_add_tail(&CLIENTA->cl_share_link,
        &nn->nfs_client_list);
spin_unlock(&nn->nfs_client_lock);
                                        spin_lock(&nn->nfs_client_lock);
                                        list_add_tail(&CLIENTB->cl_share_link,
                                                &nn->nfs_client_list);
                                        spin_unlock(&nn->nfs_client_lock);
                                        mutex_lock(&nfs_clid_init_mutex);
                                        nfs41_walk_client_list(clp, result, cred);
                                        nfs_wait_client_init_complete(CLIENTA);
(waiting for nfs_clid_init_mutex)

Make sure nfs_match_client() only evaluates clients that have completed
initialization in order to prevent that deadlock.

This patch also fixes v4.0 trunking behavior by not marking the client
NFS_CS_READY until the clientid has been confirmed.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-12-15 14:31:49 -05:00
Thomas Meyer 6089dd0d73 NFS: Fix bool initialization/comparison
Bool initializations should use true and false. Bool tests don't need
comparisons.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-11-17 16:43:43 -05:00
Elena Reshetova 212bf41d88 fs, nfs: convert nfs_client.cl_count from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable nfs_client.cl_count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-11-17 13:48:01 -05:00
Trond Myklebust d9cb73300a NFSv4: Fix double frees in nfs4_test_session_trunk()
rpc_clnt_add_xprt() expects the callback function to be synchronous, and
expects to release the transport and switch references itself.

Fixes: 04fa2c6bb5 ("NFS pnfs data server multipath session trunking")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-08-02 09:45:55 -04:00
Chuck Lever 8dcbec6d20 NFSv4.1: Handle EXCHGID4_FLAG_CONFIRMED_R during NFSv4.1 migration
Transparent State Migration copies a client's lease state from the
server where a filesystem used to reside to the server where it now
resides. When an NFSv4.1 client first contacts that destination
server, it uses EXCHANGE_ID to detect trunking relationships.

The lease that was copied there is returned to that client, but the
destination server sets EXCHGID4_FLAG_CONFIRMED_R when replying to
the client. This is because the lease was confirmed on the source
server (before it was copied).

Normally, when CONFIRMED_R is set, a client purges the lease and
creates a new one. However, that throws away the entire benefit of
Transparent State Migration.

Therefore, the client must not purge that lease when it is possible
that Transparent State Migration has occurred.

Reported-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:12 -04:00
Trond Myklebust b49c15f97c NFSv4.0: Fix a lock leak in nfs40_walk_client_list
Xiaolong Ye's kernel test robot detected the following Oops:
[  299.158991] BUG: scheduling while atomic: mount.nfs/9387/0x00000002
[  299.169587] 2 locks held by mount.nfs/9387:
[  299.176165]  #0:  (nfs_clid_init_mutex){......}, at: [<ffffffff8130cc92>] nfs4_discover_server_trunking+0x47/0x1fc
[  299.201802]  #1:  (&(&nn->nfs_client_lock)->rlock){......}, at: [<ffffffff813125fa>] nfs40_walk_client_list+0x2e9/0x338
[  299.221979] CPU: 0 PID: 9387 Comm: mount.nfs Not tainted 4.11.0-rc7-00021-g14d1bbb #45
[  299.235584] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[  299.251176] Call Trace:
[  299.255192]  dump_stack+0x61/0x7e
[  299.260416]  __schedule_bug+0x65/0x74
[  299.266208]  __schedule+0x5d/0x87c
[  299.271883]  schedule+0x89/0x9a
[  299.276937]  schedule_timeout+0x232/0x289
[  299.283223]  ? detach_if_pending+0x10b/0x10b
[  299.289935]  schedule_timeout_uninterruptible+0x2a/0x2c
[  299.298266]  ? put_rpccred+0x3e/0x115
[  299.304327]  ? schedule_timeout_uninterruptible+0x2a/0x2c
[  299.312851]  msleep+0x1e/0x22
[  299.317612]  nfs4_discover_server_trunking+0x102/0x1fc
[  299.325644]  nfs4_init_client+0x13f/0x194

It looks as if we recently added a spin_lock() leak to
nfs40_walk_client_list() when cleaning up the code.

Reported-by: kernel test robot <xiaolong.ye@intel.com>
Fixes: 14d1bbb0ca ("NFS: Create a common nfs4_match_client() function")
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-24 08:05:16 -04:00
Anna Schumaker 4fe6b366d9 NFS: Remove extra dprintk()s from nfs4client.c
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:35 -04:00
Anna Schumaker 1073d9b49a NFS: Clean up nfs4_init_server()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker 2dc42c0d60 NFS: Clean up nfs4_set_client()
If we cut out the dprintk()s, then we can return error codes directly
and cut out the goto.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker 8da0f93438 NFS: Clean up nfs4_check_server_scope()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker ddfa0d4860 NFS: Clean up nfs4_check_serverowner_major_id()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker 14d1bbb0ca NFS: Create a common nfs4_match_client() function
This puts all the common code in a single place for the
walk_client_list() functions.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker 5b6d3ff605 NFS: Clean up nfs4_check_serverowner_minor_id()
Once again, we can remove the function and compare integer values
directly.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker f251fd9e71 NFS: Clean up nfs4_match_clientids()
If we cut out the dprintk()s, then we don't even need this to be a
separate function.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Olga Kornievskaia 033853325f NFSv4.1 respect server's max size in CREATE_SESSION
Currently client doesn't respect max sizes server returns in CREATE_SESSION.
nfs4_session_set_rwsize() gets called and server->rsize, server->wsize are 0
so they never get set to the sizes returned by the server.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-03-17 16:07:03 -04:00
Anna Schumaker 7d38de3ffa NFS: Remove unused authflavour parameter from nfs_get_client()
This parameter hasn't been used since f8407299 (Linux 3.11-rc2), so
let's remove it from this function and callers.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01 17:46:32 -05:00
J. Bruce Fields ced85a7568 nfs: fix false positives in nfs40_walk_client_list()
It's possible that two different servers can return the same (clientid,
verifier) pair purely by coincidence.  Both are 64-bit values, but
depending on the server implementation, they can be highly predictable
and collisions may be quite likely, especially when there are lots of
servers.

So, check for this case.  If the clientid and verifier both match, then
we actually know they *can't* be the same server, since a new
SETCLIENTID to an already-known server should have changed the verifier.

This helps fix a bug that could cause the client to mount a filesystem
from the wrong server.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Yongcheng Yang <yoyang@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01 17:43:31 -05:00
Jeff Layton a1d617d8f1 nfs: allow blocking locks to be awoken by lock callbacks
Add a waitqueue head to the client structure. Have clients set a wait
on that queue prior to requesting a lock from the server. If the lock
is blocked, then we can use that to wait for wakeups.

Note that we do need to do this "manually" since we need to set the
wait on the waitqueue prior to requesting the lock, but requesting a
lock can involve activities that can block.

However, only do that for NFSv4.1 locks, either by compiling out
all of the waitqueue handling when CONFIG_NFS_V4_1 is disabled, or
skipping all of it at runtime if we're dealing with v4.0, or v4.1
servers that don't send lock callbacks.

Note too that even when we expect to get a lock callback, RFC5661
section 20.11.4 is pretty clear that we still need to poll for them,
so we do still sleep on a timeout. We do however always poll at the
longest interval in that case.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
[Anna: nfs4_retry_setlk() "status" should default to -ERESTARTSYS]
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-22 15:54:27 -04:00
Andy Adamson ad0849a7ef NFS test session trunking with exchange id
Use an async exchange id call to test for session trunking

To conform with RFC 5661 section 18.35.4, the Non-Update on
Existing Clientid case, save the exchange id verifier in
cl_confirm and use it for the session trunking exhange id test.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19 13:08:36 -04:00
Andy Adamson ba84db96aa NFS detect session trunking
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19 13:08:36 -04:00
Andy Adamson e7b7cbf662 NFS refactor nfs4_check_serverowner_major_id
For session trunking, to compare nfs41_exchange_id_res with
existing nfs_client

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19 13:08:36 -04:00
Andy Adamson 8e548edb40 NFS refactor nfs4_match_clientids
For session trunking, to compare nfs41_exchange_id_res with
exiting nfs_client.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19 13:08:36 -04:00
Benjamin Coddington 52442f9b11 NFS4: Avoid migration loops
If a server returns itself as a location while migrating, the client may
end up getting stuck attempting to migrate twice to the same server.  Catch
this by checking if the nfs_client found is the same as the existing
client.  For the other two callers to nfs4_set_client, the nfs_client will
always be ERR_PTR(-EINVAL).

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-08-30 09:26:32 -04:00
Tigran Mkrtchyan 3fc75f1208 nfs4: clnt: respect noresvport when establishing connections to DSes
result:

$ mount -o vers=4.1 dcache-lab007:/ /pnfs
$ cp /etc/profile /pnfs
tcp        0      0 131.169.185.68:1005     131.169.191.141:32049   ESTABLISHED
tcp        0      0 131.169.185.68:751      131.169.191.144:2049    ESTABLISHED
$

$ mount -o vers=4.1,noresvport dcache-lab007:/ /pnfs
$ cp /etc/profile /pnfs
tcp        0      0 131.169.185.68:34894    131.169.191.141:32049   ESTABLISHED
tcp        0      0 131.169.185.68:35722    131.169.191.144:2049    ESTABLISHED
$

Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-19 16:23:25 -04:00
Trond Myklebust 5c6e5b60aa NFS: Fix an Oops in the pNFS files and flexfiles connection setup to the DS
Chris Worley reports:
 RIP: 0010:[<ffffffffa0245f80>]  [<ffffffffa0245f80>] rpc_new_client+0x2a0/0x2e0 [sunrpc]
 RSP: 0018:ffff880158f6f548  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff880234f8bc00 RCX: 000000000000ea60
 RDX: 0000000000074cc0 RSI: 000000000000ea60 RDI: ffff880234f8bcf0
 RBP: ffff880158f6f588 R08: 000000000001ac80 R09: ffff880237003300
 R10: ffff880201171000 R11: ffffea0000d75200 R12: ffffffffa03afc60
 R13: ffff880230c18800 R14: 0000000000000000 R15: ffff880158f6f680
 FS:  00007f0e32673740(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000008 CR3: 0000000234886000 CR4: 00000000001406e0
 Stack:
  ffffffffa047a680 0000000000000000 ffff880158f6f598 ffff880158f6f680
  ffff880158f6f680 ffff880234d11d00 ffff88023357f800 ffff880158f6f7d0
  ffff880158f6f5b8 ffffffffa024660a ffff880158f6f5b8 ffffffffa02492ec
 Call Trace:
  [<ffffffffa024660a>] rpc_create_xprt+0x1a/0xb0 [sunrpc]
  [<ffffffffa02492ec>] ? xprt_create_transport+0x13c/0x240 [sunrpc]
  [<ffffffffa0246766>] rpc_create+0xc6/0x1a0 [sunrpc]
  [<ffffffffa038e695>] nfs_create_rpc_client+0xf5/0x140 [nfs]
  [<ffffffffa038f31a>] nfs_init_client+0x3a/0xd0 [nfs]
  [<ffffffffa038f22f>] nfs_get_client+0x25f/0x310 [nfs]
  [<ffffffffa025cef8>] ? rpc_ntop+0xe8/0x100 [sunrpc]
  [<ffffffffa047512c>] nfs3_set_ds_client+0xcc/0x100 [nfsv3]
  [<ffffffffa041fa10>] nfs4_pnfs_ds_connect+0x120/0x400 [nfsv4]
  [<ffffffffa03d41c7>] nfs4_ff_layout_prepare_ds+0xe7/0x330 [nfs_layout_flexfiles]
  [<ffffffffa03d1b1b>] ff_layout_pg_init_write+0xcb/0x280 [nfs_layout_flexfiles]
  [<ffffffffa03a14dc>] __nfs_pageio_add_request+0x12c/0x490 [nfs]
  [<ffffffffa03a1fa2>] nfs_pageio_add_request+0xc2/0x2a0 [nfs]
  [<ffffffffa03a0365>] ? nfs_pageio_init+0x75/0x120 [nfs]
  [<ffffffffa03a5b50>] nfs_do_writepage+0x120/0x270 [nfs]
  [<ffffffffa03a5d31>] nfs_writepage_locked+0x61/0xc0 [nfs]
  [<ffffffff813d4115>] ? __percpu_counter_add+0x55/0x70
  [<ffffffffa03a6a9f>] nfs_wb_single_page+0xef/0x1c0 [nfs]
  [<ffffffff811ca4a3>] ? __dec_zone_page_state+0x33/0x40
  [<ffffffffa0395b21>] nfs_launder_page+0x41/0x90 [nfs]
  [<ffffffff811baba0>] invalidate_inode_pages2_range+0x340/0x3a0
  [<ffffffff811bac17>] invalidate_inode_pages2+0x17/0x20
  [<ffffffffa039960e>] nfs_release+0x9e/0xb0 [nfs]
  [<ffffffffa0399570>] ? nfs_open+0x60/0x60 [nfs]
  [<ffffffffa0394dad>] nfs_file_release+0x3d/0x60 [nfs]
  [<ffffffff81226e6c>] __fput+0xdc/0x1e0
  [<ffffffff81226fbe>] ____fput+0xe/0x10
  [<ffffffff810bf2e4>] task_work_run+0xc4/0xe0
  [<ffffffff810a4188>] do_exit+0x2e8/0xb30
  [<ffffffff8102471c>] ? do_audit_syscall_entry+0x6c/0x70
  [<ffffffff811464e6>] ? __audit_syscall_exit+0x1e6/0x280
  [<ffffffff810a4a5f>] do_group_exit+0x3f/0xa0
  [<ffffffff810a4ad4>] SyS_exit_group+0x14/0x20
  [<ffffffff8179b76e>] system_call_fastpath+0x12/0x71

Which seems to be due to a call to utsname() when in a task exit context
in order to determine the hostname to set in rpc_new_client().

In reality, what we want here is not the hostname of the current task, but
the hostname that was used to set up the metadata server.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-06-30 15:29:56 -04:00
Benjamin Coddington c68a027c05 nfs4: start callback_ident at idr 1
If clp->cl_cb_ident is zero, then nfs_cb_idr_remove_locked() skips removing
it when the nfs_client is freed.  A decoding or server bug can then find
and try to put that first nfs_client which would lead to a crash.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: d687031265 ("nfs4client: convert to idr_alloc()")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:59:42 -05:00
Anna Schumaker d8efa4e625 NFS: Use RPC functions for matching sockaddrs
They already exist and do the exact same thing.  Let's save ourselves
several lines of code!

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-17 13:29:51 -05:00
Kinglong Mee bc4da2a2be nfs: Drop bad comment in nfs41_walk_client_list()
Commit 7b1f1fd184 "NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list"
have change the logical of the list_for_each_entry().

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01 11:30:59 -04:00
Linus Torvalds 59953fba87 NFS client updates for Linux 4.1
Highlights include:
 
 Stable patches:
 - Fix a regression in /proc/self/mountstats
 - Fix the pNFS flexfiles O_DIRECT support
 - Fix high load average due to callback thread sleeping
 
 Bugfixes:
 - Various patches to fix the pNFS layoutcommit support
 - Do not cache pNFS deviceids unless server notifications are enabled
 - Fix a SUNRPC transport reconnection regression
 - make debugfs file creation failure non-fatal in SUNRPC
 - Another fix for circular directory warnings on NFSv4 "junctioned" mountpoints
 - Fix locking around NFSv4.2 fallocate() support
 - Truncating NFSv4 file opens should also sync O_DIRECT writes
 - Prevent infinite loop in rpcrdma_ep_create()
 
 Features:
 - Various improvements to the RDMA transport code's handling of memory
   registration
 - Various code cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVOmT6AAoJEGcL54qWCgDyrhYQAMPKXB55jrdOR/7UVSF/xPML
 7OjMGHvBnTn/y0pNIyLyS1PjTZZsD/WZjoW9EFGpTv727qQNVoFxFRLNUcgi3NoL
 1YledCkLf7Q32aqod93SRRFPc9hzBoKhOZpOzBuWaAviyAB3KLi70DWAq9qRReYM
 prXUQQjpW5FLU+B2ifaVc2RCnu/rZ2c02YdR2XdtkBaAJxuhB2vR8IY1evwjCv3R
 5zyLDd9zSDDoArdpUzM97cxZPcYRSqbOwgTKvaaRnDDq/mKbKMZaqmEfjblwzNFt
 b43FbveJzZ3hlPADIpmaiMHjRTbxWjIKc9K1sOF2FPfcuPe2yM3DMAxDegUkEveS
 7fkbv/qRZ30NqfchGanX/pmBlLOcdI76qe/bwhN19wCnw48O1eeHi1HK8rWGhU+E
 qcrRZ3ZS2ufP/MVBuhauy0qU9Q4wcEtm7NGGP1231ZtmfjHKyBa4pLirNfG1AlJt
 dK7tBrknVx+WVm/UddJp/fEsxbP0+fki6TwzioHUSWcz8rDVYF6PFT/QPM54SX2h
 0oqwvu6d/uShpkVRm+fbje8FHmUxKdgqDsCYX2fNjWskh1oXSPsItvjqmTmTlE0i
 EBmBwVwI0uB1ZQ3PrJLadhRcO3ZJmLQ5gNj456dstvWy6UQds1xyIQ/DgvmlzxWO
 E9t0l18xHGRwbndsDa8f
 =j5dP
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Another set of mainly bugfixes and a couple of cleanups.  No new
  functionality in this round.

  Highlights include:

  Stable patches:
   - Fix a regression in /proc/self/mountstats
   - Fix the pNFS flexfiles O_DIRECT support
   - Fix high load average due to callback thread sleeping

  Bugfixes:
   - Various patches to fix the pNFS layoutcommit support
   - Do not cache pNFS deviceids unless server notifications are enabled
   - Fix a SUNRPC transport reconnection regression
   - make debugfs file creation failure non-fatal in SUNRPC
   - Another fix for circular directory warnings on NFSv4 "junctioned"
     mountpoints
   - Fix locking around NFSv4.2 fallocate() support
   - Truncating NFSv4 file opens should also sync O_DIRECT writes
   - Prevent infinite loop in rpcrdma_ep_create()

  Features:
   - Various improvements to the RDMA transport code's handling of
     memory registration
   - Various code cleanups"

* tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (55 commits)
  fs/nfs: fix new compiler warning about boolean in switch
  nfs: Remove unneeded casts in nfs
  NFS: Don't attempt to decode missing directory entries
  Revert "nfs: replace nfs_add_stats with nfs_inc_stats when add one"
  NFS: Rename idmap.c to nfs4idmap.c
  NFS: Move nfs_idmap.h into fs/nfs/
  NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.h
  NFS: Add a stub for GETDEVICELIST
  nfs: remove WARN_ON_ONCE from nfs_direct_good_bytes
  nfs: fix DIO good bytes calculation
  nfs: Fetch MOUNTED_ON_FILEID when updating an inode
  sunrpc: make debugfs file creation failure non-fatal
  nfs: fix high load average due to callback thread sleeping
  NFS: Reduce time spent holding the i_mutex during fallocate()
  NFS: Don't zap caches on fallocate()
  xprtrdma: Make rpcrdma_{un}map_one() into inline functions
  xprtrdma: Handle non-SEND completions via a callout
  xprtrdma: Add "open" memreg op
  xprtrdma: Add "destroy MRs" memreg op
  xprtrdma: Add "reset MRs" memreg op
  ...
2015-04-26 17:33:59 -07:00
Anna Schumaker 40c64c26a4 NFS: Move nfs_idmap.h into fs/nfs/
This file is only used internally to the NFS v4 module, so it doesn't
need to be in the global include path.  I also renamed it from
nfs_idmap.h to nfs4idmap.h to emphasize that it's an NFSv4-only include
file.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23 15:16:14 -04:00
David Howells 2b0143b5c9 VFS: normal filesystems (and lustre): d_inode() annotations
that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:57 -04:00
Trond Myklebust 48d66b9749 NFSv4: Fix a race in NFSv4.1 server trunking discovery
We do not want to allow a race with another NFS mount to cause
nfs41_walk_client_list() to establish a lease on our nfs_client before
we're done checking for trunking.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-03 20:42:23 -05:00
Trond Myklebust e2c63e091e Merge branch 'flexfiles'
* flexfiles: (53 commits)
  pnfs: lookup new lseg at lseg boundary
  nfs41: .init_read and .init_write can be called with valid pg_lseg
  pnfs: Update documentation on the Layout Drivers
  pnfs/flexfiles: Add the FlexFile Layout Driver
  nfs: count DIO good bytes correctly with mirroring
  nfs41: wait for LAYOUTRETURN before retrying LAYOUTGET
  nfs: add a helper to set NFS_ODIRECT_RESCHED_WRITES to direct writes
  nfs41: add NFS_LAYOUT_RETRY_LAYOUTGET to layout header flags
  nfs/flexfiles: send layoutreturn before freeing lseg
  nfs41: introduce NFS_LAYOUT_RETURN_BEFORE_CLOSE
  nfs41: allow async version layoutreturn
  nfs41: add range to layoutreturn args
  pnfs: allow LD to ask to resend read through pnfs
  nfs: add nfs_pgio_current_mirror helper
  nfs: only reset desc->pg_mirror_idx when mirroring is supported
  nfs41: add a debug warning if we destroy an unempty layout
  pnfs: fail comparison when bucket verifier not set
  nfs: mirroring support for direct io
  nfs: add mirroring support to pgio layer
  pnfs: pass ds_commit_idx through the commit path
  ...

Conflicts:
	fs/nfs/pnfs.c
	fs/nfs/pnfs.h
2015-02-03 16:01:27 -05:00
Peng Tao 30626f9c32 nfs41: allow LD to choose DS connection version/minor_version
flexfile layout may need to set such when making DS connections.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
2015-02-03 11:06:34 -08:00
Peng Tao 064172f345 nfs41: allow LD to choose DS connection auth flavor
flexfile layout may use different auth flavor as specified by MDS.

Reviewed-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
2015-02-03 11:06:33 -08:00
Trond Myklebust 3175e1dcec NFSv4.1: Fix an Oops in nfs41_walk_client_list
If we start state recovery on a client that failed to initialise correctly,
then we are very likely to Oops.

Reported-by: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
Link: http://lkml.kernel.org/r/130621862.279655.1421851650684.JavaMail.zimbra@desy.de
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-01-21 17:10:09 -05:00
Chuck Lever 7a01edf005 NFS: Ignore transport protocol when detecting server trunking
Detect server trunking across transport protocols. Otherwise, an
RDMA mount and a TCP mount of the same server will end up with
separate nfs_clients using the same clientid4.

Reported-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-01-05 19:40:54 -08:00
Trond Myklebust 55b9df93dd NFSv4/v4.1: Verify the client owner id during trunking detection
While we normally expect the NFSv4 client to always send the same client
owner to all servers, there are a couple of situations where that is not
the case:
 1) In NFSv4.0, switching between use of '-omigration' and not will cause
    the kernel to switch between using the non-uniform and uniform client
    strings.
 2) In NFSv4.1, or NFSv4.0 when using uniform client strings, if the
    uniquifier string is suddenly changed.

This patch will catch those situations by checking the client owner id
in the trunking detection code, and will do the right thing if it notices
that the strings differ.

Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-01-05 19:40:53 -08:00
Trond Myklebust ceb3a16c07 NFSv4: Cache the NFSv4/v4.1 client owner_id in the struct nfs_client
Ensure that we cache the NFSv4/v4.1 client owner_id so that we can
verify it when we're doing trunking detection.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-01-05 19:40:53 -08:00
Trond Myklebust 1fc0703af3 NFSv4.1: Fix client id trunking on Linux
Currently, our trunking code will check for session trunking, but will
fail to detect client id trunking. This is a problem, because it means
that the client will fail to recognise that the two connections represent
shared state, even if they do not permit a shared session.
By removing the check for the server minor id, and only checking the
major id, we will end up doing the right thing in both cases: we close
down the new nfs_client and fall back to using the existing one.

Fixes: 05f4c350ee ("NFS: Discover NFSv4 server trunking when mounting")
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # 3.7.x
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-01-05 19:40:53 -08:00
Trond Myklebust 1702562db4 NFS: Generic client side changes from Chuck
These patches fixes for iostats and SETCLIENTID in addition to cleaning
 up the nfs4_init_callback() function.
 
 Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJUdfhQAAoJENfLVL+wpUDr9x4QANWmG6xjEU7RuIBLalOoxit6
 eXnEDNZlwYp6NCkYktHVWaTqXdwq7fdGX+3p4eiwNg3C3SrJHkpWvjeFd4KT/SyP
 B/w3vYG3o5H01i3Mb5kCD2uW0gbS9soZXpIg+uHHOl43yZzneC0vPTLhQ/h+9zPc
 B32XcgQhvLIHR3LWrC/+uolsa31lcyya0W45PX+iCpzHF7i9qRBrNLODkTlw1hNQ
 eEnYIvVy5oW00zlHJUYiTHP3e+0EJn5PAngdYbqiboJ9mK7DbB0QDwqyvJbIT7ql
 WAip6cNcJnSv1eiVYqDwlR1ok8drK5X7yQCT3lcLzAMDznLsSAL1Itu0h2Ay3z61
 f8XCyTwI0izq0DbdrMcPoqPSitqyM8nkPElnOuitXwzEroPaG40OF67yss3+ixbl
 JeQZ+u35pnpCkKUaZdCK3Pn83StxmUaBcFx8eg30NBc0SN13Eiz6aZcGEperClrR
 RwMLDUhUtAMcMRunRRxiN9lHafPqqeDeJre7uky0p0sU9CsH+1n5qIKLmk+Ber2d
 ZS29TobdR7ktfjQ52XazMAIFzI7r1v4Zn7ziH/WRbvKoKw9ICcyoUGNy9CS5LyWu
 BxWCZAJmby5H1i7V7ituJK8TImb81L4aPW06hHX9k+0SoTrgFjas//4jZYfysJ9N
 PvR0MRNfMFhuBlsTj7Gy
 =PMdS
 -----END PGP SIGNATURE-----

Merge tag 'nfs-cel-for-3.19' of git://git.linux-nfs.org/projects/anna/nfs-rdma into linux-next

Pull pull additional NFS client changes for 3.19 from Anna Schumaker:
  "NFS: Generic client side changes from Chuck

  These patches fixes for iostats and SETCLIENTID in addition to cleaning
  up the nfs4_init_callback() function.

  Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>"

* tag 'nfs-cel-for-3.19' of git://git.linux-nfs.org/projects/anna/nfs-rdma:
  NFS: Clean up nfs4_init_callback()
  NFS: SETCLIENTID XDR buffer sizes are incorrect
  SUNRPC: serialize iostats updates
2014-11-26 17:34:14 -05:00
Chuck Lever c2ef47b7f5 NFS: Clean up nfs4_init_callback()
nfs4_init_callback() is never invoked for NFS versions other than 4.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-11-25 16:22:16 -05:00
Markus Elfring fe0bf1185d NFS: Deletion of unnecessary checks before the function call "nfs_put_client"
The nfs_put_client() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-11-24 20:07:27 -05:00
Steve Dickson 080af20cc9 NFSv4: nfs4_state_manager() vs. nfs_server_remove_lists()
There is a race between nfs4_state_manager() and
nfs_server_remove_lists() that happens during a nfsv3 mount.

The v3 mount notices there is already a supper block so
nfs_server_remove_lists() called which uses the nfs_client_lock
spin lock to synchronize access to the client list.

At the same time nfs4_state_manager() is running through
the client list looking for work to do, using the same
lock. When nfs4_state_manager() wins the race to the
list, a v3 client pointer is found and not ignored
properly which causes the panic.

Moving some protocol checks before the state checking
avoids the panic.

CC: Stable Tree <stable@vger.kernel.org>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-18 13:04:21 -04:00
Peng Tao a363e32e94 nfsv4: set hostname when creating nfsv4 ds connection
We reference cl_hostname in many places for debugging purpose.
So make it useful by setting hostname when calling nfs_get_client.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-07-08 14:30:03 -04:00
Trond Myklebust f9b7ebdf7e NFSv4: Schedule recovery if nfs40_walk_client_list() is interrupted
If a timeout or a signal interrupts the NFSv4 trunking discovery
SETCLIENTID_CONFIRM call, then we don't know whether or not the
server has changed the callback identifier on us.
Assume that it did, and schedule a 'path down' recovery...

Tested-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-03-19 08:34:20 -04:00
Trond Myklebust 292f503cad NFSv4: Use the correct net namespace in nfs4_update_server
We need to use the same net namespace that was used to resolve
the hostname and sockaddr arguments.

Fixes: 32e62b7c3e (NFS: Add nfs4_update_server)
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-02-17 14:15:46 -05:00
Trond Myklebust 20b9a90245 NFSv4.1: nfs4_destroy_session must call rpc_destroy_waitqueue
There may still be timers active on the session waitqueues. Make sure
that we kill them before freeing the memory.

Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-02-01 15:13:39 -05:00
Jeff Layton 0ea9de0ea6 sunrpc: turn warn_gssd() log message into a dprintk()
The original printk() made sense when the GSSAPI codepaths were called
only when sec=krb5* was explicitly requested. Now however, in many cases
the nfs client will try to acquire GSSAPI credentials by default, even
when it's not requested.

Since we don't have a great mechanism to distinguish between the two
cases, just turn the pr_warn into a dprintk instead. With this change we
can also get rid of the ratelimiting.

We do need to keep the EXPORT_SYMBOL(gssd_running) in place since
auth_gss.ko needs it and sunrpc.ko provides it. We can however,
eliminate the gssd_running call in the nfs code since that's a bit of a
layering violation.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-27 15:36:05 -05:00
Weston Andros Adamson abad2fa5ba nfs4: fix discover_server_trunking use after free
If clp is new (cl_count = 1) and it matches another client in
nfs4_discover_server_trunking, the nfs_put_client will free clp before
->cl_preserve_clid is set.

Cc: stable@vger.kernel.org # 3.7+
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-20 16:08:06 -07:00