linux

Commit Graph

Author	SHA1	Message	Date
James Morris	6c8ff877cd	Merge commit 'v3.16' into next	2014-10-01 00:44:04 +10:00
Anna Schumaker	24bab49122	NFSD: Implement SEEK This patch adds server support for the NFS v4.2 operation SEEK, which returns the position of the next hole or data segment in a file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-29 14:35:20 -04:00
Anna Schumaker	87a15a8090	NFSD: Add generic v4.2 infrastructure It's cleaner to introduce everything at once and have the server reply with "not supported" than it would be to introduce extra operations when implementing a specific one in the middle of the list. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-29 14:35:19 -04:00
Christoph Hellwig	0162ac2b97	nfsd: introduce nfsd4_callback_ops Add a higher level abstraction than the rpc_ops for callback operations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-26 16:29:29 -04:00
Christoph Hellwig	f0b5de1b6b	nfsd: split nfsd4_callback initialization and use Split out initializing the nfs4_callback structure from using it. For the NULL callback this gets rid of tons of pointless re-initializations. Note that I don't quite understand what protects us from running multiple NULL callbacks at the same time, but at least this chance doesn't make it worse.. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-26 16:29:28 -04:00
Christoph Hellwig	326129d02a	nfsd: introduce a generic nfsd4_cb Add a helper to queue up a callback. CB_NULL has a bit of special casing because it is special in the specification, but all other new callback operations will be able to share code with this and a few more changes to refactor the callback code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-26 16:29:27 -04:00
Christoph Hellwig	2faf3b4350	nfsd: remove nfsd4_callback.cb_op We can always get at the private data by using container_of, no need for a void pointer. Also introduce a little to_delegation helper to avoid opencoding the container_of everywhere. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-26 16:29:26 -04:00
Benny Halevy	341b51df1f	nfsd: do not clear rpc_resp in nfsd4_cb_done_sequence This is incorrect when a callback is has to be restarted, in which case the XDR decoding of the second iteration will see a NULL cb argument. [hch: updated description] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-26 16:29:25 -04:00
Christoph Hellwig	444b6e910d	nfsd: fix nfsd4_cb_recall_done error handling For any error that is not EBADHANDLE or NFS4ERR_BAD_STATEID, nfsd4_cb_recall_done first marks the connection down, then retries until dl_retries hits zero, then marks the connection down again and sets cb_done. This changes the code to only retry for EBADHANDLE or NFS4ERR_BAD_STATEID, and factors setting cb_done into a single point in the function. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-26 16:29:25 -04:00
Kirill Tkhai	f139caf2e8	sched, cleanup, treewide: Remove set_current_state(TASK_RUNNING) after schedule() schedule(), io_schedule() and schedule_timeout() always return with TASK_RUNNING state set, so one more setting is unnecessary. (All places in patch are visible good, only exception is kiblnd_scheduler() from: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c Its schedule() is one line above standard 3 lines of unified diff) No places where set_current_state() is used for mb(). Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1410529254.3569.23.camel@tkhai Cc: Alasdair Kergon <agk@redhat.com> Cc: Anil Belur <askb23@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: David Airlie <airlied@linux.ie> Cc: David Howells <dhowells@redhat.com> Cc: Dmitry Eremin <dmitry.eremin@intel.com> Cc: Frank Blaschka <blaschka@linux.vnet.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Isaac Huang <he.huang@intel.com> Cc: James E.J. Bottomley <JBottomley@parallels.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Liang Zhen <liang.zhen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Masaru Nomura <massa.nomura@gmail.com> Cc: Michael Opdenacker <michael.opdenacker@free-electrons.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: Oleg Drokin <green@linuxhacker.ru> Cc: Peng Tao <bergwolf@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Robert Love <robert.w.love@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Ursula Braun <ursula.braun@de.ibm.com> Cc: Zi Shen Lim <zlim.lnx@gmail.com> Cc: devel@driverdev.osuosl.org Cc: dm-devel@redhat.com Cc: dri-devel@lists.freedesktop.org Cc: fcoe-devel@open-fcoe.org Cc: jfs-discussion@lists.sourceforge.net Cc: linux390@de.ibm.com Cc: linux-afs@lists.infradead.org Cc: linux-cris-kernel@axis.com Cc: linux-kernel@vger.kernel.org Cc: linux-nfs@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linux-raid@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: qla2xxx-upstream@qlogic.com Cc: user-mode-linux-devel@lists.sourceforge.net Cc: user-mode-linux-user@lists.sourceforge.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-09-19 12:35:17 +02:00
J. Bruce Fields	70b2823535	nfsd4: clarify how grace period ends The grace period is ended in two steps--first userland is notified that the grace period is now long enough that any clients who have not yet reclaimed can be safely forgotten, then we flip the switch that forbids reclaims and allows new opens. I had to think a bit to convince myself that the ordering was right here. Document it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-17 16:33:19 -04:00
J. Bruce Fields	bea57fe45b	nfsd4: stop grace_time update at end of grace period The attempt to automatically set a new grace period time at the end of the grace period isn't really helpful. We'll probably shut down and reboot before we actually make use of the new grace period time anyway. So may as well leave it up to the init system to get this right. This just confuses people when they see /proc/fs/nfsd/nfsv4gracetime change from what they set it to. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-17 16:33:18 -04:00
Jeff Layton	65decb650a	nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients In the case of v4.0 clients, we may call into the "create" client tracking operation multiple times (once for each openowner). Upcalling for each one of those is wasteful and slow however. We can skip doing further "create" operations after the first one if we know that one has already been done. v4.1+ clients generally only call into this function once (on RECLAIM_COMPLETE), and we can't skip upcalling on the create even if the STABLE bit is set. Doing so would make it impossible for nfsdcltrack to lift the grace period early since the timestamp has a different meaning in the case where the client is expected to issue a RECLAIM_COMPLETE. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:17 -04:00
Jeff Layton	788a7914ad	nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls The nfsdcltrack upcall doesn't utilize the NFSD4_CLIENT_STABLE flag, which basically results in an upcall every time we call into the client tracking ops. Change it to set this bit on a successful "check" or "create" request, and clear it on a "remove" request. Also, check to see if that bit is set before upcalling on a "check" or "remove" request, and skip upcalling appropriately, depending on its state. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:17 -04:00
Jeff Layton	d682e750ce	nfsd: serialize nfsdcltrack upcalls for a particular client In a later patch, we want to add a flag that will allow us to reduce the need for upcalls. In order to handle that correctly, we'll need to ensure that racing upcalls for the same client can't occur. In practice it should be rare for this to occur with a well-behaved client, but it is possible. Convert one of the bits in the cl_flags field to be an upcall bitlock, and use it to ensure that upcalls for the same client are serialized. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:16 -04:00
Jeff Layton	d4318acd5d	nfsd: pass extra info in env vars to upcalls to allow for early grace period end In order to support lifting the grace period early, we must tell nfsdcltrack what sort of client the "create" upcall is for. We can't reliably tell if a v4.0 client has completed reclaiming, so we can only lift the grace period once all the v4.1+ clients have issued a RECLAIM_COMPLETE and if there are no v4.0 clients. Also, in order to lift the grace period, we have to tell userland when the grace period started so that it can tell whether a RECLAIM_COMPLETE has been issued for each client since then. Since this is all optional info, we pass it along in environment variables to the "init" and "create" upcalls. By doing this, we don't need to revise the upcall format. The UMH upcall can simply make use of this info if it happens to be present. If it's not then it can just avoid lifting the grace period early. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:15 -04:00
Jeff Layton	7f5ef2e900	nfsd: add a v4_end_grace file to /proc/fs/nfsd Allow a privileged userland process to end the v4 grace period early. Writing "Y", "y", or "1" to the file will cause the v4 grace period to be lifted. The basic idea with this will be to allow the userland client tracking program to lift the grace period once it knows that no more clients will be reclaiming state. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:14 -04:00
Jeff Layton	3b3e7b7223	nfsd: reject reclaim request when client has already sent RECLAIM_COMPLETE As stated in RFC 5661, section 18.51.3: Once a RECLAIM_COMPLETE is done, there can be no further reclaim operations for locks whose scope is defined as having completed recovery. Once the client sends RECLAIM_COMPLETE, the server will not allow the client to do subsequent reclaims of locking state for that scope and, if these are attempted, will return NFS4ERR_NO_GRACE. Ensure that we enforce that requirement. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:13 -04:00
Jeff Layton	919b8049f0	nfsd: remove redundant boot_time parm from grace_done client tracking op Since it's stored in nfsd_net, we don't need to pass it in separately. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:12 -04:00
Jeff Layton	f779002965	lockd: move lockd's grace period handling into its own module Currently, all of the grace period handling is part of lockd. Eventually though we'd like to be able to build v4-only servers, at which point we'll need to put all of this elsewhere. Move the code itself into fs/nfs_common and have it build a grace.ko module. Then, rejigger the Kconfig options so that both nfsd and lockd enable it automatically. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:11 -04:00
Christoph Hellwig	f0c63124a6	nfsd: update mtime on truncate This fixes a failure in xfstests generic/313 because nfs doesn't update mtime on a truncate. The protocol requires this to be done implicity for a size changing setattr. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-11 11:12:16 -04:00
Kinglong Mee	aef9583b23	NFSD: Get reference of lockowner when coping file_lock v5: using nfs4_get_stateowner() instead of an inline function v3: Update based on Jeff's comments v2: Fix bad using of struct file_lock_operations for handle the owner Acked-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-09 16:01:09 -04:00
Kinglong Mee	b5971afa0b	NFSD: New helper nfs4_get_stateowner() for atomic_inc sop reference v5: same as the first version Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-09 16:01:09 -04:00
Dmitry Kasatkin	3034a14682	ima: pass 'opened' flag to identify newly created files Empty files and missing xattrs do not guarantee that a file was just created. This patch passes FILE_CREATED flag to IMA to reliably identify new files. Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> 3.14+	2014-09-09 10:28:43 -04:00
J. Bruce Fields	aee3776441	nfsd4: fix rd_dircount enforcement Commit `3b29970909` "nfsd4: enforce rd_dircount" totally misunderstood rd_dircount; it refers to total non-attribute bytes returned, not number of directory entries returned. Bring the code into agreement with RFC 3530 section 14.2.24. Cc: stable@vger.kernel.org Fixes: `3b29970909` "nfsd4: enforce rd_dircount" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-08 12:02:03 -04:00
Kinglong Mee	027bc41a3e	NFSD: Put export if prepare_creds() fail Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-03 17:43:04 -04:00
Kinglong Mee	13c82e8eb5	NFSD: Full checking of authentication name Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-03 17:43:03 -04:00
Kinglong Mee	48c348b09c	NFSD: Fix bad using of return value from qword_get Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-03 17:43:02 -04:00
Kinglong Mee	15d176c195	NFSD: Fix a memory leak if nfsd4_recdir_load fail Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-03 17:43:01 -04:00
Kinglong Mee	c2236f141e	NFSD: Reset creds after mnt_want_write_file() fail Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-03 17:43:01 -04:00
Kinglong Mee	8519f994e5	NFSD: Put file after ima_file_check fail in nfsd_open() Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-03 17:43:00 -04:00
J. Bruce Fields	ccad7dad86	nfsd4: remove labeled NFS warning from config help The working group appears committed to keeping the protocol stable, the code has gotten some use and seems to work OK. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-28 16:00:07 -04:00
Anna Schumaker	2b8941b962	NFSD: Update some as-yet unused 4.2 error codes Recent NFS v4.2 drafts have removed NFS4ERR_METADATA_NOTSUPP and reassigned the error code to NFS4ERR_UNION_NOTSUPP. I also add in the NFS4ERR_OFFLOAD_NO_REQS error code. We're not using any of these yet, so there's no harm done. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-28 16:00:01 -04:00
Kinglong Mee	6cd906627b	NFSD: Remove duplicate initialization of file_lock locks_alloc_lock() has initialized struct file_lock, no need to re-initialize it here. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-28 15:58:35 -04:00
Rajesh Ghanekar	18c01ab302	nfsd: allow turning off nfsv3 readdir_plus One of our customer's application only needs file names, not file attributes. With directories having 10K+ inodes (assuming buffer cache has directory blocks cached having file names, but inode cache is limited and hence need eviction of older cached inodes), older inodes are evicted periodically. So if they keep on doing readdir(2) from NSF client on multiple directories, some directory's files are periodically removed from inode cache and hence new readdir(2) on same directory requires disk access to bring back inodes again to inode cache. As READDIRPLUS request fetches attributes also, doing getattr on each file on server, it causes unnecessary disk accesses. If READDIRPLUS on NFS client is returned with -ENOTSUPP, NFS client uses READDIR request which just gets the names of the files in a directory, not attributes, hence avoiding disk accesses on server. There's already a corresponding client-side mount option, but an export option reduces the need for configuration across multiple clients. This flag affects NFSv3 only. If it turns out it's needed for NFSv4 as well then we may have to figure out how to extend the behavior to NFSv4, but it's not currently obvious how to do that. Signed-off-by: Rajesh Ghanekar <rajesh_ghanekar@symantec.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-18 15:12:14 -04:00
J. Bruce Fields	f7b43d0c99	nfsd4: reserve adequate space for LOCK op As of `8c7424cff6` "nfsd4: don't try to encode conflicting owner if low on space", we permit the server to process a LOCK operation even if there might not be space to return the conflicting lockowner, because we've made returning the conflicting lockowner optional. However, the rpc server still wants to know the most we might possibly return, so we need to take into account the possible conflicting lockowner in the svc_reserve_space() call here. Symptoms were log messages like "RPC request reserved 88 but used 108". Fixes: `8c7424cff6` "nfsd4: don't try to encode conflicting owner if low on space" Reported-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:14 -04:00
J. Bruce Fields	1383bf37ce	nfsd4: remove obsolete comment We do what Neil suggests now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:14 -04:00
Ross Lagerwall	63bab0651b	nfsd3: Check write permission after checking existence When creating a file that already exists in a read-only directory with O_EXCL, the NFSv3 server returns EACCES rather than EEXIST (which local files and the NFSv4 server return). Fix this by checking the MAY_CREATE permission only if the file does not exist. Since this already happens in do_nfsd_create, the check in nfsd3_proc_create can simply be removed. Signed-off-by: Ross Lagerwall <rosslagerwall@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:14 -04:00
Jeff Layton	afbda402a0	nfsd: call nfs4_put_deleg_lease outside of state_lock Currently, we hold the state_lock when releasing the lease. That's potentially problematic in the future if we allow for setlease methods that can sleep. Move the nfs4_put_deleg_lease call out of the delegation unhashing routine (which was always a bit goofy anyway), and into the unlocked sections of the callers of unhash_delegation_locked. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:14 -04:00
Jeff Layton	6bcc034eac	nfsd: protect lease-related nfs4_file fields with fi_lock Currently these fields are protected with the state_lock, but that doesn't really make a lot of sense. These fields are "private" to the nfs4_file, and can be protected with the more granular fi_lock. The fi_lock is already held when setting these fields. Make the code hold the fp->fi_lock when clearing the lease-related fields in the nfs4_file, and no longer require that the state_lock be held when calling into this function. To prevent lock inversion with the i_lock, we also move the vfs_setlease and fput calls outside of the fi_lock. This also sets us up for allowing vfs_setlease calls to block in the future. Finally, remove a redundant NULL pointer check. unhash_delegation_locked locks the fp->fi_lock prior to that check, so fp in that function must never be NULL. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:13 -04:00
Trond Myklebust	ef9b16dc6d	nfsd: Reorder nfsd_cache_match to check more powerful discriminators first We would normally expect the xid and the checksum to be the best discriminators. Check them before looking at the procedure number, etc. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:13 -04:00
Trond Myklebust	89a26b3d29	nfsd: split DRC global spinlock into per-bucket locks Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:13 -04:00
Trond Myklebust	31e60f5222	nfsd: convert num_drc_entries to an atomic_t ...so we can remove the spinlocking around it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:12 -04:00
Trond Myklebust	11acf6ef3b	nfsd: Remove the cache_hash list Now that the lru list is per-bucket, we don't need a second list for searches. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:12 -04:00
Trond Myklebust	bedd4b61a4	nfsd: convert the lru list into a per-bucket thing Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:12 -04:00
Trond Myklebust	7142b98d9f	nfsd: Clean up drc cache in preparation for global spinlock elimination Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:12 -04:00
Linus Torvalds	0d10c2c170	Merge branch 'for-3.17' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "This includes a major rewrite of the NFSv4 state code, which has always depended on a single mutex. As an example, open creates are no longer serialized, fixing a performance regression on NFSv3->NFSv4 upgrades. Thanks to Jeff, Trond, and Benny, and to Christoph for review. Also some RDMA fixes from Chuck Lever and Steve Wise, and miscellaneous fixes from Kinglong Mee and others" * 'for-3.17' of git://linux-nfs.org/~bfields/linux: (167 commits) svcrdma: remove rdma_create_qp() failure recovery logic nfsd: add some comments to the nfsd4 object definitions nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net nfsd: remove nfs4_lock_state: nfs4_laundromat nfsd: Remove nfs4_lock_state(): reclaim_complete() nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session() nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn() nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt() nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op() nfsd: remove old fault injection infrastructure nfsd: add more granular locking to *_delegations fault injectors nfsd: add more granular locking to forget_openowners fault injector nfsd: add more granular locking to forget_locks fault injector nfsd: add a list_head arg to nfsd_foreach_client_lock ...	2014-08-09 14:31:18 -07:00
Jeff Layton	14a571a8ec	nfsd: add some comments to the nfsd4 object definitions Add some comments that describe what each of these objects is, and how they related to one another. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 16:09:20 -04:00
Jeff Layton	b687f6863e	nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 15:00:54 -04:00
Jeff Layton	74cf76df0f	nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:20 -04:00
Jeff Layton	dab6ef2415	nfsd: remove nfs4_lock_state: nfs4_laundromat Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:20 -04:00
Trond Myklebust	05149dd4dc	nfsd: Remove nfs4_lock_state(): reclaim_complete() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:19 -04:00
Trond Myklebust	cb86fb1428	nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:18 -04:00
Trond Myklebust	3974552dce	nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session() Also destroy_clientid and bind_conn_to_session. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:17 -04:00
Trond Myklebust	3234975f47	nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:16 -04:00
Trond Myklebust	084d4d4549	nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:15 -04:00
Trond Myklebust	36626a2ecf	nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:14 -04:00
Trond Myklebust	2dd7f2ad4e	nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:13 -04:00
Trond Myklebust	51f5e78355	nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:12 -04:00
Trond Myklebust	e7d5dc19ce	nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:12 -04:00
Trond Myklebust	c2d1d6a8f0	nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:11 -04:00
Jeff Layton	285abdee53	nfsd: remove old fault injection infrastructure Remove the old nfsd_for_n_state function and move nfsd_find_client higher up into the file to get rid of forward declaration. Remove the struct nfsd_fault_inject_op arguments from the operations as they are no longer needed by any of them. Finally, remove the old "standard" get and set routines, which also eliminates the client_mutex from this code. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:10 -04:00
Jeff Layton	98d5c7c5bd	nfsd: add more granular locking to *_delegations fault injectors ...instead of relying on the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:09 -04:00
Jeff Layton	82e05efaec	nfsd: add more granular locking to forget_openowners fault injector ...instead of relying on the client_mutex. Also, fix up the printk output that is generated when the file is read. It currently says that it's reporting the number of open files, but it's actually reporting the number of openowners. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:08 -04:00
Jeff Layton	016200c373	nfsd: add more granular locking to forget_locks fault injector ...instead of relying on the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:07 -04:00
Jeff Layton	3738d50e7f	nfsd: add a list_head arg to nfsd_foreach_client_lock In a later patch, we'll want to collect the locks onto a list for later destruction. If "func" is defined and "collect" is defined, then we'll add the lock stateid to the list. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:06 -04:00
Jeff Layton	69fc9edf98	nfsd: add nfsd_inject_forget_clients ...which uses the client_lock for protection instead of client_mutex. Also remove nfsd_forget_client as there are no more callers. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:05 -04:00
Jeff Layton	a0926d1527	nfsd: add a forget_client set_clnt routine ...that relies on the client_lock instead of client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:04 -04:00
Jeff Layton	7ec0e36f1a	nfsd: add a forget_clients "get" routine with proper locking Add a new "get" routine for forget_clients that relies on the client_lock instead of the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:04 -04:00
Jeff Layton	c96223d3b6	nfsd: abstract out the get and set routines into the fault injection ops Now that we've added more granular locking in other places, it's time to address the fault injection code. This code is currently quite reliant on the client_mutex for protection. Start to change this by adding a new set of fault injection op vectors. For now they all use the legacy ones. In later patches we'll add new routines that can deal with more granular locking. Also, move some of the printk routines into the callers to make the results of the operations more uniform. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:02 -04:00
Jeff Layton	294ac32e99	nfsd: protect clid and verifier generation with client_lock The clid counter is a global counter currently. Move it to be a per-net property so that it can be properly protected by the nn->client_lock instead of relying on the client_mutex. The verifier generator is also potentially racy if there are two simultaneous callers. Generate the verifier when we generate the clid value, so it's also created under the client_lock. With this, there's no need to keep two counters as they'd always be in sync anyway, so just use the clientid_counter for both. As Trond points out, what would be best is to eventually move this code to use IDR instead of the hash tables. That would also help ensure uniqueness, but that's probably best done as a separate project. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:02 -04:00
Jeff Layton	fd699b8a48	nfsd: don't destroy clients that are busy It's possible that we'll have an in-progress call on some of the clients while a rogue EXCHANGE_ID or DESTROY_CLIENTID call comes in. Be sure to try and mark the client expired first, so that the refcount is respected. This will only be a problem once the client_mutex is removed. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:01 -04:00
Kinglong Mee	fb94d766af	NFSD: Put the reference of nfs4_file when freeing stid After testing nfs4 lock, I restart the nfsd service, got messages as, [ 5677.403419] nfsd: last server has exited, flushing export cache [ 5677.463728] ============================================================================= [ 5677.463942] BUG nfsd4_files (Tainted: G B OE): Objects remaining in nfsd4_files on kmem_cache_close() [ 5677.464055] ----------------------------------------------------------------------------- [ 5677.464203] INFO: Slab 0xffffea0000233400 objects=28 used=1 fp=0xffff880008cd3d98 flags=0x3ffc0000004080 [ 5677.464318] CPU: 0 PID: 3772 Comm: rmmod Tainted: G B OE 3.16.0-rc2+ #29 [ 5677.464420] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 5677.464538] 0000000000000000 0000000036af2c9f ffff88000ce97d68 ffffffff816eacfa [ 5677.464643] ffffea0000233400 ffff88000ce97e40 ffffffff811cda44 ffffffff00000020 [ 5677.464774] ffff88000ce97e50 ffff88000ce97e00 656a624f00000008 616d657220737463 [ 5677.464875] Call Trace: [ 5677.464925] [<ffffffff816eacfa>] dump_stack+0x45/0x56 [ 5677.464983] [<ffffffff811cda44>] slab_err+0xb4/0xe0 [ 5677.465040] [<ffffffff811d0457>] ? __kmalloc+0x117/0x290 [ 5677.465099] [<ffffffff81100eec>] ? on_each_cpu_cond+0xac/0xf0 [ 5677.465158] [<ffffffff811d1bc0>] ? kmem_cache_close+0x110/0x2e0 [ 5677.465218] [<ffffffff811d1be0>] kmem_cache_close+0x130/0x2e0 [ 5677.465279] [<ffffffff8135a0c1>] ? kobject_cleanup+0x91/0x1b0 [ 5677.465338] [<ffffffff811d22be>] __kmem_cache_shutdown+0xe/0x10 [ 5677.465399] [<ffffffff8119bd28>] kmem_cache_destroy+0x48/0x100 [ 5677.465466] [<ffffffffa05ef78d>] nfsd4_free_slabs+0x2d/0x50 [nfsd] [ 5677.465530] [<ffffffffa05fa987>] exit_nfsd+0x34/0x6ad [nfsd] [ 5677.465589] [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200 [ 5677.465649] [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90 [ 5677.465759] [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b [ 5677.465822] INFO: Object 0xffff880008cd0000 @offset=0 [ 5677.465882] INFO: Allocated in nfsd4_process_open1+0x61/0x350 [nfsd] age=7599 cpu=0 pid=3253 [ 5677.466115] __slab_alloc+0x3b0/0x4b1 [ 5677.466166] kmem_cache_alloc+0x1e4/0x240 [ 5677.466220] nfsd4_process_open1+0x61/0x350 [nfsd] [ 5677.466276] nfsd4_open+0xee/0x860 [nfsd] [ 5677.466329] nfsd4_proc_compound+0x4d7/0x7f0 [nfsd] [ 5677.466384] nfsd_dispatch+0xbb/0x200 [nfsd] [ 5677.466447] svc_process_common+0x453/0x6f0 [sunrpc] [ 5677.466506] svc_process+0x103/0x170 [sunrpc] [ 5677.466559] nfsd+0x117/0x190 [nfsd] [ 5677.466609] kthread+0xd8/0xf0 [ 5677.466656] ret_from_fork+0x7c/0xb0 [ 5677.466775] kmem_cache_destroy nfsd4_files: Slab cache still has objects [ 5677.466839] CPU: 0 PID: 3772 Comm: rmmod Tainted: G B OE 3.16.0-rc2+ #29 [ 5677.466937] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 5677.467049] 0000000000000000 0000000036af2c9f ffff88000ce97eb0 ffffffff816eacfa [ 5677.467150] ffff880020bb2d00 ffff88000ce97ed0 ffffffff8119bdd9 0000000000000000 [ 5677.467250] ffffffffa06065c0 ffff88000ce97ee0 ffffffffa05ef78d ffff88000ce97ef0 [ 5677.467351] Call Trace: [ 5677.467397] [<ffffffff816eacfa>] dump_stack+0x45/0x56 [ 5677.467454] [<ffffffff8119bdd9>] kmem_cache_destroy+0xf9/0x100 [ 5677.467516] [<ffffffffa05ef78d>] nfsd4_free_slabs+0x2d/0x50 [nfsd] [ 5677.467579] [<ffffffffa05fa987>] exit_nfsd+0x34/0x6ad [nfsd] [ 5677.467639] [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200 [ 5677.467765] [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90 [ 5677.467826] [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Fixes: `11b9164ada` "nfsd: Add a struct nfs4_file field to struct nfs4_stid" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:53:36 -04:00
Jeff Layton	7abea1e8e8	nfsd: don't destroy client if mark_client_expired_locked fails If it fails, it means that the client is in use and so destroying it would be bad. Currently, the client_mutex prevents this from happening but once we remove it, we won't be able to do this. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:26 -04:00
Jeff Layton	97403d95e1	nfsd: move unhash_client_locked call into mark_client_expired_locked All the callers except for the fault injection code call it directly afterward, and in the fault injection case it won't hurt to do so anyway. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:25 -04:00
Jeff Layton	217526e7ec	nfsd: protect the close_lru list and oo_last_closed_stid with client_lock Currently, it's protected by the client_mutex. Move it so that the list and the fields in the openowner are protected by the client_lock. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:24 -04:00
Trond Myklebust	0a880a28f8	nfsd: Add lockdep assertions to document the nfs4_client/session locking Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:23 -04:00
Trond Myklebust	3e339f964b	nfsd: Ensure lookup_clientid() takes client_lock Ensure that the client lookup is done safely under the client_lock, so we're not relying on the client_mutex. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:23 -04:00
Trond Myklebust	6b10ad193d	nfsd: Protect nfsd4_destroy_clientid using client_lock ...instead of relying on the client_mutex. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:22 -04:00
Jeff Layton	d20c11d86d	nfsd: Protect session creation and client confirm using client_lock In particular, we want to ensure that the move_to_confirmed() is protected by the nn->client_lock spin lock, so that we can use that when looking up the clientid etc. instead of relying on the client_mutex. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:21 -04:00
Trond Myklebust	3dbacee6e1	nfsd: Protect unconfirmed client creation using client_lock ...instead of relying on the client_mutex. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:20 -04:00
Trond Myklebust	5cc40fd7b6	nfsd: Move create_client() call outside the lock For efficiency reasons, and because we want to use spin locks instead of relying on the client_mutex. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:20 -04:00
Trond Myklebust	425510f5c8	nfsd: Don't require client_lock in free_client The struct nfs_client is supposed to be invisible and unreferenced before it gets here. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:19 -04:00
Trond Myklebust	4864af97e0	nfsd: Ensure that the laundromat unhashes the client before releasing locks If we leave the client on the confirmed/unconfirmed tables, and leave the sessions visible on the sessionid_hashtbl, then someone might find them before we've had a chance to destroy them. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:18 -04:00
Trond Myklebust	4beb345b37	nfsd: Ensure struct nfs4_client is unhashed before we try to destroy it When we remove the client_mutex protection, we will need to ensure that it can't be found by other threads while we're destroying it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:17 -04:00
J. Bruce Fields	83e452fee8	nfsd4: fix out of date comment Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:28:16 -04:00
Kinglong Mee	d9499a9571	NFSD: Decrease nfsd_users in nfsd_startup_generic fail A memory allocation failure could cause nfsd_startup_generic to fail, in which case nfsd_users wouldn't be incorrectly left elevated. After nfsd restarts nfsd_startup_generic will then succeed without doing anything--the first consequence is likely nfs4_start_net finding a bad laundry_wq and crashing. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Fixes: `4539f14981` "nfsd: replace boolean nfsd_up flag by users counter" Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-01 16:26:09 -04:00
Jeff Layton	4ae098d327	nfsd: rename unhash_generic_stateid to unhash_ol_stateid ...to better match other functions that deal with open/lock stateids. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:31 -04:00
Jeff Layton	d83017f94c	nfsd: don't thrash the cl_lock while freeing an open stateid When we remove the client_mutex, we'll have a potential race between FREE_STATEID and CLOSE. The root of the problem is that we are walking the st_locks list, dropping the spinlock and then trying to release the persistent reference to the lockstateid. In between, a FREE_STATEID call can come along and take the lock, find the stateid and then try to put the reference. That leads to a double put. Fix this by not releasing the cl_lock in order to release each lock stateid. Use put_generic_stateid_locked to unhash them and gather them onto a list, and free_ol_stateid_reaplist to free any that end up on the list. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:31 -04:00
Jeff Layton	2c41beb0e5	nfsd: reduce cl_lock thrashing in release_openowner Releasing an openowner is a bit inefficient as it can potentially thrash the cl_lock if you have a lot of stateids attached to it. Once we remove the client_mutex, it'll also potentially be dangerous to do this. Add some functions to make it easier to defer the part of putting a generic stateid reference that needs to be done outside the cl_lock while doing the parts that must be done while holding it under a single lock. First we unhash each open stateid. Then we call put_generic_stateid_locked which will put the reference to an nfs4_ol_stateid. If it turns out to be the last reference, it'll go ahead and remove the stid from the IDR tree and put it onto the reaplist using the st_locks list_head. Then, after dropping the lock we'll call free_ol_stateid_reaplist to walk the list of stateids that are fully unhashed and ready to be freed, and free each of them. This function can sleep, so it must be done outside any spinlocks. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:30 -04:00
Jeff Layton	fc5a96c3b7	nfsd: close potential race in nfsd4_free_stateid Once we remove the client_mutex, it'll be possible for the sc_type of a lock stateid to change after it's found and checked, but before we can go to destroy it. If that happens, we can end up putting the persistent reference to the stateid more than once, and unhash it more than once. Fix this by unhashing the lock stateid prior to dropping the cl_lock but after finding it. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:29 -04:00
Jeff Layton	3c1c995cc2	nfsd: optimize destroy_lockowner cl_lock thrashing Reduce the cl_lock trashing in destroy_lockowner. Unhash all of the lockstateids on the lockowner's list. Put the reference under the lock and see if it was the last one. If so, then add it to a private list to be destroyed after we drop the lock. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:28 -04:00
Jeff Layton	a819ecc1bb	nfsd: add locking to stateowner release Once we remove the client_mutex, we'll need to properly protect the stateowner reference counts using the cl_lock. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:27 -04:00
Jeff Layton	882e9d25e1	nfsd: clean up and reorganize release_lockowner Do more within the main loop, and simplify the function a bit. Also, there's no need to take a stateowner reference unless we're going to call release_lockowner. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:27 -04:00
Trond Myklebust	d4f0489f38	nfsd: Move the open owner hash table into struct nfs4_client Preparation for removing the client_mutex. Convert the open owner hash table into a per-client table and protect it using the nfs4_client->cl_lock spin lock. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:26 -04:00
Trond Myklebust	c58c6610ec	nfsd: Protect adding/removing lock owners using client_lock Once we remove client mutex protection, we'll need to ensure that stateowner lookup and creation are atomic between concurrent compounds. Ensure that alloc_init_lock_stateowner checks the hashtable under the client_lock before adding a new element. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:25 -04:00
Trond Myklebust	7ffb588086	nfsd: Protect adding/removing open state owners using client_lock Once we remove client mutex protection, we'll need to ensure that stateowner lookup and creation are atomic between concurrent compounds. Ensure that alloc_init_open_stateowner checks the hashtable under the client_lock before adding a new element. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:24 -04:00
Jeff Layton	b401be22b5	nfsd: don't allow CLOSE to proceed until refcount on stateid drops Once we remove client_mutex protection, it'll be possible to have an in-flight operation using an openstateid when a CLOSE call comes in. If that happens, we can't just put the sc_file reference and clear its pointer without risking an oops. Fix this by ensuring that v4.0 CLOSE operations wait for the refcount to drop before proceeding to do so. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:23 -04:00
Jeff Layton	d3134b1049	nfsd: make openstateids hold references to their openowners Change it so that only openstateids hold persistent references to openowners. References can still be held by compounds in progress. With this, we can get rid of NFS4_OO_NEW. It's possible that we will create a new openowner in the process of doing the open, but something later fails. In the meantime, another task could find that openowner and start using it on a successful open. If that occurs we don't necessarily want to tear it down, just put the reference that the failing compound holds. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:23 -04:00
Jeff Layton	5adfd8850b	nfsd: clean up refcounting for lockowners Ensure that lockowner references are only held by lockstateids and operations that are in-progress. With this, we can get rid of release_lockowner_if_empty, which will be racy once we remove client_mutex protection. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:22 -04:00
Trond Myklebust	e4f1dd7fc2	nfsd: Make lock stateid take a reference to the lockowner A necessary step toward client_mutex removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:21 -04:00
Jeff Layton	8f4b54c53f	nfsd: add an operation for unhashing a stateowner Allow stateowners to be unhashed and destroyed when the last reference is put. The unhashing must be idempotent. In a future patch, we'll add some locking around it, but for now it's only protected by the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:20 -04:00
Jeff Layton	5db1c03feb	nfsd: clean up lockowner refcounting when finding them Ensure that when finding or creating a lockowner, that we get a reference to it. For now, we also take an extra reference when a lockowner is created that can be put when release_lockowner is called, but we'll remove that in a later patch once we change how references are held. Since we no longer destroy lockowners in the event of an error in nfsd4_lock, we must change how the seqid gets bumped in the lk_is_new case. Instead of doing so on creation, do it manually in nfsd4_lock. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:20 -04:00
Jeff Layton	58fb12e6a4	nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache We don't want to rely on the client_mutex for protection in the case of NFSv4 open owners. Instead, we add a mutex that will only be taken for NFSv4.0 state mutating operations, and that will be released once the entire compound is done. Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay take a reference to the stateowner when they are using it for NFSv4.0 open and lock replay caching. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:19 -04:00
Jeff Layton	6b180f0b57	nfsd: Add reference counting to state owners The way stateowners are managed today is somewhat awkward. They need to be explicitly destroyed, even though the stateids reference them. This will be particularly problematic when we remove the client_mutex. We may create a new stateowner and attempt to open a file or set a lock, and have that fail. In the meantime, another RPC may come in that uses that same stateowner and succeed. We can't have the first task tearing down the stateowner in that situation. To fix this, we need to change how stateowners are tracked altogether. Refcount them and only destroy them once all stateids that reference them have been destroyed. This patch starts by adding the refcounting necessary to do that. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:18 -04:00
Trond Myklebust	2d3f96689f	nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type() Allow nfs4_find_stateid_by_type to take the stateid reference, while still holding the &cl->cl_lock. Necessary step toward client_mutex removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:17 -04:00
Trond Myklebust	fd9110113c	nfsd: Migrate the stateid reference into nfs4_lookup_stateid() Allow nfs4_lookup_stateid to take the stateid reference, instead of having all the callers do so. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:16 -04:00
Trond Myklebust	4cbfc9f704	nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op Allow nfs4_preprocess_seqid_op to take the stateid reference, instead of having all the callers do so. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:15 -04:00
Trond Myklebust	0667b1e9d8	nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op Ensure that all the callers put the open stateid after use. Necessary step toward client_mutex removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:15 -04:00
Trond Myklebust	2585fc7958	nfsd: nfsd4_open_confirm() must reference the open stateid Ensure that nfsd4_open_confirm() keeps a reference to the open stateid until it is done working with it. Necessary step toward client_mutex removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:14 -04:00
Trond Myklebust	8a0b589d8f	nfsd: Prepare nfsd4_close() for open stateid referencing Prepare nfsd4_close for a future where nfs4_preprocess_seqid_op() hands it a fully referenced open stateid. Necessary step toward client_mutex removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:13 -04:00
Trond Myklebust	d6f2bc5dcf	nfsd: nfsd4_process_open2() must reference the open stateid Ensure that nfsd4_process_open2() keeps a reference to the open stateid until it is done working with it. Necessary step toward client_mutex removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:12 -04:00
Trond Myklebust	dcd94cc2e7	nfsd: nfsd4_process_open2() must reference the delegation stateid Ensure that nfsd4_process_open2() keeps a reference to the delegation stateid until it is done working with it. Necessary step toward client_mutex removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:11 -04:00
Trond Myklebust	67cb1279be	nfsd: Ensure that nfs4_open_delegation() references the delegation stateid Ensure that nfs4_open_delegation() keeps a reference to the delegation stateid until it is done working with it. Necessary step toward client_mutex removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:11 -04:00
Trond Myklebust	858cc57336	nfsd: nfsd4_locku() must reference the lock stateid Ensure that nfsd4_locku() keeps a reference to the lock stateid until it is done working with it. Necessary step toward client_mutex removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:10 -04:00
Trond Myklebust	3d0fabd5a4	nfsd: Add reference counting to lock stateids Ensure that nfsd4_lock() references the lock stateid while it is manipulating it. Not currently necessary, but will be once the client_mutex is removed. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:09 -04:00
Jeff Layton	1af71cc801	nfsd: ensure atomicity in nfsd4_free_stateid and nfsd4_validate_stateid Hold the cl_lock over the bulk of these functions. In addition to ensuring that they aren't freed prematurely, this will also help prevent a potential race that could be introduced later. Once we remove the client_mutex, it'll be possible for FREE_STATEID and CLOSE to race and for both to try to put the "persistent" reference to the stateid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:08 -04:00
Jeff Layton	356a95ece7	nfsd: clean up races in lock stateid searching and creation Preparation for removal of the client_mutex. Currently, no lock aside from the client_mutex is held when calling find_lock_state. Ensure that the cl_lock is held by adding a lockdep assertion. Once we remove the client_mutex, it'll be possible for another thread to race in and insert a lock state for the same file after we search but before we insert a new one. Ensure that doesn't happen by redoing the search after allocating a new stid that we plan to insert. If one is found just put the one that was allocated. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:07 -04:00
Jeff Layton	1c755dc1ad	nfsd: Add locking to protect the state owner lists Change to using the clp->cl_lock for this. For now, there's a lot of cl_lock thrashing, but in later patches we'll eliminate that and close the potential races that can occur when releasing the cl_lock while walking the lists. For now, the client_mutex prevents those races. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:07 -04:00
Jeff Layton	b49e084d8c	nfsd: do filp_close in sc_free callback for lock stateids Releasing locks when we unhash the stateid instead of doing so only when the stateid is actually released will be problematic in later patches when we need to protect the unhashing with spinlocks. Move it into the sc_free operation instead. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:19:50 -04:00
Jeff Layton	4770d72201	nfsd4: use cl_lock to synchronize all stateid idr calls Currently, this is serialized by the client_mutex, which is slated for removal. Add finer-grained locking here. Also, do some cleanup around find_stateid to prepare for taking references. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:19:25 -04:00
Trond Myklebust	11b9164ada	nfsd: Add a struct nfs4_file field to struct nfs4_stid All stateids are associated with a nfs4_file. Let's consolidate. Replace delegation->dl_file with the dl_stid.sc_file, and nfs4_ol_stateid->st_file with st_stid.sc_file. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 12:51:34 -04:00
Trond Myklebust	6011695da2	nfsd: Add reference counting to the lock and open stateids When we remove the client_mutex, we'll need to be able to ensure that these objects aren't destroyed while we're not holding locks. Add a ->free() callback to the struct nfs4_stid, so that we can release a reference to the stid without caring about the contents. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 12:43:53 -04:00
Jeff Layton	b3fbfe0e7a	nfsd: print status when nfsd4_open fails to open file it just created It's possible for nfsd to fail opening a file that it has just created. When that happens, we throw a WARN but it doesn't include any info about the error code. Print the status code to give us a bit more info. Our QA group hit some of these warnings under some very heavy stress testing. My suspicion is that they hit the file-max limit, but it's hard to know for sure. Go ahead and add a -ENFILE mapping to nfserr_serverfault to make the error more distinct (and correct). Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 23:08:38 -04:00
Jeff Layton	650ecc8f8f	nfsd: remove dl_fh field from struct nfs4_delegation Now that the nfs4_file has a filehandle in it, we no longer need to keep a per-delegation copy of it. Switch to using the one in the nfs4_file instead. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 14:49:58 -04:00
Jeff Layton	f54fe962b8	nfsd: give block_delegation and delegation_blocked its own spinlock The state lock can be fairly heavily contended, and there's no reason that nfs4_file lookups and delegation_blocked should be mutually exclusive. Let's give the new block_delegation code its own spinlock. It does mean that we'll need to take a different lock in the delegation break code, but that's not generally as critical to performance. Cc: Neil Brown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 14:49:57 -04:00
Jeff Layton	0b26693c56	nfsd: clean up nfs4_set_delegation Move the alloc_init_deleg call into nfs4_set_delegation and change the function to return a pointer to the delegation or an IS_ERR return. This allows us to skip allocating a delegation if the file has already experienced a lease conflict. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 14:49:56 -04:00
Jeff Layton	4cf59221c7	nfsd: clean up arguments to nfs4_open_delegation No need to pass in a net pointer since we can derive that. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 14:49:55 -04:00
Jeff Layton	f9416e281e	nfsd: drop unused stp arg to alloc_init_deleg Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 14:49:54 -04:00
Trond Myklebust	02a3508dba	nfsd: Convert delegation counter to an atomic_long_t type We want to convert to an atomic type so that we don't need to lock across the call to alloc_init_deleg(). Then convert to a long type so that we match the size of 'max_delegations'. None of this is a problem today, but it will be once we remove client_mutex protection. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 14:49:54 -04:00
Jeff Layton	2d4a532d38	nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock Currently, both destroy_revoked_delegation and revoke_delegation manipulate the cl_revoked list without any locking aside from the client_mutex. Ensure that the clp->cl_lock is held when manipulating it, except for the list walking in destroy_client. At that point, the client should no longer be in use, and so it should be safe to walk the list without any locking. That also means that we don't need to do the list_splice_init there either. Also, the fact that revoke_delegation deletes dl_recall_lru list_head without any locking makes it difficult to know whether it's doing so safely in all cases. Move the list_del_init calls into the callers, and add a WARN_ON in the event that t's passed a delegation that has a non-empty list_head. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 14:49:53 -04:00
Jeff Layton	4269067696	nfsd: fully unhash delegations when revoking them Ensure that the delegations cannot be found by the laundromat etc once we add them to the various 'revoke' lists. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 14:49:52 -04:00
Trond Myklebust	f83388341b	nfsd: simplify stateid allocation and file handling Don't allow stateids to clear the open file pointer until they are being destroyed. In a later patches we'll want to rely on the fact that we have a valid file pointer when dealing with the stateid and this will save us from having to do a lot of NULL pointer checks before doing so. Also, move to allocating stateids with kzalloc and get rid of the explicit zeroing of fields. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 14:49:51 -04:00
Jeff Layton	f9c00c3ab4	nfsd: Do not let nfs4_file pin the struct inode Remove the fi_inode field in struct nfs4_file in order to remove the possibility of struct nfs4_file pinning the inode when it does not have any open state. The only place we still need to get to an inode is in check_for_locks, so change it to use find_any_file and use the inode from any that it finds. If it doesn't find one, then just assume there aren't any. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-23 16:35:24 -04:00
Trond Myklebust	b07c54a4a3	nfsd: nfs4_check_fh - make it actually check the filehandle ...instead of just checking the inode that corresponds to it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-23 16:35:24 -04:00
Trond Myklebust	ca94321783	nfsd: Use the filehandle to look up the struct nfs4_file instead of inode This makes more sense anyway since an inode pointer value can change even when the filehandle doesn't. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-23 16:35:24 -04:00
Trond Myklebust	e2cf80d73f	nfsd: Store the filehandle with the struct nfs4_file For use when we may not have a struct inode. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-23 16:35:23 -04:00
Himangi Saraogi	fc8e5a644c	nfsd4: convert comma to semicolon Replace a comma between expression statements by a semicolon. This changes the semantics of the code, but given the current indentation appears to be what is intended. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: // <smpl> @r@ expression e1,e2; @@ e1 -, +; e2; // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-23 14:20:49 -04:00
Jeff Layton	2f6ce8e73c	nfsd: ensure that st_access_bmap and st_deny_bmap are initialized to 0 Open stateids must be initialized with the st_access_bmap and st_deny_bmap set to 0, so that nfs4_get_vfs_file can properly record their state in old_access_bmap and old_deny_bmap. This bug was introduced in commit `baeb4ff0e5` (nfsd: make deny mode enforcement more efficient and close races in it) and was causing the refcounts to end up incorrect when nfs4_get_vfs_file returned an error after bumping the refcounts. This made it impossible to unmount the underlying filesystem after running pynfs tests that involve deny modes. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-23 14:20:47 -04:00
Kinglong Mee	f98bac5a30	NFSD: Fix crash encoding lock reply on 32-bit Commit `8c7424cff6` "nfsd4: don't try to encode conflicting owner if low on space" forgot to free conf->data in nfsd4_encode_lockt and before sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak. Worse, kfree() can be called on an uninitialized pointer in the case of a succesful lock (or one that fails for a reason other than a conflict). (Note that lock->lk_denied.ld_owner.data appears it should be zero here, until you notice that it's one arm of a union the other arm of which is written to in the succesful case by the memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid, sizeof(stateid_t)); in nfsd4_lock(). In the 32-bit case this overwrites ld_owner.data.) Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Fixes: `8c7424cff6` ""nfsd4: don't try to encode conflicting owner if low on space" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-23 10:31:56 -04:00
Jeff Layton	d55a166c96	nfsd: bump dl_time when unhashing delegation There's a potential race between a lease break and DELEGRETURN call. Suppose a lease break comes in and queues the workqueue job for a delegation, but it doesn't run just yet. Then, a DELEGRETURN comes in finds the delegation and calls destroy_delegation on it to unhash it and put its primary reference. Next, the workqueue job runs and queues the delegation back onto the del_recall_lru list, issues the CB_RECALL and puts the final reference. With that, the final reference to the delegation is put, but it's still on the LRU list. When we go to unhash a delegation, it's because we intend to get rid of it soon afterward, so we don't want lease breaks to mess with it once that occurs. Fix this by bumping the dl_time whenever we unhash a delegation, to ensure that lease breaks don't monkey with it. I believe this is a regression due to commit `02e1215f9f` (nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg). Prior to that, the state_lock was held in the lm_break callback itself, and that would have prevented this race. Cc: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-22 15:34:47 -04:00
Trond Myklebust	72c0b0fb9f	nfsd: Move the delegation reference counter into the struct nfs4_stid We will want to add reference counting to the lock stateid and open stateids too in later patches. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-21 17:03:00 -04:00
Jeff Layton	417c6629b2	nfsd: fix race that grants unrecallable delegation If nfs4_setlease succesfully acquires a new delegation, then another task breaks the delegation before we reach hash_delegation_locked, then the breaking task will see an empty fi_delegations list and do nothing. The client will receive an open reply incorrectly granting a delegation and will never receive a recall. Move more of the delegation fields to be protected by the fi_lock. It's more granular than the state_lock and in later patches we'll want to be able to rely on it in addition to the state_lock. Attempt to acquire a delegation. If that succeeds, take the spinlocks and then check to see if the file has had a conflict show up since then. If it has, then we assume that the lease is no longer valid and that we shouldn't hand out a delegation. There's also one more potential (but very unlikely) problem. If the lease is broken before the delegation is hashed, then it could leak. In the event that the fi_delegations list is empty, reset the fl_break_time to jiffies so that it's cleaned up ASAP by the normal lease handling code. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-21 16:31:17 -04:00
J. Bruce Fields	57a3714421	nfsd4: CREATE_SESSION should update backchannel immediately nfsd4_probe_callback kicks off some work that will eventually run nfsd4_process_cb_update and update the session flags. In theory we could process a following SEQUENCE call before that update happens resulting in flags that don't accurately represent, for example, the lack of a backchannel. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-21 12:30:50 -04:00
Chuck Lever	3c45ddf823	svcrdma: Select NFSv4.1 backchannel transport based on forward channel The current code always selects XPRT_TRANSPORT_BC_TCP for the back channel, even when the forward channel was not TCP (eg, RDMA). When a 4.1 mount is attempted with RDMA, the server panics in the TCP BC code when trying to send CB_NULL. Instead, construct the transport protocol number from the forward channel transport or'd with XPRT_TRANSPORT_BC. Transports that do not support bi-directional RPC will not have registered a "BC" transport, causing create_backchannel_client() to fail immediately. Fixes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=265 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-18 11:35:45 -04:00
J. Bruce Fields	5d6031ca74	nfsd4: zero op arguments beyond the 8th compound op The first 8 ops of the compound are zeroed since they're a part of the argument that's zeroed by the memset(rqstp->rq_argp, 0, procp->pc_argsize); in svc_process_common(). But we handle larger compounds by allocating the memory on the fly in nfsd4_decode_compound(). Other than code recently fixed by `01529e3f81` "NFSD: Fix memory leak in encoding denied lock", I don't know of any examples of code depending on this initialization. But it definitely seems possible, and I'd rather be safe. Compounds this long are unusual so I'm much more worried about failure in this poorly tested cases than about an insignificant performance hit. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-17 16:20:39 -04:00
Jeff Layton	ae4b884fc6	nfsd: silence sparse warning about accessing credentials sparse says: fs/nfsd/auth.c:31:38: warning: incorrect type in argument 1 (different address spaces) fs/nfsd/auth.c:31:38: expected struct cred const cred fs/nfsd/auth.c:31:38: got struct cred const [noderef] <asn:4>real_cred Add a new accessor for the ->real_cred and use that to fetch the pointer. Accessing current->real_cred directly is actually quite safe since we know that they can't go away so this is mostly a cosmetic fixup to silence sparse. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-17 16:15:35 -04:00
Trond Myklebust	b0fc29d6fc	nfsd: Ensure stateids remain unique until they are freed Add an extra delegation state to allow the stateid to remain in the idr tree until the last reference has been released. This will be necessary to ensure uniqueness once the client_mutex is removed. [jlayton: reset the sc_type under the state_lock in unhash_delegation] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-16 21:39:51 -04:00
Jeff Layton	d564fbec7a	nfsd: nfs4_alloc_init_lease should take a nfs4_file arg No need to pass the delegation pointer in here as it's only used to get the nfs4_file pointer. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-16 21:35:25 -04:00
Jeff Layton	02e1215f9f	nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg state_lock is a heavily contended global lock. We don't want to grab that while simultaneously holding the inode->i_lock. Add a new per-nfs4_file lock that we can use to protect the per-nfs4_file delegation list. Hold that while walking the list in the break_deleg callback and queue the workqueue job for each one. The workqueue job can then take the state_lock and do the list manipulations without the i_lock being held prior to starting the rpc call. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-16 21:06:12 -04:00

1 2 3 4 5 ...

2214 Commits