linux

Commit Graph

Author	SHA1	Message	Date
J. Bruce Fields	4ea41e2de5	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs into for-2.6.34-incoming Resolve merge conflict in fs/xfs/linux-2.6/xfs_export.c.	2010-03-04 12:04:51 -05:00
Linus Torvalds	0f2cc4ecd8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) init: Open /dev/console from rootfs mqueue: fix typo "failues" -> "failures" mqueue: only set error codes if they are really necessary mqueue: simplify do_open() error handling mqueue: apply mathematics distributivity on mq_bytes calculation mqueue: remove unneeded info->messages initialization mqueue: fix mq_open() file descriptor leak on user-space processes fix race in d_splice_alias() set S_DEAD on unlink() and non-directory rename() victims vfs: add NOFOLLOW flag to umount(2) get rid of ->mnt_parent in tomoyo/realpath hppfs can use existing proc_mnt, no need for do_kern_mount() in there Mirror MS_KERNMOUNT in ->mnt_flags get rid of useless vfsmount_lock use in put_mnt_ns() Take vfsmount_lock to fs/internal.h get rid of insanity with namespace roots in tomoyo take check for new events in namespace (guts of mounts_poll()) to namespace.c Don't mess with generic_permission() under ->d_lock in hpfs sanitize const/signedness for udf nilfs: sanitize const/signedness in dealing with ->d_name.name ... Fix up fairly trivial (famous last words...) conflicts in drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c	2010-03-04 08:15:33 -08:00
Al Viro	f694869709	a couple of mntget+dget -> path_get in nfs4proc Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:07:56 -05:00
Al Viro	6eae7974d0	Switch alloc_nfs_open_context() to struct path Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:07:56 -05:00
Linus Torvalds	0a135ba14d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.	2010-03-03 07:34:18 -08:00
Andy Adamson	180b62a3d8	nfs41 fix NFS4ERR_CLID_INUSE for exchange id Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-03-02 13:45:33 -05:00
Trond Myklebust	ebed9203b6	NFS: Fix an allocation-under-spinlock bug sunrpc_cache_update() will always call detail->update() from inside the detail->hash_lock, so it cannot allocate memory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org	2010-03-02 13:06:22 -05:00
Trond Myklebust	0f79fd6f5c	NFSv4.1: Various fixes to the sequence flag error handling Ensure that we change the EXCHANGE_ID verifier (i.e. clp->cl_boot_time) when we want to reset all state. This is mainly needed when the server tells us that it is revoking our open or lock stateids. Handle revoking of recallable state by expiring the delegations. Handle callback path issues by expiring the delegations and then resetting the session. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-03-02 13:06:21 -05:00
Alexandros Batsakis	0851de0617	nfs4: renewd renew operations should take/put a client reference renewd sends RENEW requests to the NFS server in order to renew state. As the request is asynchronous, renewd should take a reference to the nfs_client to prevent concurrent umounts from freeing the client Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-03-02 13:00:03 -05:00
Alexandros Batsakis	7135840fc7	nfs41: renewd sequence operations should take/put client reference renewd sends SEQUENCE requests to the NFS server in order to renew state. As the request is asynchronous, renewd should take a reference to the nfs_client to prevent concurrent umounts from freeing the session/client Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-03-02 12:54:30 -05:00
Alexandros Batsakis	dc96aef96a	nfs: prevent backlogging of renewd requests If the renewd send queue gets backlogged (e.g., if the server goes down), we will keep filling the queue with periodic RENEW/SEQUENCE requests. This patch schedules a new renewd request if and only if the previous one returns (either success or failure) Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [Trond.Myklebust@netapp.com: moved nfs4_schedule_state_renewal() into separate nfs4_renew_release() and nfs41_sequence_release() callbacks to ensure correct behaviour on call setup failure] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-03-02 12:44:07 -05:00
Alexandros Batsakis	888ef2e3f8	nfs: kill renewd before clearing client minor version renewd should be synchronously killed before we destroy the session in nfs4_clear_minor_version Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [Trond.Myklebust@netapp.com: clean up to remove 'unused function warning when !CONFIG_NFS_V4] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-03-02 12:16:12 -05:00
Christian Kujau	4912002fff	Remove EXPERIMENTAL from NFS_FSCACHE There's currently an open Ubuntu bug[0], with the intent to compile NFS_FSCACHE (and possibly AFS_FSCACHE, 9P_FSCACHE) into the standard Ubuntu kernel. However, since *_FSCACHE still depends on EXPERIMENTAL, this won't happen. As Arjan van de Ven pointed out[1], the EXPERIMENTAL flag doesn't mean that much any more, I propose the following patch to fs/nfs/Kconfig. I'd do the same for fs/9p/Kconfig and fs/afs/Kconfig, but as I did not test 9p or AFS, I feel it would not be appropriate for me to remove the flag. [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/440522/comments/5 [1] http://lkml.org/lkml/2010/1/23/145 Signed-off-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-02-26 17:22:35 -08:00
Tejun Heo	003cb608a2	percpu: add __percpu sparse annotations to fs Add __percpu sparse annotations to fs. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk>	2010-02-17 11:17:38 +09:00
Chuck Lever	65d269538a	NFS: Too many GETATTR and ACCESS calls after direct I/O The cached read and write paths initialize fattr->time_start in their setup procedures. The value of fattr->time_start is propagated to read_cache_jiffies by nfs_update_inode(). Subsequent calls to nfs_attribute_timeout() will then use a good time stamp when computing the attribute cache timeout, and squelch unneeded GETATTR calls. Since the direct I/O paths erroneously leave the inode's fattr->time_start field set to zero, read_cache_jiffies for that inode is set to zero after any direct read or write operation. This triggers an otw GETATTR or ACCESS call to update the file's attribute and access caches properly, even when the NFS READ or WRITE replies have usable post-op attributes. Make sure the direct read and write setup code performs the same fattr initialization as the cached I/O paths to prevent unnecessary GETATTR calls. This was likely introduced by commit `0e574af1` in 2.6.15, which appears to add new nfs_fattr_init() call sites in the cached read and write paths, but not in the equivalent places in fs/nfs/direct.c. A subsequent commit in the same series, `33801147`, introduces the fattr->time_start field. Interestingly, the direct write reschedule path already has a call to nfs_fattr_init() in the right place. Reported-by: Quentin Barnes <qbarnes@yahoo-inc.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-02-15 19:53:43 -08:00
Chuck Lever	f895c53f8a	NFS: Make close(2) asynchronous when closing NFS O_DIRECT files For NFSv2 and v3: O_DIRECT writes are always synchronous, and aren't cached, so nothing should be flushed when closing an NFS O_DIRECT file descriptor. Thus there are no write errors to report on close(2). In addition, there's no cached data to verify on the next open(2), so we don't need clean GETATTR results at close time to compare with. Thus, there's no need for the nfs_revalidate_inode() call when closing an NFS O_DIRECT file. This reduces the number of synchronous on-the-wire requests for a simple open-write-close of an NFS O_DIRECT file by roughly 20%. For NFSv4: Call nfs4_do_close() with wait set to zero when closing an NFS O_DIRECT file. The CLOSE will go on the wire, but the application won't wait for it to complete. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:31:05 -05:00
Chuck Lever	7e381172cf	NFS: Improve NFS iostat byte count accuracy for writes The bytes counted by the performance counters for NFS writes should reflect write and sync errors. If the write(2) system call reports an error, the bytes should not be counted. And, if the write is short, the actual number of bytes that was written should be counted, not the number of bytes that was requested. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:31:04 -05:00
Chuck Lever	aa2f1ef10e	NFS: Account for NFS bytes read via the splice API Bytes read via the splice API should be accounted for in the NFS performance statistics. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:31:03 -05:00
Chuck Lever	4184dcf2db	NFS: Fix byte accounting for generic NFS reads Currently, the NFS I/O counters count the number of bytes requested by applications, rather than the number of bytes actually read by the system calls. The number of bytes requested for reads is actually not that useful, because the value is usually a buffer size for reads. That is, that requested number is usually a maximum, and frequently doesn't reflect the actual number of bytes read. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:31:03 -05:00
Chuck Lever	c2459dc462	NFS: Proper accounting for NFS VFS calls Nit: The VFSOPEN and VFSFLUSH counters are function call counters. Count every call to these routines. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:31:02 -05:00
Andy Adamson	9733f0d928	nfs41: cleanup callback code to use __be32 type Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:31:01 -05:00
Andy Adamson	41f54a5548	nfs41: clear NFS4CLNT_RECALL_SLOT bit on session reset Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:31:00 -05:00
Andy Adamson	bae0ac0ee1	nfs41: fix nfs4_callback_recallslot Return NFS4_OK if target high slotid equals enforced high slotid. Fix nfs_client reference leak. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:31:00 -05:00
Andy Adamson	104aeba484	nfs41: resize slot table in reset When session is reset, client can renegotiate slot table size. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:59 -05:00
Andy Adamson	b9efa1b27e	nfs41: implement cb_recall_slot Drain the fore channel and reset the max_slots to the new value. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:59 -05:00
Andy Adamson	4911096f1a	nfs41: back channel drc minimal implementation For now the back channel ca_maxresponsesize_cached is 0 and there is no backchannel DRC. Return NFS4ERR_REP_TOO_BIG_TO_CACHE when a cb_sequence cachethis is true. When it is false, return NFS4ERR_RETRY_UNCACHED_REP as the next operation error. Remember the replay error accross compound operation processing. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:58 -05:00
Andy Adamson	b2f28bd783	nfs41: prepare for back channel drc Make all cb_sequence arguments available to verify_seqid which will make replay decisions. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:58 -05:00
Andy Adamson	e95e60daee	nfs41: remove uneeded checks in callback processing All callback operations have arguments to decode and require processing. The preprocess_nfs4X_op functions catch unsupported or illegal ops so decode_args and process_op pointers are always non NULL. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:57 -05:00
Andy Adamson	b92b301900	nfs41: directly encode back channel error Skip all other processing when error is encountered. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:56 -05:00
Andy Adamson	31d2b4356b	nfs41: fix wrong error on callback header xdr overflow Set NFS4ERR_RESOURCE as CB_COMPOUND status and do not return an op on decode_op_hdr or encode_op_hdr buffer overflow. NFS4ERR_RESOURCE is correct for v4.0. Will fix the return for v4.1 along with all the other NFS4ERR_RESOURCE errors in a later patch. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:56 -05:00
Mike Sager	72ce2b3c06	nfs41: Process callback's referring call list If a CB_SEQUENCE referring call triple matches a slot table entry, the client is still waiting for a response to the original request. In this case, return NFS4ERR_DELAY as the response to the callback. Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:55 -05:00
Mike Sager	a7989c3e47	nfs41: Check slot table for referring calls Traverse a list of referring calls and look for a session/slot/seq number match. Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:55 -05:00
Mike Sager	8e0d46e138	nfs41: Adjust max cache response size value For the CREATE_SESSION attribute ca_maxresponsesize_cached, calculate the value based on the rpc reply header size plus the maximum nfs compound reply size. Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:54 -05:00
Jeff Layton	97cefcc6d0	nfs: handle NFSv2 -EKEYEXPIRED returns from RPC layer appropriately Add a wrapper around rpc_call_sync that handles -EKEYEXPIRED errors from the RPC layer as it would an -EJUKEBOX error if NFSv2 had such a thing. Also, add a handler for that error for async calls that makes it resubmit the RPC on -EKEYEXPIRED. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:52 -05:00
Jeff Layton	b68d69b8c6	nfs: handle NFSv3 -EKEYEXPIRED errors as we would -EJUKEBOX We're using -EKEYEXPIRED to indicate that a krb5 credcache contains an expired ticket and that we should have the NFS layer retry the RPC call instead of returning an error back to the caller. Handle this as we would an -EJUKEBOX error return. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:51 -05:00
Jeff Layton	2c6434888c	nfs4: handle -EKEYEXPIRED errors from RPC layer If a KRB5 TGT ticket expires, we don't want to return an error immediatel. If someone has a long running job and just forgets to run "kinit" in time then this will make it fail. Instead, we want to treat this situation as we would NFS4ERR_DELAY and retry the upcall after delaying a bit with an exponential backoff. This patch just makes any place that would handle NFS4ERR_DELAY also handle -EKEYEXPIRED the same way. In the future, we may want to be more sophisticated however and handle hard vs. soft mounts differently, or specify some upper limit on how long we'll wait for a new TGT to be acquired. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-10 08:30:50 -05:00
Trond Myklebust	fdcb45777a	NFS: Fix the mapping of the NFSERR_SERVERFAULT error It was recently pointed out that the NFSERR_SERVERFAULT error, which is designed to inform the user of a serious internal error on the server, was being mapped to an error value that is internal to the kernel. This patch maps it to the error EREMOTEIO, which is exported to userland through errno.h. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org	2010-02-09 14:29:29 -05:00
Trond Myklebust	7549ad5f9b	NFS: Remove a redundant check for PageFsCache in nfs_migrate_page() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: David Howells <dhowells@redhat.com>	2010-02-09 14:29:21 -05:00
Trond Myklebust	2c1740098c	NFS: Fix a bug in nfs_fscache_release_page() Not having an fscache cookie is perfectly valid if the user didn't mount with the fscache option. This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=15234 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: David Howells <dhowells@redhat.com> Cc: stable@kernel.org	2010-02-09 14:29:10 -05:00
Trond Myklebust	9b4b351346	NFS: Don't clobber the attribute type in nfs_update_inode() If the NFS_ATTR_FATTR_TYPE field isn't set in fattr->valid, then we should not set the S_IFMT part of inode->i_mode. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-02-03 08:27:35 -05:00
Trond Myklebust	387c149b54	NFS: Fix a umount race Ensure that we unregister the bdi before kill_anon_super() calls ida_remove() on our device name. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org	2010-02-03 08:27:35 -05:00
Trond Myklebust	9f557cd807	NFS: Fix an Oops when truncating a file The VM/VFS does not allow mapping->a_ops->invalidatepage() to fail. Unfortunately, nfs_wb_page_cancel() may fail if a fatal signal occurs. Since the NFS code assumes that the page stays mapped for as long as the writeback is active, we can end up Oopsing (among other things). The only safe fix here is to convert nfs_wait_on_request(), so as to make it uninterruptible (as is already the case with wait_on_page_writeback()). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org	2010-02-03 08:27:22 -05:00
Chuck Lever	d6783b2b6c	SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt() Clean up: Bruce observed we have more or less common logic in each of svc_create_xprt()'s callers: the check to create an IPv6 RPC listener socket only if CONFIG_IPV6 is set. I'm about to add another case that does just the same. If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt() call sites can get rid of the "#ifdef" ugliness, and can use the same logic with or without IPv6 support available in the kernel. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-26 17:56:43 -05:00
Trond Myklebust	a2c0b9e291	NFS: Ensure that we handle NFS4ERR_STALE_STATEID correctly Even if the server is crazy, we should be able to mark the stateid as being bad, to ensure it gets recovered. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>	2010-01-26 15:42:47 -05:00
Trond Myklebust	03391693a9	NFSv4.1: Don't call nfs4_schedule_state_recovery() unnecessarily Currently, nfs4_handle_exception() will call it twice if called with an error of -NFS4ERR_STALE_CLIENTID, -NFS4ERR_STALE_STATEID or -NFS4ERR_EXPIRED. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>	2010-01-26 15:42:38 -05:00
Trond Myklebust	8e469ebd6d	NFSv4: Don't allow posix locking against servers that don't support it Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Reviewed-by: Chuck Lever <chuck.lever@oracle.com>	2010-01-26 15:42:30 -05:00
Trond Myklebust	2bee72a6aa	NFSv4: Ensure that the NFSv4 locking can recover from stateid errors In most cases, we just want to mark the lock_stateid sequence id as being uninitialised. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Reviewed-by: Chuck Lever <chuck.lever@oracle.com>	2010-01-26 15:42:21 -05:00
David Howells	b0706ca415	NFS: Avoid warnings when CONFIG_NFS_V4=n Avoid the following warnings when CONFIG_NFS_V4=n: fs/nfs/sysctl.c:19: warning: unused variable `nfs_set_port_max' fs/nfs/sysctl.c:18: warning: unused variable `nfs_set_port_min' by making those variables contingent on NFSv4 being configured. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>	2010-01-26 15:42:11 -05:00
H Hartley Sweeten	0aa05887af	NFS: Make nfs_commitdata_release static The symbol nfs_commitdata_release is only used locally in this file. Make it static to prevent the following sparse warning: warning: symbol 'nfs_commitdata_release' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>	2010-01-26 15:42:03 -05:00
Trond Myklebust	82be934a59	NFS: Try to commit unstable writes in nfs_release_page() If someone calls nfs_release_page(), we presumably already know that the page is clean, however it may be holding an unstable write. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Reviewed-by: Chuck Lever <chuck.lever@oracle.com>	2010-01-26 15:41:53 -05:00
Trond Myklebust	c9edda7140	NFS: Fix a reference leak in nfs_wb_cancel_page() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Reviewed-by: Chuck Lever <chuck.lever@oracle.com>	2010-01-26 15:41:34 -05:00
OGAWA Hirofumi	56335936de	nfs: fix oops in nfs_rename() Recent change is missing to update "rehash". With that change, it will become the cause of adding dentry to hash twice. This explains the reason of Oops (dereference the freed dentry in __d_lookup()) on my machine. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: Marvin <marvin24@gmx.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-01-06 18:48:26 -05:00
Linus Torvalds	a2770d86b3	Revert "fix mismerge with Trond's stuff (create_mnt_ns() export is gone now)" This reverts commit `e9496ff46a`. Quoth Al: "it's dependent on a lot of other stuff not currently in mainline and badly broken with current fs/namespace.c. Sorry, badly out-of-order cherry-pick from old queue. PS: there's a large pending series reworking the refcounting and lifetime rules for vfsmounts that will, among other things, allow to rip a subtree away _without_ dissolving connections in it, to be garbage-collected when all active references are gone. It's considerably saner wrt "is the subtree busy" logics, but it's nowhere near being ready for merge at the moment; this changeset is one of the things becoming possible with that sucker, but it certainly shouldn't have been picked during this cycle. My apologies..." Noticed-by: Eric Paris <eparis@redhat.com> Requested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 12:51:05 -08:00
Linus Torvalds	bac5e54c29	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits) direct I/O fallback sync simplification ocfs: stop using do_sync_mapping_range cleanup blockdev_direct_IO locking make generic_acl slightly more generic sanitize xattr handler prototypes libfs: move EXPORT_SYMBOL for d_alloc_name vfs: force reval of target when following LAST_BIND symlinks (try #7) ima: limit imbalance msg Untangling ima mess, part 3: kill dead code in ima Untangling ima mess, part 2: deal with counters Untangling ima mess, part 1: alloc_file() O_TRUNC open shouldn't fail after file truncation ima: call ima_inode_free ima_inode_free IMA: clean up the IMA counts updating code ima: only insert at inode creation time ima: valid return code from ima_inode_alloc fs: move get_empty_filp() deffinition to internal.h Sanitize exec_permission_lite() Kill cached_lookup() and real_lookup() Kill path_lookup_open() ... Trivial conflicts in fs/direct-io.c	2009-12-16 12:04:02 -08:00
Linus Torvalds	e4bdda1bc3	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFSv4: Fix a regression in the NFSv4 state manager NFSv4: Release the sequence id before restarting a CLOSE rpc call nfs41: fix session fore channel negotiation nfs41: do not zero seqid portion of stateid on close nfs: run state manager in privileged mode nfs: make recovery state manager operations privileged nfs: enforce FIFO ordering of operations trying to acquire slot rpc: add a new priority in RPC task nfs: remove rpc_task argument from nfs4_find_slot rpc: add rpc_queue_empty function nfs: change nfs4_do_setlk params to identify recovery type nfs: do not do a LOOKUP after open nfs: minor cleanup of session draining	2009-12-16 10:47:44 -08:00
Linus Torvalds	37c24b37fb	Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: (42 commits) nfsd: remove pointless paths in file headers nfsd: move most of nfsfh.h to fs/nfsd nfsd: remove unused field rq_reffh nfsd: enable V4ROOT exports nfsd: make V4ROOT exports read-only nfsd: restrict filehandles accepted in V4ROOT case nfsd: allow exports of symlinks nfsd: filter readdir results in V4ROOT case nfsd: filter lookup results in V4ROOT case nfsd4: don't continue "under" mounts in V4ROOT case nfsd: introduce export flag for v4 pseudoroot nfsd: let "insecure" flag vary by pseudoflavor nfsd: new interface to advertise export features nfsd: Move private headers to source directory vfs: nfsctl.c un-used nfsd #includes lockd: Remove un-used nfsd headers #includes s390: remove un-used nfsd #includes sparc: remove un-used nfsd #includes parsic: remove un-used nfsd #includes compat.c: Remove dependence on nfsd private headers ...	2009-12-16 10:43:34 -08:00
Al Viro	e9496ff46a	fix mismerge with Trond's stuff (create_mnt_ns() export is gone now) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-16 12:16:44 -05:00
Trond Myklebust	380454126f	NFSv4: Fix a regression in the NFSv4 state manager Commit `5601a00d67` (nfs: run state manager in privileged mode) introduces a regression in the NFSv4 code when compiled with CONFIG_NFS_V4_1. The calls to nfs4_end_drain_session() from the main loop in nfs4_state_manager() Oops due to the lack of an NFSv4.1 session when running NFSv4.0. The fix is to move those two calls back into nfs41_init_clientid() and nfs4_reset_session(). The calls to nfs4_end_drain_session() that remain inside nfs4_state_manager() are safe, since the NFSv4.0 code will never set the NFS4CLNT_SESSION_DRAINING bit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 17:36:57 -05:00
Trond Myklebust	72211dbe72	NFSv4: Release the sequence id before restarting a CLOSE rpc call If the CLOSE or OPEN_DOWNGRADE call triggers a state recovery, and has to be resent, then we must release the seqid. Otherwise the open recovery will wait for the close to finish, which causes a deadlock. This is mainly a NFSv4.1 problem, although it can theoretically happen with NFSv4.0 too, in a OPEN_DOWNGRADE situation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 14:47:36 -05:00
Andy Adamson	68bf05efb7	nfs41: fix session fore channel negotiation If the rsize or wsize is not set on the mount command, negotiate the highest supported rsize and wsize in session creation. Fixes a bug where the client negotiated nfs41_maxwrite_overhead as ca_maxrequestsize and nfs41_maxread_overhead as ca_maxresponsesize resulting in NFS4ERR_REQ_TOO_BIG errors on writes. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 13:58:42 -05:00
Andy Adamson	a5523b84c4	nfs41: do not zero seqid portion of stateid on close Remove code left over from a previous minorversion draft. which specified zeroing seqid portions of stateid's. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 13:58:36 -05:00
Alexandros Batsakis	5601a00d67	nfs: run state manager in privileged mode Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 13:58:23 -05:00
Alexandros Batsakis	b257957e50	nfs: make recovery state manager operations privileged Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 13:58:07 -05:00
Alexandros Batsakis	689cf5c15b	nfs: enforce FIFO ordering of operations trying to acquire slot Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 13:55:18 -05:00
Alexandros Batsakis	40ead580ae	nfs: remove rpc_task argument from nfs4_find_slot Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 13:51:22 -05:00
Alexandros Batsakis	afe6c27ccb	nfs: change nfs4_do_setlk params to identify recovery type Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 13:50:32 -05:00
Alexandros Batsakis	0f7e720694	nfs: do not do a LOOKUP after open Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 13:50:21 -05:00
Alexandros Batsakis	3bfb0fc591	nfs: minor cleanup of session draining Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-15 13:50:01 -05:00
Linus Torvalds	3f86ce72cf	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (75 commits) NFS: Fix nfs_migrate_page() rpc: remove unneeded function parameter in gss_add_msg() nfs41: Invoke RECLAIM_COMPLETE on all new client ids SUNRPC: IS_ERR/PTR_ERR confusion NFSv41: Fix a potential state leakage when restarting nfs4_close_prepare nfs41: Handle NFSv4.1 session errors in the delegation recall code nfs41: Retry delegation return if it failed with session error nfs41: Handle session errors during delegation return nfs41: Mark stateids in need of reclaim if state manager gets stale clientid NFS: Fix up the declaration of nfs4_restart_rpc when NFSv4 not configured nfs41: Don't clear DRAINING flag on NFS4ERR_STALE_CLIENTID nfs41: nfs41_setup_state_renewal NFSv41: More cleanups NFSv41: Fix up some bugs in the NFS4CLNT_SESSION_DRAINING code NFSv41: Clean up slot table management NFSv41: Fix nfs4_proc_create_session nfs41: Invoke RECLAIM_COMPLETE nfs41: RECLAIM_COMPLETE functionality nfs41: RECLAIM_COMPLETE XDR functionality Cleanup some NFSv4 XDR decode comments ...	2009-12-14 10:00:24 -08:00
Linus Torvalds	d0316554d3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c	2009-12-14 09:58:24 -08:00
Trond Myklebust	52c9948b1f	Merge branch 'nfs-for-2.6.33'	2009-12-13 13:56:27 -05:00
Trond Myklebust	190f38e5ce	NFS: Fix nfs_migrate_page() The call to migrate_page() will cause the page->private field to be cleared. Also fix up the locking around the page->private transfer, so that we ensure that calls to nfs_page_find_request() don't end up racing. Finally, fix up a double free bug: nfs_unlock_request() already calls nfs_release_request() for us... Reported-by: Wu Fengguang <fengguang.wu@intel.com> Tested-by: Andi Kleen <andi@firstfloor.org> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-10 09:05:55 -05:00
Christoph Hellwig	6b2f3d1f76	vfs: Implement proper O_SYNC semantics While Linux provided an O_SYNC flag basically since day 1, it took until Linux 2.4.0-test12pre2 to actually get it implemented for filesystems, since that day we had generic_osync_around with only minor changes and the great "For now, when the user asks for O_SYNC, we'll actually give O_DSYNC" comment. This patch intends to actually give us real O_SYNC semantics in addition to the O_DSYNC semantics. After Jan's O_SYNC patches which are required before this patch it's actually surprisingly simple, we just need to figure out when to set the datasync flag to vfs_fsync_range and when not. This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's numerical value to keep binary compatibility, and adds a new real O_SYNC flag. To guarantee backwards compatiblity it is defined as expanding to both the O_DSYNC and the new additional binary flag (__O_SYNC) to make sure we are backwards-compatible when compiled against the new headers. This also means that all places that don't care about the differences can just check O_DSYNC and get the right behaviour for O_SYNC, too - only places that actuall care need to check __O_SYNC in addition. Drivers and network filesystems have been updated in a fail safe way to always do the full sync magic if O_DSYNC is set. The few places setting O_SYNC for lower layers are kept that way for now to stay failsafe. We enforce that O_DSYNC is set when __O_SYNC is set early in the open path to make sure we always get these sane options. Note that parisc really screwed up their headers as they already define a O_DSYNC that has always been a no-op. We try to repair it by using it for the new O_DSYNC and redefinining O_SYNC to send both the traditional O_SYNC numerical value _and_ the O_DSYNC one. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger@sun.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-10 15:02:50 +01:00
Ricardo Labiaga	7cab89b275	nfs41: Invoke RECLAIM_COMPLETE on all new client ids The NFSv4.1 spec indicates RECLAIM_COMPLETE is to be issued whenever a client establishes a new client id, not only after detecting the server has rebooted. Set the NFS4CLNT_RECLAIM_REBOOT bit after every new client id has been established. This enables us to issue RECLAIM_COMPLETE during the wrap up of the NFS4CLNT_RECLAIM_REBOOT state. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-08 14:35:28 -05:00
Linus Torvalds	6035ccd8e9	Merge branch 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block: (113 commits) cfq-iosched: Do not access cfqq after freeing it block: include linux/err.h to use ERR_PTR cfq-iosched: use call_rcu() instead of doing grace period stall on queue exit blkio: Allow CFQ group IO scheduling even when CFQ is a module blkio: Implement dynamic io controlling policy registration blkio: Export some symbols from blkio as its user CFQ can be a module block: Fix io_context leak after failure of clone with CLONE_IO block: Fix io_context leak after clone with CLONE_IO cfq-iosched: make nonrot check logic consistent io controller: quick fix for blk-cgroup and modular CFQ cfq-iosched: move IO controller declerations to a header file cfq-iosched: fix compile problem with !CONFIG_CGROUP blkio: Documentation blkio: Wait on sync-noidle queue even if rq_noidle = 1 blkio: Implement group_isolation tunable blkio: Determine async workload length based on total number of queues blkio: Wait for cfq queue to get backlogged if group is empty blkio: Propagate cgroup weight updation to cfq groups blkio: Drop the reference to queue once the task changes cgroup blkio: Provide some isolation between groups ...	2009-12-08 08:19:16 -08:00
Linus Torvalds	1557d33007	Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits) security/tomoyo: Remove now unnecessary handling of security_sysctl. security/tomoyo: Add a special case to handle accesses through the internal proc mount. sysctl: Drop & in front of every proc_handler. sysctl: Remove CTL_NONE and CTL_UNNUMBERED sysctl: kill dead ctl_handler definitions. sysctl: Remove the last of the generic binary sysctl support sysctl net: Remove unused binary sysctl code sysctl security/tomoyo: Don't look at ctl_name sysctl arm: Remove binary sysctl support sysctl x86: Remove dead binary sysctl support sysctl sh: Remove dead binary sysctl support sysctl powerpc: Remove dead binary sysctl support sysctl ia64: Remove dead binary sysctl support sysctl s390: Remove dead sysctl binary support sysctl frv: Remove dead binary sysctl support sysctl mips/lasat: Remove dead binary sysctl support sysctl drivers: Remove dead binary sysctl support sysctl crypto: Remove dead binary sysctl support sysctl security/keys: Remove dead binary sysctl support sysctl kernel: Remove binary sysctl logic ...	2009-12-08 07:38:50 -08:00
Trond Myklebust	88069f77e1	NFSv41: Fix a potential state leakage when restarting nfs4_close_prepare Currently, if the call to nfs4_setup_sequence() in nfs4_close_prepare fails, any later retries will fail to launch an RPC call, due to the fact that the &state->flags will have been cleared. Ditto if nfs4_close_done() triggers a call to the NFSv4.1 version of nfs_restart_rpc(). We therefore move the actual clearing of the state->flags to nfs4_close_done(), when we know that the RPC call was successful. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-08 08:33:16 -05:00
Ricardo Labiaga	74e7bb73a3	nfs41: Handle NFSv4.1 session errors in the delegation recall code Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-07 09:48:30 -05:00
Ricardo Labiaga	7970886118	nfs41: Retry delegation return if it failed with session error Update nfs4_delegreturn_done() to retry the operation after setting the NFS4CLNT_SESSION_SETUP bit to indicate the need to reset the session. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-07 09:23:21 -05:00
Ricardo Labiaga	bcfa49f6f9	nfs41: Handle session errors during delegation return Add session error handling to nfs4_open_delegation_recall() Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-07 09:22:29 -05:00
Ricardo Labiaga	f455848a11	nfs41: Mark stateids in need of reclaim if state manager gets stale clientid The state manager was not marking the stateids as needing to be reclaimed after reestablishing the clientid. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-07 09:16:09 -05:00
Trond Myklebust	0110ee152b	NFS: Fix up the declaration of nfs4_restart_rpc when NFSv4 not configured Also rename it: it is used in generic code, and so should not have a 'nfs4' prefix. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-07 09:00:24 -05:00
Ricardo Labiaga	9dfdf404c9	nfs41: Don't clear DRAINING flag on NFS4ERR_STALE_CLIENTID If CREATE_SESSION fails with NFS4ERR_STALE_CLIENTID, don't clear the NFS4CLNT_SESSION_DRAINING flag and don't wake RPCs waiting for the session to be reestablished. We don't have a session yet, so there is no reason to wake other RPCs. This avoids sending spurious compounds with bogus sequenceID during session and state recovery. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [Trond.Myklebust@netapp.com: cleaned up patch by adding the nfs41_begin/end_drain_session() helpers] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-06 12:57:34 -05:00
Ricardo Labiaga	9430fb6b53	nfs41: nfs41_setup_state_renewal Move call to get the lease time and the setup of the state renewal out of nfs4_create_session so that it can be called after clearing the DRAINING flag. We use the getattr RPC to obtain the lease time, which requires a sequence slot. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-06 12:23:46 -05:00
Trond Myklebust	bcb56164ce	NFSv41: More cleanups Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 19:32:19 -05:00
Trond Myklebust	35dc1d74a8	NFSv41: Fix up some bugs in the NFS4CLNT_SESSION_DRAINING code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 19:32:19 -05:00
Trond Myklebust	d61e612a72	NFSv41: Clean up slot table management We no longer need to maintain a distinction between nfs41_sequence_done and nfs41_sequence_free_slot. This fixes a number of slot table leakages in the NFSv4.1 code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 19:32:19 -05:00
Trond Myklebust	f26468fb93	NFSv41: Fix nfs4_proc_create_session We should not assume that nfs41_init_clientid() will always want to initialise the session. If it is being called due to a server reboot, then we just want to reset the session after re-establishing the clientid. Fix this by getting rid of the 'reset' parameter in nfs4_proc_create_session(), and instead relying on whether or not the session slot table pointer is non-NULL. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 19:32:11 -05:00
Ricardo Labiaga	da6ebfe34a	nfs41: Invoke RECLAIM_COMPLETE This patch invokes RECLAIM_COMPLETE after the client is done reclaiming state. There are interpretations of the spec that suggest that RECLAIM_COMPLETE should also be issued after a new clientid has been obtained from the server and even if there is no state to reclaim. This tells the server that the client has no state to reclaim even if the client isn't aware the server may have rebooted. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 16:08:41 -05:00
Ricardo Labiaga	fce5c838e1	nfs41: RECLAIM_COMPLETE functionality Implements RECLAIM_COMPLETE as an asynchronous RPC. NFS4ERR_DELAY is retried, NFS4ERR_DEADSESSION invokes the error handling but does not result in a retry, since we don't want to have a lingering RECLAIM_COMPLETE call sent in the middle of a possible new state recovery cycle. If a session reset occurs, a new wave of reclaim operations will follow, containing their own RECLAIM_COMPLETE call. We don't want a retry to get on the way of recovery by incorrectly indicating to the server that we're done reclaiming state. A subsequent patch invokes the functionality. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 16:08:41 -05:00
Ricardo Labiaga	180197536b	nfs41: RECLAIM_COMPLETE XDR functionality XDR encoding and decoding for RECLAIM_COMPLETE. Implements the necessary encoding to indicate reclaim complete for the entire client. In the future, it can be extended to provide reclaim complete functionality for a single file system after migration. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 16:08:40 -05:00
Ricardo Labiaga	8b173218bd	Cleanup some NFSv4 XDR decode comments Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 16:08:39 -05:00
Trond Myklebust	0556d1a695	NFSv41: nfs4_reset_session must always set NFS4CLNT_SESSION_DRAINING Otherwise we have no guarantees that other processes won't start another RPC call while we're resetting the session. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 15:03:20 -05:00
Alexandros Batsakis	2597641dea	nfs41: v2 fix cb_recall bug in NFSv4.1 the seqid part of a stateid in CB_RECALL must be 0 Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:48:55 -05:00
Alexandros Batsakis	0629e370dd	nfs41: check SEQUENCE status flag the server can indicate a number of error conditions by setting the appropriate bits in the SEQUENCE operation. The client re-establishes state with the server when it receives one of those, with the action depending on the specific case. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:46:14 -05:00
Alexandros Batsakis	2449ea2e19	nfs41: V2 adjust max_rqst_sz, max_resp_sz w.r.t to rsize, wsize The v4.1 client should take into account the desired rsize, wsize when negotiating the max size in CREATE_SESSION. Accordingly, it should use rsize, wsize that are smaller than the session negotiated values. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:36:55 -05:00
Alexandros Batsakis	7b183d0d43	nfs41: remove server-only EXCHGID4_FLAG_CONFIRMED_R flag from exchange_id Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:33:25 -05:00
Alexandros Batsakis	4882ef72cd	nfs41: add support for the exclusive create flags In v4.1 the client MUST/SHOULD use the EXCLUSIVE4_1 flag instead of EXCLUSIVE4, and GUARDED when the server supports persistent sessions. For now (and until we support suppattr_exclcreat), we don't send any attributes with EXCLUSIVE4_1 relying in the subsequent SETATTR as in v4.0 Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:30:21 -05:00
Alexandros Batsakis	d8cb1a7ce3	nfs41: check if session exists and if it is persistent Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:29:53 -05:00
Alexandros Batsakis	31f0960778	nfs41: V2 initial support for CB_RECALL_ANY For now the clients returns _all_ the delegations of the specificed type it holds Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:27:02 -05:00
Alexandros Batsakis	c79571a508	nfs4: V2 return/expire delegations depending on their type Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:20:52 -05:00
Alexandros Batsakis	b4a6f4966e	nfs4: minor delegation cleaning Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:19:11 -05:00
Alexandros Batsakis	07bccc2dd4	nfs41: add support for callback with RPC version number 4 The NFSv4.1 spec-29 (18.36.3) says that the server MUST use an ONC RPC (program) version number equal to 4 in callbacks sent to the client. For now we allow both versions 1 and 4. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-05 13:19:01 -05:00
Andy Adamson	0b9e2d41f1	nfs41: only state manager sets NFS4CLNT_SESSION_SETUP Replace sync and async handlers setting of the NFS4CLNT_SESSION_SETUP bit with setting NFS4CLNT_CHECK_LEASE, and let the state manager decide to reset the session. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 16:02:14 -05:00
Andy Adamson	691daf3b0c	nfs41: drain session cleanup Do not wake up the next slot_tbl_waitq task in nfs4_free_slot because we may be draining the slot. Either signal the state manager that the session is drained (the state manager wakes up tasks) OR wake up the next task. In nfs41_sequence_done, the slot dereference is only needed in the sequence operation success case. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:39 -05:00
Andy Adamson	ea028ac925	nfs41: nfs41: fix state manager deadlock in session reset If the session is reset during state recovery, the state manager thread can sleep on the slot_tbl_waitq causing a deadlock. Add a completion framework to the session. Have the state manager thread set a new session state (NFS4CLNT_SESSION_DRAINING) and wait for the session slot table to drain. Signal the state manager thread in nfs41_sequence_free_slot when the NFS4CLNT_SESSION_DRAINING bit is set and the session is drained. Reported-by: Trond Myklebust <trond@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:38 -05:00
Andy Adamson	05f0d23647	nfs41: remove nfs4_recover_session nfs4_recover_session can put rpciod to sleep. Just use nfs4_schedule_recovery. Reported-by: Trond Myklebust <trond.myklebust@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:37 -05:00
Andy Adamson	2628eddff1	nfs41: don't clear tk_action on success Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:35 -05:00
Andy Adamson	8ba9bf8e51	nfs41: fix switch in nfs4_recovery_handle_error Do not fall through and set NFS4CLNT_SESSION_RESET bit on NFS4ERR_EXPIRED Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:34 -05:00
Andy Adamson	b9179237e2	nfs41: fix switch in nfs4_handle_exception Do not fall through and call nfs4_delay on session error handling. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:32 -05:00
Andy Adamson	36bbe34239	nfs41: free the slot on unhandled read errors nfs4_read_done returns zero on unhandled errors. nfs_readpage_result will return on a negative tk_status without freeing the slot. Call nfs4_sequence_free_slot on unhandled errors in nfs4_read_done. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:30 -05:00
Andy Adamson	e608e79f1b	nfs41: call free slot from nfs4_restart_rpc nfs41_sequence_free_slot can be called multiple times on SEQUENCE operation errors. No reason to inline nfs4_restart_rpc Reported-by: Trond Myklebust <trond.myklebust@netapp.com> nfs_writeback_done and nfs_readpage_retry call nfs4_restart_rpc outside the error handler, and the slot is not freed prior to restarting in the rpc_prepare state during session reset. Fix this by moving the call to nfs41_sequence_free_slot from the error path of nfs41_sequence_done into nfs4_restart_rpc, and by removing the test for NFS4CLNT_SESSION_SETUP. Always free slot and goto the rpc prepare state on async errors. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:29 -05:00
Andy Adamson	1d9ddde94a	nfs41: nfs4_get_lease_time will never session reset Make this clear by calling rpc_restart-call. Prepare for nfs4_restart_rpc() to free slots. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:27 -05:00
Andy Adamson	6df08189ff	nfs41: rename cl_state session SETUP bit to RESET The bit is no longer used for session setup, only for session reset. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:55:05 -05:00
Andy Adamson	4d643d1dfa	nfs41: add create session into establish_clid Reported-by: Trond Myklebust <trond.myklebust@netapp.com> Resetting the clientid from the state manager could result in not confirming the clientid due to create session not being called. Move the create session call from the NFS4CLNT_SESSION_SETUP state manager initialize session case into the NFS4CLNT_LEASE_EXPIRED case establish_clid call. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-04 15:52:24 -05:00
Trond Myklebust	7285f2d2ff	Merge branch 'devel' into linux-next	2009-12-03 21:27:36 -05:00
NeilBrown	44ed3556ba	NFS4ERR_FILE_OPEN handling in Linux/NFS NFS4ERR_FILE_OPEN is return by the server when an operation cannot be performed because the file is currently open and local (to the server) semantics prohibit the operation while the file is open. A typical case is a RENAME operation on an MS-Windows platform, which prevents rename while the file is open. While it is possible that such a condition is transitory, it is also very possible that the file will be held open for an extended period of time thus preventing the operation. The current behaviour of Linux/NFS is to retry the operation indefinitely. This is not appropriate - we do not expect a rename to take an arbitrary amount of time to complete. Rather, and error should be returned. The most obvious error code would be EBUSY, which is a legal at least for 'rename' and 'unlink', and accurately captures the reason for the error. This patch allows a few retries until about 2 seconds have elapsed, then returns EBUSY. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 21:26:36 -05:00
Miklos Szeredi	24e93025ee	nfs: clean up sillyrenaming in nfs_rename() The d_instantiate(new_dentry, NULL) is superfluous, the dentry is already negative. Rehashing this dummy dentry isn't needed either, d_move() works fine on an unhashed target. The re-checking for busy after a failed nfs_sillyrename() is bogus too: new_dentry->d_count < 2 would be a bug here. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Miklos Szeredi	27226104e6	nfs: dont unhash target if renaming a directory Move unhashing the target to after the check for existence and being a non-directory. If renaming a directory then the VFS already unhashes the target if it is not busy. If it's busy then acquiring more references during the rename makes no difference. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Miklos Szeredi	28f79a1a69	nfs: fix comments in nfs_rename() Comments are wrong or out of date. In particular d_drop() doesn't free the inode it just unhashes the dentry. And if target is a directory then it is not checked for being busy. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Miklos Szeredi	e48de5ec25	nfs: remove unnecessary check from nfs_rename() VFS already checks if both source and target are directories. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Trond Myklebust	9c4c761a62	NFSv4.1: Handle NFSv4.1 session errors in the lock recovery code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	dd47f96c07	NFS: Revert default r/wsize behavior When the "rsize=" or "wsize=" mount options are not specified, text-based mounts have slightly different behavior than legacy binary mounts. Text-based mounts use the smaller of the server's maximum and the client's maximum, but binary mounts use the smaller of the server's _preferred_ size and the client's maximum. This difference is actually pretty subtle. Most servers advertise the same value as their maximum and their preferred transfer size, so the end result is the same in most cases. The reason for this difference is that for text-based mounts, if r/wsize are not specified, they are set to the largest value supported by the client. For legacy mounts, the values are set to zero if these options are not specified. nfs_server_set_fsinfo() can negotiate the transfer size defaults correctly in any case. There's no need to specify any particular value as default in the text-based option parsing logic. Note that nfs4 doesn't use nfs_server_set_fsinfo(), but the mount.nfs4 command does set rsize and wsize to 0 if the user didn't specify these options. So, make the same change for text-based NFSv4 mounts. Thanks to James Pearson <james-p@moving-picture.com> for reporting and diagnosing the problem. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	d250e190fb	NFS: Display compressed (shorthand) IPv6 in /proc/mounts Recent changes to snprintf() introduced the %pI6c formatter, which can display an IPv6 address with standard shorthanding. Use this new formatter when displaying IPv6 server addresses in /proc/mounts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Jeff Layton	ee671b016f	NFS: convert proto= option to use netids rather than a protoname Solaris uses netids as values for the proto= option, so that when someone specifies "tcp6" they get traffic over TCP + IPv6. Until recently, this has never really been an issue for Linux since it didn't support NFS over IPv6. The netid and the protocol name were generally always the same (modulo any strange configuration in /etc/netconfig). The solaris manpage documents their proto= option as: proto= _netid_ \| rdma This patch is intended to bring Linux closer to how the Solaris proto= option works, by declaring a static netid mapping in the kernel and converting the proto= and mountproto= options to follow it and display the proper values in /proc/mounts. Much of this functionality will need to be provided by a userspace mount.nfs patch. Chuck Lever has a patch to change mount.nfs in the same way. In principle, we could do all of this in userspace but that would mean that the options in /proc/mounts may not match the options used by userspace. The alternative to the static mapping here is to add a mechanism to upcall to userspace for netid's. I'm not opposed to that option, but it'll probably mean more overhead (and quite a bit more code). Rather than shoot for that at first, I figured it was probably better to start simply. Comments welcome. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
J. Bruce Fields	d4e935bd67	The rpc server does not require that service threads take the BKL. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:33 -05:00
Trond Myklebust	1185a552e3	NFSv4: Ensure nfs4_close_context() is declared as static Fix another 'sparse' warning in fs/nfs/nfs4proc.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:54:02 -05:00
Trond Myklebust	0a6566ecd3	NFSv4: Ensure nfs_dns_lookup() and nfs_dns_update() are declared static Fix two 'sparse' warnings in fs/nfs/dns_resolve.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:54:01 -05:00
Trond Myklebust	b6d408ba8c	NFSv4: Fix up error handling in the state manager main loop. The nfs4_state_manager should not be looking at the error values when deciding whether or not to loop round in order to handle a higher priority state recovery task. It should rather be looking at the clp->cl_state. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:53:22 -05:00
Trond Myklebust	a9ed2e2583	NFSv4: Handle NFS4ERR_GRACE when recovering an expired lease. If our lease expires, and the server reboots while we're recovering, we need to be able to wait until the grace period is over. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:53:21 -05:00
Trond Myklebust	c8b7ae3d32	NFSv4: Ensure the state manager handles NFS4ERR_NO_GRACE correctly Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:53:21 -05:00
Trond Myklebust	4f7cdf18e1	NFSv4: The state manager shouldn't exit on errors that were handled nfs4_recovery_handle_error() will correctly handle errors such as NFS4ERR_CB_PATH_DOWN, however because they are still passed back to the main loop in nfs4_state_manager(), they can cause the latter to exit prematurely. Fix this by letting nfs4_recovery_handle_error() change the error value in cases where there is no action required by the caller. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:53:20 -05:00
Trond Myklebust	e345e88a77	NFSv4: Fix up the callers of nfs4_state_end_reclaim_reboot In practice, we need to ensure that we call nfs4_state_end_reclaim_reboot in 2 cases: - If we lose the lease while we were reclaiming state OR - After we're done with reboot recovery Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:52:41 -05:00
Trond Myklebust	d18cc1fda2	NFSv4: Fix a potential state manager deadlock when returning delegations The nfsv4 state manager could potentially deadlock inside __nfs_inode_return_delegation() if the server reboots, so that the calls to nfs_msync_inode() end up waiting on state recovery to complete. Also ensure that if a server reboot or network partition causes us to have to stop returning delegations, that NFS4CLNT_DELEGRETURN is set so that the state manager can resume any outstanding delegation returns after it has dealt with the state recovery situation. Finally, ensure that the state manager doesn't wait for the DELEGRETURN call to complete. It doesn't need to, and that too can cause a deadlock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 08:10:17 -05:00
J. Bruce Fields	d327cf7449	Re: acl trouble after upgrading ubuntu Subject: [PATCH] nfs: fix acl decoding Commit `28f566942c` "NFS: use dynamically computed compound_hdr.replen for xdr_inline_pages offset" accidentally changed the amount of space to allow for the acl reply, resulting in an IO error on attempts to get an acl. Reported-by: Paul Rudin <paul@rudin.co.uk> Cc: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 08:10:17 -05:00
Trond Myklebust	96f287b0cf	NFS: BKL removal from the mount code... None of the code in nfs_umount_begin() or nfs_remount() has any BKL dependency. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 08:09:56 -05:00
Wu Fengguang	b17621fed6	writeback: introduce wbc.for_background It will lower the flush priority for NFS, and maybe more in future. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-12-03 13:54:25 +01:00
Linus Torvalds	6e80133f7f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache * git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache: (31 commits) FS-Cache: Provide nop fscache_stat_d() if CONFIG_FSCACHE_STATS=n SLOW_WORK: Fix GFS2 to #include <linux/module.h> before using THIS_MODULE SLOW_WORK: Fix CIFS to pass THIS_MODULE to slow_work_register_user() CacheFiles: Don't log lookup/create failing with ENOBUFS CacheFiles: Catch an overly long wait for an old active object CacheFiles: Better showing of debugging information in active object problems CacheFiles: Mark parent directory locks as I_MUTEX_PARENT to keep lockdep happy CacheFiles: Handle truncate unlocking the page we're reading CacheFiles: Don't write a full page if there's only a partial page to cache FS-Cache: Actually requeue an object when requested FS-Cache: Start processing an object's operations on that object's death FS-Cache: Make sure FSCACHE_COOKIE_LOOKING_UP cleared on lookup failure FS-Cache: Add a retirement stat counter FS-Cache: Handle pages pending storage that get evicted under OOM conditions FS-Cache: Handle read request vs lookup, creation or other cache failure FS-Cache: Don't delete pending pages from the page-store tracking tree FS-Cache: Fix lock misorder in fscache_write_op() FS-Cache: The object-available state can't rely on the cookie to be available FS-Cache: Permit cache retrieval ops to be interrupted in the initial wait phase FS-Cache: Use radix tree preload correctly in tracking of pages to be stored ...	2009-11-30 13:33:48 -08:00
J. Bruce Fields	9b8b317d58	Merge commit 'v2.6.32-rc8' into HEAD	2009-11-23 12:34:58 -05:00
David Howells	201a15428b	FS-Cache: Handle pages pending storage that get evicted under OOM conditions Handle netfs pages that the vmscan algorithm wants to evict from the pagecache under OOM conditions, but that are waiting for write to the cache. Under these conditions, vmscan calls the releasepage() function of the netfs, asking if a page can be discarded. The problem is typified by the following trace of a stuck process: kslowd005 D 0000000000000000 0 4253 2 0x00000080 ffff88001b14f370 0000000000000046 ffff880020d0d000 0000000000000007 0000000000000006 0000000000000001 ffff88001b14ffd8 ffff880020d0d2a8 000000000000ddf0 00000000000118c0 00000000000118c0 ffff880020d0d2a8 Call Trace: [<ffffffffa00782d8>] __fscache_wait_on_page_write+0x8b/0xa7 [fscache] [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34 [<ffffffffa0078240>] ? __fscache_check_page_write+0x63/0x70 [fscache] [<ffffffffa00b671d>] nfs_fscache_release_page+0x4e/0xc4 [nfs] [<ffffffffa00927f0>] nfs_release_page+0x3c/0x41 [nfs] [<ffffffff810885d3>] try_to_release_page+0x32/0x3b [<ffffffff81093203>] shrink_page_list+0x316/0x4ac [<ffffffff8109372b>] shrink_inactive_list+0x392/0x67c [<ffffffff813532fa>] ? __mutex_unlock_slowpath+0x100/0x10b [<ffffffff81058df0>] ? trace_hardirqs_on_caller+0x10c/0x130 [<ffffffff8135330e>] ? mutex_unlock+0x9/0xb [<ffffffff81093aa2>] shrink_list+0x8d/0x8f [<ffffffff81093d1c>] shrink_zone+0x278/0x33c [<ffffffff81052d6c>] ? ktime_get_ts+0xad/0xba [<ffffffff81094b13>] try_to_free_pages+0x22e/0x392 [<ffffffff81091e24>] ? isolate_pages_global+0x0/0x212 [<ffffffff8108e743>] __alloc_pages_nodemask+0x3dc/0x5cf [<ffffffff81089529>] grab_cache_page_write_begin+0x65/0xaa [<ffffffff8110f8c0>] ext3_write_begin+0x78/0x1eb [<ffffffff81089ec5>] generic_file_buffered_write+0x109/0x28c [<ffffffff8103cb69>] ? current_fs_time+0x22/0x29 [<ffffffff8108a509>] __generic_file_aio_write+0x350/0x385 [<ffffffff8108a588>] ? generic_file_aio_write+0x4a/0xae [<ffffffff8108a59e>] generic_file_aio_write+0x60/0xae [<ffffffff810b2e82>] do_sync_write+0xe3/0x120 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34 [<ffffffff810b18e1>] ? __dentry_open+0x1a5/0x2b8 [<ffffffff810b1a76>] ? dentry_open+0x82/0x89 [<ffffffffa00e693c>] cachefiles_write_page+0x298/0x335 [cachefiles] [<ffffffffa0077147>] fscache_write_op+0x178/0x2c2 [fscache] [<ffffffffa0075656>] fscache_op_execute+0x7a/0xd1 [fscache] [<ffffffff81082093>] slow_work_execute+0x18f/0x2d1 [<ffffffff8108239a>] slow_work_thread+0x1c5/0x308 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34 [<ffffffff810821d5>] ? slow_work_thread+0x0/0x308 [<ffffffff8104be91>] kthread+0x7a/0x82 [<ffffffff8100beda>] child_rip+0xa/0x20 [<ffffffff8100b87c>] ? restore_args+0x0/0x30 [<ffffffff8102ef83>] ? tg_shares_up+0x171/0x227 [<ffffffff8104be17>] ? kthread+0x0/0x82 [<ffffffff8100bed0>] ? child_rip+0x0/0x20 In the above backtrace, the following is happening: (1) A page storage operation is being executed by a slow-work thread (fscache_write_op()). (2) FS-Cache farms the operation out to the cache to perform (cachefiles_write_page()). (3) CacheFiles is then calling Ext3 to perform the actual write, using Ext3's standard write (do_sync_write()) under KERNEL_DS directly from the netfs page. (4) However, for Ext3 to perform the write, it must allocate some memory, in particular, it must allocate at least one page cache page into which it can copy the data from the netfs page. (5) Under OOM conditions, the memory allocator can't immediately come up with a page, so it uses vmscan to find something to discard (try_to_free_pages()). (6) vmscan finds a clean netfs page it might be able to discard (possibly the one it's trying to write out). (7) The netfs is called to throw the page away (nfs_release_page()) - but it's called with __GFP_WAIT, so the netfs decides to wait for the store to complete (__fscache_wait_on_page_write()). (8) This blocks a slow-work processing thread - possibly against itself. The system ends up stuck because it can't write out any netfs pages to the cache without allocating more memory. To avoid this, we make FS-Cache cancel some writes that aren't in the middle of actually being performed. This means that some data won't make it into the cache this time. To support this, a new FS-Cache function is added fscache_maybe_release_page() that replaces what the netfs releasepage() functions used to do with respect to the cache. The decisions fscache_maybe_release_page() makes are counted and displayed through /proc/fs/fscache/stats on a line labelled "VmScan". There are four counters provided: "nos=N" - pages that weren't pending storage; "gon=N" - pages that were pending storage when we first looked, but weren't by the time we got the object lock; "bsy=N" - pages that we ignored as they were actively being written when we looked; and "can=N" - pages that we cancelled the storage of. What I'd really like to do is alter the behaviour of the cancellation heuristics, depending on how necessary it is to expel pages. If there are plenty of other pages that aren't waiting to be written to the cache that could be ejected first, then it would be nice to hold up on immediate cancellation of cache writes - but I don't see a way of doing that. Signed-off-by: David Howells <dhowells@redhat.com>	2009-11-19 18:11:35 +00:00
Eric W. Biederman	6d4561110a	sysctl: Drop & in front of every proc_handler. For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-18 08:37:40 -08:00
Eric W. Biederman	ab09203e30	sysctl fs: Remove dead binary sysctl support Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name and .strategy members of sysctl tables are dead code. Remove them. Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-12 02:04:55 -08:00
Trond Myklebust	96d25e5322	NFSv4: Fix a cache validation bug which causes getcwd() to return ENOENT Changeset `a65318bf3a` (NFSv4: Simplify some cache consistency post-op GETATTRs) incorrectly changed the getattr bitmap for readdir(). This causes the readdir() function to fail to return a fileid/inode number, which again exposed a bug in the NFS readdir code that causes spurious ENOENT errors to appear in applications (see http://bugzilla.kernel.org/show_bug.cgi?id=14541). The immediate band aid is to revert the incorrect bitmap change, but more long term, we should change the NFS readdir code to cope with the fact that NFSv4 servers are not required to support fileids/inode numbers. Reported-by: Daniel J Blueman <daniel.blueman@gmail.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-11-11 16:15:42 +09:00
J. Bruce Fields	dc7a08166f	nfs: new subdir Documentation/filesystems/nfs We're adding enough nfs documentation that it may as well have its own subdirectory. Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-10-27 19:34:04 -04:00
Trond Myklebust	9a3936aac1	NFSv4: The link() operation should return any delegation on the file Otherwise, we have to wait for the server to recall it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-26 08:09:46 -04:00
Trond Myklebust	141aeb9f26	NFSv4: Fix two unbalanced put_rpccred() issues. Commits `29fba38b` (nfs41: lease renewal) and `fc01cea9` (nfs41: sequence operation) introduce a couple of put_rpccred() calls on credentials for which there is no corresponding get_rpccred(). See http://bugzilla.kernel.org/show_bug.cgi?id=14249 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-26 08:09:46 -04:00
Trond Myklebust	52567b03ca	NFSv4: Fix a bug when the server returns NFS4ERR_RESOURCE RFC 3530 states that when we recieve the error NFS4ERR_RESOURCE, we are not supposed to bump the sequence number on OPEN, LOCK, LOCKU, CLOSE, etc operations. The problem is that we map that error into EREMOTEIO in the XDR layer, and so the NFSv4 middle-layer routines like seqid_mutating_err(), and nfs_increment_seqid() don't recognise it. The fix is to defer the mapping until after the middle layers have processed the error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-23 14:46:42 -04:00
Terry Loftin	a8b40bc7e6	nfs: Panic when commit fails Actually pass the NFS_FILE_SYNC option to the server to avoid a Panic in nfs_direct_write_complete() when a commit fails. At the end of an nfs write, if the nfs commit fails, all the writes will be rescheduled. They are supposed to be rescheduled as NFS_FILE_SYNC writes, but the rpc_task structure is not completely intialized and so the option is not passed. When the rescheduled writes complete, the return indicates that they are NFS_UNSTABLE and we try to do another commit. This leads to a Panic because the commit data structure pointer was set to null in the initial (failed) commit attempt. Signed-off-by: Terry Loftin <terry.loftin@hp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-23 14:16:30 -04:00
Yinghai Lu	4223a4a155	nfs: Fix nfs_parse_mount_options() kfree() leak Fix a (small) memory leak in one of the error paths of the NFS mount options parsing code. Regression introduced in 2.6.30 by commit `a67d18f` (NFS: load the rpc/rdma transport module automatically). Reported-by: Yinghai Lu <yinghai@kernel.org> Reported-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-22 08:15:23 +09:00
Stefan Richter	a1be9eee29	NFS: suppress a build warning struct sockaddr_storage * can safely be used as struct sockaddr *. Suppress an "incompatible pointer type" warning. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-12 10:25:12 -07:00
Trond Myklebust	3050141bae	NFSv4: Kill nfs4_renewd_prepare_shutdown() The NFSv4 renew daemon is shared between all active super blocks that refer to a particular NFS server, so it is wrong to be shutting it down in nfs4_kill_super every time a super block is destroyed. This patch therefore kills nfs4_renewd_prepare_shutdown altogether, and leaves it up to nfs4_shutdown_client() to also shut down the renew daemon by means of the existing call to nfs4_kill_renewd(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-08 11:50:55 -04:00
Trond Myklebust	517be09def	NFSv4: Fix the referral mount code Fix a typo which causes try_location() to use the wrong length argument when calling nfs_parse_server_name(). This again, causes the initialisation of the mount's sockaddr structure to fail. Also ensure that if nfs4_pathname_string() returns an error, then we pass that error back up the stack instead of ENOENT. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-06 15:42:20 -04:00
Ben Hutchings	f4373bf9e6	nfs: Avoid overrun when copying client IP address string As seen in <http://bugs.debian.org/549002>, nfs4_init_client() can overrun the source string when copying the client IP address from nfs_parsed_mount_data::client_address to nfs_client::cl_ipaddr. Since these are both treated as null-terminated strings elsewhere, the copy should be done with strlcpy() not memcpy(). Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-06 15:42:18 -04:00
Trond Myklebust	bcd2ea17da	NFS: Fix port initialisation in nfs_remount() The recent changeset `53a0b9c4c9` (NFS: Replace nfs_parse_ip_address() with rpc_pton()) broke nfs_remount, since the call to rpc_pton() will zero out the port number in data->nfs_server.address. This is actually due to a bug in nfs_remount: it should be looking at the port number in nfs_server.port instead... This fixes bug http://bugzilla.kernel.org/show_bug.cgi?id=14276 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-06 15:41:22 -04:00
Trond Myklebust	f5855fecda	NFS: Fix port and mountport display in /proc/self/mountinfo Currently, the port and mount port will both display as 65535 if you do not specify a port number. That would be wrong... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-06 15:40:37 -04:00
Trond Myklebust	c5811dbdd2	NFS: Fix a default mount regression... With the recent spate of changes, the nfs protocol version will now default to 2 instead of 3, while the mount protocol version defaults to 3. The following patch should ensure the defaults are consistent with the previous defaults of vers=3,proto=tcp,mountvers=3,mountproto=tcp. This fixes the bug http://bugzilla.kernel.org/show_bug.cgi?id=14259 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-10-06 15:40:15 -04:00
Christoph Lameter	fce22848a1	this_cpu: Use this_cpu operations for NFS statistics Simplify NFS statistics and allow the use of optimized arch instructions. Acked-by: Tejun Heo <tj@kernel.org> CC: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-10-03 19:48:22 +09:00
Alexey Dobriyan	f0f37e2f77	const: mark struct vm_struct_operations * mark struct vm_area_struct::vm_ops as const * mark vm_ops in AGP code But leave TTM code alone, something is fishy there with global vm_ops being used. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-27 11:39:25 -07:00
Al Viro	36dd2fdb37	nfs[23] tcp breakage in mount with binary options We forget to set nfs_server.protocol in tcp case when old-style binary options are passed to mount. The thing remains zero and never validated afterwards. As the result, we hit BUG in fs/nfs/client.c:588. Breakage has been introduced in NFS: Add nfs_alloc_parsed_mount_data merged yesterday... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-24 14:58:42 -04:00
Linus Torvalds	6c5daf012c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: truncate: use new helpers truncate: new helpers fs: fix overflow in sys_mount() for in-kernel calls fs: Make unload_nls() NULL pointer safe freeze_bdev: grab active reference to frozen superblocks freeze_bdev: kill bd_mount_sem exofs: remove BKL from super operations fs/romfs: correct error-handling code vfs: seq_file: add helpers for data filling vfs: remove redundant position check in do_sendfile vfs: change sb->s_maxbytes to a loff_t vfs: explicitly cast s_maxbytes in fiemap_check_ranges libfs: return error code on failed attr set seq_file: return a negative error code when seq_path_root() fails. vfs: optimize touch_time() too vfs: optimization for touch_atime() vfs: split generic_forget_inode() so that hugetlbfs does not have to copy it fs/inode.c: add dev-id and inode number for debugging in init_special_inode() libfs: make simple_read_from_buffer conventional	2009-09-24 08:32:11 -07:00
Linus Torvalds	db16826367	Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 * 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits) HWPOISON: Enable error_remove_page on btrfs HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs HWPOISON: Add madvise() based injector for hardware poisoned pages v4 HWPOISON: Enable error_remove_page for NFS HWPOISON: Enable .remove_error_page for migration aware file systems HWPOISON: The high level memory error handler in the VM v7 HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process HWPOISON: shmem: call set_page_dirty() with locked page HWPOISON: Define a new error_remove_page address space op for async truncation HWPOISON: Add invalidate_inode_page HWPOISON: Refactor truncate to allow direct truncating of page v2 HWPOISON: check and isolate corrupted free pages v2 HWPOISON: Handle hardware poisoned pages in try_to_unmap HWPOISON: Use bitmask/action code for try_to_unmap behaviour HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2 HWPOISON: Add poison check to page fault handling HWPOISON: Add basic support for poisoned pages in fault handler v3 HWPOISON: Add new SIGBUS error codes for hardware poison signals HWPOISON: Add support for poison swap entries v2 HWPOISON: Export some rmap vma locking to outside world ...	2009-09-24 07:53:22 -07:00
npiggin@suse.de	c08d3b0e33	truncate: use new helpers Update some fs code to make use of new helper functions introduced in the previous patch. Should be no significant change in behaviour (except CIFS now calls send_sig under i_lock, via inode_newsize_ok). Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Cc: linux-nfs@vger.kernel.org Cc: Trond.Myklebust@netapp.com Cc: linux-cifs-client@lists.samba.org Cc: sfrench@samba.org Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 08:41:47 -04:00
Alexey Dobriyan	2bcd57ab61	headers: utsname.h redux * remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 18:13:10 -07:00
David Howells	2df5480638	NFS: Propagate 'fsc' mount option through automounts Propagate the NFS 'fsc' mount option through NFS automounts of various types. This is now required as commit: commit `c02d7adf8c` Author: Trond Myklebust <Trond.Myklebust@netapp.com> Date: Mon Jun 22 15:09:14 2009 -0400 NFSv4: Replace nfs4_path_walk() with VFS path lookup in a private namespace uses VFS-driven automounting to reach all submounts barring the root, thus preventing fscaching from being enabled on any submount other than the root. This patch gets around that by propagating the NFS_OPTION_FSCACHE flag across automounts. If a uniquifier is supplied to a mount then this is propagated to all automounts of that mount too. Signed-off-by: David Howells <dhowells@redhat.com> [Trond: Fixed up the definition of nfs_fscache_get_super_cookie for the case of #undef CONFIG_NFS_FSCACHE] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-23 14:36:39 -04:00
Chuck Lever	9423a08ad5	NFS: Add nfs_alloc_parsed_mount_data Allocating nfs_parsed_mount_data and setting up the defaults is nearly the same for both nfs and nfs4 mounts. Both paths seem to use nfs_validate_transport_protocol(), so setting a default value for nfs_server.protocol ought to be unnecessary. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-23 14:36:38 -04:00
Trond Myklebust	8a6e5deb8a	NFS: Get rid of the NFS_MOUNT_VER3 and NFS_MOUNT_TCP flags Keep it in the case of the legacy binary mount interface, but purge it from the nfs_server structure. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-23 14:36:37 -04:00
James Morris	88e9d34c72	seq_file: constify seq_operations Make all seq_operations structs const, to help mitigate against revectoring user-triggerable function pointers. This is derived from the grsecurity patch, although generated from scratch because it's simpler than extracting the changes from there. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 07:39:29 -07:00
Linus Torvalds	342ff1a1b5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) trivial: fix typo in aic7xxx comment trivial: fix comment typo in drivers/ata/pata_hpt37x.c trivial: typo in kernel-parameters.txt trivial: fix typo in tracing documentation trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c trivial: remove unnecessary semicolons trivial: Fix duplicated word "options" in comment trivial: kbuild: remove extraneous blank line after declaration of usage() trivial: improve help text for mm debug config options trivial: doc: hpfall: accept disk device to unload as argument trivial: doc: hpfall: reduce risk that hpfall can do harm trivial: SubmittingPatches: Fix reference to renumbered step trivial: fix typos "man[ae]g?ment" -> "management" trivial: media/video/cx88: add __init/__exit macros to cx88 drivers trivial: fix typo in CONFIG_DEBUG_FS in gcov doc trivial: fix missing printk space in amd_k7_smp_check trivial: fix typo s/ketymap/keymap/ in comment trivial: fix typo "to to" in multiple files trivial: fix typos in comments s/DGBU/DBGU/ ...	2009-09-22 07:51:45 -07:00
Alexey Dobriyan	6aed62853c	const: make file_lock_operations const Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:25 -07:00
Jens Axboe	48d0764998	nfs: initialize the backing_dev_info when creating the server NFS may free the server structure without ever having used the bdi, so we either need to flag the bdi as being uninitialized or initialize it up front. This does the latter. This fixes a crash with mounting more than one NFS file system, should people ever need that kind of obscure NFS functionality. Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-21 15:40:33 +02:00
Jens Axboe	92f25053c0	nfs: nfs_kill_super() should call bdi_unregister() after killing super Otherwise we could be attempting to flush data for a writeback thread and bdi that have already disappeared. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-21 15:40:32 +02:00
Joe Perches	a419aef8b8	trivial: remove unnecessary semicolons Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-09-21 15:14:58 +02:00
Jens Axboe	32a88aa1b6	fs: Assign bdi in super_block We do this automatically in get_sb_bdev() from the set_bdev_super() callback. Filesystems that have their own private backing_dev_info must assign that in ->fill_super(). Note that ->s_bdi assignment is required for proper writeback! Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-16 15:18:51 +02:00
Jens Axboe	1fe06ad892	writeback: get rid of wbc->for_writepages It's only set, it's never checked. Kill it. Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-16 15:16:18 +02:00
Andi Kleen	f590f333fb	HWPOISON: Enable error_remove_page for NFS Enable hardware memory error handling for NFS Truncation of data pages at runtime should be safe in NFS, even when it doesn't support migration so far. Trond tells me migration is also queued up for 2.6.32. Acked-by: Trond.Myklebust@netapp.com Signed-off-by: Andi Kleen <ak@linux.intel.com>	2009-09-16 11:50:17 +02:00
Trond Myklebust	ab3bbaa8b2	Merge branch 'nfs-for-2.6.32'	2009-09-11 14:59:37 -04:00
Jens Axboe	d993831fa7	writeback: add name to backing_dev_info This enables us to track who does what and print info. Its main use is catching dirty inodes on the default_backing_dev_info, so we can fix that up. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 09:20:26 +02:00
Trond Myklebust	2ecda72b49	NFSv4: Disallow 'mount -t nfs4 -overs=2' and 'mount -t nfs4 -overs=3' Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-08 19:50:07 -04:00
Chuck Lever	764302ccb8	NFS: Allow the "nfs" file system type to support NFSv4 When mounting an "nfs" type file system, recognize "v4," "vers=4," or "nfsvers=4" mount options, and convert the file system to "nfs4" under the covers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [trondmy: fixed up binary mount code so it sets the 'version' field too] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-08 19:50:03 -04:00
Chuck Lever	a6fe23be90	NFS: Move details of nfs4_get_sb() to a helper Clean up: Refactor nfs4_get_sb() to allow its guts to be invoked by nfs_get_sb(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-08 19:50:00 -04:00
Chuck Lever	7630c852e1	NFS: Refactor NFSv4 text-based mount option validation Clean up: Refactor the part of nfs4_validate_mount_options() that handles text-based options, so we can call it from the NFSv2/v3 option validation function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-08 19:49:57 -04:00
Chuck Lever	4cfd74fc99	NFS: Mount option parser should detect missing "port=" The meaning of not specifying the "port=" mount option is different for "-t nfs" and "-t nfs4" mounts. The default port value for NFSv2/v3 mounts is 0, but the default for NFSv4 mounts is 2049. To support "-t nfs -o vers=4", the mount option parser must detect when "port=" is missing so that the correct default port value can be set depending on which NFS version is requested. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-08 19:49:47 -04:00
Harshula Jayasuriya	dbab8360ed	NFS: out of date comment regarding O_EXCL above nfs3_proc_create() Hi Trond, Recently we were observing the behaviour difference between a 2.4.x and 2.6.x kernel with respect to O_EXCL. A comment from 2.4.x era, "For now, we don't implement O_EXCL." seems inaccurate in TOT. If so, here's a patch to remove the comment. This patch is against: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Signed-off-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-08 19:49:33 -04:00
Trond Myklebust	7111dc7392	NFSv4: Fix an infinite looping problem with the nfs4_state_manager Commit `76db6d9500` (nfs41: add session setup to the state manager) introduces an infinite loop possibility in the NFSv4 state manager. By first checking nfs4_has_session() before clearing the NFS4CLNT_SESSION_SETUP flag, it allows for a situation where someone sets that flag, but it never gets cleared, and so the state manager loops. In fact commit `c3fad1b1aa` (nfs41: add session reset to state manager) causes this to happen every time we get a network partition error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-24 16:28:42 -07:00
Chuck Lever	5eecfde615	NFS: Handle a zero-length auth flavor list Some releases of Linux rpc.mountd (nfs-utils 1.1.4 and later) return an empty auth flavor list if no sec= was specified for the export. This is notably broken server behavior. The new auth flavor list checking added in a recent commit rejects this case. The OpenSolaris client does too. The broken mountd implementation is already widely deployed. To avoid a behavioral regression, the kernel's mount client skips flavor checking (ie reverts to the pre-2.6.32 behavior) if mountd returns an empty flavor list. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-23 23:43:57 -04:00
Jan Kara	e1af88a1ad	nfs: Remove reference to generic_osync_inode from a comment generic_file_direct_write() no longer calls generic_osync_inode() so remove the comment. CC: linux-nfs@vger.kernel.org CC: Neil Brown <neilb@suse.de> CC: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-19 19:48:08 -04:00
Trond Myklebust	7d7ea88289	NFS: Use the DNS resolver in the mount code. In the referral code, use it to look up the new server's ip address if the fs_locations attribute contains a hostname. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-19 18:22:15 -04:00
Trond Myklebust	e571cbf1a4	NFS: Add a dns resolver for use with NFSv4 referrals and migration The NFSv4 and NFSv4.1 protocols both allow for the redirection of a client from one server to another in order to support filesystem migration and replication. For full protocol support, we need to add the ability to convert a DNS host name into an IP address that we can feed to the RPC client. We'll reuse the sunrpc cache, now that it has been converted to work with rpc_pipefs. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-19 18:22:15 -04:00
Trond Myklebust	6a396f67d2	Merge branch 'nfsv4_xdr_cleanups-for-2.6.32' into nfs-for-2.6.32 Conflicts: fs/nfs/nfs4xdr.c	2009-08-19 18:21:52 -04:00
Benny Halevy	cccddf4f55	nfs: nfs4xdr: optimize low level decoding do not increment decoding ptr if not needed. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:26 -04:00
Benny Halevy	c0eae66ece	nfs: nfs4xdr: get rid of READ_BUF Use xdr_inline_decode instead. Open code debug printout and error return. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:23 -04:00
Benny Halevy	2460ba57c4	nfs: nfs4xdr: simplify decode_exchange_id by reusing decode_opaque_inline Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:20 -04:00
Benny Halevy	99398d0655	nfs: nfs4xdr: get rid of COPYMEM Just directly call memcpy. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:17 -04:00
Benny Halevy	e78291e4e0	nfs: nfs4xdr: introduce decode_sessionid helper Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:14 -04:00
Benny Halevy	db942bbd09	nfs: nfs4xdr: introduce decode_verifier helper Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Trond: Fixed up an 'uninitialised variable' issue in decode_readdir] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:57:58 -04:00
Benny Halevy	07d30434cf	nfs: nfs4xdr: introduce decode_opaque_fixed and decode_stateid helpers Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:26:27 -04:00
Benny Halevy	686841b3cc	nfs: nfs4xdr: introduce print_overflow_msg Part fo the nfs4xdr cleanup. READ_BUF will go away. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:24:38 -04:00
Benny Halevy	c816fd3406	nfs: nfs4xdr: get rid of READTIME It has no users. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:24:32 -04:00
Benny Halevy	3ceb4dbb99	nfs: nfs4xdr: get rid of READ64 s/READ64\(\(.)\)/p = xdr_decode_hyper(p, \1)/ s/READ64\((.*)\)/p = xdr_decode_hyper(p, &\1)/ Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:24:13 -04:00
Benny Halevy	6f723f7710	nfs: nfs4xdr: get rid of READ32 s/READ32\((.*)\)/\1 = be32_to_cpup(p++)/ Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:23:58 -04:00
Benny Halevy	811652bd6e	nfs: nfs4xdr: merge xdr_encode_int+xdr_encode_opaque_fixed into xdr_encode_opaque use encode_string where appropriate. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:19:24 -04:00
Benny Halevy	345585132a	nfs: nfs4xdr: optimize low level encoding do not increment encoding ptr if not needed. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:18:03 -04:00
Benny Halevy	13c65ce900	nfs: nfs4xdr: change RESERVE_SPACE macro into a static helper In order to open code and expose the result pointer assignment. Alternatively, we can open code the call to xdr_reserve_space and do the BUG_ON an the error case at the call site. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:17:17 -04:00
Benny Halevy	2220f13a8b	nfs: nfs4xdr: encode_compound_hdr does not have to round up reserved bytes This is already done by xdr_reserve_space and since encode_compound_hdr is adding a byte count to "12" which is already word aligned, the xdr level rounding will work just as well. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:16:00 -04:00
Benny Halevy	42edd69812	nfs: nfs4xdr: optimize RESERVE_SPACE in encode_create_session and encode_sequence Coalesce multilpe constant RESERVE_SPACEs into one Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:15:20 -04:00
Benny Halevy	93f0cf2594	nfs: nfs4xdr: get rid of WRITEMEM s/WRITEMEM(/p = xdr_encode_opaque_fixed(p, / Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:13:58 -04:00
Benny Halevy	b95be5a976	nfs: nfs4xdr: get rid of WRITE64 s/WRITE64/p = xdr_encode_hyper(p, / Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:13:15 -04:00
Benny Halevy	e75bc1c89e	nfs: nfs4xdr: get rid of WRITE32 s/WRITE32/*p++ = cpu_to_be32/ Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:12:55 -04:00
Trond Myklebust	1ae88b2e44	NFS: Fix an O_DIRECT Oops... We can't call nfs_readdata_release()/nfs_writedata_release() without first initialising and referencing args.context. Doing so inside nfs_direct_read_schedule_segment()/nfs_direct_write_schedule_segment() causes an Oops. We should rather be calling nfs_readdata_free()/nfs_writedata_free() in those cases. Looking at the O_DIRECT code, the "struct nfs_direct_req" is already referencing the nfs_open_context for us. Since the readdata and writedata structures carry a reference to that, we can simplify things by getting rid of the extra nfs_open_context references, so that we can replace all instances of nfs_readdata_release()/nfs_writedata_release(). Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-12 08:21:39 -07:00
Trond Myklebust	f884dcaead	Merge branch 'sunrpc_cache-for-2.6.32' into nfs-for-2.6.32	2009-08-10 17:45:58 -04:00
Trond Myklebust	976a6f921c	Merge branch 'patches_cel-for-2.6.32' into nfs-for-2.6.32	2009-08-10 17:45:50 -04:00
Bartlomiej Zolnierkiewicz	e576e05a73	nfs: remove superfluous BUG_ON()s Subject: [PATCH] nfs: remove superfluous BUG_ON()s Remove duplicated BUG_ON()s from nfs[4]_create_server() (we make the same checks earlier in both functions). This takes care of the following entries from Dan's list: fs/nfs/client.c +1078 nfs_create_server(47) warning: variable derefenced before check 'server->nfs_client' fs/nfs/client.c +1079 nfs_create_server(48) warning: variable derefenced before check 'server->nfs_client->rpc_ops' fs/nfs/client.c +1363 nfs4_create_server(43) warning: variable derefenced before check 'server->nfs_client' fs/nfs/client.c +1364 nfs4_create_server(44) warning: variable derefenced before check 'server->nfs_ Reported-by: Dan Carpenter <error27@gmail.com> Cc: corbet@lwn.net Cc: eteo@redhat.com Cc: Julia Lawall <julia@diku.dk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-10 08:54:16 -04:00
Peter Staubach	38c73044f5	NFS: read-modify-write page updating Hi. I have a proposal for possibly resolving this issue. I believe that this situation occurs due to the way that the Linux NFS client handles writes which modify partial pages. The Linux NFS client handles partial page modifications by allocating a page from the page cache, copying the data from the user level into the page, and then keeping track of the offset and length of the modified portions of the page. The page is not marked as up to date because there are portions of the page which do not contain valid file contents. When a read call comes in for a portion of the page, the contents of the page must be read in the from the server. However, since the page may already contain some modified data, that modified data must be written to the server before the file contents can be read back in the from server. And, since the writing and reading can not be done atomically, the data must be written and committed to stable storage on the server for safety purposes. This means either a FILE_SYNC WRITE or a UNSTABLE WRITE followed by a COMMIT. This has been discussed at length previously. This algorithm could be described as modify-write-read. It is most efficient when the application only updates pages and does not read them. My proposed solution is to add a heuristic to decide whether to do this modify-write-read algorithm or switch to a read- modify-write algorithm when initially allocating the page in the write system call path. The heuristic uses the modes that the file was opened with, the offset in the page to read from, and the size of the region to read. If the file was opened for reading in addition to writing and the page would not be filled completely with data from the user level, then read in the old contents of the page and mark it as Uptodate before copying in the new data. If the page would be completely filled with data from the user level, then there would be no reason to read in the old contents because they would just be copied over. This would optimize for applications which randomly access and update portions of files. The linkage editor for the C compiler is an example of such a thing. I tested the attached patch by using rpmbuild to build the current Fedora rawhide kernel. The kernel without the patch generated about 269,500 WRITE requests. The modified kernel containing the patch generated about 261,000 WRITE requests. Thus, about 8,500 fewer WRITE requests were generated. I suspect that many of these additional WRITE requests were probably FILE_SYNC requests to WRITE a single page, but I didn't test this theory. The difference between this patch and the previous one was to remove the unneeded PageDirty() test. I then retested to ensure that the resulting system continued to behave as desired. Thanx... ps Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-10 08:54:16 -04:00
Trond Myklebust	074cc1deec	NFS: Add a ->migratepage() aop for NFS Make NFS a bit more friendly to NUMA and memory hot removal... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-10 08:54:13 -04:00
Trond Myklebust	7d217caca5	SUNRPC: Replace rpc_client->cl_dentry and cl_mnt, with a cl_path Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:24 -04:00
Trond Myklebust	b693ba4a33	SUNRPC: Constify rpc_pipe_ops... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:15 -04:00
Chuck Lever	ec6ee61250	NFS: Replace nfs_set_port() with rpc_set_port() Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:37 -04:00
Chuck Lever	53a0b9c4c9	NFS: Replace nfs_parse_ip_address() with rpc_pton() Clean up: Use the common routine now provided in sunrpc.ko for parsing mount addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:36 -04:00
Chuck Lever	a02d692611	SUNRPC: Provide functions for managing universal addresses Introduce a set of functions in the kernel's RPC implementation for converting between a socket address and either a standard presentation address string or an RPC universal address. The universal address functions will be used to encode and decode RPCB_FOO and NFSv4 SETCLIENTID arguments. The other functions are part of a previous promise to deliver shared functions that can be used by upper-layer protocols to display and manipulate IP addresses. The kernel's current address printf formatters were designed specifically for kernel to user-space APIs that require a particular string format for socket addresses, thus are somewhat limited for the purposes of sunrpc.ko. The formatter for IPv6 addresses, %pI6, does not support short-handing or scope IDs. Also, these printf formatters are unique per address family, so a separate formatter string is required for printing AF_INET and AF_INET6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:34 -04:00
Chuck Lever	ec88f28d1e	NFS: Use the authentication flavor list returned by mountd Commit `a14017db` added support in the kernel's NFS mount client to decode the authentication flavor list returned by mountd. The NFS client can now use this list to determine whether the authentication flavor requested by the user is actually supported by the server. Note we don't actually negotiate the security flavor if none was specified by the user. Instead, we try to use AUTH_SYS, and fail if the server does not support it. This prevents us from negotiating an inappropriate security flavor (some servers list AUTH_NULL first). If the server does not support AUTH_SYS, the user must provide an appropriate security flavor by specifying the "sec=" mount option. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:32 -04:00
Chuck Lever	059f90b323	NFS: Fix auth flavor len accounting Previous logic in the NFS mount parsing code path assumed auth_flavor_len was set to zero for simple authentication flavors (like AUTH_UNIX), and 1 for compound flavors (like AUTH_GSS). At some earlier point (maybe even before the option parsers were merged?) specific checks for auth_flavor_len being zero were removed from the functions that validate the mount option that sets the mount point's authentication flavor. Since we are populating an array for authentication flavors, the auth_flavor_len should always be set to the number of flavors. Let's eliminate some cleverness here, and prepare for new logic that needs to know the number of flavors in the auth_flavors[] array. (auth_flavors[] is an array because at some point we want to allow a list of acceptable authentication flavors to be specified via the sec= mount option. For now it remains a single element array). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:31 -04:00
Chuck Lever	0b524123c9	NFS: Add ability to send MOUNTPROC_UMNT to the kernel's mountd client After certain failure modes of an NFS mount, an NFS client should send a MOUNTPROC_UMNT request to remove the just-added mount entry from the server's mount table. While no-one should rely on the accuracy of the server's mount table, sending a UMNT is simply being a good internet neighbor. Since NFS mount processing is handled in the kernel now, we will need a function in the kernel's mountd client that can post a MOUNTRPC_UMNT request, in order to handle these failure modes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:30 -04:00
Chuck Lever	f3f4f4ed26	NFS: Fix up new minorversion= option The new minorversion= mount option (commit `3fd5be9e`) was merged at the same time as the recent sloppy parser fixes (commit `a5a16bae`), so minorversion= still uses the old value parsing logic. If the minorversion= option specifies a bogus value, it should fail with "bad value" not "bad option." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:29 -04:00
Trond Myklebust	c140aa9135	NFSv4: Clean up the nfs.callback_tcpport option Tighten up the validity checking in param_set_port: check for NULL pointers. Ensure that the option shows up on 'modinfo' output. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:06:19 -04:00
Trond Myklebust	80e52aced1	NFSv4: Don't do idmapper upcalls for asynchronous RPC calls We don't want to cause rpciod to hang... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:06:19 -04:00
Trond Myklebust	62ab460cf5	NFSv4: Add 'server capability' flags for NFSv4 recommended attributes If the NFSv4 server doesn't support a POSIX attribute, the generic NFS code needs to know that, so that it don't keep trying to poll for it. However, by the same count, if the NFSv4 server does support that attribute, then we should ensure that the inode metadata is appropriately labelled as being untrusted. For instance, if we don't know the correct value of the file's uid, we should certainly not be caching ACLs or ACCESS results. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:06:19 -04:00
Trond Myklebust	a78cb57a10	NFSv4: Don't loop forever on state recovery failure... If the server is broken, then retrying forever won't fix it. We should just give up after a while, and return an error to the user. We set the number of retries to 10 for now... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:06:19 -04:00
Roel Kluin	dd8ac1da41	nfs: Keep index within mnt_errtbl[] Ensure that index i remains within array mnt_errtbl[] and mnt3_errtbl[]. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:06:19 -04:00
Trond Myklebust	d953126a28	NFSv4: Fix a problem whereby a buggy server can oops the kernel We just had a case in which a buggy server occasionally returns the wrong attributes during an OPEN call. While the client does catch this sort of condition in nfs4_open_done(), and causes the nfs4_atomic_open() to return -EISDIR, the logic in nfs_atomic_lookup() is broken, since it causes a fallback to an ordinary lookup instead of just returning the error. When the buggy server then returns a regular file for the fallback lookup, the VFS allows the open, and bad things start to happen, since the open file doesn't have any associated NFSv4 state. The fix is firstly to return the EISDIR/ENOTDIR errors immediately, and secondly to ensure that we are always careful when dereferencing the nfs_open_context state pointer. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-07-21 19:22:38 -04:00
Trond Myklebust	fccba80455	NFSv4: Fix an NFSv4 mount regression Commit `008f55d0e0` (nfs41: recover lease in _nfs4_lookup_root) forces the state manager to always run on mount. This is a bug in the case of NFSv4.0, which doesn't require us to send a setclientid until we want to grab file state. In any case, this is completely the wrong place to be doing state management. Moving that code into nfs4_init_session... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-07-21 16:48:07 -04:00
Trond Myklebust	b64aec8d1e	NFSv4: Fix an Oops in nfs4_free_lock_state The oops http://www.kerneloops.org/raw.php?rawid=537858&msgid= appears to be due to the nfs4_lock_state->ls_state field being uninitialised. This happens if the call to nfs4_free_lock_state() is triggered at the end of nfs4_get_lock_state(). The fix is to move the initialisation of ls_state into the allocator. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-07-21 16:47:46 -04:00
Alexey Dobriyan	405f55712d	headers: smp_lock.h redux * Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-12 12:22:34 -07:00
Linus Torvalds	81e4e1ba7e	Revert "fuse: Fix build error" as unnecessary This reverts commit `097041e576`. Trond had a better fix, which is the parent of this one ("Fix compile error due to congestion_wait() changes") Requested-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-11 11:22:34 -07:00
Larry Finger	097041e576	fuse: Fix build error When building v2.6.31-rc2-344-g69ca06c, the following build errors are found due to missing includes: CC [M] fs/fuse/dev.o fs/fuse/dev.c: In function ‘request_end’: fs/fuse/dev.c:289: error: ‘BLK_RW_SYNC’ undeclared (first use in this function) ... fs/nfs/write.c: In function ‘nfs_set_page_writeback’: fs/nfs/write.c:207: error: ‘BLK_RW_ASYNC’ undeclared (first use in this function) Signed-off-by: Larry Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-10 19:09:46 -07:00
Jens Axboe	8aa7e847d8	Fix congestion_wait() sync/async vs read/write confusion Commit `1faa16d228` accidentally broke the bdi congestion wait queue logic, causing us to wait on congestion for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-10 20:31:53 +02:00
Alexey Dobriyan	b43f3cbd21	headers: mnt_namespace.h redux Fix various silly problems wrt mnt_namespace.h: - exit_mnt_ns() isn't used, remove it - done that, sched.h and nsproxy.h inclusions aren't needed - mount.h inclusion was need for vfsmount_lock, but no longer - remove mnt_namespace.h inclusion from files which don't use anything from mnt_namespace.h Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-08 09:31:56 -07:00
Trond Myklebust	b88f8a546f	NFS: Correct the NFS mount path when following a referral Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-22 21:28:25 -07:00
Trond Myklebust	0b75b35c7c	NFS: Fix nfs_path() to always return a '/' at the beginning of the path Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-22 21:28:25 -07:00
Trond Myklebust	c02d7adf8c	NFSv4: Replace nfs4_path_walk() with VFS path lookup in a private namespace As noted in the previous patch, the NFSv4 client mount code currently has several limitations. If the mount path contains symlinks, or referrals, or even if it just contains a '..', then the client code in nfs4_path_walk() will fail with an error. This patch replaces the nfs4_path_walk()-based lookup with a helper function that sets up a private namespace to represent the namespace on the server, then uses the ordinary VFS and NFS path lookup code to walk down the mount path in that namespace. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-22 21:28:25 -07:00
Linus Torvalds	7e0338c0de	Merge branch 'for-2.6.31' of git://fieldses.org/git/linux-nfsd * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits) SUNRPC: Fix the TCP server's send buffer accounting nfsd41: Backchannel: minorversion support for the back channel nfsd41: Backchannel: cleanup nfs4.0 callback encode routines nfsd41: Remove ip address collision detection case nfsd: optimise the starting of zero threads when none are running. nfsd: don't take nfsd_mutex twice when setting number of threads. nfsd41: sanity check client drc maxreqs nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct NFS: kill off complicated macro 'PROC' sunrpc: potential memory leak in function rdma_read_xdr nfsd: minor nfsd_vfs_write cleanup nfsd: Pull write-gathering code out of nfsd_vfs_write nfsd: track last inode only in use_wgather case sunrpc: align cache_clean work's timer nfsd: Use write gathering only with NFSv2 NFSv4: kill off complicated macro 'PROC' NFSv4: do exact check about attribute specified knfsd: remove unreported filehandle stats counters knfsd: fix reply cache memory corruption knfsd: reply cache cleanups ...	2009-06-22 12:55:50 -07:00
Benny Halevy	578e458568	nfs41: Move initialization of nfs4_opendata seq_res to nfs4_init_opendata_res nfs4_open_recover_helper clears opendata->o_res before calling nfs4_init_opendata_res, thus causing NFSv4.0 OPEN operations to be sent rather than nfsv4.1. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-20 14:55:12 -04:00
Trond Myklebust	1f84603c09	Merge branch 'devel-for-2.6.31' into for-2.6.31 Conflicts: fs/nfs/client.c fs/nfs/super.c	2009-06-18 18:13:44 -07:00
James Morris	4bf259e3ae	nfs: remove unnecessary NFS_INO_INVALID_ACL checks Unless I'm mistaken, NFS_INO_INVALID_ACL is being checked twice during getacl calls (i.e. first via nfs_revalidate_inode() and then by each all site). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:14 -07:00
Chuck Lever	a5a16bae70	NFS: More "sloppy" parsing problems Specifying "port=-5" with the kernel's current mount option parser generates "unrecognized mount option". If "sloppy" is set, this causes the mount to succeed and use the default values; the desired behavior is that, since this is a valid option with an invalid value, the mount should fail, even with "sloppy." To properly handle "sloppy" parsing, we need to distinguish between correct options with invalid values, and incorrect options. We will need to parse integer values by hand, therefore, and not rely on match_token(). For instance, these must all fail with "invalid value": port=12345678 port=-5 port=samuel and not with "unrecognized option," as they do currently. Thus, for the sake of match_token() we need to treat the values for these options as strings, and do the conversion to integers using strict_strtol(). This is basically the same solution we used for the earlier "retry=" fix (commit `ecbb3845`), except in this case the kernel actually has to parse the value, rather than ignore it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:14 -07:00
Chuck Lever	d23c45fd84	NFS: Invalid mount option values should always fail, even with "sloppy" Ian Kent reports: "I've noticed a couple of other regressions with the options vers and proto option of mount.nfs(8). The commands: mount -t nfs -o vers=<invalid version> <server>:/<path> /<mountpoint> mount -t nfs -o proto=<invalid proto> <server>:/<path> /<mountpoint> both immediately fail. But if the "-s" option is also used they both succeed with the mount falling back to defaults (by the look of it). In the past these failed even when the sloppy option was given, as I think they should. I believe the sloppy option is meant to allow the mount command to still function for mount options (for example in shared autofs maps) that exist on other Unix implementations but aren't present in the Linux mount.nfs(8). So, an invalid value specified for a known mount option is different to an unknown mount option and should fail appropriately." See RH bugzilla 486266. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:13 -07:00
Chuck Lever	065015e5ef	NFS: Remove unused XDR decoder functions Clean up: Remove xdr_decode_fhstatus() and xdr_decode_fhstatus3(), now that they are unused. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:13 -07:00
Chuck Lever	8e02f6b9aa	NFS: Update MNT and MNT3 reply decoding functions Solder xdr_stream-based XDR decoding functions into the in-kernel mountd client that are more careful about checking data types and watching for buffer overflows. The new MNT3 decoder includes support for auth-flavor list decoding. The "_sz" macro for MNT3 replies was missing the size of the file handle. I've added this back, and included the size of the auth flavor array. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:13 -07:00
Chuck Lever	a14017db28	NFS: add XDR decoder for mountd version 3 auth-flavor lists Introduce an xdr_stream-based XDR decoder that can unpack the auth- flavor list returned in a MNT3 reply. The nfs_mount() function's caller allocates an array, and passes the size and a pointer to it. The decoder decodes all the flavors it can into the array, and returns the number of decoded flavors. If the caller is not interested in the auth flavors, it can pass a value of zero as the size of the pre-allocated array. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:12 -07:00
Chuck Lever	4fdcd9966d	NFS: add new file handle decoders to in-kernel mountd client Introduce xdr_stream-based XDR file handle decoders to the in-kernel mountd client. These are more careful than the existing decoder functions about buffer overflows and data type and range checking. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:12 -07:00
Chuck Lever	fb12529577	NFS: Add separate mountd status code decoders for each mountd version Introduce data structures and xdr_stream-based decoding functions for unmarshalling mountd status codes properly. Mountd version 3 uses specific standard error return codes that are not errno values and not NFS3ERR_ values. These have a well-defined standard mapping to local errno values. Introduce data structures and a decoder function that map these status codes to local errno values properly. This is new functionality (but not used yet). Version 1 mountd status values are defined by RFC 1094 as UNIX error values (errno values). Errno values on heterogeneous systems do not necessarily match each other. To avoid exposing possibly incorrect errno values to upper layers, the current XDR decoder converts all non-zero MNT version 1 status codes to -EACCES. The OpenGroup XNFS standard provides a mapping similar to but smaller than the version 3 error codes. Implement a decoder that uses the XNFS error codes, replacing the current decoder. For both mountd protocol versions, map unrecognized errors to -EACCES. Finally we introduce a replacement data structure for mnt_fhstatus at this time, which is used by the new XDR decoders. In addition to documenting that the status value returned by the XDR decoders is always an errno, this new structure will be expanded in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:12 -07:00

... 3 4 5 6 7 ...

1684 Commits