linux

Commit Graph

Author	SHA1	Message	Date
Ben Dooks	d49dd11753	NFSv4: add declaration of current_stateid The current_stateid is exported from nfs4state.c but not declared in any of the headers. Add to nfs4_fs.h to remove the following warning: fs/nfs/nfs4state.c:80:20: warning: symbol 'current_stateid' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-11-18 10:36:45 +01:00
Trond Myklebust	30cb3ee299	pNFS: Handle NFS4ERR_OLD_STATEID on layoutreturn by bumping the state seqid If a LAYOUTRETURN receives a reply of NFS4ERR_OLD_STATEID then assume we've missed an update, and just bump the stateid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2019-09-20 15:48:35 -04:00
Trond Myklebust	6109bcf713	NFSv4: Handle RPC level errors in LAYOUTRETURN Handle RPC level errors by assuming that the RPC call was successful. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2019-09-20 15:39:56 -04:00
Trond Myklebust	078a432d1c	NFSv4: Handle NFS4ERR_DELAY correctly in return-on-close If the server sends a NFS4ERR_DELAY, then allow the caller to retry. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2019-09-20 15:36:58 -04:00
Trond Myklebust	287a9c558b	NFSv4: Clean up pNFS return-on-close error handling Both close and delegreturn have identical code to handle pNFS return-on-close. This patch refactors that code and places it in pnfs.c Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2019-09-20 15:27:51 -04:00
Trond Myklebust	9c47b18cf7	pNFS: Ensure we do clear the return-on-close layout stateid on fatal errors IF the server rejected our layout return with a state error such as NFS4ERR_BAD_STATEID, or even a stale inode error, then we do want to clear out all the remaining layout segments and mark that stateid as invalid. Fixes: `1c5bd76d17` ("pNFS: Enable layoutreturn operation for...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2019-09-20 15:17:42 -04:00
Trond Myklebust	731c74dd98	NFSv4: Report the error from nfs4_select_rw_stateid() In pnfs_update_layout() ensure that we do report any fatal errors from nfs4_select_rw_stateid(). Fixes: `d9aba2b40d` ("NFSv4: Don't use the zero stateid with layoutget") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-08-04 22:35:40 -04:00
Trond Myklebust	d5b9216fd5	pnfs/flexfiles: Add tracepoints for detecting pnfs fallback to MDS Add tracepoints to allow debugging of the event chain leading to a pnfs fallback to doing I/O through the MDS. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-07-18 15:50:28 -04:00
Trond Myklebust	58bbeab425	pnfs: Fix a problem where we gratuitously start doing I/O through the MDS If the client has to stop in pnfs_update_layout() to wait for another layoutget to complete, it currently exits and defaults to I/O through the MDS if the layoutget was successful. Fixes: `d03360aaf5` ("pNFS: Ensure we return the error if someone kills...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.20+	2019-07-18 15:33:42 -04:00
Trond Myklebust	d9aba2b40d	NFSv4: Don't use the zero stateid with layoutget The NFSv4.1 protocol explicitly forbids us from using the zero stateid together with layoutget, so when we see that nfs4_select_rw_stateid() is unable to return a valid delegation, lock or open stateid, then we should initiate recovery and retry. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-07-18 14:43:52 -04:00
Trond Myklebust	2b17d725f9	NFS: Clean up writeback code Now that the VM promises never to recurse back into the filesystem layer on writeback, remove all the GFP_NOFS references etc from the generic writeback code. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-07-06 14:54:52 -04:00
Trond Myklebust	9fcd5960e8	NFS: Add a helper to return a pointer to the open context of a struct nfs_page Add a helper for when we remove the explicit pointer to the open context. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2019-04-25 14:18:15 -04:00
Trond Myklebust	400417b05f	pNFS: Fix a typo in pnfs_update_layout We're supposed to wait for the outstanding layout count to go to zero, but that got lost somehow. Fixes: `d03360aaf5` ("pNFS: Ensure we return the error if someone...") Reported-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-12 16:04:51 -04:00
Trond Myklebust	5085607d20	NFS/pnfs: Bulk destroy of layouts needs to be safe w.r.t. umount If a bulk layout recall or a metadata server reboot coincides with a umount, then holding a reference to an inode is unsafe unless we also hold a reference to the super block. Fixes: `fd9a8d7160` ("NFSv4.1: Fix bulk recall and destroy of layouts") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-23 13:59:29 -05:00
NeilBrown	a52458b48a	NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'. SUNRPC has two sorts of credentials, both of which appear as "struct rpc_cred". There are "generic credentials" which are supplied by clients such as NFS and passed in 'struct rpc_message' to indicate which user should be used to authorize the request, and there are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS which describe the credential to be sent over the wires. This patch replaces all the generic credentials by 'struct cred' pointers - the credential structure used throughout Linux. For machine credentials, there is a special 'struct cred *' pointer which is statically allocated and recognized where needed as having a special meaning. A look-up of a low-level cred will map this to a machine credential. Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-12-19 13:52:46 -05:00
Trond Myklebust	0de43976fb	NFS: Convert lookups of the open context to RCU Reduce contention on the inode->i_lock by ensuring that we use RCU when looking up the NFS open context. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:17 -04:00
Trond Myklebust	28ced9a84c	pNFS: Don't allocate more pages than we need to fit a layoutget response For the 'files' and 'flexfiles' layout types, we do not expect the reply to be any larger than 4k. The block and scsi layout types are a little more greedy, so we keep allocating the maximum response size for now. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	a2791d3a2c	pNFS: Don't zero out the array in nfs4_alloc_pages() We don't need a zeroed out array, since it is immediately being filled. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	d03360aaf5	pNFS: Ensure we return the error if someone kills a waiting layoutget If someone interrupts a wait on one or more outstanding layoutgets in pnfs_update_layout() then return the ERESTARTSYS/EINTR error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-09-14 16:24:08 -04:00
Trond Myklebust	0af4c8be97	pNFS: Remove unwanted optimisation of layoutget If we knew that the file was empty, we wouldn't be asking for a layout. Any optimisation here is already done before calling pnfs_update_layout(). As it stands, we sometimes end up doing an unnecessary inband read to the MDS even when holding a layout. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-08-21 13:39:08 -04:00
Trond Myklebust	ea51f94b45	pNFS: Treat RECALLCONFLICT like DELAY... Yes, it is possible to get trapped in a loop, but the server should be administratively revoking the recalled layout if it never gets returned. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-08-16 13:47:09 -04:00
Trond Myklebust	ecf8402603	pNFS: When updating the stateid in layoutreturn, also update the recall range When we update the layout stateid in nfs4_layoutreturn_refresh_stateid, we should also update the range in order to let the server know we're actually returning everything. Fixes: 16c278dbfa63 ("pnfs: Fix handling of NFS4ERR_OLD_STATEID replies...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-08-16 13:29:36 -04:00
Gustavo A. R. Silva	10db5b7a2f	pnfs: Use true and false for boolean values Return statements in functions returning bool should use true or false instead of an integer value. This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-08-08 16:50:03 -04:00
Trond Myklebust	2230ca0d28	pnfs: pnfs_find_lseg() should not check NFS_LSEG_LAYOUTRETURN Layout segment validity is determined only by the NFS_LSEG_VALID flag. If it is set, the layout segment is finable. As it is, when the flexfiles driver sets NFS_LSEG_LAYOUTRETURN to indicate that we cannot discard the layout segment, but that it must be returned, then this can result in an unnecessary layoutget storm. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-08-08 16:50:03 -04:00
Trond Myklebust	c16467dc03	pnfs: Fix handling of NFS4ERR_OLD_STATEID replies to layoutreturn If the server tells us that out layoutreturn raced with another layout update, then we must ensure that the new layout segments are not in use before we resend with an updated layout stateid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-08-08 16:50:01 -04:00
Trond Myklebust	af9b6d7570	pNFS: Parse the results of layoutget on open even if permissions checks fail Even if the results of the permissions checks failed, we should parse the results of the layout on open call so that we can return the layout if required. Note that we also want to ignore the sequence counter for whether or not a layout recall occurred. If the recall pertained to our OPEN, then the callback will know, and will attempt to wait for us to finih processing anyway. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-07-26 16:25:25 -04:00
Trond Myklebust	411ae722d1	pNFS: Wait for stale layoutget calls to complete in pnfs_update_layout() If the old layout was recalled, and we returned NFS4ERR_NOMATCHINGLAYOUT then we need to wait for all outstanding layoutget calls to complete before we can send a new one. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-07-26 16:25:25 -04:00
Trond Myklebust	f0b429819b	pNFS: Ignore non-recalled layouts in pnfs_layout_need_return() If a layout has been recalled, then we should fire off a layoutreturn as soon as all the layout segments that match the recall have been retired. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-07-26 16:25:25 -04:00
Trond Myklebust	e0b7d420f7	pNFS: Don't discard layout segments that are marked for return If there are layout segments that are marked for return, then we need to ensure that pnfs_mark_matching_lsegs_return() does not just silently discard them, but it should tell the caller that there is a layoutreturn scheduled. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-07-26 16:25:24 -04:00
Olga Kornievskaia	93b7f7ad20	skip LAYOUTRETURN if layout is invalid Currently, when IO to DS fails, client returns the layout and retries against the MDS. However, then on umounting (inode eviction) it returns the layout again. This is because pnfs_return_layout() was changed in commit `d78471d32b` ("pnfs/blocklayout: set PNFS_LAYOUTRETURN_ON_ERROR") to always set NFS_LAYOUT_RETURN_REQUESTED so even if we returned the layout, it will be returned again. Instead, let's also check if we have already marked the layout invalid. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-06-12 08:48:04 -04:00
Trond Myklebust	32f1c28f3d	pnfs: Don't call commit on failed layoutget-on-open If the layoutget on open call failed, we can't really commit the inode, so don't bother calling it. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:12 -04:00
Trond Myklebust	64294b08f9	pNFS: Don't send LAYOUTGET on OPEN for read, if we already have cached data If we're only opening the file for reading, and the file is empty and/or we already have cached data, then heuristically optimise away the LAYOUTGET. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:12 -04:00
Trond Myklebust	8dc96566c0	NFSv4/pnfs: Don't switch off layoutget-on-open for transient errors Ensure that we only switch off the LAYOUTGET operation in the OPEN compound when the server is truly broken, and/or it is complaining that the compound is too large. Currently, we end up turning off the functionality permanently, even for transient errors such as EACCES or ENOSPC. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Trond Myklebust	d49e0d5b99	NFSv4/pnfs: Ensure pnfs_parse_lgopen() won't try to parse uninitialised data We need to ensure that pnfs_parse_lgopen() doesn't try to parse a struct nfs4_layoutget_res that was not filled by a successful call to decode_layoutget(). This can happen if we performed a cached open, or if either the OP_ACCESS or OP_GETATTR operations preceding the OP_LAYOUTGET in the compound returned an error. By initialising the 'status' field to NFS4ERR_DELAY, we ensure that pnfs_parse_lgopen() won't try to interpret the structure. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Fred Isaman	30ae2412e9	pnfs: Fix manipulation of NFS_LAYOUT_FIRST_LAYOUTGET The flag was not always being cleared after LAYOUTGET on OPEN. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Fred Isaman	c49b5209f9	pnfs: Add barrier to prevent lgopen using LAYOUTGET during recall Since the LAYOUTGET on OPEN can be sent without prior inode information, existing methods to prevent LAYOUTGET from being sent while processing CB_LAYOUTRECALL don't work. Track if a recall occurred while LAYOUTGET was being sent, and if so ignore the results. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Fred Isaman	6e01260cee	pnfs: Stop attempting LAYOUTGET on OPEN on failure Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Fred Isaman	78746a384c	pnfs: Add LAYOUTGET to OPEN of an existing file Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Trond Myklebust	29a8bfe52d	pNFS: Refactor nfs4_layoutget_release() Move the actual freeing of the struct nfs4_layoutget into fs/nfs/pnfs.c where it can be reused by the layoutget on open code. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Fred Isaman	2409a976a2	pnfs: Add LAYOUTGET to OPEN of a new file This triggers when have no pre-existing inode to attach to. The preexisting case is saved for later. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Fred Isaman	5e36e2a941	pnfs: Change pnfs_alloc_init_layoutget_args call signature Don't send in a layout, instead use the (possibly NULL) inode. This is needed for LAYOUTGET attached to an OPEN where the inode is not yet set. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Fred Isaman	1b146fcff7	pnfs: Move nfs4_opendata into nfs4_fs.h It will be needed now by the pnfs code. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Fred Isaman	dacb452db8	pnfs: move allocations out of nfs4_proc_layoutget They work better in the new alloc_init function. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Fred Isaman	587f03deb6	pnfs: refactor send_layoutget Pull out the alloc/init part for eventual reuse by OPEN. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:11 -04:00
Trond Myklebust	9c6376ebdd	pNFS: Prevent the layout header refcount going to zero in pnfs_roc() Ensure that we hold a reference to the layout header when processing the pNFS return-on-close so that the refcount value does not inadvertently go to zero. Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.10+ Tested-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>	2018-03-08 12:56:31 -05:00
Scott Mayhew	ba4a76f703	nfs/pnfs: fix nfs_direct_req ref leak when i/o falls back to the mds Currently when falling back to doing I/O through the MDS (via pnfs_{read\|write}_through_mds), the client frees the nfs_pgio_header without releasing the reference taken on the dreq via pnfs_generic_pg_{read\|write}pages -> nfs_pgheader_init -> nfs_direct_pgio_init. It then takes another reference on the dreq via nfs_generic_pg_pgios -> nfs_pgheader_init -> nfs_direct_pgio_init and as a result the requester will become stuck in inode_dio_wait. Once that happens, other processes accessing the inode will become stuck as well. Ensure that pnfs_read_through_mds() and pnfs_write_through_mds() clean up correctly by calling hdr->completion_ops->completion() instead of calling hdr->release() directly. This can be reproduced (sometimes) by performing "storage failover takeover" commands on NetApp filer while doing direct I/O from a client. This can also be reproduced using SystemTap to simulate a failure while doing direct I/O from a client (from Dave Wysochanski <dwysocha@redhat.com>): stap -v -g -e 'probe module("nfs_layout_nfsv41_files").function("nfs4_fl_prepare_ds").return { $return=NULL; exit(); }' Suggested-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Fixes: `1ca018d28d` ("pNFS: Fix a memory leak when attempted pnfs fails") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Benjamin Coddington	b3dce6a2f0	pnfs/blocklayout: handle transient devices PNFS block/SCSI layouts should gracefully handle cases where block devices are not available when a layout is retrieved, or the block devices are removed while the client holds a layout. While setting up a layout segment, keep a record of an unavailable or un-parsable block device in cache with a flag so that subsequent layouts do not spam the server with GETDEVINFO. We can reuse the current NFS_DEVICEID_UNAVAILABLE handling with one variation: instead of reusing the device, we will discard it and send a fresh GETDEVINFO after the timeout, since the lookup and validation of the device occurs within the GETDEVINFO response handling. A lookup of a layout segment that references an unavailable device will return a segment with the NFS_LSEG_UNAVAILABLE flag set. This will allow the pgio layer to mark the layout with the appropriate fail bit, which forces subsequent IO to the MDS, and prevents spamming the server with LAYOUTGET, LAYOUTRETURN. Finally, when IO to a block device fails, look up the block device(s) referenced by the pgio header, and mark them as unavailable. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Trond Myklebust	7380020e77	pNFS: Retry NFS4ERR_OLD_STATEID errors in layoutreturn-on-close If our layoutreturn on close operation returns an NFS4ERR_OLD_STATEID, then try to update the stateid and retry. We know that there should be no further LAYOUTGET requests being launched. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-11-17 16:43:47 -05:00
Thomas Meyer	6089dd0d73	NFS: Fix bool initialization/comparison Bool initializations should use true and false. Bool tests don't need comparisons. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-11-17 16:43:43 -05:00
Elena Reshetova	2b28a7bee4	fs, nfs: convert pnfs_layout_hdr.plh_refcount from atomic_t to refcount_t atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable pnfs_layout_hdr.plh_refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-11-17 13:47:59 -05:00

1 2 3 4 5 ...

412 Commits