This fixes a wrong check for same cr_principal in same_creds
Introduced by 8fbba96e5b "nfsd4: stricter
cred comparison for setclientid/exchange_id".
Cc: stable@vger.kernel.org
Signed-off-by: Vivek Trivedi <vtrivedi018@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We normally allow the owner of a file to override permissions checks on
IO operations, since:
- the client will take responsibility for doing an access check
on open;
- the permission checks offer no protection against malicious
clients--if they can authenticate as the file's owner then
they can always just change its permissions;
- checking permission on each IO operation breaks the usual
posix rule that permission is checked only on open.
However, we've never allowed the owner to override permissions on
readdir operations, even though the above logic would also apply to
directories. I've never heard of this causing a problem, probably
because a) simultaneously opening and creating a directory (with
restricted mode) isn't possible, and b) opening a directory, then
chmod'ing it, is rare.
Our disallowal of owner-override on directories appears to be an
accident, though--the readdir itself succeeds, and then we fail just
because lookup_one_len() calls in our filldir methods fail.
I'm not sure what the easiest fix for that would be. For now, just make
this behavior obvious by denying the override right at the start.
This also fixes some odd v4 behavior: with the rdattr_error attribute
requested, it would perform the readdir but return an ACCES error with
each entry.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We don't need to keep openowners around in the >=4.1 case, because they
aren't needed to handle CLOSE replays any more (that's a problem for
sessions). And doing so causes unexpected failures on a subsequent
destroy_clientid to fail.
We probably also need something comparable for lock owners on last
unlock.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Actually, xfs and jfs can optionally be case insensitive; we'll handle
that case in later patches.
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Share a little common logic. And note the comments here are a little
out of date (e.g. we don't always create new state in the "new" case any
more.)
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
For the most part readers of cl_cb_state only need a value that is
"eventually" right. And the value is set only either 1) in response to
some change of state, in which case it's set to UNKNOWN and then a
callback rpc is sent to probe the real state, or b) in the handling of a
response to such a callback. UNKNOWN is therefore always a "temporary"
state, and for the other states we're happy to accept last writer wins.
So I think we're OK here.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
According to RFC 5661, the TEST_STATEID operation is not allowed to
return NFS4ERR_STALE_STATEID. In addition, RFC 5661 says:
15.1.16.5. NFS4ERR_STALE_STATEID (Error Code 10023)
A stateid generated by an earlier server instance was used. This
error is moot in NFSv4.1 because all operations that take a stateid
MUST be preceded by the SEQUENCE operation, and the earlier server
instance is detected by the session infrastructure that supports
SEQUENCE.
I triggered NFS4ERR_STALE_STATEID while testing the Linux client's
NOGRACE recovery. Bruce suggested an additional test that could be
useful to client developers.
Lastly, RFC 5661, section 18.48.3 has this:
o Special stateids are always considered invalid (they result in the
error code NFS4ERR_BAD_STATEID).
An explicit check is made for those state IDs to avoid printk noise.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Initiate a CB probe when a new connection with the correct direction is added
to a session (IFF backchannel is marked as down). Without this a
BIND_CONN_TO_SESSION has no effect on the internal backchannel state, which
causes the server to reply to every SEQUENCE op with the
SEQ4_STATUS_CB_PATH_DOWN flag set until DESTROY_SESSION.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Most frequent symptom was a BUG triggering in expire_client, with the
server locking up shortly thereafter.
Introduced by 508dc6e110 "nfsd41:
free_session/free_client must be called under the client_lock".
Cc: stable@kernel.org
Cc: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull the rest of the nfsd commits from Bruce Fields:
"... and then I cherry-picked the remainder of the patches from the
head of my previous branch"
This is the rest of the original nfsd branch, rebased without the
delegation stuff that I thought really needed to be redone.
I don't like rebasing things like this in general, but in this situation
this was the lesser of two evils.
* 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits)
nfsd4: fix, consolidate client_has_state
nfsd4: don't remove rebooted client record until confirmation
nfsd4: remove some dprintk's and a comment
nfsd4: return "real" sequence id in confirmed case
nfsd4: fix exchange_id to return confirm flag
nfsd4: clarify that renewing expired client is a bug
nfsd4: simpler ordering of setclientid_confirm checks
nfsd4: setclientid: remove pointless assignment
nfsd4: fix error return in non-matching-creds case
nfsd4: fix setclientid_confirm same_cred check
nfsd4: merge 3 setclientid cases to 2
nfsd4: pull out common code from setclientid cases
nfsd4: merge last two setclientid cases
nfsd4: setclientid/confirm comment cleanup
nfsd4: setclientid remove unnecessary terms from a logical expression
nfsd4: move rq_flavor into svc_cred
nfsd4: stricter cred comparison for setclientid/exchange_id
nfsd4: move principal name into svc_cred
nfsd4: allow removing clients not holding state
nfsd4: rearrange exchange_id logic to simplify
...
Pull nfsd update from Bruce Fields.
* 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux: (23 commits)
nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek
SUNRPC: split upcall function to extract reusable parts
nfsd: allocate id-to-name and name-to-id caches in per-net operations.
nfsd: make name-to-id cache allocated per network namespace context
nfsd: make id-to-name cache allocated per network namespace context
nfsd: pass network context to idmap init/exit functions
nfsd: allocate export and expkey caches in per-net operations.
nfsd: make expkey cache allocated per network namespace context
nfsd: make export cache allocated per network namespace context
nfsd: pass pointer to export cache down to stack wherever possible.
nfsd: pass network context to export caches init/shutdown routines
Lockd: pass network namespace to creation and destruction routines
NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches
nfsd: pass pointer to expkey cache down to stack wherever possible.
nfsd: use hash table from cache detail in nfsd export seq ops
nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops
nfsd: use exp_put() for svc_export_cache put
nfsd: use cache detail pointer from svc_export structure on cache put
nfsd: add link to owner cache detail to svc_export structure
nfsd: use passed cache_detail pointer expkey_parse()
...
Whoops: first, I reimplemented the already-existing has_resources
without noticing; second, I got the test backwards. I did pick a better
name, though. Combine the two....
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
In the NFSv4.1 client-reboot case we're currently removing the client's
previous state in exchange_id. That's wrong--we should be waiting till
the confirming create_session.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The comment is redundant, and if we really want dprintk's here they'd
probably be better in the common (check-slot_seqid) code.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The client should ignore the returned sequence_id in the case where the
CONFIRMED flag is set on an exchange_id reply--and in the unconfirmed
case "1" is always the right response. So it shouldn't actually matter
what we return here.
We could continue returning 1 just to catch clients ignoring the spec
here, but I'd rather be generous. Other things equal, returning the
existing sequence_id seems more informative.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Otherwise nfsd4_set_ex_flags writes over the return flags.
Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This can't happen:
- cl_time is zeroed only by unhash_client_locked, which is only
ever called under both the state lock and the client lock.
- every caller of renew_client() should have looked up a
(non-expired) client and then called renew_client() all
without dropping the state lock.
- the only other caller of renew_client_locked() is
release_session_client(), which first checks under the
client_lock that the cl_time is nonzero.
So make it clear that this is a bug, not something we handle. I can't
quite bring myself to make this a BUG(), though, as there are a lot of
renew_client() callers, and returning here is probably safer than a
BUG().
We'll consider making it a BUG() after some more cleanup.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The cases here divide into two main categories:
- if there's an uncomfirmed record with a matching verifier,
then this is a "normal", succesful case: we're either creating
a new client, or updating an existing one.
- otherwise, this is a weird case: a replay, or a server reboot.
Reordering to reflect that makes the code a bit more concise and the
logic a lot easier to understand.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Note CLID_INUSE is for the case where two clients are trying to use the
same client-provided long-form client identifiers. But what we're
looking at here is the server-returned shorthand client id--if those
clash there's a bug somewhere.
Fix the error return, pull the check out into common code, and do the
check unconditionally in all cases.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
New clients are created only by nfsd4_setclientid(), which always gives
any new client a unique clientid. The only exception is in the
"callback update" case, in which case it may create an unconfirmed
client with the same clientid as a confirmed client. In that case it
also checks that the confirmed client has the same credential.
Therefore, it is pointless for setclientid_confirm to check whether a
confirmed and unconfirmed client with the same clientid have matching
credentials--they're guaranteed to.
Instead, it should be checking whether the credential on the
setclientid_confirm matches either of those. Otherwise, it could be
anyone sending the setclientid_confirm. Granted, I can't see why anyone
would, but still it's probalby safer to check.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Move the rq_flavor into struct svc_cred, and use it in setclientid and
exchange_id comparisons as well.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The typical setclientid or exchange_id will probably be performed with a
credential that maps to either root or nobody, so comparing just uid's
is unlikely to be useful. So, use everything else we can get our hands
on.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Instead of keeping the principal name associated with a request in a
structure that's private to auth_gss and using an accessor function,
move it to svc_cred.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
RFC 5661 actually says we should allow an exchange_id to remove a
matching client, even if the exchange_id comes from a different
principal, *if* the victim client lacks any state.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Following rfc 5661 section 2.4.1, we can permit a 4.1 client to remove
an established 4.0 client's state.
(But we don't allow updates.)
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We mustn't allow a client to destroy another client with established
state unless it has the right credential.
And some minor cleanup.
(Note: our comparison of credentials is actually pretty bogus currently;
that will need to be fixed in another patch.)
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
debugfs read operations were returning the contents of an uninitialized u64.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Handle the st_deny_bmap in a similar fashion to the st_access_bmap. Add
accessor functions and use those instead of bare bitops.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Currently, we do this for the most part with "bare" bitops, but
eventually we'll need to expand the share mode code to handle access
and deny modes on other nodes.
In order to facilitate that code in the future, move to some generic
accessor functions. For now, these are mostly static inlines, but
eventually we'll want to move these to "real" functions that are
able to handle multi-node configurations or have a way to "swap in"
new operations to be done in lieu of or in conjunction with these
atomic bitops.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
All of the callers treat the return that way already.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
These functions are identical. Also, rename them to bmap_to_share_mode
to better reflect what they do, and have them just return the result
instead of passing in a pointer to the storage location.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
According to RFC 3530bis, the only items SETCLIENTID_CONFIRM processing
should be concerned with is the clientid, clientid verifier, and
principal. The client's IP address is not supposed to be interesting.
And, NFS4ERR_CLID_INUSE is meant only for principal mismatches.
I triggered this logic with a prototype UCS client -- one that
uses the same nfs_client_id4 string for all servers. The client
mounted our server via its IPv4, then via its IPv6 address.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The idea is to separate service destruction and per-net operations,
because these are two different things and the mix looks ugly.
Notes:
1) For NFS server this patch looks ugly (sorry for that). But these
place will be rewritten soon during NFSd containerization.
2) LockD per-net counter increase int lockd_up() was moved prior to
make_socks() to make lockd_down_net() call safe in case of error.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This new routine is responsible for service registration in a specified
network context.
The idea is to separate service creation from per-net operations.
Note also: since registering service with svc_bind() can fail, the
service will be destroyed and during destruction it will try to
unregister itself from rpcbind. In this case unregistration has to be
skipped.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The fs_location->hosts list is split on colons, but this doesn't work when
IPv6 addresses are used (they contain colons).
This patch adds the function nfsd4_encode_components_esc() to
allow the caller to specify escape characters when splitting on 'sep'.
In order to fix referrals, this patch must be used with the mountd patch
that similarly fixes IPv6 [] escaping.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Though actually this doesn't matter much, as NFSv4.0 clients are
required to treat the change attribute as opaque.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull user namespace enhancements from Eric Biederman:
"This is a course correction for the user namespace, so that we can
reach an inexpensive, maintainable, and reasonably complete
implementation.
Highlights:
- Config guards make it impossible to enable the user namespace and
code that has not been converted to be user namespace safe.
- Use of the new kuid_t type ensures the if you somehow get past the
config guards the kernel will encounter type errors if you enable
user namespaces and attempt to compile in code whose permission
checks have not been updated to be user namespace safe.
- All uids from child user namespaces are mapped into the initial
user namespace before they are processed. Removing the need to add
an additional check to see if the user namespace of the compared
uids remains the same.
- With the user namespaces compiled out the performance is as good or
better than it is today.
- For most operations absolutely nothing changes performance or
operationally with the user namespace enabled.
- The worst case performance I could come up with was timing 1
billion cache cold stat operations with the user namespace code
enabled. This went from 156s to 164s on my laptop (or 156ns to
164ns per stat operation).
- (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
Most uid/gid setting system calls treat these value specially
anyway so attempting to use -1 as a uid would likely cause
entertaining failures in userspace.
- If setuid is called with a uid that can not be mapped setuid fails.
I have looked at sendmail, login, ssh and every other program I
could think of that would call setuid and they all check for and
handle the case where setuid fails.
- If stat or a similar system call is called from a context in which
we can not map a uid we lie and return overflowuid. The LFS
experience suggests not lying and returning an error code might be
better, but the historical precedent with uids is different and I
can not think of anything that would break by lying about a uid we
can't map.
- Capabilities are localized to the current user namespace making it
safe to give the initial user in a user namespace all capabilities.
My git tree covers all of the modifications needed to convert the core
kernel and enough changes to make a system bootable to runlevel 1."
Fix up trivial conflicts due to nearby independent changes in fs/stat.c
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
userns: Silence silly gcc warning.
cred: use correct cred accessor with regards to rcu read lock
userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
userns: Convert cgroup permission checks to use uid_eq
userns: Convert tmpfs to use kuid and kgid where appropriate
userns: Convert sysfs to use kgid/kuid where appropriate
userns: Convert sysctl permission checks to use kuid and kgids.
userns: Convert proc to use kuid/kgid where appropriate
userns: Convert ext4 to user kuid/kgid where appropriate
userns: Convert ext3 to use kuid/kgid where appropriate
userns: Convert ext2 to use kuid/kgid where appropriate.
userns: Convert devpts to use kuid/kgid where appropriate
userns: Convert binary formats to use kuid/kgid where appropriate
userns: Add negative depends on entries to avoid building code that is userns unsafe
userns: signal remove unnecessary map_cred_ns
userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
userns: Convert stat to return values mapped from kuids and kgids
userns: Convert user specfied uids and gids in chown into kuids and kgid
userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
...
As a first step to converting struct cred to be all kuid_t and kgid_t
values convert the group values stored in group_info to always be
kgid_t values. Unless user namespaces are used this change should
have no effect.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Fix printk format warnings -- both items are size_t,
so use %zu to print them.
fs/nfsd/nfs4recover.c:580:3: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'size_t'
fs/nfsd/nfs4recover.c:580:3: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'unsigned int'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
They're equivalent, but SEEK_SET is more informative...
Cc: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull nfsd bugfixes from J. Bruce Fields:
"One bugfix, and one minor header fix from Jeff Layton while we're
here"
* 'for-3.4' of git://linux-nfs.org/~bfields/linux:
nfsd: include cld.h in the headers_install target
nfsd: don't fail unchecked creates of non-special files
Restore the original logics ("fail on mountpoints, negatives and in
case of fh_compose() failures"). Since commit 8177e (nfsd: clean up
readdirplus encoding) that got broken -
rv = fh_compose(fhp, exp, dchild, &cd->fh);
if (rv)
goto out;
if (!dchild->d_inode)
goto out;
rv = 0;
out:
is equivalent to
rv = fh_compose(fhp, exp, dchild, &cd->fh);
out:
and the second check has no effect whatsoever...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
->ts_id_status gets nfs errno, i.e. it's already big-endian; no need
to apply htonl() to it. Broken by commit 174568 (NFSD: Added TEST_STATEID
operation) last year...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
nfsd_open() already returns an NFS error value; only vfs_test_lock()
result needs to be fed through nfserrno(). Broken by commit 55ef12
(nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT)
three years ago...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
These functions will be called from per-net operations.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch also changes svcauth_unix_purge() function: added network namespace
as a parameter and thus loop over all networks was replaced by only one call
for ip map cache purge.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch also changes prototypes of nfsd_export_flush() and exp_rootfh():
network namespace parameter added.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This cache will be per-net soon. And it's easier to get the pointer to desired
per-net instance only once and then pass it down instead of discovering it in
every place were required.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
These functions will be called from per-net operations.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
v2: dereference of most probably already released nlm_host removed in
nlmclnt_done() and reclaimer().
These routines are called from locks reclaimer() kernel thread. This thread
works in "init_net" network context and currently relays on persence on lockd
thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay
on current network context. So let's pass corrent one into them.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
These dereferences to global static caches are redundant. They also prevents
converting these caches into per-net ones. So this patch is cleanup + precursor
of patch set,a which will make them per-net.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This cache will be per-net soon. And it's easier to get the pointer to desired
per-net instance only once and then pass it down instead of discovering it in
every place were required.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Hard-code is redundant and will prevent from making caches per net ns.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Global svc_export_cache cache is going to be replaced with per-net instance. So
prepare the ground for it.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch replaces cache_put() call for svc_export_cache by exp_put() call.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Without info about owner cache datail it won't be able to find out, which
per-net cache detail have to be.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Using of hard-coded svc_expkey_cache pointer in expkey_parse() looks redundant.
Moreover, global cache will be replaced with per-net instance soon.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
It's possible that lockd or another lock manager might still be on the
list after we call nfsd4_end_grace. If the laundromat thread runs
again at that point, then we could end up calling nfsd4_end_grace more
than once.
That's not only inefficient, but calling nfsd4_recdir_purge_old more
than once could be problematic. Fix this by adding a new global
"grace_ended" flag and use that to determine whether we've already
called nfsd4_grace_end.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Allow a v3 unchecked open of a non-regular file succeed as if it were a
lookup; typically a client in such a case will want to fall back on a
local open, so succeeding and giving it the filehandle is more useful
than failing with nfserr_exist, which makes it appear that nothing at
all exists by that name.
Similarly for v4, on an open-create, return the same errors we would on
an attempt to open a non-regular file, instead of returning
nfserr_exist.
This fixes a problem found doing a v4 open of a symlink with
O_RDONLY|O_CREAT, which resulted in the current client returning EEXIST.
Thanks also to Trond for analysis.
Cc: stable@kernel.org
Reported-by: Orion Poplawski <orion@cora.nwra.com>
Tested-by: Orion Poplawski <orion@cora.nwra.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull nfsd changes from Bruce Fields:
Highlights:
- Benny Halevy and Tigran Mkrtchyan implemented some more 4.1 features,
moving us closer to a complete 4.1 implementation.
- Bernd Schubert fixed a long-standing problem with readdir cookies on
ext2/3/4.
- Jeff Layton performed a long-overdue overhaul of the server reboot
recovery code which will allow us to deprecate the current code (a
rather unusual user of the vfs), and give us some needed flexibility
for further improvements.
- Like the client, we now support numeric uid's and gid's in the
auth_sys case, allowing easier upgrades from NFSv2/v3 to v4.x.
Plus miscellaneous bugfixes and cleanup.
Thanks to everyone!
There are also some delegation fixes waiting on vfs review that I
suppose will have to wait for 3.5. With that done I think we'll finally
turn off the "EXPERIMENTAL" dependency for v4 (though that's mostly
symbolic as it's been on by default in distro's for a while).
And the list of 4.1 todo's should be achievable for 3.5 as well:
http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues
though we may still want a bit more experience with it before turning it
on by default.
* 'for-3.4' of git://linux-nfs.org/~bfields/linux: (55 commits)
nfsd: only register cld pipe notifier when CONFIG_NFSD_V4 is enabled
nfsd4: use auth_unix unconditionally on backchannel
nfsd: fix NULL pointer dereference in cld_pipe_downcall
nfsd4: memory corruption in numeric_name_to_id()
sunrpc: skip portmap calls on sessions backchannel
nfsd4: allow numeric idmapping
nfsd: don't allow legacy client tracker init for anything but init_net
nfsd: add notifier to handle mount/unmount of rpc_pipefs sb
nfsd: add the infrastructure to handle the cld upcall
nfsd: add a header describing upcall to nfsdcld
nfsd: add a per-net-namespace struct for nfsd
sunrpc: create nfsd dir in rpc_pipefs
nfsd: add nfsd4_client_tracking_ops struct and a way to set it
nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
NFSD: Fix nfs4_verifier memory alignment
NFSD: Fix warnings when NFSD_DEBUG is not defined
nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)
nfsd: rename 'int access' to 'int may_flags' in nfsd_open()
ext4: return 32/64-bit dir name hash according to usage type
fs: add new FMODE flags: FMODE_32bithash and FMODE_64bithash
...
Otherwise, we get a warning or error similar to this when building with
CONFIG_NFSD_V4 disabled:
ERROR: "nfsd4_cld_block" [fs/nfsd/nfsd.ko] undefined!
Fix this by wrapping the calls to rpc_pipefs_notifier_register and
..._unregister in another function and providing no-op replacements
when CONFIG_NFSD_V4 is disabled.
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This isn't actually correct, but it works with the Linux client, and
agrees with the behavior we used to have before commit 80fc015bdf.
Later patches will implement the spec-mandated behavior (which is to use
the security parameters explicitly given by the client in create_session
or backchannel_ctl).
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
If we find that "cup" is NULL in this case, then we obviously don't
want to dereference it. What we really want to print in this case
is the xid that we copied off earlier.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
"id" is type is a uid_t (32 bits) but on 64 bit systems strict_strtoul()
modifies 64 bits of data. We should use kstrtouint() instead.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Mimic the client side by providing a module parameter that turns off
idmapping in the auth_sys case, for backwards compatibility with NFSv2
and NFSv3.
Unlike in the client case, we don't have any way to negotiate, since the
client can return an error to us if it doesn't like the id that we
return to it in (for example) a getattr call.
However, it has always been possible for servers to return numeric id's,
and as far as we're aware clients have always been able to handle them.
Also, in the auth_sys case clients already need to have numeric id's the
same between client and server.
Therefore we believe it's safe to default this to on; but the module
parameter is available to return to previous behavior if this proves to
be a problem in some unexpected setup.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This code isn't set up for containers, so don't allow it to be
used for anything but init_net.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
In the event that rpc_pipefs isn't mounted when nfsd starts, we
must register a notifier to handle creating the dentry once it
is mounted, and to remove the dentry on unmount.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
...and add a mechanism for switching between the "legacy" tracker and
the new one. The decision is made by looking to see whether the
v4recoverydir exists. If it does, then the legacy client tracker is
used.
If it's not, then the kernel will create a "cld" pipe in rpc_pipefs.
That pipe is used to talk to a daemon for handling the upcall.
Most of the data structures for the new client tracker are handled on a
per-namespace basis, so this upcall should be essentially ready for
containerization. For now however, nfsd just starts it by calling the
initialization and exit functions for init_net.
I'm making the assumption that at some point in the future we'll be able
to determine the net namespace from the nfs4_client. Until then, this
patch hardcodes init_net in those places. I've sprinkled some "FIXME"
comments around that code to attempt to make it clear where we'll need
to fix that up later.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Eventually, we'll need this when nfsd gets containerized fully. For
now, create a struct on a per-net-namespace basis that will just hold
a pointer to the cld_net structure. That struct will hold all of the
per-net data that we need for the cld tracker.
Eventually we can add other pernet objects to struct nfsd_net.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Abstract out the mechanism that we use to track clients into a set of
client name tracking functions.
This gives us a mechanism to plug in a new set of client tracking
functions without disturbing the callers. It also gives us a way to
decide on what tracking scheme to use at runtime.
For now, this just looks like pointless abstraction, but later we'll
add a new alternate scheme for tracking clients on stable storage.
Note too that this patch anticipates the eventual containerization
of this code by passing in struct net pointers in places. No attempt
is made to containerize the legacy client tracker however.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We'll need a way to flag the nfs4_client as already being recorded on
stable storage so that we don't continually upcall. Currently, that's
recorded in the cl_firststate field of the client struct. Using an
entire u32 to store a flag is rather wasteful though.
The cl_cb_flags field is only using 2 bits right now, so repurpose that
to a generic flags field. Rename NFSD4_CLIENT_KILL to
NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback
flags. Add a mask that we can use for existing checks that look to see
whether any flags are set, so that the new flags don't interfere.
Convert all references to cl_firstate to the NFSD4_CLIENT_STABLE flag,
and add a new NFSD4_CLIENT_RECLAIM_COMPLETE flag.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
New features include:
- Add NFS client support for containers.
This should enable most of the necessary functionality, including
lockd support, and support for rpc.statd, NFSv4 idmapper and
RPCSEC_GSS upcalls into the correct network namespace from
which the mount system call was issued.
- NFSv4 idmapper scalability improvements
Base the idmapper cache on the keyring interface to allow concurrent
access to idmapper entries. Start the process of migrating users from
the single-threaded daemon-based approach to the multi-threaded
request-key based approach.
- NFSv4.1 implementation id.
Allows the NFSv4.1 client and server to mutually identify each other
for logging and debugging purposes.
- Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
having to use the more counterintuitive 'vers=4,minorversion=1'.
- SUNRPC tracepoints.
Start the process of adding tracepoints in order to improve debugging
of the RPC layer.
- pNFS object layout support for autologin.
Important bugfixes include:
- Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail
to wake up all tasks when applied to priority waitqueues.
- Ensure that we handle read delegations correctly, when we try to
truncate a file.
- A number of fixes for NFSv4 state manager loops (mostly to do with
delegation recovery).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJPalZbAAoJEGcL54qWCgDyCi4P+QHcmzQhJO7HWx3Pzjs67bFT
xMSYaKHGWS4AJKUBVl5OKBxUExfrMHBNbElV3IKUIwBlDx8RVtnwfptKSe146iki
dn4TrRO5es8nmI4hRDcGMlzJDZq4y0Qg//qiUFmojiNW/Avw0ljfMoVUejJJ09FV
oeDk4EGtcxkEyH+g48ZjYbyspRnG8qtD3atf70Z3lYE0ELdG/B5Dyzw1RDrA5p73
xJX3lqy8p/4ROzw/dmNoxdAXOrr3Q4/T58Bvp/lUglPy/EHyPmWzFoH0MU0C/PFu
5VnAl6QDbNCTcIw9FvJlX/mIyErpNG9eKzUskUc9L9SA+B+J/i4rIap4KATRN3nH
7QhE5qUacPuJnvxml7MPmlQTuft3fkAQ7NhKIWrbRi1QS9FmJC5NxctIb8loqlFn
yIXdKeLfMshB+NyuFS9uzStX7SmV3eMgVd+5ZxRjYxm+PKJLw2KXeudArL6M5mHK
3QeKZpqwaYQ3RfaTNpvAp0doiXHCO5UbWfI0Pe8xQs/QcMCNReffqV2G4IJKFAu6
WpoN2UDQC9LCBifLw2nS7kku8+ZVXLQU8OC1NVl3TG15xD9cNLXuk3/y5llPGq4O
odo52uLFpJohbDaHMj5RTKOfchTQCm2iyuVmxZEeAySypMSiAXmW7COSKHs/HxI1
VBm+EI00Pvmm5+fUjIlp
=LuHE
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates for Linux 3.4 from Trond Myklebust:
"New features include:
- Add NFS client support for containers.
This should enable most of the necessary functionality, including
lockd support, and support for rpc.statd, NFSv4 idmapper and
RPCSEC_GSS upcalls into the correct network namespace from which
the mount system call was issued.
- NFSv4 idmapper scalability improvements
Base the idmapper cache on the keyring interface to allow
concurrent access to idmapper entries. Start the process of
migrating users from the single-threaded daemon-based approach to
the multi-threaded request-key based approach.
- NFSv4.1 implementation id.
Allows the NFSv4.1 client and server to mutually identify each
other for logging and debugging purposes.
- Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
having to use the more counterintuitive 'vers=4,minorversion=1'.
- SUNRPC tracepoints.
Start the process of adding tracepoints in order to improve
debugging of the RPC layer.
- pNFS object layout support for autologin.
Important bugfixes include:
- Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to
fail to wake up all tasks when applied to priority waitqueues.
- Ensure that we handle read delegations correctly, when we try to
truncate a file.
- A number of fixes for NFSv4 state manager loops (mostly to do with
delegation recovery)."
* tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits)
NFS: fix sb->s_id in nfs debug prints
xprtrdma: Remove assumption that each segment is <= PAGE_SIZE
xprtrdma: The transport should not bug-check when a dup reply is received
pnfs-obj: autologin: Add support for protocol autologin
NFS: Remove nfs4_setup_sequence from generic rename code
NFS: Remove nfs4_setup_sequence from generic unlink code
NFS: Remove nfs4_setup_sequence from generic read code
NFS: Remove nfs4_setup_sequence from generic write code
NFS: Fix more NFS debug related build warnings
SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined
nfs: non void functions must return a value
SUNRPC: Kill compiler warning when RPC_DEBUG is unset
SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG
NFS: Use cond_resched_lock() to reduce latencies in the commit scans
NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner
NFS: ncommit count is being double decremented
SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
Try using machine credentials for RENEW calls
NFSv4.1: Fix a few issues in filelayout_commit_pagelist
NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code
...