Commit Graph

665 Commits

Author SHA1 Message Date
Linus Torvalds 0ff08ba5d0 Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from Bruce Fields:
 "Changes this time include:

   - 4.1 enabled on the server by default: the last 4.1-specific issues
     I know of are fixed, so we're not going to find the rest of the
     bugs without more exposure.
   - Experimental support for NFSv4.2 MAC Labeling (to allow running
     selinux over NFS), from Dave Quigley.
   - Fixes for some delicate cache/upcall races that could cause rare
     server hangs; thanks to Neil Brown and Bodo Stroesser for extreme
     debugging persistence.
   - Fixes for some bugs found at the recent NFS bakeathon, mostly v4
     and v4.1-specific, but also a generic bug handling fragmented rpc
     calls"

* 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits)
  nfsd4: support minorversion 1 by default
  nfsd4: allow destroy_session over destroyed session
  svcrpc: fix failures to handle -1 uid's
  sunrpc: Don't schedule an upcall on a replaced cache entry.
  net/sunrpc: xpt_auth_cache should be ignored when expired.
  sunrpc/cache: ensure items removed from cache do not have pending upcalls.
  sunrpc/cache: use cache_fresh_unlocked consistently and correctly.
  sunrpc/cache: remove races with queuing an upcall.
  nfsd4: return delegation immediately if lease fails
  nfsd4: do not throw away 4.1 lock state on last unlock
  nfsd4: delegation-based open reclaims should bypass permissions
  svcrpc: don't error out on small tcp fragment
  svcrpc: fix handling of too-short rpc's
  nfsd4: minor read_buf cleanup
  nfsd4: fix decoding of compounds across page boundaries
  nfsd4: clean up nfs4_open_delegation
  NFSD: Don't give out read delegations on creates
  nfsd4: allow client to send no cb_sec flavors
  nfsd4: fail attempts to request gss on the backchannel
  nfsd4: implement minimal SP4_MACH_CRED
  ...
2013-07-11 10:17:13 -07:00
J. Bruce Fields f0f51f5cdd nfsd4: allow destroy_session over destroyed session
RFC 5661 allows a client to destroy a session using a compound
associated with the destroyed session, as long as the DESTROY_SESSION op
is the last op of the compound.

We attempt to allow this, but testing against a Solaris client (which
does destroy sessions in this way) showed that we were failing the
DESTROY_SESSION with NFS4ERR_DELAY, because we assumed the reference
count on the session (held by us) represented another rpc in progress
over this session.

Fix this by noting that in this case the expected reference count is 1,
not 0.

Also, note as long as the session holds a reference to the compound
we're destroying, we can't free it here--instead, delay the free till
the final put in nfs4svc_encode_compoundres.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-08 19:46:38 -04:00
J. Bruce Fields d08d32e6e5 nfsd4: return delegation immediately if lease fails
This case shouldn't happen--the administrator shouldn't really allow
other applications access to the export until clients have had the
chance to reclaim their state--but if it does then we should set the
"return this lease immediately" bit on the reply.  That still leaves
some small races, but it's the best the protocol allows us to do in the
case a lease is ripped out from under us....

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:32:07 -04:00
J. Bruce Fields 0a262ffb75 nfsd4: do not throw away 4.1 lock state on last unlock
This reverts commit eb2099f31b "nfsd4:
release lockowners on last unlock in 4.1 case".  Trond identified
language in rfc 5661 section 8.2.4 which forbids this behavior:

	Stateids associated with byte-range locks are an exception.
	They remain valid even if a LOCKU frees all remaining locks, so
	long as the open file with which they are associated remains
	open, unless the client frees the stateids via the FREE_STATEID
	operation.

And bakeathon 2013 testing found a 4.1 freebsd client was getting an
incorrect BAD_STATEID return from a FREE_STATEID in the above situation
and then failing.

The spec language honestly was probably a mistake but at this point with
implementations already following it we're probably stuck with that.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:32:06 -04:00
J. Bruce Fields 99c415156c nfsd4: clean up nfs4_open_delegation
The nfs4_open_delegation logic is unecessarily baroque.

Also stop pretending we support write delegations in several places.

Some day we will support write delegations, but when that happens adding
back in these flag parameters will be the easy part.  For now they're
just confusing.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:07 -04:00
Steve Dickson 9a0590aec3 NFSD: Don't give out read delegations on creates
When an exclusive create is done with the mode bits
set (aka open(testfile, O_CREAT | O_EXCL, 0777)) this
causes a OPEN op followed by a SETATTR op. When a
read delegation is given in the OPEN, it causes
the SETATTR to delay with EAGAIN until the
delegation is recalled.

This patch caused exclusive creates to give out
a write delegation (which turn into no delegation)
which allows the SETATTR seamlessly succeed.

Signed-off-by: Steve Dickson <steved@redhat.com>
[bfields: do this for any CREATE, not just exclusive; comment]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:07 -04:00
J. Bruce Fields b78724b705 nfsd4: fail attempts to request gss on the backchannel
We don't support gss on the backchannel.  We should state that fact up
front rather than just letting things continue and later making the
client try to figure out why the backchannel isn't working.

Trond suggested instead returning NFS4ERR_NOENT.  I think it would be
tricky for the client to distinguish between the case "I don't support
gss on the backchannel" and "I can't find that in my cache, please
create another context and try that instead", and I'd prefer something
that currently doesn't have any other meaning for this operation, hence
the (somewhat arbitrary) NFS4ERR_ENCR_ALG_UNSUPP.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:06 -04:00
J. Bruce Fields 57266a6e91 nfsd4: implement minimal SP4_MACH_CRED
Do a minimal SP4_MACH_CRED implementation suggested by Trond, ignoring
the client-provided spo_must_* arrays and just enforcing credential
checks for the minimum required operations.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:06 -04:00
J. Bruce Fields 0dc1531aca svcrpc: store gss mech in svc_cred
Store a pointer to the gss mechanism used in the rq_cred and cl_cred.
This will make it easier to enforce SP4_MACH_CRED, which needs to
compare the mechanism used on the exchange_id with that used on
protected operations.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:06 -04:00
Jeff Layton 1c8c601a8c locks: protect most of the file_lock handling with i_lock
Having a global lock that protects all of this code is a clear
scalability problem. Instead of doing that, move most of the code to be
protected by the i_lock instead. The exceptions are the global lists
that the ->fl_link sits on, and the ->fl_block list.

->fl_link is what connects these structures to the
global lists, so we must ensure that we hold those locks when iterating
over or updating these lists.

Furthermore, sound deadlock detection requires that we hold the
blocked_list state steady while checking for loops. We also must ensure
that the search and update to the list are atomic.

For the checking and insertion side of the blocked_list, push the
acquisition of the global lock into __posix_lock_file and ensure that
checking and update of the  blocked_list is done without dropping the
lock in between.

On the removal side, when waking up blocked lock waiters, take the
global lock before walking the blocked list and dequeue the waiters from
the global list prior to removal from the fl_block list.

With this, deadlock detection should be race free while we minimize
excessive file_lock_lock thrashing.

Finally, in order to avoid a lock inversion problem when handling
/proc/locks output we must ensure that manipulations of the fl_block
list are also protected by the file_lock_lock.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:42 +04:00
Jim Rees 1a9357f443 nfsd: avoid undefined signed overflow
In C, signed integer overflow results in undefined behavior, but unsigned
overflow wraps around. So do the subtraction first, then cast to signed.

Reported-by: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-21 11:02:03 -04:00
J. Bruce Fields 4f540e29dc nfsd4: store correct client minorversion for >=4.2
This code assumes that any client using exchange_id is using NFSv4.1,
but with the introduction of 4.2 that will no longer true.

This main effect of this is that client callbacks will use the same
minorversion as that used on the exchange_id.

Note that clients are forbidden from mixing 4.1 and 4.2 compounds.  (See
rfc 5661, section 2.7, #13: "A client MUST NOT attempt to use a stateid,
filehandle, or similar returned object from the COMPOUND procedure with
minor version X for another COMPOUND procedure with minor version Y,
where X != Y.")  However, we do not currently attempt to enforce this
except in the case of mixing zero minor version with non-zero minor
versions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-13 10:11:45 -04:00
Linus Torvalds 1db772216f Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from J Bruce Fields:
 "Highlights include:

   - Some more DRC cleanup and performance work from Jeff Layton

   - A gss-proxy upcall from Simo Sorce: currently krb5 mounts to the
     server using credentials from Active Directory often fail due to
     limitations of the svcgssd upcall interface.  This replacement
     lifts those limitations.  The existing upcall is still supported
     for backwards compatibility.

   - More NFSv4.1 support: at this point, if a user with a current
     client who upgrades from 4.0 to 4.1 should see no regressions.  In
     theory we do everything a 4.1 server is required to do.  Patches
     for a couple minor exceptions are ready for 3.11, and with those
     and some more testing I'd like to turn 4.1 on by default in 3.11."

Fix up semantic conflict as per Stephen Rothwell and linux-next:

Commit 030d794bf4 ("SUNRPC: Use gssproxy upcall for server RPCGSS
authentication") adds two new users of "PDE(inode)->data", but we're
supposed to use "PDE_DATA(inode)" instead since commit d9dda78bad
("procfs: new helper - PDE_DATA(inode)").

The old PDE() macro is no longer available since commit c30480b92c
("proc: Make the PROC_I() and PDE() macros internal to procfs")

* 'for-3.10' of git://linux-nfs.org/~bfields/linux: (60 commits)
  NFSD: SECINFO doesn't handle unsupported pseudoflavors correctly
  NFSD: Simplify GSS flavor encoding in nfsd4_do_encode_secinfo()
  nfsd: make symbol nfsd_reply_cache_shrinker static
  svcauth_gss: fix error return code in rsc_parse()
  nfsd4: don't remap EISDIR errors in rename
  svcrpc: fix gss-proxy to respect user namespaces
  SUNRPC: gssp_procedures[] can be static
  SUNRPC: define {create,destroy}_use_gss_proxy_proc_entry in !PROC case
  nfsd4: better error return to indicate SSV non-support
  nfsd: fix EXDEV checking in rename
  SUNRPC: Use gssproxy upcall for server RPCGSS authentication.
  SUNRPC: Add RPC based upcall mechanism for RPCGSS auth
  SUNRPC: conditionally return endtime from import_sec_context
  SUNRPC: allow disabling idle timeout
  SUNRPC: attempt AF_LOCAL connect on setup
  nfsd: Decode and send 64bit time values
  nfsd4: put_client_renew_locked can be static
  nfsd4: remove unused macro
  nfsd4: remove some useless code
  nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED
  ...
2013-05-03 10:59:39 -07:00
Jeff Layton 398c33aaa4 nfsd: convert nfs4_alloc_stid() to use idr_alloc_cyclic()
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 18:28:41 -07:00
J. Bruce Fields dd30333cf5 nfsd4: better error return to indicate SSV non-support
As 4.1 becomes less experimental and SSV still isn't implemented, we
have to admit it's not going to be, and return some sensible error
rather than just saying "our server's broken".  Discussion in the ietf
group hasn't turned up any objections to using NFS4ERR_ENC_ALG_UNSUPP
for that purpose.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 16:18:15 -04:00
Fengguang Wu ba138435d1 nfsd4: put_client_renew_locked can be static
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 22:15:00 -04:00
fanchaoting 53584f6652 nfsd4: remove some useless code
The "list_empty(&oo->oo_owner.so_stateids)" is aways true, so remove it.

Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 10:59:31 -04:00
J. Bruce Fields 3bd64a5ba1 nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED
A 4.1 server must notify a client that has had any state revoked using
the SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag.  The client can figure
out exactly which state is the problem using CHECK_STATEID and then free
it using FREE_STATEID.  The status flag will be unset once all such
revoked stateids are freed.

Our server's only recallable state is delegations.  So we keep with each
4.1 client a list of delegations that have timed out and been recalled,
but haven't yet been freed by FREE_STATEID.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 10:59:30 -04:00
J. Bruce Fields 23340032e6 nfsd4: clean up validate_stateid
The logic here is better expressed with a switch statement.

While we're here, CLOSED stateids (or stateids of an unkown type--which
would indicate a server bug) should probably return nfserr_bad_stateid,
though this behavior shouldn't affect any non-buggy client.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 17:42:28 -04:00
J. Bruce Fields 06b332a522 nfsd4: check backchannel attributes on create_session
Make sure the client gives us an adequate backchannel.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 16:53:56 -04:00
J. Bruce Fields 55c760cfc4 nfsd4: fix forechannel attribute negotiation
Negotiation of the 4.1 session forechannel attributes is a mess.  Fix:

	- Move it all into check_forechannel_attrs instead of spreading
	  it between that, alloc_session, and init_forechannel_attrs.
	- set a minimum "slotsize" so that our drc memory limits apply
	  even for small maxresponsesize_cached.  This also fixes some
	  bugs when slotsize becomes <= 0.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 16:43:44 -04:00
J. Bruce Fields 373cd4098a nfsd4: cleanup check_forechannel_attrs
Pass this struct by reference, not by value, and return an error instead
of a boolean to allow for future additions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 15:49:50 -04:00
J. Bruce Fields 0c7c3e67ab nfsd4: don't close read-write opens too soon
Don't actually close any opens until we don't need them at all.

This means being left with write access when it's not really necessary,
but that's better than putting a file that might still have posix locks
held on it, as we have been.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:57 -04:00
J. Bruce Fields eb2099f31b nfsd4: release lockowners on last unlock in 4.1 case
In the 4.1 case we're supposed to release lockowners as soon as they're
no longer used.

It would probably be more efficient to reference count them, but that's
slightly fiddly due to the need to have callbacks from locks.c to take
into account lock merging and splitting.

For most cases just scanning the inode's lock list on unlock for
matching locks will be sufficient.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:56 -04:00
J. Bruce Fields 3d74e6a5b6 nfsd4: no need for replay_owner in sessions case
The replay_owner will never be used in the sessions case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:55 -04:00
J. Bruce Fields c383747ef6 nfsd4: remove some redundant comments
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:54 -04:00
Wei Yongjun 2c44a23471 nfsd: use kmem_cache_free() instead of kfree()
memory allocated by kmem_cache_alloc() should be freed using
kmem_cache_free(), not kfree().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:47 -04:00
J. Bruce Fields 9411b1d4c7 nfsd4: cleanup handling of nfsv4.0 closed stateid's
Closed stateid's are kept around a little while to handle close replays
in the 4.0 case.  So we stash them in the last-used stateid in the
oo_last_closed_stateid field of the open owner.  We can free that in
encode_seqid_op_tail once the seqid on the open owner is next
incremented.  But we don't want to do that on the close itself; so we
set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the
first time through encode_seqid_op_tail, then when we see that flag set
next time we free it.

This is unnecessarily baroque.

Instead, just move the logic that increments the seqid out of the xdr
code and into the operation code itself.

The justification given for the current placement is that we need to
wait till the last minute to be sure we know whether the status is a
sequence-id-mutating error or not, but examination of the code shows
that can't actually happen.

Reported-by: Yanchuan Nian <ycnian@gmail.com>
Tested-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-08 09:55:32 -04:00
J. Bruce Fields 41d22663cb nfsd4: remove unused nfs4_check_deleg argument
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-04 13:25:17 -04:00
J. Bruce Fields e8c69d17d1 nfsd4: make del_recall_lru per-network-namespace
If nothing else this simplifies the nfs4_state_shutdown_net logic a tad.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-04 13:25:16 -04:00
J. Bruce Fields 68a3396178 nfsd4: shut down more of delegation earlier
Once we've unhashed the delegation, it's only hanging around for the
benefit of an oustanding recall, which only needs the encoded
filehandle, stateid, and dl_retries counter.  No point keeping the file
around any longer, or keeping it hashed.

This also fixes a race: calls to idr_remove should really be serialized
by the caller, but the nfs4_put_delegation call from the callback code
isn't taking the state lock.

(Better might be to cancel the callback before destroying the
delegation, and remove any need for reference counting--but I don't see
an easy way to cancel an rpc call.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-04 13:25:15 -04:00
Jeff Layton 89876f8c0d nfsd: convert the file_hashtbl to a hlist
We only ever traverse the hash chains in the forward direction, so a
double pointer list head isn't really necessary.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 15:11:04 -04:00
J. Bruce Fields 66b2b9b2b0 nfsd4: don't destroy in-use session
This changes session destruction to be similar to client destruction in
that attempts to destroy a session while in use (which should be rare
corner cases) result in DELAY.  This simplifies things somewhat and
helps meet a coming 4.2 requirement.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:40 -04:00
J. Bruce Fields 221a687669 nfsd4: don't destroy in-use clients
When a setclientid_confirm or create_session confirms a client after a
client reboot, it also destroys any previous state held by that client.

The shutdown of that previous state must be careful not to free the
client out from under threads processing other requests that refer to
the client.

This is a particular problem in the NFSv4.1 case when we hold a
reference to a session (hence a client) throughout compound processing.

The server attempts to handle this by unhashing the client at the time
it's destroyed, then delaying the final free to the end.  But this still
leaves some races in the current code.

I believe it's simpler just to fail the attempt to destroy the client by
returning NFS4ERR_DELAY.  This is a case that should never happen
anyway.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:39 -04:00
J. Bruce Fields 4f6e6c1773 nfsd4: simplify bind_conn_to_session locking
The locking here is very fiddly, and there's no reason for us to be
setting cstate->session, since this is the only op in the compound.
Let's just take the state lock and drop the reference counting.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:39 -04:00
J. Bruce Fields abcdff09a0 nfsd4: fix destroy_session race
destroy_session uses the session and client without continuously holding
any reference or locks.

Put the whole thing under the state lock for now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:38 -04:00
J. Bruce Fields bfa85e83a8 nfsd4: clientid lookup cleanup
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:37 -04:00
J. Bruce Fields c0293b0131 nfsd4: destroy_clientid simplification
I'm not sure what the check for clientid expiry was meant to do here.

The check for a matching session is redundant given the previous check
for state: a client without state is, in particular, a client without
sessions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:36 -04:00
J. Bruce Fields 1ca507920d nfsd4: remove some dprintk's
E.g. printk's that just report the return value from an op are
uninteresting as we already do that in the main proc_compound loop.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:36 -04:00
J. Bruce Fields 0eb6f20aa5 nfsd4: STALE_STATEID cleanup
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:35 -04:00
J. Bruce Fields 78389046f7 nfsd4: warn on odd create_session state
This should never happen.

(Note: the comparable case in setclientid_confirm *can* happen, since
updating a client record can result in both confirmed and unconfirmed
records with the same clientid.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:34 -04:00
ycnian@gmail.com 491402a787 nfsd: fix bug on nfs4 stateid deallocation
NFS4_OO_PURGE_CLOSE is not handled properly. To avoid memory leak, nfs4
stateid which is pointed by oo_last_closed_stid is freed in nfsd4_close(),
but NFS4_OO_PURGE_CLOSE isn't cleared meanwhile. So the stateid released in
THIS close procedure may be freed immediately in the coming encoding function.
Sorry that Signed-off-by was forgotten in last version.

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:34 -04:00
J. Bruce Fields 2e4b7239a6 nfsd4: fix use-after-free of 4.1 client on connection loss
Once we drop the lock here there's nothing keeping the client around:
the only lock still held is the xpt_lock on this socket, but this socket
no longer has any connection with the client so there's no way for other
code to know we're still using the client.

The solution is simple: all nfsd4_probe_callback does is set a few
variables and queue some work, so there's no reason we can't just keep
it under the lock.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:32 -04:00
J. Bruce Fields b0a9d3ab57 nfsd4: fix race on client shutdown
Dropping the session's reference count after the client's means we leave
a window where the session's se_client pointer is NULL.  An xpt_user
callback that encounters such a session may then crash:

[  303.956011] BUG: unable to handle kernel NULL pointer dereference at 0000000000000318
[  303.959061] IP: [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
[  303.959061] PGD 37811067 PUD 3d498067 PMD 0
[  303.959061] Oops: 0002 [#8] PREEMPT SMP
[  303.959061] Modules linked in: md5 nfsd auth_rpcgss nfs_acl snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc microcode psmouse snd_timer serio_raw pcspkr evdev snd soundcore i2c_piix4 i2c_core intel_agp intel_gtt processor button nfs lockd sunrpc fscache ata_generic pata_acpi ata_piix uhci_hcd libata btrfs usbcore usb_common crc32c scsi_mod libcrc32c zlib_deflate floppy virtio_balloon virtio_net virtio_pci virtio_blk virtio_ring virtio
[  303.959061] CPU 0
[  303.959061] Pid: 264, comm: nfsd Tainted: G      D      3.8.0-ARCH+ #156 Bochs Bochs
[  303.959061] RIP: 0010:[<ffffffff81481a8e>]  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
[  303.959061] RSP: 0018:ffff880037877dd8  EFLAGS: 00010202
[  303.959061] RAX: 0000000000000100 RBX: ffff880037a2b698 RCX: ffff88003d879278
[  303.959061] RDX: ffff88003d879278 RSI: dead000000100100 RDI: 0000000000000318
[  303.959061] RBP: ffff880037877dd8 R08: ffff88003c5a0f00 R09: 0000000000000002
[  303.959061] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  303.959061] R13: 0000000000000318 R14: ffff880037a2b680 R15: ffff88003c1cbe00
[  303.959061] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[  303.959061] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  303.959061] CR2: 0000000000000318 CR3: 000000003d49c000 CR4: 00000000000006f0
[  303.959061] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  303.959061] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  303.959061] Process nfsd (pid: 264, threadinfo ffff880037876000, task ffff88003c1fd0a0)
[  303.959061] Stack:
[  303.959061]  ffff880037877e08 ffffffffa03772ec ffff88003d879000 ffff88003d879278
[  303.959061]  ffff88003d879080 0000000000000000 ffff880037877e38 ffffffffa0222a1f
[  303.959061]  0000000000107ac0 ffff88003c22e000 ffff88003d879000 ffff88003c1cbe00
[  303.959061] Call Trace:
[  303.959061]  [<ffffffffa03772ec>] nfsd4_conn_lost+0x3c/0xa0 [nfsd]
[  303.959061]  [<ffffffffa0222a1f>] svc_delete_xprt+0x10f/0x180 [sunrpc]
[  303.959061]  [<ffffffffa0223d96>] svc_recv+0xe6/0x580 [sunrpc]
[  303.959061]  [<ffffffffa03587c5>] nfsd+0xb5/0x140 [nfsd]
[  303.959061]  [<ffffffffa0358710>] ? nfsd_destroy+0x90/0x90 [nfsd]
[  303.959061]  [<ffffffff8107ae00>] kthread+0xc0/0xd0
[  303.959061]  [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pte_at+0x50/0x100
[  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
[  303.959061]  [<ffffffff814898ec>] ret_from_fork+0x7c/0xb0
[  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
[  303.959061] Code: ff ff 5d c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 65 48 8b 04 25 f0 c6 00 00 48 89 e5 83 80 44 e0 ff ff 01 b8 00 01 00 00 <3e> 66 0f c1 07 0f b6 d4 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f
[  303.959061] RIP  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
[  303.959061]  RSP <ffff880037877dd8>
[  303.959061] CR2: 0000000000000318
[  304.001218] ---[ end trace 2d809cd4a7931f5a ]---
[  304.001903] note: nfsd[264] exited with preempt_count 2

Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:31 -04:00
Tejun Heo ebd6c70714 nfsd: convert to idr_alloc()
idr_get_new*() and friends are about to be deprecated.  Convert to the
new idr_alloc() interface.

Only compile-tested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Tested-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13 15:21:45 -07:00
Tejun Heo 801cb2d62d nfsd: remove unused get_new_stid()
get_new_stid() is no longer used since commit 3abdb60712 ("nfsd4:
simplify idr allocation").  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13 15:21:45 -07:00
Linus Torvalds b6669737d3 Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from J Bruce Fields:
 "Miscellaneous bugfixes, plus:

   - An overhaul of the DRC cache by Jeff Layton.  The main effect is
     just to make it larger.  This decreases the chances of intermittent
     errors especially in the UDP case.  But we'll need to watch for any
     reports of performance regressions.

   - Containerized nfsd: with some limitations, we now support
     per-container nfs-service, thanks to extensive work from Stanislav
     Kinsbursky over the last year."

Some notes about conflicts, since there were *two* non-data semantic
conflicts here:

 - idr_remove_all() had been added by a memory leak fix, but has since
   become deprecated since idr_destroy() does it for us now.

 - xs_local_connect() had been added by this branch to make AF_LOCAL
   connections be synchronous, but in the meantime Trond had changed the
   calling convention in order to avoid a RCU dereference.

There were a couple of more obvious actual source-level conflicts due to
the hlist traversal changes and one just due to code changes next to
each other, but those were trivial.

* 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
  SUNRPC: make AF_LOCAL connect synchronous
  nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
  svcrpc: fix rpc server shutdown races
  svcrpc: make svc_age_temp_xprts enqueue under sv_lock
  lockd: nlmclnt_reclaim(): avoid stack overflow
  nfsd: enable NFSv4 state in containers
  nfsd: disable usermode helper client tracker in container
  nfsd: use proper net while reading "exports" file
  nfsd: containerize NFSd filesystem
  nfsd: fix comments on nfsd_cache_lookup
  SUNRPC: move cache_detail->cache_request callback call to cache_read()
  SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
  SUNRPC: rework cache upcall logic
  SUNRPC: introduce cache_detail->cache_request callback
  NFS: simplify and clean cache library
  NFS: use SUNRPC cache creation and destruction helper for DNS cache
  nfsd4: free_stid can be static
  nfsd: keep a checksum of the first 256 bytes of request
  sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
  sunrpc: fix comment in struct xdr_buf definition
  ...
2013-02-28 18:02:55 -08:00
Linus Torvalds 94f2f14234 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace and namespace infrastructure changes from Eric W Biederman:
 "This set of changes starts with a few small enhnacements to the user
  namespace.  reboot support, allowing more arbitrary mappings, and
  support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the
  user namespace root.

  I do my best to document that if you care about limiting your
  unprivileged users that when you have the user namespace support
  enabled you will need to enable memory control groups.

  There is a minor bug fix to prevent overflowing the stack if someone
  creates way too many user namespaces.

  The bulk of the changes are a continuation of the kuid/kgid push down
  work through the filesystems.  These changes make using uids and gids
  typesafe which ensures that these filesystems are safe to use when
  multiple user namespaces are in use.  The filesystems converted for
  3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs.  The
  changes for these filesystems were a little more involved so I split
  the changes into smaller hopefully obviously correct changes.

  XFS is the only filesystem that remains.  I was hoping I could get
  that in this release so that user namespace support would be enabled
  with an allyesconfig or an allmodconfig but it looks like the xfs
  changes need another couple of days before it they are ready."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits)
  cifs: Enable building with user namespaces enabled.
  cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t
  cifs: Convert struct cifs_sb_info to use kuids and kgids
  cifs: Modify struct smb_vol to use kuids and kgids
  cifs: Convert struct cifsFileInfo to use a kuid
  cifs: Convert struct cifs_fattr to use kuid and kgids
  cifs: Convert struct tcon_link to use a kuid.
  cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t
  cifs: Convert from a kuid before printing current_fsuid
  cifs: Use kuids and kgids SID to uid/gid mapping
  cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc
  cifs: Use BUILD_BUG_ON to validate uids and gids are the same size
  cifs: Override unmappable incoming uids and gids
  nfsd: Enable building with user namespaces enabled.
  nfsd: Properly compare and initialize kuids and kgids
  nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids
  nfsd: Modify nfsd4_cb_sec to use kuids and kgids
  nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion
  nfsd: Convert nfsxdr to use kuids and kgids
  nfsd: Convert nfs3xdr to use kuids and kgids
  ...
2013-02-25 16:00:49 -08:00
Zhang Yanfei 697ce9be7d fs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_used
The three variables are calculated from nr_free_buffer_pages so change
their types to unsigned long in case of overflow.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:22 -08:00
Stanislav Kinsbursky deb4534f4f nfsd: enable NFSv4 state in containers
Currently, NFSd is ready to operate in network namespace based containers.
So let's drop check for "init_net" and make it able to fly.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-15 11:21:02 -05:00