In rare cases a directory can be renamed out from under a bind mount.
In those cases without special handling it becomes possible to walk up
the directory tree to the root dentry of the filesystem and down
from the root dentry to every other file or directory on the filesystem.
Like division by zero .. from an unconnected path can not be given
a useful semantic as there is no predicting at which path component
the code will realize it is unconnected. We certainly can not match
the current behavior as the current behavior is a security hole.
Therefore when encounting .. when following an unconnected path
return -ENOENT.
- Add a function path_connected to verify path->dentry is reachable
from path->mnt.mnt_root. AKA to validate that rename did not do
something nasty to the bind mount.
To avoid races path_connected must be called after following a path
component to it's next path component.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that we can get there in RCU mode, we shouldn't play with
nd->path.dentry->d_inode - it's not guaranteed to be stable.
Use nd->inode instead.
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In RCU mode we might end up with dentry evicted just we check
that it's a directory. In such case we should return ECHILD
rather than ENOTDIR, so that pathwalk would be retries in non-RCU
mode.
Breakage had been introduced in commit b18825a - prior to that
we were looking at nd->inode, which had been fetched before
verifying that ->d_seq was still valid. That form of check
would only be satisfied if at some point the pathname prefix
would indeed have resolved to a non-directory. The fix consists
of checking ->d_seq after we'd run into a non-directory dentry,
and failing with ECHILD in case of mismatch.
Note that all branches since 3.12 have that problem...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The only caller that cares about its return value can just
as easily pick it from nd->root_seq itself. We used to just
calculate it and return to caller, but these days we are
storing it in nd->root_seq in all cases.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
these guys are always declared next to each other; might as well put
the former (pointer to previous instance) into the latter and simplify
the calling conventions for {set,restore}_nameidata()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a) make it reject ERR_PTR() for name
b) make it putname(name) on all other failure exits
c) make it return name on success
again, simplifies the callers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a) make it reject ERR_PTR() for name
b) make it putname(name) upon return in all other cases.
seriously simplifies the callers...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Otherwise we are risking a hard error where nonlazy restart would be the right
thing to do; it's a very narrow race with mount --move and most of the time it
ends up being completely harmless, but it's possible to construct a case when
we'll get a bogus hard error instead of falling back to non-lazy walk...
For one thing, when crossing _into_ overmount of parent we need to check for
mount_lock bumps when we get NULL from __lookup_mnt() as well.
For another, and less exotically, we need to make sure that the data fetched
in follow_up_rcu() had been consistent. ->mnt_mountpoint is pinned for as
long as it is a mountpoint, but we need to check mount_lock after fetching
to verify that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
touch_atime is not RCU-safe, and so cannot be called on an RCU walk.
However, in situations where RCU-walk makes a difference, the symlink
will likely to accessed much more often than it is useful to update
the atime.
So split out the test of "Does the atime actually need to be updated"
into atime_needs_update(), and have get_link() unlazy if it finds that
it will need to do that update.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We are almost done - primitives for leaving RCU mode are aware of nd->stack
now, a new primitive for going to non-RCU mode when we have a symlink on hands
added.
The thing we are heavily relying upon is that *any* unlazy failure will be
shortly followed by terminate_walk(), with no access to nameidata in between.
So it's enough to leave the things in a state terminate_walk() would cope with.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We *can't* call that audit garbage in RCU mode - it's doing a weird
mix of allocations (GFP_NOFS, immediately followed by GFP_KERNEL)
and I'm not touching that... thing again.
So if this security sclero^Whardening feature gets triggered when
we are in RCU mode, tough - we'll fail with -ECHILD and have
everything restarted in non-RCU mode. Only to hit the same test
and fail, this time with EACCES and with (oh, rapture) an audit spew
produced.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
very simple - just make path_put() conditional on !RCU.
Note that right now it doesn't get called in RCU mode -
we leave it before getting anything into stack.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
inode_follow_link now takes an inode and rcu flag as well as the
dentry.
inode is used in preference to d_backing_inode(dentry), particularly
in RCU-walk mode.
selinux_inode_follow_link() gets dentry_has_perm() and
inode_has_perm() open-coded into it so that it can call
avc_has_perm_flags() in way that is safe if LOOKUP_RCU is set.
Calling avc_has_perm_flags() with rcu_read_lock() held means
that when avc_has_perm_noaudit calls avc_compute_av(), the attempt
to rcu_read_unlock() before calling security_compute_av() will not
actually drop the RCU read-lock.
However as security_compute_av() is completely in a read_lock()ed
region, it should be safe with the RCU read-lock held.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Make use of d_backing_inode() in pathwalk to gain access to an
inode or dentry that's on a lower layer.
Signed-off-by: David Howells <dhowells@redhat.com>
Lift it from link_path_walk(), trailing_symlink(), lookup_last(),
mountpoint_last(), complete_walk() and do_last(). A _lot_ of
those suckers merge.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Make trailing_symlink() return the pathname to traverse or ERR_PTR(-E...).
A subtle point is that for "magic" symlinks it returns "" now - that
leads to link_path_walk("", nd), which is immediately returning 0 and
we are back to the treatment of the last component, at whereever the
damn thing has left us.
Reduces the stack footprint - link_path_walk() called on more shallow
stack now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* lift link_path_walk() into callers; moving it down into path_init()
had been a mistake. Stack footprint, among other things...
* do _not_ call path_cleanup() after path_init() failure; on all failure
exits out of it we have nothing for path_cleanup() to do
* have path_init() return pathname or ERR_PTR(-E...)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
we can do fdput() under rcu_read_lock() just fine; all we need to take
care of is fetching nd->inode value first.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Makes the situation much more regular - we avoid a strange state
when the element just after the top of stack is used to store
struct path of symlink, but isn't counted in nd->depth. This
is much more regular, so the normal failure exits, etc., work
fine.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Just store it in nd->stack[nd->depth].link right in pick_link().
Now that we make sure of stack expansion in pick_link(), we can
do so...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
... and don't open-code unlazy_walk() in there - the only reason
for that is to avoid verfication of cached nd->root, which is
trivially avoided by discarding said cached nd->root first.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
rather than letting the callers handle the jump-to-root part of
semantics, do it right in get_link() and return the rest of the
body for the caller to deal with - at that point it's treated
the same way as relative symlinks would be. And return NULL
when there's no "rest of the body" - those are treated the same
as pure jump symlink would be.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Instead of saving name and branching to OK:, where we'll immediately restore
it, and call walk_component() with WALK_PUT|WALK_GET and nd->last_type being
LAST_BIND, which is equivalent to put_link(nd), err = 0, we can just treat
that the same way we'd treat procfs-style "jump" symlinks - do put_link(nd)
and move on.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>