Add fs_context support to the AFS filesystem, converting the parameter
parsing to store options there.
This will form the basis for namespace propagation over mountpoints within
the AFS model, thereby allowing AFS to be used in containers more easily.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add some logging to the core users of the fs_context log so that
information can be extracted from them as to the reason for failure.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Implement the ability for filesystems to log error, warning and
informational messages through the fs_context. In the future, these will
be extractable by userspace by reading from an fd created by the fsopen()
syscall.
Error messages are prefixed with "e ", warnings with "w " and informational
messages with "i ".
In the future, inside the kernel, formatted messages will be malloc'd but
unformatted messages will not copied if they're either in the core .rodata
section or in the .rodata section of the filesystem module pinned by
fs_context::fs_type. The messages will only be good till the fs_type is
released.
Note that the logging object will be shared between duplicated fs_context
structures. This is so that such as NFS which do a mount within a mount
can get at least some of the errors from the inner mount.
Five logging functions are provided for this:
(1) void logfc(struct fs_context *fc, const char *fmt, ...);
This logs a message into the context. If the buffer is full, the
earliest message is discarded.
(2) void errorf(fc, fmt, ...);
This wraps logfc() to log an error.
(3) void invalf(fc, fmt, ...);
This wraps errorf() and returns -EINVAL for convenience.
(4) void warnf(fc, fmt, ...);
This wraps logfc() to log a warning.
(5) void infof(fc, fmt, ...);
This wraps logfc() to log an informational message.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The kern_mount_data() isn't used any more so remove it.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Convert the hugetlbfs to use the fs_context during mount.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Make the cpuset filesystem use the filesystem context. This is potentially
tricky as the cpuset fs is almost an alias for the cgroup filesystem, but
with some special parameters.
This can, however, be handled by setting up an appropriate cgroup
filesystem and returning the root directory of that as the root dir of this
one.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Make kernfs support superblock creation/mount/remount with fs_context.
This requires that sysfs, cgroup and intel_rdt, which are built on kernfs,
be made to support fs_context also.
Notes:
(1) A kernfs_fs_context struct is created to wrap fs_context and the
kernfs mount parameters are moved in here (or are in fs_context).
(2) kernfs_mount{,_ns}() are made into kernfs_get_tree(). The extra
namespace tag parameter is passed in the context if desired
(3) kernfs_free_fs_context() is provided as a destructor for the
kernfs_fs_context struct, but for the moment it does nothing except
get called in the right places.
(4) sysfs doesn't wrap kernfs_fs_context since it has no parameters to
pass, but possibly this should be done anyway in case someone wants to
add a parameter in future.
(5) A cgroup_fs_context struct is created to wrap kernfs_fs_context and
the cgroup v1 and v2 mount parameters are all moved there.
(6) cgroup1 parameter parsing error messages are now handled by invalf(),
which allows userspace to collect them directly.
(7) cgroup1 parameter cleanup is now done in the context destructor rather
than in the mount/get_tree and remount functions.
Weirdies:
(*) cgroup_do_get_tree() calls cset_cgroup_from_root() with locks held,
but then uses the resulting pointer after dropping the locks. I'm
told this is okay and needs commenting.
(*) The cgroup refcount web. This really needs documenting.
(*) cgroup2 only has one root?
Add a suggestion from Thomas Gleixner in which the RDT enablement code is
placed into its own function.
[folded a leak fix from Andrey Vagin]
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc: Tejun Heo <tj@kernel.org>
cc: Li Zefan <lizefan@huawei.com>
cc: Johannes Weiner <hannes@cmpxchg.org>
cc: cgroups@vger.kernel.org
cc: fenghua.yu@intel.com
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
pass it fs_context instead of fs_type/flags/root triple, have
it return int instead of dentry and make it deal with setting
fc->root.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Note that this reference is *NOT* contributing to refcount of
cgroup_root in question and is valid only until cgroup_do_mount()
returns.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[again, carved out of patch by dhowells]
[NB: we probably want to handle "source" in parse_param here]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Store the results in cgroup_fs_context. There's a nasty twist caused
by the enabling/disabling subsystems - we can't do the checks sensitive
to that until cgroup_mutex gets grabbed. Frankly, these checks are
complete bullshit (e.g. all,none combination is accepted if all subsystems
are disabled; so's cpusets,none and all,cpusets when cpusets is disabled,
etc.), but touching that would be a userland-visible behaviour change ;-/
So we do parsing in ->parse_monolithic() and have the consistency checks
done in check_cgroupfs_options(), with the latter called (on already parsed
options) from cgroup1_get_tree() and cgroup1_reconfigure().
Freeing the strdup'ed strings is done from fs_context destructor, which
somewhat simplifies the life for cgroup1_{get_tree,reconfigure}().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Unfortunately, cgroup is tangled into kernfs infrastructure.
To avoid converting all kernfs-based filesystems at once,
we need to untangle the remount part of things, instead of
having it go through kernfs_sop_remount_fs(). Fortunately,
it's not hard to do.
This commit just gets cgroup/cgroup1 to use fs_context to
deliver options on mount and remount paths. Parsing those
is going to be done in the next commits; for now we do
pretty much what legacy case does.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Convert the mqueue filesystem to use the filesystem context stuff.
Notes:
(1) The relevant ipc namespace is selected in when the context is
initialised (and it defaults to the current task's ipc namespace).
The caller can override this before calling vfs_get_tree().
(2) Rather than simply calling kern_mount_data(), mq_init_ns() and
mq_internal_mount() create a context, adjust it and then do the rest
of the mount procedure.
(3) The lazy mqueue mounting on creation of a new namespace is retained
from a previous patch, but the avoidance of sget() if no superblock
yet exists is reverted and the superblock is again keyed on the
namespace pointer.
Yes, there was a performance gain in not searching the superblock
hash, but it's only paid once per ipc namespace - and only if someone
uses mqueue within that namespace, so I'm not sure it's worth it,
especially as calling sget() allows avoidance of recursion.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add fs_context support to procfs.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Move proc_fill_super() to fs/proc/root.c as that's where the other
superblock stuff is.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>
cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
new primitive: vfs_dup_fs_context(). Comes with fs_context
method (->dup()) for copying the filesystem-specific parts
of fs_context, along with LSM one (->fs_context_dup()) for
doing the same to LSM parts.
[needs better commit message, and change of Author:, anyway]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
the former is an analogue of mount_{single,nodev} for use in
->get_tree() instances, the latter - analogue of sget() for the
same.
These are fairly similar to the originals, but the callback signature
for sget_fc() is different from sget() ones, so getting bits and
pieces shared would be too convoluted; we might get around to that
later, but for now let's just remember to keep them in sync. They
do live next to each other, and changes in either won't be hard
to spot.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[AV - unfuck kern_mount_data(); we want non-NULL ->mnt_ns on long-living
mounts]
[AV - reordering fs/namespace.c is badly overdue, but let's keep it
separate from that series]
[AV - drop simple_pin_fs() change]
[AV - clean vfs_kern_mount() failure exits up]
Implement a filesystem context concept to be used during superblock
creation for mount and superblock reconfiguration for remount.
The mounting procedure then becomes:
(1) Allocate new fs_context context.
(2) Configure the context.
(3) Create superblock.
(4) Query the superblock.
(5) Create a mount for the superblock.
(6) Destroy the context.
Rather than calling fs_type->mount(), an fs_context struct is created and
fs_type->init_fs_context() is called to set it up. Pointers exist for the
filesystem and LSM to hang their private data off.
A set of operations has to be set by ->init_fs_context() to provide
freeing, duplication, option parsing, binary data parsing, validation,
mounting and superblock filling.
Legacy filesystems are supported by the provision of a set of legacy
fs_context operations that build up a list of mount options and then invoke
fs_type->mount() from within the fs_context ->get_tree() operation. This
allows all filesystems to be accessed using fs_context.
It should be noted that, whilst this patch adds a lot of lines of code,
there is quite a bit of duplication with existing code that can be
eliminated should all filesystems be converted over.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Put security flags, such as SECURITY_LSM_NATIVE_LABELS, into the filesystem
context so that the filesystem can communicate them to the LSM more easily.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Implement the new mount API LSM hooks for SELinux. At some point the old
hooks will need to be removed.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paul Moore <paul@paul-moore.com>
cc: Stephen Smalley <sds@tycho.nsa.gov>
cc: selinux@tycho.nsa.gov
cc: linux-security-module@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add LSM hooks for use by the new mount API and filesystem context code.
This includes:
(1) Hooks to handle allocation, duplication and freeing of the security
record attached to a filesystem context.
(2) A hook to snoop source specifications. There may be multiple of these
if the filesystem supports it. They will to be local files/devices if
fs_context::source_is_dev is true and will be something else, possibly
remote server specifications, if false.
(3) A hook to snoop superblock configuration options in key[=val] form.
If the LSM decides it wants to handle it, it can suppress the option
being passed to the filesystem. Note that 'val' may include commas
and binary data with the fsopen patch.
(4) A hook to perform validation and allocation after the configuration
has been done but before the superblock is allocated and set up.
(5) A hook to transfer the security from the context to a newly created
superblock.
(6) A hook to rule on whether a path point can be used as a mountpoint.
These are intended to replace:
security_sb_copy_data
security_sb_kern_mount
security_sb_mount
security_sb_set_mnt_opts
security_sb_clone_mnt_opts
security_sb_parse_opts_str
[AV -- some of the methods being replaced are already gone, some of the
methods are not added for the lack of need]
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-security-module@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Because the new API passes in key,value parameters, match_token() cannot be
used with it. Instead, provide three new helpers to aid with parsing:
(1) fs_parse(). This takes a parameter and a simple static description of
all the parameters and maps the key name to an ID. It returns 1 on a
match, 0 on no match if unknowns should be ignored and some other
negative error code on a parse error.
The parameter description includes a list of key names to IDs, desired
parameter types and a list of enumeration name -> ID mappings.
[!] Note that for the moment I've required that the key->ID mapping
array is expected to be sorted and unterminated. The size of the
array is noted in the fsconfig_parser struct. This allows me to use
bsearch(), but I'm not sure any performance gain is worth the hassle
of requiring people to keep the array sorted.
The parameter type array is sized according to the number of parameter
IDs and is indexed directly. The optional enum mapping array is an
unterminated, unsorted list and the size goes into the fsconfig_parser
struct.
The function can do some additional things:
(a) If it's not ambiguous and no value is given, the prefix "no" on
a key name is permitted to indicate that the parameter should
be considered negatory.
(b) If the desired type is a single simple integer, it will perform
an appropriate conversion and store the result in a union in
the parse result.
(c) If the desired type is an enumeration, {key ID, name} will be
looked up in the enumeration list and the matching value will
be stored in the parse result union.
(d) Optionally generate an error if the key is unrecognised.
This is called something like:
enum rdt_param {
Opt_cdp,
Opt_cdpl2,
Opt_mba_mpbs,
nr__rdt_params
};
const struct fs_parameter_spec rdt_param_specs[nr__rdt_params] = {
[Opt_cdp] = { fs_param_is_bool },
[Opt_cdpl2] = { fs_param_is_bool },
[Opt_mba_mpbs] = { fs_param_is_bool },
};
const const char *const rdt_param_keys[nr__rdt_params] = {
[Opt_cdp] = "cdp",
[Opt_cdpl2] = "cdpl2",
[Opt_mba_mpbs] = "mba_mbps",
};
const struct fs_parameter_description rdt_parser = {
.name = "rdt",
.nr_params = nr__rdt_params,
.keys = rdt_param_keys,
.specs = rdt_param_specs,
.no_source = true,
};
int rdt_parse_param(struct fs_context *fc,
struct fs_parameter *param)
{
struct fs_parse_result parse;
struct rdt_fs_context *ctx = rdt_fc2context(fc);
int ret;
ret = fs_parse(fc, &rdt_parser, param, &parse);
if (ret < 0)
return ret;
switch (parse.key) {
case Opt_cdp:
ctx->enable_cdpl3 = true;
return 0;
case Opt_cdpl2:
ctx->enable_cdpl2 = true;
return 0;
case Opt_mba_mpbs:
ctx->enable_mba_mbps = true;
return 0;
}
return -EINVAL;
}
(2) fs_lookup_param(). This takes a { dirfd, path, LOOKUP_EMPTY? } or
string value and performs an appropriate path lookup to convert it
into a path object, which it will then return.
If the desired type was a blockdev, the type of the looked up inode
will be checked to make sure it is one.
This can be used like:
enum foo_param {
Opt_source,
nr__foo_params
};
const struct fs_parameter_spec foo_param_specs[nr__foo_params] = {
[Opt_source] = { fs_param_is_blockdev },
};
const char *char foo_param_keys[nr__foo_params] = {
[Opt_source] = "source",
};
const struct constant_table foo_param_alt_keys[] = {
{ "device", Opt_source },
};
const struct fs_parameter_description foo_parser = {
.name = "foo",
.nr_params = nr__foo_params,
.nr_alt_keys = ARRAY_SIZE(foo_param_alt_keys),
.keys = foo_param_keys,
.alt_keys = foo_param_alt_keys,
.specs = foo_param_specs,
};
int foo_parse_param(struct fs_context *fc,
struct fs_parameter *param)
{
struct fs_parse_result parse;
struct foo_fs_context *ctx = foo_fc2context(fc);
int ret;
ret = fs_parse(fc, &foo_parser, param, &parse);
if (ret < 0)
return ret;
switch (parse.key) {
case Opt_source:
return fs_lookup_param(fc, &foo_parser, param,
&parse, &ctx->source);
default:
return -EINVAL;
}
}
(3) lookup_constant(). This takes a table of named constants and looks up
the given name within it. The table is expected to be sorted such
that bsearch() be used upon it.
Possibly I should require the table be terminated and just use a
for-loop to scan it instead of using bsearch() to reduce hassle.
Tables look something like:
static const struct constant_table bool_names[] = {
{ "0", false },
{ "1", true },
{ "false", false },
{ "no", false },
{ "true", true },
{ "yes", true },
};
and a lookup is done with something like:
b = lookup_constant(bool_names, param->string, -1);
Additionally, optional validation routines for the parameter description
are provided that can be enabled at compile time. A later patch will
invoke these when a filesystem is registered.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Introduce a set of logging functions through which informational messages,
warnings and error messages incurred by the mount procedure can be logged
and, in a future patch, passed to userspace instead by way of the
filesystem configuration context file descriptor.
There are four functions:
(1) infof(const char *fmt, ...);
Logs an informational message.
(2) warnf(const char *fmt, ...);
Logs a warning message.
(3) errorf(const char *fmt, ...);
Logs an error message.
(4) invalf(const char *fmt, ...);
As errof(), but returns -EINVAL so can be used on a return statement.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This is an eventual replacement for vfs_submount() uses. Unlike the
"mount" and "remount" cases, the users of that thing are not in VFS -
they are buried in various ->d_automount() instances and rather than
converting them all at once we introduce the (thankfully small and
simple) infrastructure here and deal with the prospective users in
afs, nfs, etc. parts of the series.
Here we just introduce a new constructor (fs_context_for_submount())
along with the corresponding enum constant to be put into fc->purpose
for those.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Replace do_remount_sb() with a function, reconfigure_super(), that's
fs_context aware. The fs_context is expected to be parameterised already
and have ->root pointing to the superblock to be reconfigured.
A legacy wrapper is provided that is intended to be called from the
fs_context ops when those appear, but for now is called directly from
reconfigure_super(). This wrapper invokes the ->remount_fs() superblock op
for the moment. It is intended that the remount_fs() op will be phased
out.
The fs_context->purpose is set to FS_CONTEXT_FOR_RECONFIGURE to indicate
that the context is being used for reconfiguration.
do_umount_root() is provided to consolidate remount-to-R/O for umount and
emergency remount by creating a context and invoking reconfiguration.
do_remount(), do_umount() and do_emergency_remount_callback() are switched
to use the new process.
[AV -- fold UMOUNT and EMERGENCY_REMOUNT in; fixes the
umount / bug, gets rid of pointless complexity]
[AV -- set ->net_ns in all cases; nfs remount will need that]
[AV -- shift security_sb_remount() call into reconfigure_super(); the callers
that didn't do security_sb_remount() have NULL fc->security anyway, so it's
a no-op for them]
Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Right now vfs_get_tree() calls security_sb_kern_mount() (i.e.
mount MAC) unless it gets MS_KERNMOUNT or MS_SUBMOUNT in flags.
Doing it that way is both clumsy and imprecise.
Consider the callers' tree of vfs_get_tree():
vfs_get_tree()
<- do_new_mount()
<- vfs_kern_mount()
<- simple_pin_fs()
<- vfs_submount()
<- kern_mount_data()
<- init_mount_tree()
<- btrfs_mount()
<- vfs_get_tree()
<- nfs_do_root_mount()
<- nfs4_try_mount()
<- nfs_fs_mount()
<- vfs_get_tree()
<- nfs4_referral_mount()
do_new_mount() always does need MAC (we are guaranteed that neither
MS_KERNMOUNT nor MS_SUBMOUNT will be passed there).
simple_pin_fs(), vfs_submount() and kern_mount_data() pass explicit
flags inhibiting that check. So does nfs4_referral_mount() (the
flags there are ulimately coming from vfs_submount()).
init_mount_tree() is called too early for anything LSM-related; it
doesn't matter whether we attempt those checks, they'll do nothing.
Finally, in case of btrfs_mount() and nfs_fs_mount(), doing MAC
is pointless - either the caller will do it, or the flags are
such that we wouldn't have done it either.
In other words, the one and only case when we want that check
done is when we are called from do_new_mount(), and there we
want it unconditionally.
So let's simply move it there. The superblock is still locked,
so nobody is going to get access to it (via ustat(2), etc.)
until we get a chance to apply the checks - we are free to
move them to any point up to where we drop ->s_umount (in
do_new_mount_fc()).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Create an fs_context-aware version of do_new_mount(). This takes an
fs_context with a superblock already attached to it.
Make do_new_mount() use do_new_mount_fc() rather than do_new_mount(); this
allows the consolidation of the mount creation, check and add steps.
To make this work, mount_too_revealing() is changed to take a superblock
rather than a mount (which the fs_context doesn't have available), allowing
this check to be done before the mount object is created.
Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Roll the handling of subtypes into do_new_mount() and vfs_get_tree(). The
former determines any subtype string and hangs it off the fs_context; the
latter applies it.
Make do_new_mount() create, parameterise and commit an fs_context and
create a mount for itself rather than calling vfs_kern_mount().
[AV -- missing kstrdup()]
[AV -- ... and no kstrdup() if we get to setting ->s_submount - we
simply transfer it from fc, leaving NULL behind]
[AV -- constify ->s_submount, while we are at it]
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Create a new helper, vfs_create_mount(), that creates a detached vfsmount
object from an fs_context that has a superblock attached to it.
Almost all uses will be paired with immediately preceding vfs_get_tree();
add a helper for such combination.
Switch vfs_kern_mount() to use this.
NOTE: mild behaviour change; passing NULL as 'device name' to
something like procfs will change /proc/*/mountstats - "device none"
instead on "no device". That is consistent with /proc/mounts et.al.
[do'h - EXPORT_SYMBOL_GPL slipped in by mistake; removed]
[AV -- remove confused comment from vfs_create_mount()]
[AV -- removed the second argument]
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Introduce a filesystem context concept to be used during superblock
creation for mount and superblock reconfiguration for remount. This is
allocated at the beginning of the mount procedure and into it is placed:
(1) Filesystem type.
(2) Namespaces.
(3) Source/Device names (there may be multiple).
(4) Superblock flags (SB_*).
(5) Security details.
(6) Filesystem-specific data, as set by the mount options.
Accessor functions are then provided to set up a context, parameterise it
from monolithic mount data (the data page passed to mount(2)) and tear it
down again.
A legacy wrapper is provided that implements what will be the basic
operations, wrapping access to filesystems that aren't yet aware of the
fs_context.
Finally, vfs_kern_mount() is changed to make use of the fs_context and
mount_fs() is replaced by vfs_get_tree(), called from vfs_kern_mount().
[AV -- add missing kstrdup()]
[AV -- put_cred() can be unconditional - fc->cred can't be NULL]
[AV -- take legacy_validate() contents into legacy_parse_monolithic()]
[AV -- merge KERNEL_MOUNT and USER_MOUNT]
[AV -- don't unlock superblock on success return from vfs_get_tree()]
[AV -- kill 'reference' argument of init_fs_context()]
Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
mount_subtree() creates (and soon destroys) a temporary namespace,
so that automounts could function normally. These beasts should
never become anyone's current namespaces; they don't, but it would
be better to make prevention of that more straightforward. And
since they don't become anyone's current namespace, we don't need
to bother with reserving procfs inums for those.
Teach alloc_mnt_ns() to skip inum allocation if told so, adjust
put_mnt_ns() accordingly, make mount_subtree() use temporary
(anon) namespace. is_anon_ns() checks if a namespace is such.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* make the reference from superblock to cgroup_root counting -
do cgroup_put() in cgroup_kill_sb() whether we'd done
percpu_ref_kill() or not; matching grab is done when we allocate
a new root. That gives the same refcounting rules for all callers
of cgroup_do_mount() - a reference to cgroup_root has been grabbed
by caller and it either is transferred to new superblock or dropped.
* have cgroup_kill_sb() treat an already killed refcount as "just
don't bother killing it, then".
* after successful cgroup_do_mount() have cgroup1_mount() recheck
if we'd raced with mount/umount from somebody else and cgroup_root
got killed. In that case we drop the superblock and bugger off
with -ERESTARTSYS, same as if we'd found it in the list already
dying.
* don't bother with delayed initialization of refcount - it's
unreliable and not needed. No need to prevent attempts to bump
the refcount if we find cgroup_root of another mount in progress -
sget will reuse an existing superblock just fine and if the
other sb manages to die before we get there, we'll catch
that immediately after cgroup_do_mount().
* don't bother with kernfs_pin_sb() - no need for doing that
either.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
same story as with last May fixes in sysfs (7b745a4e40
"unfuck sysfs_mount()"); new_sb is left uninitialized
in case of early errors in kernfs_mount_ns() and papering
over it by treating any error from kernfs_mount_ns() as
equivalent to !new_ns ends up conflating the cases when
objects had never been transferred to a superblock with
ones when that has happened and resulting new superblock
had been dropped. Easily fixed (same way as in sysfs
case). Additionally, there's a superblock leak on
kernfs_node_dentry() failure *and* a dentry leak inside
kernfs_node_dentry() itself - the latter on probably
impossible errors, but the former not impossible to trigger
(as the matter of fact, injecting allocation failures
at that point *does* trigger it).
Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
UNAME26 is a mechanism to report Linux's version as 2.6.x, for
compatibility with old/broken software. Due to the way it is
implemented, it would have to be updated after 5.0, to keep the
resulting versions unique. Linus Torvalds argued:
"Do we actually need this?
I'd rather let it bitrot, and just let it return random versions. It
will just start again at 2.4.60, won't it?
Anybody who uses UNAME26 for a 5.x kernel might as well think it's
still 4.x. The user space is so old that it can't possibly care about
differences between 4.x and 5.x, can it?
The only thing that matters is that it shows "2.4.<largeenough>",
which it will do regardless"
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A bigger batch than I anticipated this week, for two reasons:
- Some fallout on Davinci from board file -> DTB conversion, that also
includes a few longer-standing fixes (i.e. not recent regressions).
- drivers/reset material that has been in linux-next for a while, but
didn't get sent to us until now for a variety of reasons (maintainer out
sick, holidays, etc). There's a functional dependency in there such that
one platform (Altera's SoCFPGA) won't boot without one of the patches;
instead of reverting the patch that got merged, I looked at this set
and decided it was small enough that I'll pick it up anyway. If you
disagree I can revisit with a smaller set.
That being said, there's also a handful of the usual stuff:
- Fix for a crash on Armada 7K/8K when the kernel touches PSCI-reserved
memory
- Fix for PCIe reset on Macchiatobin (Armada 8K development board, what
this email is sent from in fact :)
- Enable a few new-merged modules for Amlogic in arm64 defconfig
- Error path fixes on Integrator
- Build fix for Renesas and Qualcomm
- Initialization fix for Renesas RZ/G2E
+ A few more fixlets.
-----BEGIN PGP SIGNATURE-----
iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAlw7hv0PHG9sb2ZAbGl4
b20ubmV0AAoJEIwa5zzehBx3rrYP/ixwKcLu9mEW9pbTmQw9m/vAGRgah4b+prT2
KNWfphwPEfFjzEE1zNwnELEJMip3Sq0s9vEbju6VHeaUeLBfGQl7160HtLK7isHX
nnJOgz0r1jDJbXmbijbcQEYLagnxV6bwh30skCx/HjUGd/IgTAsFJ2zXtaQNob2u
QAlCp7E21eVleNnoRkU+tGys+8+JooS6QXzi3hhvqnwQAuAMRAa05C36jtYPnNcg
jpLtBcxgtngHprqxfNCNpsiAsacWL1K62B3atY77+wl0Fv25pH0q67e+YAtaXLDP
iRd79pmZ803C8guAAUantxjRWoog4wCf1o97EEMpqfeY0Q4bdUAgn3+ZCG+rYTIW
tQFm8KqYvdo29Aub6ytNnhC+VzYLCrApDkEhBKEq92J2weBvq0cnw3JmGsTeeiWX
uS6ittI6VAQOXzgZ5uOrnLFlpqgGb9BZt8aCzXzwbffApNVj6CUtuYXTE4PJNLB1
yeO7IIrCXupTnJklNUrveWjfNhs2bJ6RN2OgifDhxEZBDd8PH9JJKmRfi/mSo7u+
5O1d2UoeL6NFKDlaqvEy5mzgD2z0dA5VEcnY663khu0UxRpp8Vm1z5D+Ay/23D6W
BrF1GVafcrX374tgqKF78k+z++WRuqE7ThhHR0SQQpM9I+3CYdl6BlqghwqN2P5a
bhm5RFIK
=qjfZ
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A bigger batch than I anticipated this week, for two reasons:
- Some fallout on Davinci from board file -> DTB conversion, that
also includes a few longer-standing fixes (i.e. not recent
regressions).
- drivers/reset material that has been in linux-next for a while, but
didn't get sent to us until now for a variety of reasons
(maintainer out sick, holidays, etc). There's a functional
dependency in there such that one platform (Altera's SoCFPGA) won't
boot without one of the patches; instead of reverting the patch
that got merged, I looked at this set and decided it was small
enough that I'll pick it up anyway. If you disagree I can revisit
with a smaller set.
That being said, there's also a handful of the usual stuff:
- Fix for a crash on Armada 7K/8K when the kernel touches
PSCI-reserved memory
- Fix for PCIe reset on Macchiatobin (Armada 8K development board,
what this email is sent from in fact :)
- Enable a few new-merged modules for Amlogic in arm64 defconfig
- Error path fixes on Integrator
- Build fix for Renesas and Qualcomm
- Initialization fix for Renesas RZ/G2E
.. plus a few more fixlets"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (28 commits)
ARM: integrator: impd1: use struct_size() in devm_kzalloc()
qcom-scm: Include <linux/err.h> header
gpio: pl061: handle failed allocations
ARM: dts: kirkwood: Fix polarity of GPIO fan lines
arm64: dts: marvell: mcbin: fix PCIe reset signal
arm64: dts: marvell: armada-ap806: reserve PSCI area
ARM: dts: da850-lcdk: Correct the sound card name
ARM: dts: da850-lcdk: Correct the audio codec regulators
ARM: dts: da850-evm: Correct the sound card name
ARM: dts: da850-evm: Correct the audio codec regulators
ARM: davinci: omapl138-hawk: fix label names in GPIO lookup entries
ARM: davinci: dm644x-evm: fix label names in GPIO lookup entries
ARM: davinci: dm355-evm: fix label names in GPIO lookup entries
ARM: davinci: da850-evm: fix label names in GPIO lookup entries
ARM: davinci: da830-evm: fix label names in GPIO lookup entries
arm64: defconfig: enable modules for amlogic s400 sound card
reset: uniphier-glue: Add AHCI reset control support in glue layer
dt-bindings: reset: uniphier: Add AHCI core reset description
reset: uniphier-usb3: Rename to reset-uniphier-glue
dt-bindings: reset: uniphier: Replace the expression of USB3 with generic peripherals
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlw7Y68ACgkQxWXV+ddt
WDs+Jw/9Hn71GLb1YNpNUlzJSQHUzbvFtXbtvEVyYeuuLkmjTG4FT2bkGqYhAeC9
2jvaqPaxI0VJ7vNulAaMVZayFwNEQrF9p+z+vzV/Ty2lu0Ep/7Uuwp9KL8X49req
5hEtb5p9Dbj9hXqa8XNOfF2GPDRTQ6D4eknYiNM7MY+aTkvMEtpuZg9kfwXrZ2au
JeFBzZo9SA5nxqrXZg1XlRYttDnOf44h6YJmEFOZOJuAouKcd5I7C8BshCRKDuWo
iJBjYTBTFqjgUgCrn10UM92T19hKufHiDE3WWQ/7zykqtThpKgBFR1opAUcNBJBf
HvOOsYnZcGbxIKOjFpucSpButTmjFcnI83f/7dmZYXUsyIzP/xH51uLiz/CLJqWj
JsRgtgtJ7l5s1M8GkFG6B9Tp89KxHnVKqNC5HyX+4AuFWiIJ+CWWSddGqNSfAIHe
o2ceQRMumvwbVKgfd0AfPcZ9v4sRM/DfwxWXgEQmCSNJikSuUjP9b3D51ttnXQ8c
q6lz7L6nRKK4mgBnfJCmpus3IArdvNwJ7CF8C1RejfRwL824RzH+yHjWdg36nVXB
oaBYKJf8GRRn+3In1W6npr65NxCQERIZV4M89EgST/RK7tW5u0Fotd1KJCmo+ayO
cSRbp16On9G6gBV+qrBs8X65KIWALZpdTycwyFplCU581DXdtnU=
=aCB2
-----END PGP SIGNATURE-----
Merge tag 'for-5.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- two regression fixes in clone/dedupe ioctls, the generic check
callback needs to lock extents properly and wait for io to avoid
problems with writeback and relocation
- fix deadlock when using free space tree due to block group creation
- a recently added check refuses a valid fileystem with seeding device,
make that work again with a quickfix, proper solution needs more
intrusive changes
* tag 'for-5.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: Use real device structure to verify dev extent
Btrfs: fix deadlock when using free space tree due to block group creation
Btrfs: fix race between reflink/dedupe and relocation
Btrfs: fix race between cloning range ending at eof and writeback
Here is one small sysfs change, and a documentation update for 5.0-rc2
The sysfs change moves from using BUG_ON to WARN_ON, as discussed in an
email thread on lkml while trying to track down another driver bug.
sysfs should not be crashing and preventing people from seeing where
they went wrong. Now it properly recovers and warns the developer.
The documentation update removes the use of BUS_ATTR() as the kernel is
moving away from this to use the specific BUS_ATTR_RW() and friends
instead. There are pending patches in all of the different subsystems
to remove the last users of this macro, but for now, don't advertise it
should be used anymore to keep new ones from being introduced.
Both have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXDsRFA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynbgACfTY/rAZpWTgMdPDfoOmF+s/XHQXsAoJKZ+v+f
Tpkcw76Wo1ESpPLuT1u1
=W0D+
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here is one small sysfs change, and a documentation update for 5.0-rc2
The sysfs change moves from using BUG_ON to WARN_ON, as discussed in
an email thread on lkml while trying to track down another driver bug.
sysfs should not be crashing and preventing people from seeing where
they went wrong. Now it properly recovers and warns the developer.
The documentation update removes the use of BUS_ATTR() as the kernel
is moving away from this to use the specific BUS_ATTR_RW() and friends
instead. There are pending patches in all of the different subsystems
to remove the last users of this macro, but for now, don't advertise
it should be used anymore to keep new ones from being introduced.
Both have been in linux-next with no reported issues"
* tag 'driver-core-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Documentation: driver core: remove use of BUS_ATTR
sysfs: convert BUG_ON to WARN_ON
Here are some small staging driver fixes for some reported issues.
One reverts a patch that was made to the rtl8723bs driver that turned
out to not be needed at all as it was a bug in clang. The others fix up
some reported issues in the rtl8188eu driver and update the MAINTAINERS
file to point to Larry for this driver so he can get the bug reports
easier.
All have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXDsPnQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ympdgCfVGVVapWxNgyjKe0oZNFQpF+bMAsAoNXmPbcd
8ETJe7SzfNH6mjmnw80F
=LZmR
-----END PGP SIGNATURE-----
Merge tag 'staging-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some small staging driver fixes for some reported issues.
One reverts a patch that was made to the rtl8723bs driver that turned
out to not be needed at all as it was a bug in clang. The others fix
up some reported issues in the rtl8188eu driver and update the
MAINTAINERS file to point to Larry for this driver so he can get the
bug reports easier.
All have been in linux-next with no reported issues"
* tag 'staging-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Revert "staging: rtl8723bs: Mark ACPI table declaration as used"
staging: rtl8188eu: Fix module loading from tasklet for WEP encryption
staging: rtl8188eu: Fix module loading from tasklet for CCMP encryption
MAINTAINERS: Add entry for staging driver r8188eu
Here are 2 tty and serial fixes for 5.0-rc2 that resolve some reported
issues.
The first is a simple serial driver fix for a regression that showed up
in 5.0-rc1. The second one resolves a number of reported issues with
the recent tty locking fixes that went into 5.0-rc1. Lots of people
have tested the second one and say it resolves their issues.
Both have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXDsOzA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl8OACgg3KREaqF3vizmxVr3xzKk2xG7x0An28XJcYA
z95ZRO/BuSATH1kTM36D
=p+SA
-----END PGP SIGNATURE-----
Merge tag 'tty-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are 2 tty and serial fixes for 5.0-rc2 that resolve some reported
issues.
The first is a simple serial driver fix for a regression that showed
up in 5.0-rc1. The second one resolves a number of reported issues
with the recent tty locking fixes that went into 5.0-rc1. Lots of
people have tested the second one and say it resolves their issues.
Both have been in linux-next with no reported issues"
* tag 'tty-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: Don't hold ldisc lock in tty_reopen() if ldisc present
serial: lantiq: Do not swap register read/writes
Here are some small USB driver fixes and quirk updates for 5.0-rc2.
The majority here are some quirks for some storage devices to get them
to work properly. There's also a fix here to resolve the reported
issues with some audio devices that say they are UAC3 compliant, but
really are not.
And a fix up for the MAINTAINERS file to remove a dead url.
All have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXDsQMA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykM2wCgzC1E5NUpoCqPZHaYQ/0LNLm6SaEAoIbebWBo
rl39FeUBmPJLTmKD0pFX
=4K55
-----END PGP SIGNATURE-----
Merge tag 'usb-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB driver fixes and quirk updates for 5.0-rc2.
The majority here are some quirks for some storage devices to get them
to work properly. There's also a fix here to resolve the reported
issues with some audio devices that say they are UAC3 compliant, but
really are not.
And a fix up for the MAINTAINERS file to remove a dead url.
All have been in linux-next with no reported issues"
* tag 'usb-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: storage: Remove outdated URL from MAINTAINERS
USB: Add USB_QUIRK_DELAY_CTRL_MSG quirk for Corsair K70 RGB
usbcore: Select only first configuration for non-UAC3 compliant devices
USB: storage: add quirk for SMI SM3350
USB: storage: don't insert sane sense for SPC3+ when bad sense specified
usb: cdc-acm: send ZLP for Telit 3G Intel based modems