I am about to remove the struct user_namespace reference from struct user_struct.
So keep an explicit track of the parent user namespace.
Take advantage of this new reference and replace instances of user_ns->creator->user_ns
with user_ns->parent.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
struct user_struct will shortly loose it's user_ns reference
so make the cred user_ns reference a proper reference complete
with reference counting.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Optimize performance and prepare for the removal of the user_ns reference
from user_struct. Remove the slow long walk through cred->user->user_ns and
instead go straight to cred->user_ns.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
It just bloats the audit data structure for no good reason, since the
only time those fields are filled are just before calling the
common_lsm_audit() function, which is also the only user of those
fields.
So just make them be the arguments to common_lsm_audit(), rather than
bloating that structure that is passed around everywhere, and is
initialized in hot paths.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of declaring the entire selinux_audit_data on the stack when we
start an operation on declare it on the stack if we are going to use it.
We know it's usefulness at the end of the security decision and can declare
it there.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After shrinking the common_audit_data stack usage for private LSM data I'm
not going to shrink the data union. To do this I'm going to move anything
larger than 2 void * ptrs to it's own structure and require it to be declared
separately on the calling stack. Thus hot paths which don't need more than
a couple pointer don't have to declare space to hold large unneeded
structures. I could get this down to one void * by dealing with the key
struct and the struct path. We'll see if that is helpful after taking care of
networking.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus found that the gigantic size of the common audit data caused a big
perf hit on something as simple as running stat() in a loop. This patch
requires LSMs to declare the LSM specific portion separately rather than
doing it in a union. Thus each LSM can be responsible for shrinking their
portion and don't have to pay a penalty just because other LSMs have a
bigger space requirement.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert debug, freezer, cpuset, cpu_cgroup, cpuacct, net_prio, blkio,
net_cls and device controllers to use the new cftype based interface.
Termination entry is added to cftype arrays and populate callbacks are
replaced with cgroup_subsys->base_cftypes initializations.
This is functionally identical transformation. There shouldn't be any
visible behavior change.
memcg is rather special and will be converted separately.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Pull second try at vfs part d#2 from Al Viro:
"Miklos' first series (with do_lookup() rewrite split into edible
chunks) + assorted bits and pieces.
The 'untangling of do_lookup()' series is is a splitup of what used to
be a monolithic patch from Miklos, so this series is basically "how do
I convince myself that his patch is correct (or find a hole in it)".
No holes found and I like the resulting cleanup, so in it went..."
Changes from try 1: Fix a boot problem with selinux, and commit messages
prettied up a bit.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (24 commits)
vfs: fix out-of-date dentry_unhash() comment
vfs: split __lookup_hash
untangling do_lookup() - take __lookup_hash()-calling case out of line.
untangling do_lookup() - switch to calling __lookup_hash()
untangling do_lookup() - merge d_alloc_and_lookup() callers
untangling do_lookup() - merge failure exits in !dentry case
untangling do_lookup() - massage !dentry case towards __lookup_hash()
untangling do_lookup() - get rid of need_reval in !dentry case
untangling do_lookup() - eliminate a loop.
untangling do_lookup() - expand the area under ->i_mutex
untangling do_lookup() - isolate !dentry stuff from the rest of it.
vfs: move MAY_EXEC check from __lookup_hash()
vfs: don't revalidate just looked up dentry
vfs: fix d_need_lookup/d_revalidate order in do_lookup
ext3: move headers to fs/ext3/
migrate ext2_fs.h guts to fs/ext2/ext2.h
new helper: ext2_image_size()
get rid of pointless includes of ext2_fs.h
ext2: No longer export ext2_fs.h to user space
mtdchar: kill persistently held vfsmount
...
Now that all the slow-path code is gone from these functions, we can
inline them into the main caller - avc_has_perm_flags().
Now the compiler can see that 'avc' is allocated on the stack for this
case, which helps register pressure a bit. It also actually shrinks the
total stack frame, because the stack frame that avc_has_perm_flags()
always needed (for that 'avc' allocation) is now sufficient for the
inlined functions too.
Inlining isn't bad - but mindless inlining of cold code (see the
previous commit) is.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The selinux AVC paths remain some of the hottest (and deepest) codepaths
at filename lookup time, and we make it worse by having the slow path
cases take up I$ and stack space even when they don't trigger. Gcc
tends to always want to inline functions that are just called once -
never mind that this might make for slower and worse code in the caller.
So this tries to improve on it a bit by making the slow-path cases
explicitly separate functions that are marked noinline, causing gcc to
at least no longer allocate stack space for them unless they are
actually called. It also seems to help register allocation a tiny bit,
since gcc now doesn't take the slow case code into account.
Uninlining the slow path may also allow us to inline the remaining hot
path into the one caller that actually matters: avc_has_perm_flags().
I'll have to look at that separately, but both avc_audit() and
avc_has_perm_noaudit() are now small and lean enough that inlining them
may make sense.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x32 support for x86-64 from Ingo Molnar:
"This tree introduces the X32 binary format and execution mode for x86:
32-bit data space binaries using 64-bit instructions and 64-bit kernel
syscalls.
This allows applications whose working set fits into a 32 bits address
space to make use of 64-bit instructions while using a 32-bit address
space with shorter pointers, more compressed data structures, etc."
Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c}
* 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
x32: Fix alignment fail in struct compat_siginfo
x32: Fix stupid ia32/x32 inversion in the siginfo format
x32: Add ptrace for x32
x32: Switch to a 64-bit clock_t
x32: Provide separate is_ia32_task() and is_x32_task() predicates
x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls
x86/x32: Fix the binutils auto-detect
x32: Warn and disable rather than error if binutils too old
x32: Only clear TIF_X32 flag once
x32: Make sure TS_COMPAT is cleared for x32 tasks
fs: Remove missed ->fds_bits from cessation use of fd_set structs internally
fs: Fix close_on_exec pointer in alloc_fdtable
x32: Drop non-__vdso weak symbols from the x32 VDSO
x32: Fix coding style violations in the x32 VDSO code
x32: Add x32 VDSO support
x32: Allow x32 to be configured
x32: If configured, add x32 system calls to system call tables
x32: Handle process creation
x32: Signal-related system calls
x86: Add #ifdef CONFIG_COMPAT to <asm/sys_ia32.h>
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAT3NKzROxKuMESys7AQKElw/+JyDxJSlj+g+nymkx8IVVuU8CsEwNLgRk
8KEnRfLhGtkXFLSJYWO6jzGo16F8Uqli1PdMFte/wagSv0285/HZaKlkkBVHdJ/m
u40oSjgT013bBh6MQ0Oaf8pFezFUiQB5zPOA9QGaLVGDLXCmgqUgd7exaD5wRIwB
ZmyItjZeAVnDfk1R+ZiNYytHAi8A5wSB+eFDCIQYgyulA1Igd1UnRtx+dRKbvc/m
rWQ6KWbZHIdvP1ksd8wHHkrlUD2pEeJ8glJLsZUhMm/5oMf/8RmOCvmo8rvE/qwl
eDQ1h4cGYlfjobxXZMHqAN9m7Jg2bI946HZjdb7/7oCeO6VW3FwPZ/Ic75p+wp45
HXJTItufERYk6QxShiOKvA+QexnYwY0IT5oRP4DrhdVB/X9cl2MoaZHC+RbYLQy+
/5VNZKi38iK4F9AbFamS7kd0i5QszA/ZzEzKZ6VMuOp3W/fagpn4ZJT1LIA3m4A9
Q0cj24mqeyCfjysu0TMbPtaN+Yjeu1o1OFRvM8XffbZsp5bNzuTDEvviJ2NXw4vK
4qUHulhYSEWcu9YgAZXvEWDEM78FXCkg2v/CrZXH5tyc95kUkMPcgG+QZBB5wElR
FaOKpiC/BuNIGEf02IZQ4nfDxE90QwnDeoYeV+FvNj9UEOopJ5z5bMPoTHxm4cCD
NypQthI85pc=
=G9mT
-----END PGP SIGNATURE-----
Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system
Pull "Disintegrate and delete asm/system.h" from David Howells:
"Here are a bunch of patches to disintegrate asm/system.h into a set of
separate bits to relieve the problem of circular inclusion
dependencies.
I've built all the working defconfigs from all the arches that I can
and made sure that they don't break.
The reason for these patches is that I recently encountered a circular
dependency problem that came about when I produced some patches to
optimise get_order() by rewriting it to use ilog2().
This uses bitops - and on the SH arch asm/bitops.h drags in
asm-generic/get_order.h by a circuituous route involving asm/system.h.
The main difficulty seems to be asm/system.h. It holds a number of
low level bits with no/few dependencies that are commonly used (eg.
memory barriers) and a number of bits with more dependencies that
aren't used in many places (eg. switch_to()).
These patches break asm/system.h up into the following core pieces:
(1) asm/barrier.h
Move memory barriers here. This already done for MIPS and Alpha.
(2) asm/switch_to.h
Move switch_to() and related stuff here.
(3) asm/exec.h
Move arch_align_stack() here. Other process execution related bits
could perhaps go here from asm/processor.h.
(4) asm/cmpxchg.h
Move xchg() and cmpxchg() here as they're full word atomic ops and
frequently used by atomic_xchg() and atomic_cmpxchg().
(5) asm/bug.h
Move die() and related bits.
(6) asm/auxvec.h
Move AT_VECTOR_SIZE_ARCH here.
Other arch headers are created as needed on a per-arch basis."
Fixed up some conflicts from other header file cleanups and moving code
around that has happened in the meantime, so David's testing is somewhat
weakened by that. We'll find out anything that got broken and fix it..
* tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
Delete all instances of asm/system.h
Remove all #inclusions of asm/system.h
Add #includes needed to permit the removal of asm/system.h
Move all declarations of free_initmem() to linux/mm.h
Disintegrate asm/system.h for OpenRISC
Split arch_align_stack() out from asm-generic/system.h
Split the switch_to() wrapper out of asm-generic/system.h
Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
Create asm-generic/barrier.h
Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
Disintegrate asm/system.h for Xtensa
Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
Disintegrate asm/system.h for Tile
Disintegrate asm/system.h for Sparc
Disintegrate asm/system.h for SH
Disintegrate asm/system.h for Score
Disintegrate asm/system.h for S390
Disintegrate asm/system.h for PowerPC
Disintegrate asm/system.h for PA-RISC
Disintegrate asm/system.h for MN10300
...
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:
perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: David Howells <dhowells@redhat.com>
Fix failure in aa_change_onexec api when the request is made from a confined
task. This failure was caused by two problems
The AA_MAY_ONEXEC perm was not being mapped correctly for this case.
The executable name was being checked as second time instead of using the
requested onexec profile name, which may not be the same as the exec
profile name. This mistake can not be exploited to grant extra permission
because of the above flaw where the ONEXEC permission was not being mapped
so it will not be granted.
BugLink: http://bugs.launchpad.net/bugs/963756
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
selinux/xfrm.h needs to #include net/flow.h or else suffer:
In file included from security/selinux/ss/services.c:69:0:
security/selinux/include/xfrm.h: In function 'selinux_xfrm_notify_policyload':
security/selinux/include/xfrm.h:53:14: error: 'flow_cache_genid' undeclared (first use in this function)
security/selinux/include/xfrm.h:53:14: note: each undeclared identifier is reported only once for each function it appears in
Signed-off-by: David Howells <dhowells@redhat.com>
No functional changes. It is not sane to use UMH_KILLABLE with enum
umh_wait, but obviously we do not want another argument in
call_usermodehelper_* helpers. Kill this enum, use the plain int.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
New features include:
- Add NFS client support for containers.
This should enable most of the necessary functionality, including
lockd support, and support for rpc.statd, NFSv4 idmapper and
RPCSEC_GSS upcalls into the correct network namespace from
which the mount system call was issued.
- NFSv4 idmapper scalability improvements
Base the idmapper cache on the keyring interface to allow concurrent
access to idmapper entries. Start the process of migrating users from
the single-threaded daemon-based approach to the multi-threaded
request-key based approach.
- NFSv4.1 implementation id.
Allows the NFSv4.1 client and server to mutually identify each other
for logging and debugging purposes.
- Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
having to use the more counterintuitive 'vers=4,minorversion=1'.
- SUNRPC tracepoints.
Start the process of adding tracepoints in order to improve debugging
of the RPC layer.
- pNFS object layout support for autologin.
Important bugfixes include:
- Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail
to wake up all tasks when applied to priority waitqueues.
- Ensure that we handle read delegations correctly, when we try to
truncate a file.
- A number of fixes for NFSv4 state manager loops (mostly to do with
delegation recovery).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJPalZbAAoJEGcL54qWCgDyCi4P+QHcmzQhJO7HWx3Pzjs67bFT
xMSYaKHGWS4AJKUBVl5OKBxUExfrMHBNbElV3IKUIwBlDx8RVtnwfptKSe146iki
dn4TrRO5es8nmI4hRDcGMlzJDZq4y0Qg//qiUFmojiNW/Avw0ljfMoVUejJJ09FV
oeDk4EGtcxkEyH+g48ZjYbyspRnG8qtD3atf70Z3lYE0ELdG/B5Dyzw1RDrA5p73
xJX3lqy8p/4ROzw/dmNoxdAXOrr3Q4/T58Bvp/lUglPy/EHyPmWzFoH0MU0C/PFu
5VnAl6QDbNCTcIw9FvJlX/mIyErpNG9eKzUskUc9L9SA+B+J/i4rIap4KATRN3nH
7QhE5qUacPuJnvxml7MPmlQTuft3fkAQ7NhKIWrbRi1QS9FmJC5NxctIb8loqlFn
yIXdKeLfMshB+NyuFS9uzStX7SmV3eMgVd+5ZxRjYxm+PKJLw2KXeudArL6M5mHK
3QeKZpqwaYQ3RfaTNpvAp0doiXHCO5UbWfI0Pe8xQs/QcMCNReffqV2G4IJKFAu6
WpoN2UDQC9LCBifLw2nS7kku8+ZVXLQU8OC1NVl3TG15xD9cNLXuk3/y5llPGq4O
odo52uLFpJohbDaHMj5RTKOfchTQCm2iyuVmxZEeAySypMSiAXmW7COSKHs/HxI1
VBm+EI00Pvmm5+fUjIlp
=LuHE
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates for Linux 3.4 from Trond Myklebust:
"New features include:
- Add NFS client support for containers.
This should enable most of the necessary functionality, including
lockd support, and support for rpc.statd, NFSv4 idmapper and
RPCSEC_GSS upcalls into the correct network namespace from which
the mount system call was issued.
- NFSv4 idmapper scalability improvements
Base the idmapper cache on the keyring interface to allow
concurrent access to idmapper entries. Start the process of
migrating users from the single-threaded daemon-based approach to
the multi-threaded request-key based approach.
- NFSv4.1 implementation id.
Allows the NFSv4.1 client and server to mutually identify each
other for logging and debugging purposes.
- Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
having to use the more counterintuitive 'vers=4,minorversion=1'.
- SUNRPC tracepoints.
Start the process of adding tracepoints in order to improve
debugging of the RPC layer.
- pNFS object layout support for autologin.
Important bugfixes include:
- Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to
fail to wake up all tasks when applied to priority waitqueues.
- Ensure that we handle read delegations correctly, when we try to
truncate a file.
- A number of fixes for NFSv4 state manager loops (mostly to do with
delegation recovery)."
* tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits)
NFS: fix sb->s_id in nfs debug prints
xprtrdma: Remove assumption that each segment is <= PAGE_SIZE
xprtrdma: The transport should not bug-check when a dup reply is received
pnfs-obj: autologin: Add support for protocol autologin
NFS: Remove nfs4_setup_sequence from generic rename code
NFS: Remove nfs4_setup_sequence from generic unlink code
NFS: Remove nfs4_setup_sequence from generic read code
NFS: Remove nfs4_setup_sequence from generic write code
NFS: Fix more NFS debug related build warnings
SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined
nfs: non void functions must return a value
SUNRPC: Kill compiler warning when RPC_DEBUG is unset
SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG
NFS: Use cond_resched_lock() to reduce latencies in the commit scans
NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner
NFS: ncommit count is being double decremented
SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
Try using machine credentials for RENEW calls
NFSv4.1: Fix a few issues in filelayout_commit_pagelist
NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code
...
avc_audit() did a lot of jumping around and had a big stack frame, all
for the uncommon case.
Split up the uncommon case (which we really can't make go fast anyway)
into its own slow function, and mark the conditional branches
appropriately for the common likely case.
This causes avc_audit() to no longer show up as one of the hottest
functions on the branch profiles (the new "perf -b" thing), and makes
the cycle profiles look really nice and dense too.
The whole audit path is still annoyingly very much one of the biggest
costs of name lookup, so these things are worth optimizing for. I wish
we could just tell people to turn it off, but realistically we do need
it: we just need to make sure that the overhead of the necessary evil is
as low as possible.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull vfs pile 1 from Al Viro:
"This is _not_ all; in particular, Miklos' and Jan's stuff is not there
yet."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
ext4: initialization of ext4_li_mtx needs to be done earlier
debugfs-related mode_t whack-a-mole
hfsplus: add an ioctl to bless files
hfsplus: change finder_info to u32
hfsplus: initialise userflags
qnx4: new helper - try_extent()
qnx4: get rid of qnx4_bread/qnx4_getblk
take removal of PF_FORKNOEXEC to flush_old_exec()
trim includes in inode.c
um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
um: embed ->stub_pages[] into mmu_context
gadgetfs: list_for_each_safe() misuse
ocfs2: fix leaks on failure exits in module_init
ecryptfs: make register_filesystem() the last potential failure exit
ntfs: forgets to unregister sysctls on register_filesystem() failure
logfs: missing cleanup on register_filesystem() failure
jfs: mising cleanup on register_filesystem() failure
make configfs_pin_fs() return root dentry on success
configfs: configfs_create_dir() has parent dentry in dentry->d_parent
configfs: sanitize configfs_create()
...
Pull security subsystem updates for 3.4 from James Morris:
"The main addition here is the new Yama security module from Kees Cook,
which was discussed at the Linux Security Summit last year. Its
purpose is to collect miscellaneous DAC security enhancements in one
place. This also marks a departure in policy for LSM modules, which
were previously limited to being standalone access control systems.
Chromium OS is using Yama, and I believe there are plans for Ubuntu,
at least.
This patchset also includes maintenance updates for AppArmor, TOMOYO
and others."
Fix trivial conflict in <net/sock.h> due to the jumo_label->static_key
rename.
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (38 commits)
AppArmor: Fix location of const qualifier on generated string tables
TOMOYO: Return error if fails to delete a domain
AppArmor: add const qualifiers to string arrays
AppArmor: Add ability to load extended policy
TOMOYO: Return appropriate value to poll().
AppArmor: Move path failure information into aa_get_name and rename
AppArmor: Update dfa matching routines.
AppArmor: Minor cleanup of d_namespace_path to consolidate error handling
AppArmor: Retrieve the dentry_path for error reporting when path lookup fails
AppArmor: Add const qualifiers to generated string tables
AppArmor: Fix oops in policy unpack auditing
AppArmor: Fix error returned when a path lookup is disconnected
KEYS: testing wrong bit for KEY_FLAG_REVOKED
TOMOYO: Fix mount flags checking order.
security: fix ima kconfig warning
AppArmor: Fix the error case for chroot relative path name lookup
AppArmor: fix mapping of META_READ to audit and quiet flags
AppArmor: Fix underflow in xindex calculation
AppArmor: Fix dropping of allowed operations that are force audited
AppArmor: Add mising end of structure test to caps unpacking
...
Pull kmap_atomic cleanup from Cong Wang.
It's been in -next for a long time, and it gets rid of the (no longer
used) second argument to k[un]map_atomic().
Fix up a few trivial conflicts in various drivers, and do an "evil
merge" to catch some new uses that have come in since Cong's tree.
* 'kmap_atomic' of git://github.com/congwang/linux: (59 commits)
feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal
highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename]
drbd: remove the second argument of k[un]map_atomic()
zcache: remove the second argument of k[un]map_atomic()
gma500: remove the second argument of k[un]map_atomic()
dm: remove the second argument of k[un]map_atomic()
tomoyo: remove the second argument of k[un]map_atomic()
sunrpc: remove the second argument of k[un]map_atomic()
rds: remove the second argument of k[un]map_atomic()
net: remove the second argument of k[un]map_atomic()
mm: remove the second argument of k[un]map_atomic()
lib: remove the second argument of k[un]map_atomic()
power: remove the second argument of k[un]map_atomic()
kdb: remove the second argument of k[un]map_atomic()
udf: remove the second argument of k[un]map_atomic()
ubifs: remove the second argument of k[un]map_atomic()
squashfs: remove the second argument of k[un]map_atomic()
reiserfs: remove the second argument of k[un]map_atomic()
ocfs2: remove the second argument of k[un]map_atomic()
ntfs: remove the second argument of k[un]map_atomic()
...
Pull cgroup changes from Tejun Heo:
"Out of the 8 commits, one fixes a long-standing locking issue around
tasklist walking and others are cleanups."
* 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Walk task list under tasklist_lock in cgroup_enable_task_cg_list
cgroup: Remove wrong comment on cgroup_enable_task_cg_list()
cgroup: remove cgroup_subsys argument from callbacks
cgroup: remove extra calls to find_existing_css_set
cgroup: replace tasklist_lock with rcu_read_lock
cgroup: simplify double-check locking in cgroup_attach_proc
cgroup: move struct cgroup_pidlist out from the header file
cgroup: remove cgroup_attach_task_current_cg()
Call sequence:
tomoyo_write_domain() --> tomoyo_delete_domain()
In 'tomoyo_delete_domain', return -EINTR if locking attempt is
interrupted by signal.
At present it returns success to its caller 'tomoyo_write_domain()'
even though domain is not deleted. 'tomoyo_write_domain()' assumes
domain is deleted and returns success to its caller. This is wrong behaviour.
'tomoyo_write_domain' should return error from tomoyo_delete_domain() to its
caller.
Signed-off-by: Santosh Nayak <santoshprasadnayak@gmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Add the base support for the new policy extensions. This does not bring
any additional functionality, or change current semantics.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
"struct file_operations"->poll() expects "unsigned int" return value.
All files in /sys/kernel/security/tomoyo/ directory other than
/sys/kernel/security/tomoyo/query and /sys/kernel/security/tomoyo/audit should
return POLLIN | POLLRDNORM | POLLOUT | POLLWRNORM rather than -ENOSYS.
Also, /sys/kernel/security/tomoyo/query and /sys/kernel/security/tomoyo/audit
should return POLLOUT | POLLWRNORM rather than 0 when there is no data to read.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Move the path name lookup failure messages into the main path name lookup
routine, as the information is useful in more than just aa_path_perm.
Also rename aa_get_name to aa_path_name as it is not getting a reference
counted object with a corresponding put fn.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
Update aa_dfa_match so that it doesn't result in an input string being
walked twice (once to get its length and another time to match)
Add a single step functions
aa_dfa_next
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
When __d_path and d_absolute_path fail due to the name being outside of
the current namespace no name is reported. Use dentry_path to provide
some hint as to which file was being accessed.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
Post unpacking of policy a verification pass is made on x transition
indexes. When this fails a call to audit_iface is made resulting in an
oops, because audit_iface is expecting a valid buffer position but
since the failure comes from post unpack verification there is none.
Make the position argument optional so that audit_iface can be called
from post unpack verification.
Signed-off-by: John Johansen <john.johansen@canonical.com>
The returning of -ESATLE when a path lookup fails as disconnected is wrong.
Since AppArmor is rejecting the access return -EACCES instead.
This also fixes a bug in complain (learning) mode where disconnected paths
are denied because -ESTALE errors are not ignored causing failures that
can change application behavior.
Signed-off-by: John Johansen <john.johansen@canonical.com>
The test for "if (cred->request_key_auth->flags & KEY_FLAG_REVOKED) {"
should actually testing that the (1 << KEY_FLAG_REVOKED) bit is set.
The current code actually checks for KEY_FLAG_DEAD.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
The keyctl_set_timeout function isn't exported to other parts of the
kernel, but I want to use it for the NFS idmapper. I already have the
key, but I wanted a generic way to set the timeout.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Userspace can pass in arbitrary combinations of MS_* flags to mount().
If both MS_BIND and one of MS_SHARED/MS_PRIVATE/MS_SLAVE/MS_UNBINDABLE are
passed, device name which should be checked for MS_BIND was not checked because
MS_SHARED/MS_PRIVATE/MS_SLAVE/MS_UNBINDABLE had higher priority than MS_BIND.
If both one of MS_BIND/MS_MOVE and MS_REMOUNT are passed, device name which
should not be checked for MS_REMOUNT was checked because MS_BIND/MS_MOVE had
higher priority than MS_REMOUNT.
Fix these bugs by changing priority to MS_REMOUNT -> MS_BIND ->
MS_SHARED/MS_PRIVATE/MS_SLAVE/MS_UNBINDABLE -> MS_MOVE as with do_mount() does.
Also, unconditionally return -EINVAL if more than one of
MS_SHARED/MS_PRIVATE/MS_SLAVE/MS_UNBINDABLE is passed so that TOMOYO will not
generate inaccurate audit logs, for commit 7a2e8a8f "VFS: Sanity check mount
flags passed to change_mnt_propagation()" clarified that these flags must be
exclusively passed.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Fix IMA kconfig warning on non-X86 architectures:
warning: (IMA) selects TCG_TIS which has unmet direct dependencies
(TCG_TPM && X86)
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
When a chroot relative pathname lookup fails it is falling through to
do a d_absolute_path lookup. This is incorrect as d_absolute_path should
only be used to lookup names for namespace absolute paths.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
The mapping of AA_MAY_META_READ for the allow mask was also being mapped
to the audit and quiet masks. This would result in some operations being
audited when the should not.
This flaw was hidden by the previous audit bug which would drop some
messages that where supposed to be audited.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
If the xindex value stored in the accept tables is 0, the extraction of
that value will result in an underflow (0 - 4).
In properly compiled policy this should not happen for file rules but
it may be possible for other rule types in the future.
To exploit this underflow a user would have to be able to load a corrupt
policy, which requires CAP_MAC_ADMIN, overwrite system policy in kernel
memory or know of a compiler error resulting in the flaw being present
for loaded policy (no such flaw is known at this time).
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
The audit permission flag, that specifies an audit message should be
provided when an operation is allowed, was being ignored in some cases.
This is because the auto audit mode (which determines the audit mode from
system flags) was incorrectly assigned the same value as audit mode. The
shared value would result in messages that should be audited going through
a second evaluation as to whether they should be audited based on the
auto audit, resulting in some messages being dropped.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
The unpacking of struct capsx is missing a check for the end of the
caps structure. This can lead to unpack failures depending on what else
is packed into the policy file being unpacked.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
Since the parser needs to know which rlimits are known to the kernel,
export the list via a mask file in the "rlimit" subdirectory in the
securityfs "features" directory.
Signed-off-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Create the "file" directory in the securityfs for tracking features
related to files.
Signed-off-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
This adds the "features" subdirectory to the AppArmor securityfs
to display boolean features flags and the known capability mask.
Signed-off-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Use a file tree structure to represent the AppArmor securityfs.
Signed-off-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Replace the fd_sets in struct fdtable with an array of unsigned longs and then
use the standard non-atomic bit operations rather than the FD_* macros.
This:
(1) Removes the abuses of struct fd_set:
(a) Since we don't want to allocate a full fd_set the vast majority of the
time, we actually, in effect, just allocate a just-big-enough array of
unsigned longs and cast it to an fd_set type - so why bother with the
fd_set at all?
(b) Some places outside of the core fdtable handling code (such as
SELinux) want to look inside the array of unsigned longs hidden inside
the fd_set struct for more efficient iteration over the entire set.
(2) Eliminates the use of FD_*() macros in the kernel completely.
(3) Permits the __FD_*() macros to be deleted entirely where not exposed to
userspace.
Signed-off-by: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/20120216174954.23314.48147.stgit@warthog.procyon.org.uk
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
The audit res field ususally indicates success with a 1 and 0 for a
failure. So make IMA do it the same way.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
For a process to entirely disable Yama ptrace restrictions, it can use
the special PR_SET_PTRACER_ANY pid to indicate that any otherwise allowed
process may ptrace it. This is stronger than calling PR_SET_PTRACER with
pid "1" because it includes processes in external pid namespaces. This is
currently needed by the Chrome renderer, since its crash handler (Breakpad)
runs external to the renderer's pid namespace.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <jmorris@namei.org>
Collapse security_vm_enough_memory() variants into a single function.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
This adds the Yama Linux Security Module to collect DAC security
improvements (specifically just ptrace restrictions for now) that have
existed in various forms over the years and have been carried outside the
mainline kernel by other Linux distributions like Openwall and grsecurity.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
The current LSM interface to cred_free is not sufficient for allowing
an LSM to track the life and death of a task. This patch adds the
task_free hook so that an LSM can clean up resources on task death.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <jmorris@namei.org>
The argument is not used at all, and it's not necessary, because
a specific callback handler of course knows which subsys it
belongs to.
Now only ->pupulate() takes this argument, because the handlers of
this callback always call cgroup_add_file()/cgroup_add_files().
So we reduce a few lines of code, though the shrinking of object size
is minimal.
16 files changed, 113 insertions(+), 162 deletions(-)
text data bss dec hex filename
5486240 656987 7039960 13183187 c928d3 vmlinux.o.orig
5486170 656987 7039960 13183117 c9288d vmlinux.o
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
* git://git.samba.org/sfrench/cifs-2.6:
CIFS: Rename *UCS* functions to *UTF16*
[CIFS] ACL and FSCACHE support no longer EXPERIMENTAL
[CIFS] Fix build break with multiuser patch when LANMAN disabled
cifs: warn about impending deprecation of legacy MultiuserMount code
cifs: fetch credentials out of keyring for non-krb5 auth multiuser mounts
cifs: sanitize username handling
keys: add a "logon" key type
cifs: lower default wsize when unix extensions are not used
cifs: better instrumentation for coalesce_t2
cifs: integer overflow in parse_dacl()
cifs: Fix sparse warning when calling cifs_strtoUCS
CIFS: Add descriptions to the brlock cache functions
Replace the rcu_assign_pointer() calls with rcu_assign_keypointer().
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Fix ima_policy.c sparse "warning: dereference of noderef expression"
message, by accessing cred->uid using current_cred().
Changelog v1:
- Change __cred to just cred (based on David Howell's comment)
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
The kernel contains some special internal keyrings, for instance the DNS
resolver keyring :
2a93faf1 I----- 1 perm 1f030000 0 0 keyring .dns_resolver: empty
It would occasionally be useful to allow the contents of such keyrings to be
flushed by root (cache invalidation).
Allow a flag to be set on a keyring to mark that someone possessing the
sysadmin capability can clear the keyring, even without normal write access to
the keyring.
Set this flag on the special keyrings created by the DNS resolver, the NFS
identity mapper and the CIFS identity mapper.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
For CIFS, we want to be able to store NTLM credentials (aka username
and password) in the keyring. We do not, however want to allow users
to fetch those keys back out of the keyring since that would be a
security risk.
Unfortunately, due to the nuances of key permission bits, it's not
possible to do this. We need to grant search permissions so the kernel
can find these keys, but that also implies permissions to read the
payload.
Resolve this by adding a new key_type. This key type is essentially
the same as key_type_user, but does not define a .read op. This
prevents the payload from ever being visible from userspace. This
key type also vets the description to ensure that it's "qualified"
by checking to ensure that it has a ':' in it that is preceded by
other characters.
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit: (29 commits)
audit: no leading space in audit_log_d_path prefix
audit: treat s_id as an untrusted string
audit: fix signedness bug in audit_log_execve_info()
audit: comparison on interprocess fields
audit: implement all object interfield comparisons
audit: allow interfield comparison between gid and ogid
audit: complex interfield comparison helper
audit: allow interfield comparison in audit rules
Kernel: Audit Support For The ARM Platform
audit: do not call audit_getname on error
audit: only allow tasks to set their loginuid if it is -1
audit: remove task argument to audit_set_loginuid
audit: allow audit matching on inode gid
audit: allow matching on obj_uid
audit: remove audit_finish_fork as it can't be called
audit: reject entry,always rules
audit: inline audit_free to simplify the look of generic code
audit: drop audit_set_macxattr as it doesn't do anything
audit: inline checks for not needing to collect aux records
audit: drop some potentially inadvisable likely notations
...
Use evil merge to fix up grammar mistakes in Kconfig file.
Bad speling and horrible grammar (and copious swearing) is to be
expected, but let's keep it to commit messages and comments, rather than
expose it to users in config help texts or printouts.
Similar to SIGNATURE, rename INTEGRITY_DIGSIG to INTEGRITY_SIGNATURE.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: James Morris <jmorris@namei.org>
It was reported that DIGSIG is confusing name for digital signature
module. It was suggested to rename DIGSIG to SIGNATURE.
Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: James Morris <jmorris@namei.org>
Enabling CONFIG_PROVE_RCU and CONFIG_SPARSE_RCU_POINTER resulted in
"suspicious rcu_dereference_check() usage!" and "incompatible types
in comparison expression (different address spaces)" messages.
Access the masterkey directly when holding the rwsem.
Changelog v1:
- Use either rcu_read_lock()/rcu_derefence_key()/rcu_read_unlock()
or remove the unnecessary rcu_derefence() - David Howells
Reported-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Define rcu_assign_keypointer(), which uses the key payload.rcudata instead
of payload.data, to resolve the CONFIG_SPARSE_RCU_POINTER message:
"incompatible types in comparison expression (different address spaces)"
Replace the rcu_assign_pointer() calls in encrypted/trusted keys with
rcu_assign_keypointer().
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add missing smp_rmb() primitives to the keyring search code.
When keyring payloads are appended to without replacement (thus using up spare
slots in the key pointer array), an smp_wmb() is issued between the pointer
assignment and the increment of the key count (nkeys).
There should be corresponding read barriers between the read of nkeys and
dereferences of keys[n] when n is dependent on the value of nkeys.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
TOMOYO 2.5 in Linux 3.2 and later handles Unix domain socket's address.
Thus, tomoyo_correct_word2() needs to accept \000 as a valid character, or
TOMOYO 2.5 cannot handle Unix domain's abstract socket address.
Reported-by: Steven Allen <steven@stebalien.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
CC: stable@vger.kernel.org [3.2+]
Signed-off-by: James Morris <jmorris@namei.org>
audit_log_d_path() injects an additional space before the prefix,
which serves no purpose and doesn't mix well with other audit_log*()
functions that do not sneak extra characters into the log.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
The use of s_id should go through the untrusted string path, just to be
extra careful.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
capabilities: remove __cap_full_set definition
security: remove the security_netlink_recv hook as it is equivalent to capable()
ptrace: do not audit capability check when outputing /proc/pid/stat
capabilities: remove task_ns_* functions
capabitlies: ns_capable can use the cap helpers rather than lsm call
capabilities: style only - move capable below ns_capable
capabilites: introduce new has_ns_capabilities_noaudit
capabilities: call has_ns_capability from has_capability
capabilities: remove all _real_ interfaces
capabilities: introduce security_capable_noaudit
capabilities: reverse arguments to security_capable
capabilities: remove the task from capable LSM hook entirely
selinux: sparse fix: fix several warnings in the security server cod
selinux: sparse fix: fix warnings in netlink code
selinux: sparse fix: eliminate warnings for selinuxfs
selinux: sparse fix: declare selinux_disable() in security.h
selinux: sparse fix: move selinux_complete_init
selinux: sparse fix: make selinux_secmark_refcount static
SELinux: Fix RCU deref check warning in sel_netport_insert()
Manually fix up a semantic mis-merge wrt security_netlink_recv():
- the interface was removed in commit fd77846152 ("security: remove
the security_netlink_recv hook as it is equivalent to capable()")
- a new user of it appeared in commit a38f7907b9 ("crypto: Add
userspace configuration API")
causing no automatic merge conflict, but Eric Paris pointed out the
issue.
module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.
It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security: (32 commits)
ima: fix invalid memory reference
ima: free duplicate measurement memory
security: update security_file_mmap() docs
selinux: Casting (void *) value returned by kmalloc is useless
apparmor: fix module parameter handling
Security: tomoyo: add .gitignore file
tomoyo: add missing rcu_dereference()
apparmor: add missing rcu_dereference()
evm: prevent racing during tfm allocation
evm: key must be set once during initialization
mpi/mpi-mpow: NULL dereference on allocation failure
digsig: build dependency fix
KEYS: Give key types their own lockdep class for key->sem
TPM: fix transmit_cmd error logic
TPM: NSC and TIS drivers X86 dependency fix
TPM: Export wait_for_stat for other vendor specific drivers
TPM: Use vendor specific function for status probe
tpm_tis: add delay after aborting command
tpm_tis: Check return code from getting timeouts/durations
tpm: Introduce function to poll for result of self test
...
Fix up trivial conflict in lib/Makefile due to addition of CONFIG_MPI
and SIGSIG next to CONFIG_DQL addition.
inode needs to be fully set up before we feed it to d_instantiate().
securityfs_create_file() does *not* do so; it sets ->i_fop and
->i_private only after we'd exposed the inode. Unfortunately,
that's done fairly deep in call chain, so the amount of churn
is considerable. Helper functions killed by substituting into
their solitary call sites, dead code removed. We finally can
bury default_file_ops, now that the final value of ->i_fop is
available (and assigned) at the point where inode is allocated.
Reviewed-by: James Morris <jmorris@namei.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits)
cgroup: fix to allow mounting a hierarchy by name
cgroup: move assignement out of condition in cgroup_attach_proc()
cgroup: Remove task_lock() from cgroup_post_fork()
cgroup: add sparse annotation to cgroup_iter_start() and cgroup_iter_end()
cgroup: mark cgroup_rmdir_waitq and cgroup_attach_proc() as static
cgroup: only need to check oldcgrp==newgrp once
cgroup: remove redundant get/put of task struct
cgroup: remove redundant get/put of old css_set from migrate
cgroup: Remove unnecessary task_lock before fetching css_set on migration
cgroup: Drop task_lock(parent) on cgroup_fork()
cgroups: remove redundant get/put of css_set from css_set_check_fetched()
resource cgroups: remove bogus cast
cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
cgroup, cpuset: don't use ss->pre_attach()
cgroup: don't use subsys->can_attach_task() or ->attach_task()
cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
cgroup: improve old cgroup handling in cgroup_attach_proc()
cgroup: always lock threadgroup during migration
threadgroup: extend threadgroup_lock() to cover exit and exec
threadgroup: rename signal->threadgroup_fork_lock to ->group_rwsem
...
Fix up conflict in kernel/cgroup.c due to commit e0197aae59e5: "cgroups:
fix a css_set not found bug in cgroup_attach_proc" that already
mentioned that the bug is fixed (differently) in Tejun's cgroup
patchset. This one, in other words.
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
reiserfs: Properly display mount options in /proc/mounts
vfs: prevent remount read-only if pending removes
vfs: count unlinked inodes
vfs: protect remounting superblock read-only
vfs: keep list of mounts for each superblock
vfs: switch ->show_options() to struct dentry *
vfs: switch ->show_path() to struct dentry *
vfs: switch ->show_devname() to struct dentry *
vfs: switch ->show_stats to struct dentry *
switch security_path_chmod() to struct path *
vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
vfs: trim includes a bit
switch mnt_namespace ->root to struct mount
vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
vfs: opencode mntget() mnt_set_mountpoint()
vfs: spread struct mount - remaining argument of next_mnt()
vfs: move fsnotify junk to struct mount
vfs: move mnt_devname
vfs: move mnt_list to struct mount
vfs: switch pnode.h macros to struct mount *
...
Once upon a time netlink was not sync and we had to get the effective
capabilities from the skb that was being received. Today we instead get
the capabilities from the current task. This has rendered the entire
purpose of the hook moot as it is now functionally equivalent to the
capable() call.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reading /proc/pid/stat of another process checks if one has ptrace permissions
on that process. If one does have permissions it outputs some data about the
process which might have security and attack implications. If the current
task does not have ptrace permissions the read still works, but those fields
are filled with inocuous (0) values. Since this check and a subsequent denial
is not a violation of the security policy we should not audit such denials.
This can be quite useful to removing ptrace broadly across a system without
flooding the logs when ps is run or something which harmlessly walks proc.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
The name security_real_capable and security_real_capable_noaudit just don't
make much sense to me. Convert them to use security_capable and
security_capable_noaudit.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Exactly like security_capable except don't audit any denials. This is for
places where the kernel may make decisions about what to do if a task has a
given capability, but which failing that capability is not a sign of a
security policy violation. An example is checking if a task has
CAP_SYS_ADMIN to lower it's likelyhood of being killed by the oom killer.
This check is not a security violation if it is denied.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
security_capable takes ns, cred, cap. But the LSM capable() hook takes
cred, ns, cap. The capability helper functions also take cred, ns, cap.
Rather than flip argument order just to flip it back, leave them alone.
Heck, this should be a little faster since argument will be in the right
place!
Signed-off-by: Eric Paris <eparis@redhat.com>
The capabilities framework is based around credentials, not necessarily the
current task. Yet we still passed the current task down into LSMs from the
security_capable() LSM hook as if it was a meaningful portion of the security
decision. This patch removes the 'generic' passing of current and instead
forces individual LSMs to use current explicitly if they think it is
appropriate. In our case those LSMs are SELinux and AppArmor.
I believe the AppArmor use of current is incorrect, but that is wholely
unrelated to this patch. This patch does not change what AppArmor does, it
just makes it clear in the AppArmor code that it is doing it.
The SELinux code still uses current in it's audit message, which may also be
wrong and needs further investigation. Again this is NOT a change, it may
have always been wrong, this patch just makes it clear what is happening.
Signed-off-by: Eric Paris <eparis@redhat.com>
Fix several sparse warnings in the SELinux security server code.
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
vfs_create() ignores everything outside of 16bit subset of its
mode argument; switching it to umode_t is obviously equivalent
and it's the only caller of the method
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
vfs_mkdir() gets int, but immediately drops everything that might not
fit into umode_t and that's the only caller of ->mkdir()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
it's not needed anymore; we used to, back when we had to do
mount_subtree() by hand, complete with put_mnt_ns() in it.
No more... Apparmor didn't need it since the __d_path() fix.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
tomoyo/realpath.c needs exactly one include - that of common.h. It pulls
everything the thing needs, without doing ridiculous garbage such as trying
to include ../../fs/internal.h. If that alone doesn't scream "layering
violation", I don't know what does; and these days it's all for nothing,
since it fortunately does not use any symbols defined in there...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit 1e39f384bb ("evm: fix build problems") makes the stub version
of security_old_inode_init_security() return 0 when CONFIG_SECURITY is
not set.
But that makes callers such as reiserfs_security_init() assume that
security_old_inode_init_security() has set name, value, and len
arguments properly - but security_old_inode_init_security() left them
uninitialized which then results in interesting failures.
Revert security_old_inode_init_security() to the old behavior of
returning EOPNOTSUPP since both callers (reiserfs and ocfs2) handle this
just fine.
[ Also fixed the S_PRIVATE(inode) case of the actual non-stub
security_old_inode_init_security() function to return EOPNOTSUPP
for the same reason, as pointed out by Mimi Zohar.
It got incorrectly changed to match the new function in commit
fb88c2b6cbb1: "evm: fix security/security_old_init_security return
code". - Linus ]
Reported-by: Jorge Bastos <mysql.jorge@decimal.pt>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
net/bluetooth/l2cap_core.c
Just two overlapping changes, one added an initialization of
a local variable, and another change added a new local variable.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following bug in sel_netport_insert() where rcu_dereference() should
be rcu_dereference_protected() as sel_netport_lock is held.
===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
security/selinux/netport.c:127 invoked rcu_dereference_check() without protection!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
1 lock held by ossec-rootcheck/3323:
#0: (sel_netport_lock){+.....}, at: [<ffffffff8117d775>] sel_netport_sid+0xbb/0x226
stack backtrace:
Pid: 3323, comm: ossec-rootcheck Not tainted 3.1.0-rc8-fsdevel+ #1095
Call Trace:
[<ffffffff8105cfb7>] lockdep_rcu_dereference+0xa7/0xb0
[<ffffffff8117d871>] sel_netport_sid+0x1b7/0x226
[<ffffffff8117d6ba>] ? sel_netport_avc_callback+0xbc/0xbc
[<ffffffff8117556c>] selinux_socket_bind+0x115/0x230
[<ffffffff810a5388>] ? might_fault+0x4e/0x9e
[<ffffffff810a53d1>] ? might_fault+0x97/0x9e
[<ffffffff81171cf4>] security_socket_bind+0x11/0x13
[<ffffffff812ba967>] sys_bind+0x56/0x95
[<ffffffff81380dac>] ? sysret_check+0x27/0x62
[<ffffffff8105b767>] ? trace_hardirqs_on_caller+0x11e/0x155
[<ffffffff81076fcd>] ? audit_syscall_entry+0x17b/0x1ae
[<ffffffff811b5eae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff81380d7b>] system_call_fastpath+0x16/0x1b
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
There is a small chance of racing during tfm allocation.
This patch fixes it.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
On multi-core systems, setting of the key before every caclculation,
causes invalid HMAC calculation for other tfm users, because internal
state (ipad, opad) can be invalid before set key call returns.
It needs to be set only once during initialization.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Info about new measurements are cached in the iint for performance. When
the inode is flushed from cache, the associated iint is flushed as well.
Subsequent access to the inode will cause the inode to be re-measured and
will attempt to add a duplicate entry to the measurement list.
This patch frees the duplicate measurement memory, fixing a memory leak.
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Cc: stable@vger.kernel.org
The semantic patch that makes this change is available
in scripts/coccinelle/api/alloc/drop_kmalloc_cast.cocci.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: James Morris <jmorris@namei.org>
The 'aabool' wrappers actually pass off to the 'bool' parse functions,
so you should use the same check function. Similarly for aauint and
uint.
(Note that 'bool' module parameters also allow 'int', which is why you
got away with this, but that's changing very soon.)
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
Currently, there's no way to pass multiple tasks to cgroup_subsys
methods necessitating the need for separate per-process and per-task
methods. This patch introduces cgroup_taskset which can be used to
pass multiple tasks and their associated cgroups to cgroup_subsys
methods.
Three methods - can_attach(), cancel_attach() and attach() - are
converted to use cgroup_taskset. This unifies passed parameters so
that all methods have access to all information. Conversions in this
patchset are identical and don't introduce any behavior change.
-v2: documentation updated as per Paul Menage's suggestion.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Menage <paul@paulmenage.org>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: James Morris <jmorris@namei.org>
This adds the .gitignore file for the autogenerated TOMOYO files to keep
git from complaining after building things.
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Adds a missed rcu_dereference() around real_parent.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
Current tomoyo_realpath_from_path() implementation returns strange pathname
when calculating pathname of a file which belongs to lazy unmounted tree.
Use local pathname rather than strange absolute pathname in that case.
Also, this patch fixes a regression by commit 02125a82 "fix apparmor
dereferencing potentially freed dentry, sanitize __d_path() API".
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a small chance of racing during tfm allocation.
This patch fixes it.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
On multi-core systems, setting of the key before every caclculation,
causes invalid HMAC calculation for other tfm users, because internal
state (ipad, opad) can be invalid before set key call returns.
It needs to be set only once during initialization.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
__d_path() API is asking for trouble and in case of apparmor d_namespace_path()
getting just that. The root cause is that when __d_path() misses the root
it had been told to look for, it stores the location of the most remote ancestor
in *root. Without grabbing references. Sure, at the moment of call it had
been pinned down by what we have in *path. And if we raced with umount -l, we
could have very well stopped at vfsmount/dentry that got freed as soon as
prepend_path() dropped vfsmount_lock.
It is safe to compare these pointers with pre-existing (and known to be still
alive) vfsmount and dentry, as long as all we are asking is "is it the same
address?". Dereferencing is not safe and apparmor ended up stepping into
that. d_namespace_path() really wants to examine the place where we stopped,
even if it's not connected to our namespace. As the result, it looked
at ->d_sb->s_magic of a dentry that might've been already freed by that point.
All other callers had been careful enough to avoid that, but it's really
a bad interface - it invites that kind of trouble.
The fix is fairly straightforward, even though it's bigger than I'd like:
* prepend_path() root argument becomes const.
* __d_path() is never called with NULL/NULL root. It was a kludge
to start with. Instead, we have an explicit function - d_absolute_root().
Same as __d_path(), except that it doesn't get root passed and stops where
it stops. apparmor and tomoyo are using it.
* __d_path() returns NULL on path outside of root. The main
caller is show_mountinfo() and that's precisely what we pass root for - to
skip those outside chroot jail. Those who don't want that can (and do)
use d_path().
* __d_path() root argument becomes const. Everyone agrees, I hope.
* apparmor does *NOT* try to use __d_path() or any of its variants
when it sees that path->mnt is an internal vfsmount. In that case it's
definitely not mounted anywhere and dentry_path() is exactly what we want
there. Handling of sysctl()-triggered weirdness is moved to that place.
* if apparmor is asked to do pathname relative to chroot jail
and __d_path() tells it we it's not in that jail, the sucker just calls
d_absolute_path() instead. That's the other remaining caller of __d_path(),
BTW.
* seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway -
the normal seq_file logics will take care of growing the buffer and redoing
the call of ->show() just fine). However, if it gets path not reachable
from root, it returns SEQ_SKIP. The only caller adjusted (i.e. stopped
ignoring the return value as it used to do).
Reviewed-by: John Johansen <john.johansen@canonical.com>
ACKed-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
The ultimate goal is to get the sock_diag module, that works in
family+protocol terms. Currently this is suitable to do on the
inet_diag basis, so rename parts of the code. It will be moved
to sock_diag.c later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While parsing through IPv6 extension headers, fragment headers are
skipped making them invisible to the caller. This reports the
fragment offset of the last header in order to make it possible to
determine whether the packet is fragmented and, if so whether it is
a first or last fragment.
Signed-off-by: Jesse Gross <jesse@nicira.com>
C assignment can handle struct in6_addr copying.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build errors by adding Kconfig dependency on KEYS.
CRYPTO dependency removed.
CC security/integrity/digsig.o
security/integrity/digsig.c: In function ?integrity_digsig_verify?:
security/integrity/digsig.c:38:4: error: implicit declaration of function ?request_key?
security/integrity/digsig.c:38:17: error: ?key_type_keyring? undeclared (first use in this function)
security/integrity/digsig.c:38:17: note: each undeclared identifier is reported only once for each function it appears in
make[2]: *** [security/integrity/digsig.o] Error 1
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: James Morris <jmorris@namei.org>
Give keys their own lockdep class to differentiate them from each other in case
a key of one type has to refer to a key of another type.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Encrypted keys are encrypted/decrypted using either a trusted or
user-defined key type, which is referred to as the 'master' key.
The master key may be of type trusted iff the trusted key is
builtin or both the trusted key and encrypted keys are built as
modules. This patch resolves the build dependency problem.
- Use "masterkey-$(CONFIG_TRUSTED_KEYS)-$(CONFIG_ENCRYPTED_KEYS)" construct
to encapsulate the above logic. (Suggested by Dimtry Kasatkin.)
- Fixing the encrypted-keys Makefile, results in a module name change
from encrypted.ko to encrypted-keys.ko.
- Add module dependency for request_trusted_key() definition
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>