Commit Graph

4790 Commits

Author SHA1 Message Date
Linus Torvalds b274279a0b One patch for making it possible to execute usermode driver's path.
tomoyo: Loosen pathname/domainname validation.
 
  security/tomoyo/util.c |   29 +++++++++++++++++++++++------
  1 file changed, 23 insertions(+), 6 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJfhDj0AAoJEEJfEo0MZPUqKBYP/in6mwhoEs8EczqSJ04cRYf9
 zD1BRwubpwsiQgD6TBxHcJ1Kplpd1vp+IRtLYCG38HDZlLjRnSMKDn6dOUEwi3bf
 Uc8cBZ4FgUSxlFIyBoUjiG5IbUSndFbigJrbrMno8xKEedUcnoSntDTYsfDjuHAL
 5G6yBgCGZd3UI57Utgt+se73eptdseTLtGlCd/fD6tfZJpwiWpBlRdqM3C6PVAS5
 M9hFblYTVcR7mh1zB2fGxclgX0PIB9l8eq24yrWqOMyGaQP1C7aFuoonTxJbh295
 g4Ea5jLBmZkvjE0L1Wxu9WFLBfdepNVnKoDLayKasLFIl4OLoWUad1R+ALb5RhgM
 6pVlaJO7FjSKOc1gTq3R8WGUI0xUhP3BEwRKOron0x5n2CDew0+qZ0GIntv9K27A
 mSzk4US4T2rwKUp6L5kwsCSvh2GEvYrEdKACz8Ey0hqLSJgoUgowISh5+dpCNSd/
 GosDlZ1m35MyINT827YTVqzg7KN+HGrH/dt/vfHXCAprpcr33RBS/UniUqERA5y4
 aNHY+bwy2/uw6UUmpwwNuAVM97jIGQAfr13CxCOztCo4oXos4ksFU/BEp2fh6rJV
 5zW1+QoEQOKuDNvRT+KeFPxXunX45FcRceh7SumFvGA0vk9X6kMS5I1B2gmcMqIT
 3BZYCqGPm73o0xzhE3Xf
 =onIp
 -----END PGP SIGNATURE-----

Merge tag 'tomoyo-pr-20201012' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1

Pull tomoyo fix from Tetsuo HandaL
 "One patch to make it possible to execute usermode-driver's path"

* tag 'tomoyo-pr-20201012' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
  tomoyo: Loosen pathname/domainname validation.
2020-10-13 16:10:37 -07:00
Thomas Cedeno 03ca0ec138 LSM: SafeSetID: Fix warnings reported by test bot
Fix multiple cast-to-union warnings related to casting kuid_t and kgid_t
types to kid_t union type. Also fix incompatible type warning that
arises from accidental omission of "__rcu" qualifier on the struct
setid_ruleset pointer in the argument list for safesetid_file_read().

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Cedeno <thomascedeno@google.com>
Signed-off-by: Micah Morton <mortonm@chromium.org>
2020-10-13 09:17:36 -07:00
Thomas Cedeno 5294bac97e LSM: SafeSetID: Add GID security policy handling
The SafeSetID LSM has functionality for restricting setuid() calls based
on its configured security policies. This patch adds the analogous
functionality for setgid() calls. This is mostly a copy-and-paste change
with some code deduplication, plus slight modifications/name changes to
the policy-rule-related structs (now contain GID rules in addition to
the UID ones) and some type generalization since SafeSetID now needs to
deal with kgid_t and kuid_t types.

Signed-off-by: Thomas Cedeno <thomascedeno@google.com>
Signed-off-by: Micah Morton <mortonm@chromium.org>
2020-10-13 09:17:35 -07:00
Linus Torvalds 39a5101f98 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Allow DRBG testing through user-space af_alg
   - Add tcrypt speed testing support for keyed hashes
   - Add type-safe init/exit hooks for ahash

  Algorithms:
   - Mark arc4 as obsolete and pending for future removal
   - Mark anubis, khazad, sead and tea as obsolete
   - Improve boot-time xor benchmark
   - Add OSCCA SM2 asymmetric cipher algorithm and use it for integrity

  Drivers:
   - Fixes and enhancement for XTS in caam
   - Add support for XIP8001B hwrng in xiphera-trng
   - Add RNG and hash support in sun8i-ce/sun8i-ss
   - Allow imx-rngc to be used by kernel entropy pool
   - Use crypto engine in omap-sham
   - Add support for Ingenic X1830 with ingenic"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (205 commits)
  X.509: Fix modular build of public_key_sm2
  crypto: xor - Remove unused variable count in do_xor_speed
  X.509: fix error return value on the failed path
  crypto: bcm - Verify GCM/CCM key length in setkey
  crypto: qat - drop input parameter from adf_enable_aer()
  crypto: qat - fix function parameters descriptions
  crypto: atmel-tdes - use semicolons rather than commas to separate statements
  crypto: drivers - use semicolons rather than commas to separate statements
  hwrng: mxc-rnga - use semicolons rather than commas to separate statements
  hwrng: iproc-rng200 - use semicolons rather than commas to separate statements
  hwrng: stm32 - use semicolons rather than commas to separate statements
  crypto: xor - use ktime for template benchmarking
  crypto: xor - defer load time benchmark to a later time
  crypto: hisilicon/zip - fix the uninitalized 'curr_qm_qp_num'
  crypto: hisilicon/zip - fix the return value when device is busy
  crypto: hisilicon/zip - fix zero length input in GZIP decompress
  crypto: hisilicon/zip - fix the uncleared debug registers
  lib/mpi: Fix unused variable warnings
  crypto: x86/poly1305 - Remove assignments with no effect
  hwrng: npcm - modify readl to readb
  ...
2020-10-13 08:50:16 -07:00
Linus Torvalds 85ed13e78d Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat iovec cleanups from Al Viro:
 "Christoph's series around import_iovec() and compat variant thereof"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  security/keys: remove compat_keyctl_instantiate_key_iov
  mm: remove compat_process_vm_{readv,writev}
  fs: remove compat_sys_vmsplice
  fs: remove the compat readv/writev syscalls
  fs: remove various compat readv/writev helpers
  iov_iter: transparently handle compat iovecs in import_iovec
  iov_iter: refactor rw_copy_check_uvector and import_iovec
  iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c
  compat.h: fix a spelling error in <linux/compat.h>
2020-10-12 16:35:51 -07:00
Linus Torvalds e6412f9833 EFI changes for v5.10:
- Preliminary RISC-V enablement - the bulk of it will arrive via the RISCV tree.
 
  - Relax decompressed image placement rules for 32-bit ARM
 
  - Add support for passing MOK certificate table contents via a config table
    rather than a EFI variable.
 
  - Add support for 18 bit DIMM row IDs in the CPER records.
 
  - Work around broken Dell firmware that passes the entire Boot#### variable
    contents as the command line
 
  - Add definition of the EFI_MEMORY_CPU_CRYPTO memory attribute so we can
    identify it in the memory map listings.
 
  - Don't abort the boot on arm64 if the EFI RNG protocol is available but
    returns with an error
 
  - Replace slashes with exclamation marks in efivarfs file names
 
  - Split efi-pstore from the deprecated efivars sysfs code, so we can
    disable the latter on !x86.
 
  - Misc fixes, cleanups and updates.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+Ec9QRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1inTQ//TYj3kJq/7sWfUAxmAsWnUEC005YCNf0T
 x3kJQv3zYX4Rl4eEwkff8S1PrqqvwUP5yUZYApp8HD9s9CYvzz5iG5xtf/jX+QaV
 06JnTMnkoycx2NaOlbr1cmcIn4/cAhQVYbVCeVrlf7QL8enNTBr5IIQmo4mgP8Lc
 mauSsO1XU8ZuMQM+JcZSxAkAPxlhz3dbR5GteP4o2K4ShQKpiTCOfOG1J3FvUYba
 s1HGnhHFlkQr6m3pC+iG8dnAG0YtwHMH1eJVP7mbeKUsMXz944U8OVXDWxtn81pH
 /Xt/aFZXnoqwlSXythAr6vFTuEEn40n+qoOK6jhtcGPUeiAFPJgiaeAXw3gO0YBe
 Y8nEgdGfdNOMih94McRd4M6gB/N3vdqAGt+vjiZSCtzE+nTWRyIXSGCXuDVpkvL4
 VpEXpPINnt1FZZ3T/7dPro4X7pXALhODE+pl36RCbfHVBZKRfLV1Mc1prAUGXPxW
 E0MfaM9TxDnVhs3VPWlHmRgavee2MT1Tl/ES4CrRHEoz8ZCcu4MfROQyao8+Gobr
 VR+jVk+xbyDrykEc6jdAK4sDFXpTambuV624LiKkh6Mc4yfHRhPGrmP5c5l7SnCd
 aLp+scQ4T7sqkLuYlXpausXE3h4sm5uur5hNIRpdlvnwZBXpDEpkzI8x0C9OYr0Q
 kvFrreQWPLQ=
 =ZNI8
 -----END PGP SIGNATURE-----

Merge tag 'efi-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull EFI changes from Ingo Molnar:

 - Preliminary RISC-V enablement - the bulk of it will arrive via the
   RISCV tree.

 - Relax decompressed image placement rules for 32-bit ARM

 - Add support for passing MOK certificate table contents via a config
   table rather than a EFI variable.

 - Add support for 18 bit DIMM row IDs in the CPER records.

 - Work around broken Dell firmware that passes the entire Boot####
   variable contents as the command line

 - Add definition of the EFI_MEMORY_CPU_CRYPTO memory attribute so we
   can identify it in the memory map listings.

 - Don't abort the boot on arm64 if the EFI RNG protocol is available
   but returns with an error

 - Replace slashes with exclamation marks in efivarfs file names

 - Split efi-pstore from the deprecated efivars sysfs code, so we can
   disable the latter on !x86.

 - Misc fixes, cleanups and updates.

* tag 'efi-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
  efi: mokvar: add missing include of asm/early_ioremap.h
  efi: efivars: limit availability to X86 builds
  efi: remove some false dependencies on CONFIG_EFI_VARS
  efi: gsmi: fix false dependency on CONFIG_EFI_VARS
  efi: efivars: un-export efivars_sysfs_init()
  efi: pstore: move workqueue handling out of efivars
  efi: pstore: disentangle from deprecated efivars module
  efi: mokvar-table: fix some issues in new code
  efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure
  efivarfs: Replace invalid slashes with exclamation marks in dentries.
  efi: Delete deprecated parameter comments
  efi/libstub: Fix missing-prototypes in string.c
  efi: Add definition of EFI_MEMORY_CPU_CRYPTO and ability to report it
  cper,edac,efi: Memory Error Record: bank group/address and chip id
  edac,ghes,cper: Add Row Extension to Memory Error Record
  efi/x86: Add a quirk to support command line arguments on Dell EFI firmware
  efi/libstub: Add efi_warn and *_once logging helpers
  integrity: Load certs from the EFI MOK config table
  integrity: Move import of MokListRT certs to a separate routine
  efi: Support for MOK variable config table
  ...
2020-10-12 13:26:49 -07:00
Tetsuo Handa a207516776 tomoyo: Loosen pathname/domainname validation.
Since commit e2dc9bf3f5 ("umd: Transform fork_usermode_blob into
fork_usermode_driver") started calling execve() on a program written in
a local mount which is not connected to mount tree,
tomoyo_realpath_from_path() started returning a pathname in
"$fsname:/$pathname" format which violates TOMOYO's domainname rule that
it must start with "<$namespace>" followed by zero or more repetitions of
pathnames which start with '/'.

Since $fsname must not contain '.' since commit 79c0b2df79 ("add
filesystem subtype support"), tomoyo_correct_path() can recognize a token
which appears '/' before '.' appears (e.g. proc:/self/exe ) as a pathname
while rejecting a token which appears '.' before '/' appears (e.g.
exec.realpath="/bin/bash" ) as a condition parameter.

Therefore, accept domainnames which contain pathnames which do not start
with '/' but contain '/' before '.' (e.g. <kernel> tmpfs:/bpfilter_umh ).

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2020-10-12 19:53:34 +09:00
Casey Schaufler edd615371b Smack: Remove unnecessary variable initialization
The initialization of rc in smack_from_netlbl() is pointless.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-10-05 14:20:51 -07:00
Kees Cook 0fa8e08464 fs/kernel_file_read: Add "offset" arg for partial reads
To perform partial reads, callers of kernel_read_file*() must have a
non-NULL file_size argument and a preallocated buffer. The new "offset"
argument can then be used to seek to specific locations in the file to
fill the buffer to, at most, "buf_size" per call.

Where possible, the LSM hooks can report whether a full file has been
read or not so that the contents can be reasoned about.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201002173828.2099543-14-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:04 +02:00
Scott Branden 34736daeec IMA: Add support for file reads without contents
When the kernel_read_file LSM hook is called with contents=false, IMA
can appraise the file directly, without requiring a filled buffer. When
such a buffer is available, though, IMA can continue to use it instead
of forcing a double read here.

Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Link: https://lore.kernel.org/lkml/20200706232309.12010-10-scott.branden@broadcom.com/
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-13-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:04 +02:00
Kees Cook 2039bda1fa LSM: Add "contents" flag to kernel_read_file hook
As with the kernel_load_data LSM hook, add a "contents" flag to the
kernel_read_file LSM hook that indicates whether the LSM can expect
a matching call to the kernel_post_read_file LSM hook with the full
contents of the file. With the coming addition of partial file read
support for kernel_read_file*() API, the LSM will no longer be able
to always see the entire contents of a file during the read calls.

For cases where the LSM must read examine the complete file contents,
it will need to do so on its own every time the kernel_read_file
hook is called with contents=false (or reject such cases). Adjust all
existing LSMs to retain existing behavior.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-12-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:03 +02:00
Kees Cook 4f2d99b06b firmware_loader: Use security_post_load_data()
Now that security_post_load_data() is wired up, use it instead
of the NULL file argument style of security_post_read_file(),
and update the security_kernel_load_data() call to indicate that a
security_kernel_post_load_data() call is expected.

Wire up the IMA check to match earlier logic. Perhaps a generalized
change to ima_post_load_data() might look something like this:

    return process_buffer_measurement(buf, size,
                                      kernel_load_data_id_str(load_id),
                                      read_idmap[load_id] ?: FILE_CHECK,
                                      0, NULL);

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-10-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:03 +02:00
Kees Cook b64fcae74b LSM: Introduce kernel_post_load_data() hook
There are a few places in the kernel where LSMs would like to have
visibility into the contents of a kernel buffer that has been loaded or
read. While security_kernel_post_read_file() (which includes the
buffer) exists as a pairing for security_kernel_read_file(), no such
hook exists to pair with security_kernel_load_data().

Earlier proposals for just using security_kernel_post_read_file() with a
NULL file argument were rejected (i.e. "file" should always be valid for
the security_..._file hooks, but it appears at least one case was
left in the kernel during earlier refactoring. (This will be fixed in
a subsequent patch.)

Since not all cases of security_kernel_load_data() can have a single
contiguous buffer made available to the LSM hook (e.g. kexec image
segments are separately loaded), there needs to be a way for the LSM to
reason about its expectations of the hook coverage. In order to handle
this, add a "contents" argument to the "kernel_load_data" hook that
indicates if the newly added "kernel_post_load_data" hook will be called
with the full contents once loaded. That way, LSMs requiring full contents
can choose to unilaterally reject "kernel_load_data" with contents=false
(which is effectively the existing hook coverage), but when contents=true
they can allow it and later evaluate the "kernel_post_load_data" hook
once the buffer is loaded.

With this change, LSMs can gain coverage over non-file-backed data loads
(e.g. init_module(2) and firmware userspace helper), which will happen
in subsequent patches.

Additionally prepare IMA to start processing these cases.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-9-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:03 +02:00
Kees Cook 885352881f fs/kernel_read_file: Add file_size output argument
In preparation for adding partial read support, add an optional output
argument to kernel_read_file*() that reports the file size so callers
can reason more easily about their reading progress.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-8-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:03 +02:00
Kees Cook 113eeb5177 fs/kernel_read_file: Switch buffer size arg to size_t
In preparation for further refactoring of kernel_read_file*(), rename
the "max_size" argument to the more accurate "buf_size", and correct
its type to size_t. Add kerndoc to explain the specifics of how the
arguments will be used. Note that with buf_size now size_t, it can no
longer be negative (and was never called with a negative value). Adjust
callers to use it as a "maximum size" when *buf is NULL.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-7-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:34:19 +02:00
Kees Cook f7a4f689bc fs/kernel_read_file: Remove redundant size argument
In preparation for refactoring kernel_read_file*(), remove the redundant
"size" argument which is not needed: it can be included in the return
code, with callers adjusted. (VFS reads already cannot be larger than
INT_MAX.)

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-6-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:34:18 +02:00
Scott Branden b89999d004 fs/kernel_read_file: Split into separate include file
Move kernel_read_file* out of linux/fs.h to its own linux/kernel_read_file.h
include file. That header gets pulled in just about everywhere
and doesn't really need functions not related to the general fs interface.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Link: https://lore.kernel.org/r/20200706232309.12010-2-scott.branden@broadcom.com
Link: https://lore.kernel.org/r/20201002173828.2099543-4-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:34:18 +02:00
Kees Cook c307459b9d fs/kernel_read_file: Remove FIRMWARE_PREALLOC_BUFFER enum
FIRMWARE_PREALLOC_BUFFER is a "how", not a "what", and confuses the LSMs
that are interested in filtering between types of things. The "how"
should be an internal detail made uninteresting to the LSMs.

Fixes: a098ecd2fa ("firmware: support loading into a pre-allocated buffer")
Fixes: fd90bc559b ("ima: based on policy verify firmware signatures (pre-allocated buffer)")
Fixes: 4f0496d8ff ("ima: based on policy warn about loading firmware (pre-allocated buffer)")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201002173828.2099543-2-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:34:18 +02:00
Christoph Hellwig 5d47b39479 security/keys: remove compat_keyctl_instantiate_key_iov
Now that import_iovec handles compat iovecs, the native version of
keyctl_instantiate_key_iov can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:16 -04:00
Christoph Hellwig 89cd35c58b iov_iter: transparently handle compat iovecs in import_iovec
Use in compat_syscall to import either native or the compat iovecs, and
remove the now superflous compat_import_iovec.

This removes the need for special compat logic in most callers, and
the remaining ones can still be simplified by using __import_iovec
with a bool compat parameter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:13 -04:00
Tianjia Zhang 0b7e44d39c integrity: Asymmetric digsig supports SM2-with-SM3 algorithm
Asymmetric digsig supports SM2-with-SM3 algorithm combination,
so that IMA can also verify SM2's signature data.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Tested-by: Xufeng Zhang <yunbo.xufeng@linux.alibaba.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Vitaly Chikunov <vt@altlinux.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-09-25 17:48:55 +10:00
David S. Miller 3ab0a7a0c3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Two minor conflicts:

1) net/ipv4/route.c, adding a new local variable while
   moving another local variable and removing it's
   initial assignment.

2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
   One pretty prints the port mode differently, whilst another
   changes the driver to try and obtain the port mode from
   the port node rather than the switch node.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22 16:45:34 -07:00
Casey Schaufler bf0afe673b Smack: Fix build when NETWORK_SECMARK is not set
Use proper conditional compilation for the secmark field in
the network skb.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-22 14:59:31 -07:00
KP Singh aa662fc04f ima: Fix NULL pointer dereference in ima_file_hash
ima_file_hash can be called when there is no iint->ima_hash available
even though the inode exists in the integrity cache. It is fairly
common for a file to not have a hash. (e.g. an mknodat, prior to the
file being closed).

Another example where this can happen (suggested by Jann Horn):

Process A does:

	while(1) {
		unlink("/tmp/imafoo");
		fd = open("/tmp/imafoo", O_RDWR|O_CREAT|O_TRUNC, 0700);
		if (fd == -1) {
			perror("open");
			continue;
		}
		write(fd, "A", 1);
		close(fd);
	}

and Process B does:

	while (1) {
		int fd = open("/tmp/imafoo", O_RDONLY);
		if (fd == -1)
			continue;
    		char *mapping = mmap(NULL, 0x1000, PROT_READ|PROT_EXEC,
			 	     MAP_PRIVATE, fd, 0);
		if (mapping != MAP_FAILED)
			munmap(mapping, 0x1000);
		close(fd);
  	}

Due to the race to get the iint->mutex between ima_file_hash and
process_measurement iint->ima_hash could still be NULL.

Fixes: 6beea7afcc ("ima: add the ability to query the cached hash of a given file")
Signed-off-by: KP Singh <kpsingh@google.com>
Reviewed-by: Florent Revest <revest@chromium.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-16 17:43:02 -04:00
Lenny Szubowicz 726bd8965a integrity: Load certs from the EFI MOK config table
Because of system-specific EFI firmware limitations, EFI volatile
variables may not be capable of holding the required contents of
the Machine Owner Key (MOK) certificate store when the certificate
list grows above some size. Therefore, an EFI boot loader may pass
the MOK certs via a EFI configuration table created specifically for
this purpose to avoid this firmware limitation.

An EFI configuration table is a much more primitive mechanism
compared to EFI variables and is well suited for one-way passage
of static information from a pre-OS environment to the kernel.

This patch adds the support to load certs from the MokListRT
entry in the MOK variable configuration table, if it's present.
The pre-existing support to load certs from the MokListRT EFI
variable remains and is used if the EFI MOK configuration table
isn't present or can't be successfully used.

Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
Link: https://lore.kernel.org/r/20200905013107.10457-4-lszubowi@redhat.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-16 18:53:42 +03:00
Lenny Szubowicz 38a1f03aa2 integrity: Move import of MokListRT certs to a separate routine
Move the loading of certs from the UEFI MokListRT into a separate
routine to facilitate additional MokList functionality.

There is no visible functional change as a result of this patch.
Although the UEFI dbx certs are now loaded before the MokList certs,
they are loaded onto different key rings. So the order of the keys
on their respective key rings is the same.

Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/r/20200905013107.10457-3-lszubowi@redhat.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-16 18:53:42 +03:00
Linus Torvalds 1e484d3887 device_cgroup RCU warning fix from Amol Grover <frextrite@gmail.com>
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAl9hIXMACgkQrZhLv9lQ
 BTy34RAAkt6BAcEPA9hSMkvRCrA44Doq6jaK45vmMExuWh2/QMvfK1E4KjxXGn0U
 e74TcCPgh+AuSWQABAuZrMVx4Ai9fyDBWkhtzwz7XsHeUtUvEMUkb2fzKRGoBg6h
 6WvYtzdO4NHEcy3Lf59EYW2Hm08eEjZfmVMRlCF6MoLmsj/ifh+yQ3Xxy0RAd/Jo
 X4IvSwan6EitXNEHy7onmpDjL7BvncXs1dXpGXqHzhLF8W4EtFmIZGH3T5/W82n0
 IgtEqqsCw5MY5mSIixjUPcRxbi+NUkymEzYQyvceVU0W+voMITQ8Qb/NkGMklMLE
 KUHwP1r4q1XR1WVFqHxRCPB4c+njNwiTUtAO44ODNNgC1R+wT70CGhujP1bSW6Eo
 Gf5DWJniD9I8viBWD5tYBFWPBlH+DfURY8wqkrEEC8fntsIDkSWf2XK2dVBrDxMM
 PxXOYEKfZVIRQTRAz/HJmCAoW8rVkCa5ptpKFJzWvoLqS3FclFRg0i1FZ6fcMuz1
 4phZCL+pGDSp3yhSi6lamdhPhRPq9Pbk4ZVSPK2gAg4VzhI6w7TY/zicZapaRQ0g
 hScOmTk4YKqLhUbcWiBErH/AwV6op+H4DwG/A1z8ASUxQST5oiJk0d7dMGCF9cgG
 VuHXj/dQbtymUMjo7MSqLqpx0ieEarEqZ2BOgn1DVTYgn4c1yms=
 =DgJN
 -----END PGP SIGNATURE-----

Merge tag 'fixes-v5.9a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull security layer fix from James  Morris:
 "A device_cgroup RCU warning fix from Amol Grover"

* tag 'fixes-v5.9a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  device_cgroup: Fix RCU list debugging warning
2020-09-15 16:26:57 -07:00
Lakshmi Ramasubramanian 8861d0af64 selinux: Add helper functions to get and set checkreqprot
checkreqprot data member in selinux_state struct is accessed directly by
SELinux functions to get and set. This could cause unexpected read or
write access to this data member due to compiler optimizations and/or
compiler's reordering of access to this field.

Add helper functions to get and set checkreqprot data member in
selinux_state struct. These helper functions use READ_ONCE and
WRITE_ONCE macros to ensure atomic read or write of memory for
this data member.

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Paul Moore <paul@paul-moore.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-09-15 14:36:28 -04:00
Roberto Sassu 455b6c9112 evm: Check size of security.evm before using it
This patch checks the size for the EVM_IMA_XATTR_DIGSIG and
EVM_XATTR_PORTABLE_DIGSIG types to ensure that the algorithm is read from
the buffer returned by vfs_getxattr_alloc().

Cc: stable@vger.kernel.org # 4.19.x
Fixes: 5feeb61183 ("evm: Allow non-SHA1 digital signatures")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-15 13:47:42 -04:00
Roberto Sassu 4be92db3b5 ima: Remove semicolon at the end of ima_get_binary_runtime_size()
This patch removes the unnecessary semicolon at the end of
ima_get_binary_runtime_size().

Cc: stable@vger.kernel.org
Fixes: d158847ae8 ("ima: maintain memory size needed for serializing the measurement list")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-15 13:47:41 -04:00
Roberto Sassu 60386b8540 ima: Don't ignore errors from crypto_shash_update()
Errors returned by crypto_shash_update() are not checked in
ima_calc_boot_aggregate_tfm() and thus can be overwritten at the next
iteration of the loop. This patch adds a check after calling
crypto_shash_update() and returns immediately if the result is not zero.

Cc: stable@vger.kernel.org
Fixes: 3323eec921 ("integrity: IMA as an integrity service provider")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-15 13:47:37 -04:00
Alex Dewar f60c826d03 ima: Use kmemdup rather than kmalloc+memcpy
Issue identified with Coccinelle.

Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-15 09:57:48 -04:00
Casey Schaufler 322dd63c7f Smack: Use the netlabel cache
Utilize the Netlabel cache mechanism for incoming packet matching.
Refactor the initialization of secattr structures, as it was being
done in two places.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-11 15:31:31 -07:00
Casey Schaufler a2af031885 Smack: Set socket labels only once
Refactor the IP send checks so that the netlabel value
is set only when necessary, not on every send. Some functions
get renamed as the changes made the old name misleading.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-11 15:31:30 -07:00
Casey Schaufler 36be81293d Smack: Consolidate uses of secmark into a function
Add a function smack_from_skb() that returns the Smack label
identified by a network secmark. Replace the explicit uses of
the secmark with this function.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-11 15:31:30 -07:00
Stephen Smalley e8ba53d002 selinux: access policycaps with READ_ONCE/WRITE_ONCE
Use READ_ONCE/WRITE_ONCE for all accesses to the
selinux_state.policycaps booleans to prevent compiler
mischief.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-09-11 10:08:51 -04:00
Bruno Meneguele 8c2f516c99 integrity: include keyring name for unknown key request
Depending on the IMA policy rule a key may be searched for in multiple
keyrings (e.g. .ima and .platform) and possibly not found.  This patch
improves feedback by including the keyring "description" (name) in the
error message.

Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
[zohar@linux.ibm.com: updated commit message]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-09 20:05:28 -04:00
Bruno Meneguele e4d7e2df3a ima: limit secure boot feedback scope for appraise
Only emit an unknown/invalid message when setting the IMA appraise mode
to anything other than "enforce", when secureboot is enabled.

Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
[zohar@linux.ibm.com: updated commit message]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-09 20:01:55 -04:00
Bruno Meneguele 7fe2bb7e7e integrity: invalid kernel parameters feedback
Don't silently ignore unknown or invalid ima_{policy,appraise,hash} and evm
kernel boot command line options.

Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-08 22:03:50 -04:00
Bruno Meneguele 4afb28ab03 ima: add check for enforced appraise option
The "enforce" string is allowed as an option for ima_appraise= kernel
paramenter per kernel-paramenters.txt and should be considered on the
parameter setup checking as a matter of completeness. Also it allows futher
checking on the options being passed by the user.

Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-08 22:02:57 -04:00
Jakub Kicinski 44a8c4f33c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
We got slightly different patches removing a double word
in a comment in net/ipv4/raw.c - picked the version from net.

Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
values instead of VNIC login response buffer (following what
commit 507ebe6444 ("ibmvnic: Fix use-after-free of VNIC login
response buffer") did).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-04 21:28:59 -07:00
Denis Efremov e44f128768 integrity: Use current_uid() in integrity_audit_message()
Modify integrity_audit_message() to use current_uid().

Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-08-31 17:46:50 -04:00
Tyler Hicks 48ce1ddce1 ima: Fail rule parsing when asymmetric key measurement isn't supportable
Measuring keys is currently only supported for asymmetric keys. In the
future, this might change.

For now, the "func=KEY_CHECK" and "keyrings=" options are only
appropriate when CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS is enabled. Make
this clear at policy load so that IMA policy authors don't assume that
these policy language constructs are supported.

Fixes: 2b60c0eced ("IMA: Read keyrings= option from the IMA policy")
Fixes: 5808611ccc ("IMA: Add KEY_CHECK func to measure keys")
Suggested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-08-31 17:45:14 -04:00
Tyler Hicks 176377d97d ima: Pre-parse the list of keyrings in a KEY_CHECK rule
The ima_keyrings buffer was used as a work buffer for strsep()-based
parsing of the "keyrings=" option of an IMA policy rule. This parsing
was re-performed each time an asymmetric key was added to a kernel
keyring for each loaded policy rule that contained a "keyrings=" option.

An example rule specifying this option is:

 measure func=KEY_CHECK keyrings=a|b|c

The rule says to measure asymmetric keys added to any of the kernel
keyrings named "a", "b", or "c". The size of the buffer size was
equal to the size of the largest "keyrings=" value seen in a previously
loaded rule (5 + 1 for the NUL-terminator in the previous example) and
the buffer was pre-allocated at the time of policy load.

The pre-allocated buffer approach suffered from a couple bugs:

1) There was no locking around the use of the buffer so concurrent key
   add operations, to two different keyrings, would result in the
   strsep() loop of ima_match_keyring() to modify the buffer at the same
   time. This resulted in unexpected results from ima_match_keyring()
   and, therefore, could cause unintended keys to be measured or keys to
   not be measured when IMA policy intended for them to be measured.

2) If the kstrdup() that initialized entry->keyrings in ima_parse_rule()
   failed, the ima_keyrings buffer was freed and set to NULL even when a
   valid KEY_CHECK rule was previously loaded. The next KEY_CHECK event
   would trigger a call to strcpy() with a NULL destination pointer and
   crash the kernel.

Remove the need for a pre-allocated global buffer by parsing the list of
keyrings in a KEY_CHECK rule at the time of policy load. The
ima_rule_entry will contain an array of string pointers which point to
the name of each keyring specified in the rule. No string processing
needs to happen at the time of asymmetric key add so iterating through
the list and doing a string comparison is all that's required at the
time of policy check.

In the process of changing how the "keyrings=" policy option is handled,
a couple additional bugs were fixed:

1) The rule parser accepted rules containing invalid "keyrings=" values
   such as "a|b||c", "a|b|", or simply "|".

2) The /sys/kernel/security/ima/policy file did not display the entire
   "keyrings=" value if the list of keyrings was longer than what could
   fit in the fixed size tbuf buffer in ima_policy_show().

Fixes: 5c7bac9fb2 ("IMA: pre-allocate buffer to hold keyrings string")
Fixes: 2b60c0eced ("IMA: Read keyrings= option from the IMA policy")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-08-31 17:45:02 -04:00
Ondrej Mosnacek 66ccd2560a selinux: simplify away security_policydb_len()
Remove the security_policydb_len() calls from sel_open_policy() and
instead update the inode size from the size returned from
security_read_policy().

Since after this change security_policydb_len() is only called from
security_load_policy(), remove it entirely and just open-code it there.

Also, since security_load_policy() is always called with policy_mutex
held, make it dereference the policy pointer directly and drop the
unnecessary RCU locking.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-31 10:00:14 -04:00
Stephen Smalley 9ff9abc4c6 selinux: move policy mutex to selinux_state, use in lockdep checks
Move the mutex used to synchronize policy changes (reloads and setting
of booleans) from selinux_fs_info to selinux_state and use it in
lockdep checks for rcu_dereference_protected() calls in the security
server functions.  This makes the dependency on the mutex explicit
in the code rather than relying on comments.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-27 09:52:47 -04:00
Dan Carpenter 0256b0aa80 selinux: fix error handling bugs in security_load_policy()
There are a few bugs in the error handling for security_load_policy().

1) If the newpolicy->sidtab allocation fails then it leads to a NULL
   dereference.  Also the error code was not set to -ENOMEM on that
   path.
2) If policydb_read() failed then we call policydb_destroy() twice
   which meands we call kvfree(p->sym_val_to_name[i]) twice.
3) If policydb_load_isids() failed then we call sidtab_destroy() twice
   and that results in a double free in the sidtab_destroy_tree()
   function because entry.ptr_inner and entry.ptr_leaf are not set to
   NULL.

One thing that makes this code nice to deal with is that none of the
functions return partially allocated data.  In other words, the
policydb_read() either allocates everything successfully or it frees
all the data it allocates.  It never returns a mix of allocated and
not allocated data.

I re-wrote this to only free the successfully allocated data which
avoids the double frees.  I also re-ordered selinux_policy_free() so
it's in the reverse order of the allocation function.

Fixes: c7c556f1e8 ("selinux: refactor changing booleans")
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
[PM: partially merged by hand due to merge fuzz]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-26 10:19:08 -04:00
KP Singh 8ea636848a bpf: Implement bpf_local_storage for inodes
Similar to bpf_local_storage for sockets, add local storage for inodes.
The life-cycle of storage is managed with the life-cycle of the inode.
i.e. the storage is destroyed along with the owning inode.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
security blob which are now stackable and can co-exist with other LSMs.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-6-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
Stephen Smalley 1b8b31a2e6 selinux: convert policy read-write lock to RCU
Convert the policy read-write lock to RCU.  This is significantly
simplified by the earlier work to encapsulate the policy data
structures and refactor the policy load and boolean setting logic.
Move the latest_granting sequence number into the selinux_policy
structure so that it can be updated atomically with the policy.
Since removing the policy rwlock and moving latest_granting reduces
the selinux_ss structure to nothing more than a wrapper around the
selinux_policy pointer, get rid of the extra layer of indirection.

At present this change merely passes a hardcoded 1 to
rcu_dereference_check() in the cases where we know we do not need to
take rcu_read_lock(), with the preceding comment explaining why.
Alternatively we could pass fsi->mutex down from selinuxfs and
apply a lockdep check on it instead.

Based in part on earlier attempts to convert the policy rwlock
to RCU by Kaigai Kohei [1] and by Peter Enderborg [2].

[1] https://lore.kernel.org/selinux/6e2f9128-e191-ebb3-0e87-74bfccb0767f@tycho.nsa.gov/
[2] https://lore.kernel.org/selinux/20180530141104.28569-1-peter.enderborg@sony.com/

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-25 08:34:47 -04:00
Randy Dunlap c76a2f9ecd selinux: delete repeated words in comments
Drop a repeated word in comments.
{open, is, then}

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: selinux@vger.kernel.org
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
[PM: fix subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-24 09:03:14 -04:00
Gustavo A. R. Silva df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Peter Enderborg 30969bc8e0 selinux: add basic filtering for audit trace events
This patch adds further attributes to the event. These attributes are
helpful to understand the context of the message and can be used
to filter the events.

There are three common items. Source context, target context and tclass.
There are also items from the outcome of operation performed.

An event is similar to:
           <...>-1309  [002] ....  6346.691689: selinux_audited:
       requested=0x4000000 denied=0x4000000 audited=0x4000000
       result=-13
       scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023
       tcontext=system_u:object_r:bin_t:s0 tclass=file

With systems where many denials are occurring, it is useful to apply a
filter. The filtering is a set of logic that is inserted with
the filter file. Example:
 echo "tclass==\"file\" " > events/avc/selinux_audited/filter

This adds that we only get tclass=file.

The trace can also have extra properties. Adding the user stack
can be done with
   echo 1 > options/userstacktrace

Now the output will be
         runcon-1365  [003] ....  6960.955530: selinux_audited:
     requested=0x4000000 denied=0x4000000 audited=0x4000000
     result=-13
     scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023
     tcontext=system_u:object_r:bin_t:s0 tclass=file
          runcon-1365  [003] ....  6960.955560: <user stack trace>
 =>  <00007f325b4ce45b>
 =>  <00005607093efa57>

Signed-off-by: Peter Enderborg <peter.enderborg@sony.com>
Reviewed-by: Thiébaud Weksteen <tweek@google.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 17:07:29 -04:00
Thiébaud Weksteen dd8166212d selinux: add tracepoint on audited events
The audit data currently captures which process and which target
is responsible for a denial. There is no data on where exactly in the
process that call occurred. Debugging can be made easier by being able to
reconstruct the unified kernel and userland stack traces [1]. Add a
tracepoint on the SELinux denials which can then be used by userland
(i.e. perf).

Although this patch could manually be added by each OS developer to
trouble shoot a denial, adding it to the kernel streamlines the
developers workflow.

It is possible to use perf for monitoring the event:
  # perf record -e avc:selinux_audited -g -a
  ^C
  # perf report -g
  [...]
      6.40%     6.40%  audited=800000 tclass=4
               |
                  __libc_start_main
                  |
                  |--4.60%--__GI___ioctl
                  |          entry_SYSCALL_64
                  |          do_syscall_64
                  |          __x64_sys_ioctl
                  |          ksys_ioctl
                  |          binder_ioctl
                  |          binder_set_nice
                  |          can_nice
                  |          capable
                  |          security_capable
                  |          cred_has_capability.isra.0
                  |          slow_avc_audit
                  |          common_lsm_audit
                  |          avc_audit_post_callback
                  |          avc_audit_post_callback
                  |

It is also possible to use the ftrace interface:
  # echo 1 > /sys/kernel/debug/tracing/events/avc/selinux_audited/enable
  # cat /sys/kernel/debug/tracing/trace
  tracer: nop
  entries-in-buffer/entries-written: 1/1   #P:8
  [...]
  dmesg-3624  [001] 13072.325358: selinux_denied: audited=800000 tclass=4

The tclass value can be mapped to a class by searching
security/selinux/flask.h. The audited value is a bit field of the
permissions described in security/selinux/av_permissions.h for the
corresponding class.

[1] https://source.android.com/devices/tech/debug/native_stack_dump

Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Suggested-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: Peter Enderborg <peter.enderborg@sony.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 17:05:22 -04:00
Daniel Burgener 0eea609153 selinux: Create new booleans and class dirs out of tree
In order to avoid concurrency issues around selinuxfs resource availability
during policy load, we first create new directories out of tree for
reloaded resources, then swap them in, and finally delete the old versions.

This fix focuses on concurrency in each of the two subtrees swapped, and
not concurrency between the trees.  This means that it is still possible
that subsequent reads to eg the booleans directory and the class directory
during a policy load could see the old state for one and the new for the other.
The problem of ensuring that policy loads are fully atomic from the perspective
of userspace is larger than what is dealt with here.  This commit focuses on
ensuring that the directories contents always match either the new or the old
policy state from the perspective of userspace.

In the previous implementation, on policy load /sys/fs/selinux is updated
by deleting the previous contents of
/sys/fs/selinux/{class,booleans} and then recreating them.  This means
that there is a period of time when the contents of these directories do not
exist which can cause race conditions as userspace relies on them for
information about the policy.  In addition, it means that error recovery in
the event of failure is challenging.

In order to demonstrate the race condition that this series fixes, you
can use the following commands:

while true; do cat /sys/fs/selinux/class/service/perms/status
>/dev/null; done &
while true; do load_policy; done;

In the existing code, this will display errors fairly often as the class
lookup fails.  (In normal operation from systemd, this would result in a
permission check which would be allowed or denied based on policy settings
around unknown object classes.) After applying this patch series you
should expect to no longer see such error messages.

Signed-off-by: Daniel Burgener <dburgener@linux.microsoft.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 09:41:31 -04:00
Daniel Burgener 613ba18798 selinux: Standardize string literal usage for selinuxfs directory names
Switch class and policy_capabilities directory names to be referred to with
global constants, consistent with booleans directory name.  This will allow
for easy consistency of naming in future development.

Signed-off-by: Daniel Burgener <dburgener@linux.microsoft.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 09:39:10 -04:00
Daniel Burgener 66ec384ad3 selinux: Refactor selinuxfs directory populating functions
Make sel_make_bools and sel_make_classes take the specific elements of
selinux_fs_info that they need rather than the entire struct.

This will allow a future patch to pass temporary elements that are not in
the selinux_fs_info struct to these functions so that the original elements
can be preserved until we are ready to perform the switch over.

Signed-off-by: Daniel Burgener <dburgener@linux.microsoft.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 09:37:12 -04:00
Daniel Burgener aeecf4a3fb selinux: Create function for selinuxfs directory cleanup
Separating the cleanup from the creation will simplify two things in
future patches in this series.  First, the creation can be made generic,
to create directories not tied to the selinux_fs_info structure.  Second,
we will ultimately want to reorder creation and deletion so that the
deletions aren't performed until the new directory structures have already
been moved into place.

Signed-off-by: Daniel Burgener <dburgener@linux.microsoft.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 09:35:59 -04:00
Stephen Smalley 9530a3e004 selinux: permit removing security.selinux xattr before policy load
Currently SELinux denies attempts to remove the security.selinux xattr
always, even when permissive or no policy is loaded.  This was originally
motivated by the view that all files should be labeled, even if that label
is unlabeled_t, and we shouldn't permit files that were once labeled to
have their labels removed entirely.  This however prevents removing
SELinux xattrs in the case where one "disables" SELinux by not loading
a policy (e.g. a system where runtime disable is removed and selinux=0
was not specified).  Allow removing the xattr before SELinux is
initialized.  We could conceivably permit it even after initialization
if permissive, or introduce a separate permission check here.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-20 21:55:31 -04:00
Amol Grover bc62d68e2a device_cgroup: Fix RCU list debugging warning
exceptions may be traversed using list_for_each_entry_rcu()
outside of an RCU read side critical section BUT under the
protection of decgroup_mutex. Hence add the corresponding
lockdep expression to fix the following false-positive
warning:

[    2.304417] =============================
[    2.304418] WARNING: suspicious RCU usage
[    2.304420] 5.5.4-stable #17 Tainted: G            E
[    2.304422] -----------------------------
[    2.304424] security/device_cgroup.c:355 RCU-list traversed in non-reader section!!

Signed-off-by: Amol Grover <frextrite@gmail.com>
Signed-off-by: James Morris <jmorris@namei.org>
2020-08-20 11:25:03 -07:00
kernel test robot 879229311b selinux: fix memdup.cocci warnings
Use kmemdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/memdup.cocci

Fixes: c7c556f1e8 ("selinux: refactor changing booleans")
CC: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-20 08:39:05 -04:00
Stephen Smalley 37ea433c66 selinux: avoid dereferencing the policy prior to initialization
Certain SELinux security server functions (e.g. security_port_sid,
called during bind) were not explicitly testing to see if SELinux
has been initialized (i.e. initial policy loaded) and handling
the no-policy-loaded case.  In the past this happened to work
because the policydb was statically allocated and could always
be accessed, but with the recent encapsulation of policy state
and conversion to dynamic allocation, we can no longer access
the policy state prior to initialization.  Add a test of
!selinux_initialized(state) to all of the exported functions that
were missing them and handle appropriately.

Fixes: 461698026f ("selinux: encapsulate policy state, refactor policy load")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-19 21:14:41 -04:00
Colin Ian King 69ea651c40 selinux: fix allocation failure check on newpolicy->sidtab
The allocation check of newpolicy->sidtab is null checking if
newpolicy is null and not newpolicy->sidtab. Fix this.

Addresses-Coverity: ("Logically dead code")
Fixes: c7c556f1e8 ("selinux: refactor changing booleans")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-19 09:14:04 -04:00
Stephen Smalley c7c556f1e8 selinux: refactor changing booleans
Refactor the logic for changing SELinux policy booleans in a similar
manner to the refactoring of policy load, thereby reducing the
size of the critical section when the policy write-lock is held
and making it easier to convert the policy rwlock to RCU in the
future.  Instead of directly modifying the policydb in place, modify
a copy and then swap it into place through a single pointer update.
Only fully copy the portions of the policydb that are affected by
boolean changes to avoid the full cost of a deep policydb copy.
Introduce another level of indirection for the sidtab since changing
booleans does not require updating the sidtab, unlike policy load.
While we are here, create a common helper for notifying
other kernel components and userspace of a policy change and call it
from both security_set_bools() and selinux_policy_commit().

Based on an old (2004) patch by Kaigai Kohei [1] to convert the policy
rwlock to RCU that was deferred at the time since it did not
significantly improve performance and introduced complexity. Peter
Enderborg later submitted a patch series to convert to RCU [2] that
would have made changing booleans a much more expensive operation
by requiring a full policydb_write();policydb_read(); sequence to
deep copy the entire policydb and also had concerns regarding
atomic allocations.

This change is now simplified by the earlier work to encapsulate
policy state in the selinux_policy struct and to refactor
policy load.  After this change, the last major obstacle to
converting the policy rwlock to RCU is likely the sidtab live
convert support.

[1] https://lore.kernel.org/selinux/6e2f9128-e191-ebb3-0e87-74bfccb0767f@tycho.nsa.gov/
[2] https://lore.kernel.org/selinux/20180530141104.28569-1-peter.enderborg@sony.com/

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-17 21:00:33 -04:00
Stephen Smalley 02a52c5c8c selinux: move policy commit after updating selinuxfs
With the refactoring of the policy load logic in the security
server from the previous change, it is now possible to split out
the committing of the new policy from security_load_policy() and
perform it only after successful updating of selinuxfs.  Change
security_load_policy() to return the newly populated policy
data structures to the caller, export selinux_policy_commit()
for external callers, and introduce selinux_policy_cancel() to
provide a way to cancel the policy load in the event of an error
during updating of the selinuxfs directory tree.  Further, rework
the interfaces used by selinuxfs to get information from the policy
when creating the new directory tree to take and act upon the
new policy data structure rather than the current/active policy.
Update selinuxfs to use these updated and new interfaces.  While
we are here, stop re-creating the policy_capabilities directory
on each policy load since it does not depend on the policy, and
stop trying to create the booleans and classes directories during
the initial creation of selinuxfs since no information is available
until first policy load.

After this change, a failure while updating the booleans and class
directories will cause the entire policy load to be canceled, leaving
the original policy intact, and policy load notifications to userspace
will only happen after a successful completion of updating those
directories.  This does not (yet) provide full atomicity with respect
to the updating of the directory trees themselves.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-17 20:50:22 -04:00
Stephen Smalley 461698026f selinux: encapsulate policy state, refactor policy load
Encapsulate the policy state in its own structure (struct
selinux_policy) that is separately allocated but referenced from the
selinux_ss structure.  The policy state includes the SID table
(particularly the context structures), the policy database, and the
mapping between the kernel classes/permissions and the policy values.
Refactor the security server portion of the policy load logic to
cleanly separate loading of the new structures from committing the new
policy.  Unify the initial policy load and reload code paths as much
as possible, avoiding duplicated code.  Make sure we are taking the
policy read-lock prior to any dereferencing of the policy.  Move the
copying of the policy capability booleans into the state structure
outside of the policy write-lock because they are separate from the
policy and are read outside of any policy lock; possibly they should
be using at least READ_ONCE/WRITE_ONCE or smp_load_acquire/store_release.

These changes simplify the policy loading logic, reduce the size of
the critical section while holding the policy write-lock, and should
facilitate future changes to e.g. refactor the entire policy reload
logic including the selinuxfs code to make the updating of the policy
and the selinuxfs directory tree atomic and/or to convert the policy
read-write lock to RCU.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-17 20:48:57 -04:00
Stephen Smalley 339949be25 scripts/selinux,selinux: update mdp to enable policy capabilities
Presently mdp does not enable any SELinux policy capabilities
in the dummy policy it generates. Thus, policies derived from
it will by default lack various features commonly used in modern
policies such as open permission, extended socket classes, network
peer controls, etc.  Split the policy capability definitions out into
their own headers so that we can include them into mdp without pulling in
other kernel headers and extend mdp generate policycap statements for the
policy capabilities known to the kernel.  Policy authors may wish to
selectively remove some of these from the generated policy.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-17 20:42:00 -04:00
Linus Torvalds 9ad57f6dfc Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction,
   mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util,
   memory-hotplug, cleanups, uaccess, migration, gup, pagemap),

 - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops,
   checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump,
   exec, kdump, rapidio, panic, kcov, kgdb, ipc).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits)
  mm/gup: remove task_struct pointer for all gup code
  mm: clean up the last pieces of page fault accountings
  mm/xtensa: use general page fault accounting
  mm/x86: use general page fault accounting
  mm/sparc64: use general page fault accounting
  mm/sparc32: use general page fault accounting
  mm/sh: use general page fault accounting
  mm/s390: use general page fault accounting
  mm/riscv: use general page fault accounting
  mm/powerpc: use general page fault accounting
  mm/parisc: use general page fault accounting
  mm/openrisc: use general page fault accounting
  mm/nios2: use general page fault accounting
  mm/nds32: use general page fault accounting
  mm/mips: use general page fault accounting
  mm/microblaze: use general page fault accounting
  mm/m68k: use general page fault accounting
  mm/ia64: use general page fault accounting
  mm/hexagon: use general page fault accounting
  mm/csky: use general page fault accounting
  ...
2020-08-12 11:24:12 -07:00
Peter Xu 64019a2e46 mm/gup: remove task_struct pointer for all gup code
After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more.  Remove that parameter in the whole gup
stack.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:04 -07:00
Linus Torvalds ce13266d97 Minor fixes for v5.9.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAl8xl0QACgkQrZhLv9lQ
 BTzEUA/+Muf7gha2mtxGJ49ZX/AsUOi/feHFDjt+NEA6lQTIaaqU5LxXNdtARu/5
 j+RlJkrw8+3QGJ4h544HIJodbLZHghWpp15AxBAy+1BaeAoswEnrW2/6mD1iUBEH
 pFI0P2OjnVYxEPJGubLhp4qQ0lnqVKwzciNbBDLMydr6SerwoPDz9h0h5SMDoOxF
 m4f1/dsoXrpyp86GSvHDVa9NRs/GMKz/qIeC6DXuMRoqGX15EZVV1iABC7vPd2we
 84IacCRIE/DO1M1rmbNBSpeErmvkxRo00Qjupl0XGf7D4aazxnQl+RpaLdHAtBI1
 ubzU/76DCkaCO1x+3KPHyQUHZvXa3dt0/n4yEkOv01RIzivKZZz6jahsCrbX6lzX
 Dq4n0zg8sA7vh/T7aNX77z0FU1TuFBpiJ8dn/0vUgJPxDwt2V9F2k9jyV1pUeK1V
 yvSkIleIQmwmuT0p2nB/1g7yE5xkvWTM5WOy8/zIQj2aCvuo3ToY06Qc0rNOKTa8
 6Qi/Byi/5S1bBwYQqrAyrd5GhPVdZ8oNZyaUu8Mpm+4P0+2CquvDfN3ZHUxwILNX
 /TMVTVMu1PQQIltWANA0L0BjGIjSGxutisEUQL7o24566GXA3wTQd8HoKRBc6h+p
 DGeVMehPG7GIwoCIvuzdahSRdzzI/iBG3P10TZ5u+3BwTL0OUNY=
 =s2Jw
 -----END PGP SIGNATURE-----

Merge tag 'for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull security subsystem updates from James Morris:
 "A couple of minor documentation updates only for this release"

* tag 'for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  LSM: drop duplicated words in header file comments
  Replace HTTP links with HTTPS ones: security
2020-08-11 14:30:36 -07:00
Waiman Long 453431a549 mm, treewide: rename kzfree() to kfree_sensitive()
As said by Linus:

  A symmetric naming is only helpful if it implies symmetries in use.
  Otherwise it's actively misleading.

  In "kzalloc()", the z is meaningful and an important part of what the
  caller wants.

  In "kzfree()", the z is actively detrimental, because maybe in the
  future we really _might_ want to use that "memfill(0xdeadbeef)" or
  something. The "zero" part of the interface isn't even _relevant_.

The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.

Rename kzfree() to kfree_sensitive() to follow the example of the recently
added kvfree_sensitive() and make the intention of the API more explicit.
In addition, memzero_explicit() is used to clear the memory to make sure
that it won't get optimized away by the compiler.

The renaming is done by using the command sequence:

  git grep -w --name-only kzfree |\
  xargs sed -i 's/kzfree/kfree_sensitive/'

followed by some editing of the kfree_sensitive() kerneldoc and adding
a kzfree backward compatibility macro in slab.h.

[akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h]
[akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more]

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:22 -07:00
Alexander A. Klimov c9fecf505a Replace HTTP links with HTTPS ones: security
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
          If both the HTTP and HTTPS versions
          return 200 OK and serve the same content:
            Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
2020-08-06 12:00:05 -07:00
Linus Torvalds 4cec929370 integrity-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEjSMCCC7+cjo3nszSa3kkZrA+cVoFAl8puJgUHHpvaGFyQGxp
 bnV4LmlibS5jb20ACgkQa3kkZrA+cVq47w//VDg2pTD+/fPadleRJkKVSPaKJu4k
 N/gAVPxhYpJVJ+BTZKMFzTjX3kjfQG7udjORzC+saEdii7W1EfJJqHabLEnihfxd
 VDUS0RQndMwOkioAAZOsy5dFE84wUOX8O1kq31Aw2G+QLCYhn1dNMg10j6SBM034
 cJbS59k3w+lyqFy/Fje8e7aO1xmc/83x9MfLgzZTscCZqzf1vIJY8onwfTxRVBpQ
 QS0AZJM+b0+9MlJxpzBYxZARwYb5cXBLh07W/vBFmJRh15n0e20uWM4YFkBixicX
 gi3LtXd/75hFIHgm6QqbwDJrrA45zOJs5YsOudCctWVAe5k5mV0H7ysJ6phcRI9E
 uQvBb7Z+0viQXis6Cjx4gYSYAcAJPcDrfcjR4itQSOj5anUFBvCju+Jr373S0Vn8
 3eXGyimRAc33vEFkI7RJNfExkGh7pkYWzcruk90bHD6dAKuki/tisIs7ZvhTuFOp
 eyWt7hbctqbt/gESop3zXjUDRJsX9GyAA4OvJwFGRfRJ4ziQ5w8LGc+VendSWald
 1zjkJxXAZLjDPQlYv2074PYeIguTbcDkjeRVxUD9mWvdi0tyXK+r2qC+PeX7Rs71
 y1aGIT/NX9qYI2H0xIm3ettztdIE8F1tnAn2ziNkQiXEzCrEqKtAAxxSErTQuB78
 LMgCDPF8y06ZjD8=
 =M/tq
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity updates from Mimi Zohar:
 "The nicest change is the IMA policy rule checking. The other changes
  include allowing the kexec boot cmdline line measure policy rules to
  be defined in terms of the inode associated with the kexec kernel
  image, making the IMA_APPRAISE_BOOTPARAM, which governs the IMA
  appraise mode (log, fix, enforce), a runtime decision based on the
  secure boot mode of the system, and including errno in the audit log"

* tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  integrity: remove redundant initialization of variable ret
  ima: move APPRAISE_BOOTPARAM dependency on ARCH_POLICY to runtime
  ima: AppArmor satisfies the audit rule requirements
  ima: Rename internal filter rule functions
  ima: Support additional conditionals in the KEXEC_CMDLINE hook function
  ima: Use the common function to detect LSM conditionals in a rule
  ima: Move comprehensive rule validation checks out of the token parser
  ima: Use correct type for the args_p member of ima_rule_entry.lsm elements
  ima: Shallow copy the args_p member of ima_rule_entry.lsm elements
  ima: Fail rule parsing when appraise_flag=blacklist is unsupportable
  ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid cond
  ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an invalid cond
  ima: Fail rule parsing when buffer hook functions have an invalid action
  ima: Free the entire rule if it fails to parse
  ima: Free the entire rule when deleting a list of rules
  ima: Have the LSM free its audit rule
  IMA: Add audit log for failure conditions
  integrity: Add errno field in audit message
2020-08-06 11:35:57 -07:00
Linus Torvalds bfdd5aaa54 Smack fixes for 5.9
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl8pmVMXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBE4tA/7BvMGZjrGVu7F8oOJGam6xiQ+
 iVczarvuH9HUfIWWKytgAZXaNf8/lQvpTGYjJFdLGVaLVfB4CSrrrdOGlfvZGpZ/
 5CXeSMAAUOAmYnqjSXFuLqp0NJEEwBg0L7/YWGbp/fq7t8WcY73esDRpVw/R78fV
 rOSMq+qt+IKRERUZWHa3gY7/FDA1/KnEkXjcMViM9PN3Gi5u2Fw6IMVWugPYJYrh
 86jlrDxYV3ceqDpoiHGyulMcBk/f/GppjIWz4ihZ0YShn+ZgAXsnJGjEhGxsoSXb
 CdG0O0y/m6y/+ekyovhvL8+AymfVWbK3qzE8IxWp8Xbrd882naur0c0vTG6y2eF7
 Z2HnI3ivusNrKTkXQnzC86YJ0T96Pg9OCab5poSIsWK5bM542dKZ/7KoSDfzIfBH
 uPhTY2PyRSiUiCWpt7esHtp4mYwCGqUs2j80TZDb8ptAZbYKxzC0pFIDoFeEyOM4
 FDBXjH8Bo+eEG9V8OgGeVvluYaat+C00dba9fiOT7/Tm0F9KqOJEOO9uJPV9CuJN
 +jrL90mopy8T0bygqMjp+GB5V13IGL5+NDHZcnO3Ij5jI/7q0TQ6FlFZ4tg4o4yM
 Y7FyMF+ybUlRod0ayYgurtxn91c0E/PTOPSqtL/RKtRLbLdGrOM3Rfbxn9rREONY
 h1/RmmX5jSBQOKNq9/Y=
 =ZoLN
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-5.9' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Minor fixes to Smack for the v5.9 release.

  All were found by automated checkers and have straightforward
  resolution"

* tag 'Smack-for-5.9' of git://github.com/cschaufler/smack-next:
  Smack: prevent underflow in smk_set_cipso()
  Smack: fix another vsscanf out of bounds
  Smack: fix use-after-free in smk_write_relabel_self()
2020-08-06 11:02:23 -07:00
Linus Torvalds 74858abbb1 cap-checkpoint-restore-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXygegQAKCRCRxhvAZXjc
 olWZAQCMPbhI/20LA3OYJ6s+BgBEnm89PymvlHcym6Z4AvTungD+KqZonIYuxWgi
 6Ttlv/fzgFFbXgJgbuass5mwFVoN5wM=
 =oK7d
 -----END PGP SIGNATURE-----

Merge tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull checkpoint-restore updates from Christian Brauner:
 "This enables unprivileged checkpoint/restore of processes.

  Given that this work has been going on for quite some time the first
  sentence in this summary is hopefully more exciting than the actual
  final code changes required. Unprivileged checkpoint/restore has seen
  a frequent increase in interest over the last two years and has thus
  been one of the main topics for the combined containers &
  checkpoint/restore microconference since at least 2018 (cf. [1]).

  Here are just the three most frequent use-cases that were brought forward:

   - The JVM developers are integrating checkpoint/restore into a Java
     VM to significantly decrease the startup time.

   - In high-performance computing environment a resource manager will
     typically be distributing jobs where users are always running as
     non-root. Long-running and "large" processes with significant
     startup times are supposed to be checkpointed and restored with
     CRIU.

   - Container migration as a non-root user.

  In all of these scenarios it is either desirable or required to run
  without CAP_SYS_ADMIN. The userspace implementation of
  checkpoint/restore CRIU already has the pull request for supporting
  unprivileged checkpoint/restore up (cf. [2]).

  To enable unprivileged checkpoint/restore a new dedicated capability
  CAP_CHECKPOINT_RESTORE is introduced. This solution has last been
  discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1]
  "Update on Task Migration at Google Using CRIU") with Adrian and
  Nicolas providing the implementation now over the last months. In
  essence, this allows the CRIU binary to be installed with the
  CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling
  unprivileged users to restore processes.

  To make this possible the following permissions are altered:

   - Selecting a specific PID via clone3() set_tid relaxed from userns
     CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE.

   - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed
     from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE.

   - Accessing /proc/pid/map_files relaxed from init userns
     CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE.

   - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns
     CAP_CHECKPOINT_RESTORE.

  Of these four changes the /proc/self/exe change deserves a few words
  because the reasoning behind even restricting /proc/self/exe changes
  in the first place is just full of historical quirks and tracking this
  down was a questionable version of fun that I'd like to spare others.

  In short, it is trivial to change /proc/self/exe as an unprivileged
  user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace()
  or by simply intercepting the elf loader in userspace during exec.
  Nicolas was nice enough to even provide a POC for the latter (cf. [3])
  to illustrate this fact.

  The original patchset which introduced PR_SET_MM_MAP had no
  permissions around changing the exe link. They too argued that it is
  trivial to spoof the exe link already which is true. The argument
  brought up against this was that the Tomoyo LSM uses the exe link in
  tomoyo_manager() to detect whether the calling process is a policy
  manager. This caused changing the exe links to be guarded by userns
  CAP_SYS_ADMIN.

  All in all this rather seems like a "better guard it with something
  rather than nothing" argument which imho doesn't qualify as a great
  security policy. Again, because spoofing the exe link is possible for
  the calling process so even if this were security relevant it was
  broken back then and would be broken today. So technically, dropping
  all permissions around changing the exe link would probably be
  possible and would send a clearer message to any userspace that relies
  on /proc/self/exe for security reasons that they should stop doing
  this but for now we're only relaxing the exe link permissions from
  userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE.

  There's a final uapi change in here. Changing the exe link used to
  accidently return EINVAL when the caller lacked the necessary
  permissions instead of the more correct EPERM. This pr contains a
  commit fixing this. I assume that userspace won't notice or care and
  if they do I will revert this commit. But since we are changing the
  permissions anyway it seems like a good opportunity to try this fix.

  With these changes merged unprivileged checkpoint/restore will be
  possible and has already been tested by various users"

[1] LPC 2018
     1. "Task Migration at Google Using CRIU"
        https://www.youtube.com/watch?v=yI_1cuhoDgA&t=12095
     2. "Securely Migrating Untrusted Workloads with CRIU"
        https://www.youtube.com/watch?v=yI_1cuhoDgA&t=14400
     LPC 2019
     1. "CRIU and the PID dance"
         https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=2m48s
     2. "Update on Task Migration at Google Using CRIU"
        https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=1h2m8s

[2] https://github.com/checkpoint-restore/criu/pull/1155

[3] https://github.com/nviennot/run_as_exe

* tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  selftests: add clone3() CAP_CHECKPOINT_RESTORE test
  prctl: exe link permission error changed from -EINVAL to -EPERM
  prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe
  proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE
  pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid
  pid: use checkpoint_restore_ns_capable() for set_tid
  capabilities: Introduce CAP_CHECKPOINT_RESTORE
2020-08-04 15:02:07 -07:00
Linus Torvalds 3950e97543 Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull execve updates from Eric Biederman:
 "During the development of v5.7 I ran into bugs and quality of
  implementation issues related to exec that could not be easily fixed
  because of the way exec is implemented. So I have been diggin into
  exec and cleaning up what I can.

  This cycle I have been looking at different ideas and different
  implementations to see what is possible to improve exec, and cleaning
  the way exec interfaces with in kernel users. Only cleaning up the
  interfaces of exec with rest of the kernel has managed to stabalize
  and make it through review in time for v5.9-rc1 resulting in 2 sets of
  changes this cycle.

   - Implement kernel_execve

   - Make the user mode driver code a better citizen

  With kernel_execve the code size got a little larger as the copying of
  parameters from userspace and copying of parameters from userspace is
  now separate. The good news is kernel threads no longer need to play
  games with set_fs to use exec. Which when combined with the rest of
  Christophs set_fs changes should security bugs with set_fs much more
  difficult"

* 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits)
  exec: Implement kernel_execve
  exec: Factor bprm_stack_limits out of prepare_arg_pages
  exec: Factor bprm_execve out of do_execve_common
  exec: Move bprm_mm_init into alloc_bprm
  exec: Move initialization of bprm->filename into alloc_bprm
  exec: Factor out alloc_bprm
  exec: Remove unnecessary spaces from binfmts.h
  umd: Stop using split_argv
  umd: Remove exit_umh
  bpfilter: Take advantage of the facilities of struct pid
  exit: Factor thread_group_exited out of pidfd_poll
  umd: Track user space drivers with struct pid
  bpfilter: Move bpfilter_umh back into init data
  exec: Remove do_execve_file
  umh: Stop calling do_execve_file
  umd: Transform fork_usermode_blob into fork_usermode_driver
  umd: Rename umd_info.cmdline umd_info.driver_name
  umd: For clarity rename umh_info umd_info
  umh: Separate the user mode driver and the user mode helper support
  umh: Remove call_usermodehelper_setup_file.
  ...
2020-08-04 14:27:25 -07:00
Linus Torvalds fd76a74d94 audit/stable-5.9 PR 20200803
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl8okpIUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNqOQ/8D+m9Ykcby3csEKsp8YtsaukEu62U
 lRVaxzRNO9wwB24aFwDFuJnIkmsSi/s/O4nBsy2mw+Apn+uDCvHQ9tBU07vlNn2f
 lu27YaTya7YGlqoe315xijd8tyoX99k8cpQeixvAVr9/jdR09yka7SJ8O7X9mjV7
 +SUVDiKCplPKpiwCCRS9cqD7F64T6y35XKzbrzYqdP0UOF2XelZo/Evt5rDRvWUf
 5qDN2tP+iM/Fvu5lCfczFwAeivfAdxjQ11n783hx8Ms2qyiaKQCzbEwjqAslmkbs
 1k/+ED0NjzXX1ne0JZaz/bk0wsMnmOoa8o+NDcyd7Za/cj5prUZi7kBy+xry4YV8
 qKJ40Lk0flCWgUpm6bkYVOByIYHk0gmfBNvjilqf25NR/eOC/9e9ir8PywvYUW/7
 kvVK37+N/a3LnFj80sZpIeqqnNU8z9PV1i7//5/kDuKvz94Bq83TJDO6pPKvqDtC
 njQfCFoHwdEeF8OalK793lIiYaoODqvbkWKChKMqziODJ4ZP8AW06gXpEbEWn7G3
 TTnJx7hqzR9t90vBQJeO3Fromfn+9TDlZVdX+EGO8gIqUiLGr0r7LPPep4VkDbNw
 LxMYKeC2cgRp8Z+XXPDxfXSDL2psTwg6CXcDrXcYnUyBo/yerpBvbJkeaR0h+UR0
 j6cvMX+T39X2JXM=
 =Xs3M
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:
 "Aside from some smaller bug fixes, here are the highlights:

   - add a new backlog wait metric to the audit status message, this is
     intended to help admins determine how long processes have been
     waiting for the audit backlog queue to clear

   - generate audit records for nftables configuration changes

   - generate CWD audit records for for the relevant LSM audit records"

* tag 'audit-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: report audit wait metric in audit status reply
  audit: purge audit_log_string from the intra-kernel audit API
  audit: issue CWD record to accompany LSM_AUDIT_DATA_* records
  audit: use the proper gfp flags in the audit_log_nfcfg() calls
  audit: remove unused !CONFIG_AUDITSYSCALL __audit_inode* stubs
  audit: add gfp parameter to audit_log_nfcfg
  audit: log nftables configuration change events
  audit: Use struct_size() helper in alloc_chunk
2020-08-04 14:20:26 -07:00
Linus Torvalds 49e917deeb selinux/stable-5.9 PR 20200803
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl8okmsUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMaRA//XO7JKJEyLcpqRzhQP/QY50JXdQtE
 c9vKeb7y4wlfbTozRgjBN3Xj+tFbqANzX/rVsR1aKV+hExyEuUfNZ0Fl8MbPEccQ
 1RUCW2808/YRTYsl0g0DDZsc+vxVosfouk91pZfld9ZRnZbrNTGXFP7vuVyFKdBy
 wBX1FCL9q31wLc8Jk7f6otSbBvSCG0YXjkkxEM7LQx3oQ59s8dfOed41kDGpLoNk
 TS5BN/W3uuYEDIsIwTRZjU4h42dpc/wbxVMJhBg85rU/2bF4u5sDs2qgwqaa1tXs
 aRDH5J+eBMZRCkF4shxlDrrOWeXvEEtal9yYzQUx664tWDjZazoTLctCAe3PWI1i
 q61cG8PXw/5/oB6RyvPkRMLc5pU8P6/Xdfg6R6kOsGSq8bj+g30J6jqGXnW9FIVr
 5rIaooiw19vqH+ASVuq9oLmhuWJQyn6ImFqOkREJFWVaqufglWw9RWDCGFsLq9Tr
 w6HbA9UYCoWpdQBfRXpa086sSQm1wuCP39fIcY64uHpR5gPJuzyd8Tswz3tbEAtg
 v7vgIRtBpghhdcBLzIJgSIXJKR7W/Y49eFwNf3x0OTSeAIia6Z9paaQjXYl71I9V
 6oUiQgVE3lX2SkgMbOK2V5UsjbVkpjjv7MWxdm0mPQCU0Fmb8W2FN/wVR7FBlCZc
 yhde+bs4zTmPNFw=
 =ocPe
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:
 "Beyond the usual smattering of bug fixes, we've got three small
  improvements worth highlighting:

   - improved SELinux policy symbol table performance due to a reworking
     of the insert and search functions

   - allow reading of SELinux labels before the policy is loaded,
     allowing for some more "exotic" initramfs approaches

   - improved checking an error reporting about process
     class/permissions during SELinux policy load"

* tag 'selinux-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: complete the inlining of hashtab functions
  selinux: prepare for inlining of hashtab functions
  selinux: specialize symtab insert and search functions
  selinux: Fix spelling mistakes in the comments
  selinux: fixed a checkpatch warning with the sizeof macro
  selinux: log error messages on required process class / permissions
  scripts/selinux/mdp: fix initial SID handling
  selinux: allow reading labels before policy is loaded
2020-08-04 14:18:01 -07:00
Linus Torvalds 5b5d3be5d6 Automatic variable initialization updates for v5.9-rc1
- Introduce CONFIG_INIT_STACK_ALL_ZERO (Alexander Potapenko)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oXX4WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJt/FD/wJISl6Va3UvJrwGWcjLqb3iQh/
 38Nq7LV9ysUStpi5ibxhiB95uawFtAUsBLKyBKLtOERUz5RXiHrR9MI4UWNPBgNc
 7/H5ZAkkD21LpzC76FH+a4SWQp1kQTiyu/iONn03LE8p4vSwSVZzoGqA1r4fpzGY
 Np++2Ym/bzWV7R0Xdq/LI5oH9109dm75PhcCqCZPAtlIq+USXpyNAozimgREplVl
 /clYmj7oruoRYiF5uheOlbpCEXYlybwVHfDKE2Uh5IcXcpm3OYZU9HEK5ot5oudJ
 Z7bIcMeS2mMtSH/hhyjFbi0cZBVtJFc9exHRmuiDiYzNkWzaT2/5xAMUzw65q7Yk
 BTpr5AU+nkVQwuAmkN3AyBLrqQYyhWL0+xnWRmbbjt2yoqCx5x3AyxaBgHDV4vgF
 sTNhczFQdGqhlmvbxOw93PARV+lU9pozcc6b8TpXVdsE+bFFN5mBuRljIOTCRvke
 yxFsLF9olfNB3CXTHXAWLC/RuqdH/Vk7zC0vS34tlmvWgVC07P9QXyWciqcldAgL
 BsFXsRt6bRvOukyunhRfQkLVRxsOCLhQuYC33cRX9xY9vwCkM5v6TQH5WRcfxK7Q
 swujqqvozYZ/njblBTeagg8sGg0OiqxpCvJZD6qA6s1mO3lG58CDqqwxd4DemIDF
 /BxVarzUtmvBuiMBSQ==
 =c2Rf
 -----END PGP SIGNATURE-----

Merge tag 'var-init-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull automatic variable initialization updates from Kees Cook:
 "This adds the "zero" init option from Clang, which is being used
  widely in production builds of Android and Chrome OS (though it also
  keeps the "pattern" init, which is better for debug builds).

   - Introduce CONFIG_INIT_STACK_ALL_ZERO (Alexander Potapenko)"

* tag 'var-init-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  security: allow using Clang's zero initialization for stack variables
2020-08-04 13:38:35 -07:00
Linus Torvalds 382625d0d4 for-5.9/block-20200802
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl8m7YwQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpt+dEAC7a0HYuX2OrkyawBnsgd1QQR/soC7surec
 yDDa7SMM8cOq3935bfzcYHV9FWJszEGIknchiGb9R3/T+vmSohbvDsM5zgwya9u/
 FHUIuTq324I6JWXKl30k4rwjiX9wQeMt+WZ5gC8KJYCWA296i2IpJwd0A45aaKuS
 x4bTjxqknE+fD4gQiMUSt+bmuOUAp81fEku3EPapCRYDPAj8f5uoY7R2arT/POwB
 b+s+AtXqzBymIqx1z0sZ/XcdZKmDuhdurGCWu7BfJFIzw5kQ2Qe3W8rUmrQ3pGut
 8a21YfilhUFiBv+B4wptfrzJuzU6Ps0BXHCnBsQjzvXwq5uFcZH495mM/4E4OJvh
 SbjL2K4iFj+O1ngFkukG/F8tdEM1zKBYy2ZEkGoWKUpyQanbAaGI6QKKJA+DCdBi
 yPEb7yRAa5KfLqMiocm1qCEO1I56HRiNHaJVMqCPOZxLmpXj19Fs71yIRplP1Trv
 GGXdWZsccjuY6OljoXWdEfnxAr5zBsO3Yf2yFT95AD+egtGsU1oOzlqAaU1mtflw
 ABo452pvh6FFpxGXqz6oK4VqY4Et7WgXOiljA4yIGoPpG/08L1Yle4eVc2EE01Jb
 +BL49xNJVeUhGFrvUjPGl9kVMeLmubPFbmgrtipW+VRg9W8+Yirw7DPP6K+gbPAR
 RzAUdZFbWw==
 =abJG
 -----END PGP SIGNATURE-----

Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block

Pull core block updates from Jens Axboe:
 "Good amount of cleanups and tech debt removals in here, and as a
  result, the diffstat shows a nice net reduction in code.

   - Softirq completion cleanups (Christoph)

   - Stop using ->queuedata (Christoph)

   - Cleanup bd claiming (Christoph)

   - Use check_events, moving away from the legacy media change
     (Christoph)

   - Use inode i_blkbits consistently (Christoph)

   - Remove old unused writeback congestion bits (Christoph)

   - Cleanup/unify submission path (Christoph)

   - Use bio_uninit consistently, instead of bio_disassociate_blkg
     (Christoph)

   - sbitmap cleared bits handling (John)

   - Request merging blktrace event addition (Jan)

   - sysfs add/remove race fixes (Luis)

   - blk-mq tag fixes/optimizations (Ming)

   - Duplicate words in comments (Randy)

   - Flush deferral cleanup (Yufen)

   - IO context locking/retry fixes (John)

   - struct_size() usage (Gustavo)

   - blk-iocost fixes (Chengming)

   - blk-cgroup IO stats fixes (Boris)

   - Various little fixes"

* tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits)
  block: blk-timeout: delete duplicated word
  block: blk-mq-sched: delete duplicated word
  block: blk-mq: delete duplicated word
  block: genhd: delete duplicated words
  block: elevator: delete duplicated word and fix typos
  block: bio: delete duplicated words
  block: bfq-iosched: fix duplicated word
  iocost_monitor: start from the oldest usage index
  iocost: Fix check condition of iocg abs_vdebt
  block: Remove callback typedefs for blk_mq_ops
  block: Use non _rcu version of list functions for tag_set_list
  blk-cgroup: show global disk stats in root cgroup io.stat
  blk-cgroup: make iostat functions visible to stat printing
  block: improve discard bio alignment in __blkdev_issue_discard()
  block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
  block: defer flush request no matter whether we have elevator
  block: make blk_timeout_init() static
  block: remove retry loop in ioc_release_fn()
  block: remove unnecessary ioc nested locking
  block: integrate bd_start_claiming into __blkdev_get
  ...
2020-08-03 11:57:03 -07:00
Colin Ian King 3db0d0c276 integrity: remove redundant initialization of variable ret
The variable ret is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Fixes: eb5798f2e2 ("integrity: convert digsig to akcipher api")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Addresses-Coverity: ("Unused value")
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-27 16:52:09 -04:00
Dan Carpenter 42a2df3e82 Smack: prevent underflow in smk_set_cipso()
We have an upper bound on "maplevel" but forgot to check for negative
values.

Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-07-27 13:35:12 -07:00
Dan Carpenter a6bd4f6d9b Smack: fix another vsscanf out of bounds
This is similar to commit 84e99e58e8 ("Smack: slab-out-of-bounds in
vsscanf") where we added a bounds check on "rule".

Reported-by: syzbot+a22c6092d003d6fe1122@syzkaller.appspotmail.com
Fixes: f7112e6c9a ("Smack: allow for significantly longer Smack labels v4")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-07-27 13:35:03 -07:00
Richard Guy Briggs f1d9b23cab audit: purge audit_log_string from the intra-kernel audit API
audit_log_string() was inteded to be an internal audit function and
since there are only two internal uses, remove them.  Purge all external
uses of it by restructuring code to use an existing audit_log_format()
or using audit_log_format().

Please see the upstream issue
https://github.com/linux-audit/audit-kernel/issues/84

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-21 11:12:31 -04:00
Eric W. Biederman be619f7f06 exec: Implement kernel_execve
To allow the kernel not to play games with set_fs to call exec
implement kernel_execve.  The function kernel_execve takes pointers
into kernel memory and copies the values pointed to onto the new
userspace stack.

The calls with arguments from kernel space of do_execve are replaced
with calls to kernel_execve.

The calls do_execve and do_execveat are made static as there are now
no callers outside of exec.

The comments that mention do_execve are updated to refer to
kernel_execve or execve depending on the circumstances.  In addition
to correcting the comments, this makes it easy to grep for do_execve
and verify it is not used.

Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-hch@lst.de
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/87wo365ikj.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-07-21 08:24:52 -05:00
Bruno Meneguele 311aa6aafe ima: move APPRAISE_BOOTPARAM dependency on ARCH_POLICY to runtime
The IMA_APPRAISE_BOOTPARAM config allows enabling different "ima_appraise="
modes - log, fix, enforce - at run time, but not when IMA architecture
specific policies are enabled.  This prevents properly labeling the
filesystem on systems where secure boot is supported, but not enabled on the
platform.  Only when secure boot is actually enabled should these IMA
appraise modes be disabled.

This patch removes the compile time dependency and makes it a runtime
decision, based on the secure boot state of that platform.

Test results as follows:

-> x86-64 with secure boot enabled

[    0.015637] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix
[    0.015668] ima: Secure boot enabled: ignoring ima_appraise=fix boot parameter option

-> powerpc with secure boot disabled

[    0.000000] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix
[    0.000000] Secure boot mode disabled

-> Running the system without secure boot and with both options set:

CONFIG_IMA_APPRAISE_BOOTPARAM=y
CONFIG_IMA_ARCH_POLICY=y

Audit prompts "missing-hash" but still allow execution and, consequently,
filesystem labeling:

type=INTEGRITY_DATA msg=audit(07/09/2020 12:30:27.778:1691) : pid=4976
uid=root auid=root ses=2
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 op=appraise_data
cause=missing-hash comm=bash name=/usr/bin/evmctl dev="dm-0" ino=493150
res=no

Cc: stable@vger.kernel.org
Fixes: d958083a8f ("x86/ima: define arch_get_ima_policy() for x86")
Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
Cc: stable@vger.kernel.org # 5.0
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 18:18:49 -04:00
Tyler Hicks 1768215a65 ima: AppArmor satisfies the audit rule requirements
AppArmor meets all the requirements for IMA in terms of audit rules
since commit e79c26d040 ("apparmor: Add support for audit rule
filtering"). Update IMA's Kconfig section for CONFIG_IMA_LSM_RULES to
reflect this.

Fixes: e79c26d040 ("apparmor: Add support for audit rule filtering")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 18:18:37 -04:00
Tyler Hicks b8867eedcf ima: Rename internal filter rule functions
Rename IMA's internal filter rule functions from security_filter_rule_*()
to ima_filter_rule_*(). This avoids polluting the security_* namespace,
which is typically reserved for general security subsystem
infrastructure.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Suggested-by: Casey Schaufler <casey@schaufler-ca.com>
[zohar@linux.ibm.com: reword using the term "filter", not "audit"]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 18:18:23 -04:00
Tyler Hicks 4834177e63 ima: Support additional conditionals in the KEXEC_CMDLINE hook function
Take the properties of the kexec kernel's inode and the current task
ownership into consideration when matching a KEXEC_CMDLINE operation to
the rules in the IMA policy. This allows for some uniformity when
writing IMA policy rules for KEXEC_KERNEL_CHECK, KEXEC_INITRAMFS_CHECK,
and KEXEC_CMDLINE operations.

Prior to this patch, it was not possible to write a set of rules like
this:

 dont_measure func=KEXEC_KERNEL_CHECK obj_type=foo_t
 dont_measure func=KEXEC_INITRAMFS_CHECK obj_type=foo_t
 dont_measure func=KEXEC_CMDLINE obj_type=foo_t
 measure func=KEXEC_KERNEL_CHECK
 measure func=KEXEC_INITRAMFS_CHECK
 measure func=KEXEC_CMDLINE

The inode information associated with the kernel being loaded by a
kexec_kernel_load(2) syscall can now be included in the decision to
measure or not

Additonally, the uid, euid, and subj_* conditionals can also now be
used in KEXEC_CMDLINE rules. There was no technical reason as to why
those conditionals weren't being considered previously other than
ima_match_rules() didn't have a valid inode to use so it immediately
bailed out for KEXEC_CMDLINE operations rather than going through the
full list of conditional comparisons.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: kexec@lists.infradead.org
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:16 -04:00
Tyler Hicks 592b24cbdc ima: Use the common function to detect LSM conditionals in a rule
Make broader use of ima_rule_contains_lsm_cond() to check if a given
rule contains an LSM conditional. This is a code cleanup and has no
user-facing change.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:16 -04:00
Tyler Hicks 30031b0ec8 ima: Move comprehensive rule validation checks out of the token parser
Use ima_validate_rule(), at the end of the token parsing stage, to
verify combinations of actions, hooks, and flags. This is useful to
increase readability by consolidating such checks into a single function
and also because rule conditionals can be specified in arbitrary order
making it difficult to do comprehensive rule validation until the entire
rule has been parsed.

This allows for the check that ties together the "keyrings" conditional
with the KEY_CHECK function hook to be moved into the final rule
validation.

The modsig check no longer needs to compiled conditionally because the
token parser will ensure that modsig support is enabled before accepting
"imasig|modsig" appraise type values. The final rule validation will
ensure that appraise_type and appraise_flag options are only present in
appraise rules.

Finally, this allows for the check that ties together the "pcr"
conditional with the measure action to be moved into the final rule
validation.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:15 -04:00
Tyler Hicks aa0c0227d3 ima: Use correct type for the args_p member of ima_rule_entry.lsm elements
Make args_p be of the char pointer type rather than have it be a void
pointer that gets casted to char pointer when it is used. It is a simple
NUL-terminated string as returned by match_strdup().

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:14 -04:00
Tyler Hicks 39e5993d0d ima: Shallow copy the args_p member of ima_rule_entry.lsm elements
The args_p member is a simple string that is allocated by
ima_rule_init(). Shallow copy it like other non-LSM references in
ima_rule_entry structs.

There are no longer any necessary error path cleanups to do in
ima_lsm_copy_rule().

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:13 -04:00
Tyler Hicks 5f3e92657b ima: Fail rule parsing when appraise_flag=blacklist is unsupportable
Verifying that a file hash is not blacklisted is currently only
supported for files with appended signatures (modsig).  In the future,
this might change.

For now, the "appraise_flag" option is only appropriate for appraise
actions and its "blacklist" value is only appropriate when
CONFIG_IMA_APPRAISE_MODSIG is enabled and "appraise_flag=blacklist" is
only appropriate when "appraise_type=imasig|modsig" is also present.
Make this clear at policy load so that IMA policy authors don't assume
that other uses of "appraise_flag=blacklist" are supported.

Fixes: 273df864cf ("ima: Check against blacklisted hashes for files with modsig")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reivewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:06:26 -04:00
Adrian Reber 124ea650d3 capabilities: Introduce CAP_CHECKPOINT_RESTORE
This patch introduces CAP_CHECKPOINT_RESTORE, a new capability facilitating
checkpoint/restore for non-root users.

Over the last years, The CRIU (Checkpoint/Restore In Userspace) team has
been asked numerous times if it is possible to checkpoint/restore a
process as non-root. The answer usually was: 'almost'.

The main blocker to restore a process as non-root was to control the PID
of the restored process. This feature available via the clone3 system
call, or via /proc/sys/kernel/ns_last_pid is unfortunately guarded by
CAP_SYS_ADMIN.

In the past two years, requests for non-root checkpoint/restore have
increased due to the following use cases:
* Checkpoint/Restore in an HPC environment in combination with a
  resource manager distributing jobs where users are always running as
  non-root. There is a desire to provide a way to checkpoint and
  restore long running jobs.
* Container migration as non-root
* We have been in contact with JVM developers who are integrating
  CRIU into a Java VM to decrease the startup time. These
  checkpoint/restore applications are not meant to be running with
  CAP_SYS_ADMIN.

We have seen the following workarounds:
* Use a setuid wrapper around CRIU:
  See https://github.com/FredHutch/slurm-examples/blob/master/checkpointer/lib/checkpointer/checkpointer-suid.c
* Use a setuid helper that writes to ns_last_pid.
  Unfortunately, this helper delegation technique is impossible to use
  with clone3, and is thus prone to races.
  See https://github.com/twosigma/set_ns_last_pid
* Cycle through PIDs with fork() until the desired PID is reached:
  This has been demonstrated to work with cycling rates of 100,000 PIDs/s
  See https://github.com/twosigma/set_ns_last_pid
* Patch out the CAP_SYS_ADMIN check from the kernel
* Run the desired application in a new user and PID namespace to provide
  a local CAP_SYS_ADMIN for controlling PIDs. This technique has limited
  use in typical container environments (e.g., Kubernetes) as /proc is
  typically protected with read-only layers (e.g., /proc/sys) for
  hardening purposes. Read-only layers prevent additional /proc mounts
  (due to proc's SB_I_USERNS_VISIBLE property), making the use of new
  PID namespaces limited as certain applications need access to /proc
  matching their PID namespace.

The introduced capability allows to:
* Control PIDs when the current user is CAP_CHECKPOINT_RESTORE capable
  for the corresponding PID namespace via ns_last_pid/clone3.
* Open files in /proc/pid/map_files when the current user is
  CAP_CHECKPOINT_RESTORE capable in the root namespace, useful for
  recovering files that are unreachable via the file system such as
  deleted files, or memfd files.

See corresponding selftest for an example with clone3().

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200719100418.2112740-2-areber@redhat.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-19 20:14:42 +02:00
Tyler Hicks eb624fe214 ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid cond
The KEY_CHECK function only supports the uid, pcr, and keyrings
conditionals. Make this clear at policy load so that IMA policy authors
don't assume that other conditionals are supported.

Fixes: 5808611ccc ("IMA: Add KEY_CHECK func to measure keys")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks db2045f589 ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an invalid cond
The KEXEC_CMDLINE hook function only supports the pcr conditional. Make
this clear at policy load so that IMA policy authors don't assume that
other conditionals are supported.

Since KEXEC_CMDLINE's inception, ima_match_rules() has always returned
true on any loaded KEXEC_CMDLINE rule without any consideration for
other conditionals present in the rule. Make it clear that pcr is the
only supported KEXEC_CMDLINE conditional by returning an error during
policy load.

An example of why this is a problem can be explained with the following
rule:

 dont_measure func=KEXEC_CMDLINE obj_type=foo_t

An IMA policy author would have assumed that rule is valid because the
parser accepted it but the result was that measurements for all
KEXEC_CMDLINE operations would be disabled.

Fixes: b0935123a1 ("IMA: Define a new hook to measure the kexec boot command line arguments")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks 712183437e ima: Fail rule parsing when buffer hook functions have an invalid action
Buffer based hook functions, such as KEXEC_CMDLINE and KEY_CHECK, can
only measure. The process_buffer_measurement() function quietly ignores
all actions except measure so make this behavior clear at the time of
policy load.

The parsing of the keyrings conditional had a check to ensure that it
was only specified with measure actions but the check should be on the
hook function and not the keyrings conditional since
"appraise func=KEY_CHECK" is not a valid rule.

Fixes: b0935123a1 ("IMA: Define a new hook to measure the kexec boot command line arguments")
Fixes: 5808611ccc ("IMA: Add KEY_CHECK func to measure keys")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks 2bdd737c56 ima: Free the entire rule if it fails to parse
Use ima_free_rule() to fix memory leaks of allocated ima_rule_entry
members, such as .fsname and .keyrings, when an error is encountered
during rule parsing.

Set the args_p pointer to NULL after freeing it in the error path of
ima_lsm_rule_init() so that it isn't freed twice.

This fixes a memory leak seen when loading an rule that contains an
additional piece of allocated memory, such as an fsname, followed by an
invalid conditional:

 # echo "measure fsname=tmpfs bad=cond" > /sys/kernel/security/ima/policy
 -bash: echo: write error: Invalid argument
 # echo scan > /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
 unreferenced object 0xffff98e7e4ece6c0 (size 8):
   comm "bash", pid 672, jiffies 4294791843 (age 21.855s)
   hex dump (first 8 bytes):
     74 6d 70 66 73 00 6b a5                          tmpfs.k.
   backtrace:
     [<00000000abab7413>] kstrdup+0x2e/0x60
     [<00000000f11ede32>] ima_parse_add_rule+0x7d4/0x1020
     [<00000000f883dd7a>] ima_write_policy+0xab/0x1d0
     [<00000000b17cf753>] vfs_write+0xde/0x1d0
     [<00000000b8ddfdea>] ksys_write+0x68/0xe0
     [<00000000b8e21e87>] do_syscall_64+0x56/0xa0
     [<0000000089ea7b98>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: f1b08bbcbd ("ima: define a new policy condition based on the filesystem name")
Fixes: 2b60c0eced ("IMA: Read keyrings= option from the IMA policy")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks 465aee77aa ima: Free the entire rule when deleting a list of rules
Create a function, ima_free_rule(), to free all memory associated with
an ima_rule_entry. Use the new function to fix memory leaks of allocated
ima_rule_entry members, such as .fsname and .keyrings, when deleting a
list of rules.

Make the existing ima_lsm_free_rule() function specific to the LSM
audit rule array of an ima_rule_entry and require that callers make an
additional call to kfree to free the ima_rule_entry itself.

This fixes a memory leak seen when loading by a valid rule that contains
an additional piece of allocated memory, such as an fsname, followed by
an invalid rule that triggers a policy load failure:

 # echo -e "dont_measure fsname=securityfs\nbad syntax" > \
    /sys/kernel/security/ima/policy
 -bash: echo: write error: Invalid argument
 # echo scan > /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
 unreferenced object 0xffff9bab67ca12c0 (size 16):
   comm "bash", pid 684, jiffies 4295212803 (age 252.344s)
   hex dump (first 16 bytes):
     73 65 63 75 72 69 74 79 66 73 00 6b 6b 6b 6b a5  securityfs.kkkk.
   backtrace:
     [<00000000adc80b1b>] kstrdup+0x2e/0x60
     [<00000000d504cb0d>] ima_parse_add_rule+0x7d4/0x1020
     [<00000000444825ac>] ima_write_policy+0xab/0x1d0
     [<000000002b7f0d6c>] vfs_write+0xde/0x1d0
     [<0000000096feedcf>] ksys_write+0x68/0xe0
     [<0000000052b544a2>] do_syscall_64+0x56/0xa0
     [<000000007ead1ba7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: f1b08bbcbd ("ima: define a new policy condition based on the filesystem name")
Fixes: 2b60c0eced ("IMA: Read keyrings= option from the IMA policy")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks 9ff8a616df ima: Have the LSM free its audit rule
Ask the LSM to free its audit rule rather than directly calling kfree().
Both AppArmor and SELinux do additional work in their audit_rule_free()
hooks. Fix memory leaks by allowing the LSMs to perform necessary work.

Fixes: b169424551 ("ima: use the lsm policy update notifier")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: Janne Karhunen <janne.karhunen@gmail.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00