2005-04-29 23:23:29 +08:00
|
|
|
/* audit.c -- Auditing support
|
2005-04-17 06:20:36 +08:00
|
|
|
* Gateway between the kernel (e.g., selinux) and the user-space audit daemon.
|
|
|
|
* System-call specific features have moved to auditsc.c
|
|
|
|
*
|
2007-01-20 03:39:55 +08:00
|
|
|
* Copyright 2003-2007 Red Hat Inc., Durham, North Carolina.
|
2005-04-17 06:20:36 +08:00
|
|
|
* All Rights Reserved.
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or modify
|
|
|
|
* it under the terms of the GNU General Public License as published by
|
|
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
|
|
* (at your option) any later version.
|
|
|
|
*
|
|
|
|
* This program is distributed in the hope that it will be useful,
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
* GNU General Public License for more details.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU General Public License
|
|
|
|
* along with this program; if not, write to the Free Software
|
|
|
|
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
|
|
|
*
|
|
|
|
* Written by Rickard E. (Rik) Faith <faith@redhat.com>
|
|
|
|
*
|
2008-03-02 04:01:11 +08:00
|
|
|
* Goals: 1) Integrate fully with Security Modules.
|
2005-04-17 06:20:36 +08:00
|
|
|
* 2) Minimal run-time overhead:
|
|
|
|
* a) Minimal when syscall auditing is disabled (audit_enable=0).
|
|
|
|
* b) Small when syscall auditing is enabled and no audit record
|
|
|
|
* is generated (defer as much work as possible to record
|
|
|
|
* generation time):
|
|
|
|
* i) context is allocated,
|
|
|
|
* ii) names from getname are stored without a copy, and
|
|
|
|
* iii) inode information stored from path_lookup.
|
|
|
|
* 3) Ability to disable syscall auditing at boot time (audit=0).
|
|
|
|
* 4) Usable by other parts of the kernel (if audit_log* is called,
|
|
|
|
* then a syscall record will be generated automatically for the
|
|
|
|
* current syscall).
|
|
|
|
* 5) Netlink interface to user-space.
|
|
|
|
* 6) Support low-overhead kernel-based filtering to minimize the
|
|
|
|
* information that must be passed to user-space.
|
|
|
|
*
|
2018-02-03 13:33:11 +08:00
|
|
|
* Audit userspace, documentation, tests, and bug/issue trackers:
|
|
|
|
* https://github.com/linux-audit
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
|
|
|
|
2014-01-15 02:33:12 +08:00
|
|
|
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
|
|
|
|
|
2015-02-23 10:20:09 +08:00
|
|
|
#include <linux/file.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <linux/init.h>
|
2014-06-07 05:37:37 +08:00
|
|
|
#include <linux/types.h>
|
2011-07-27 07:09:06 +08:00
|
|
|
#include <linux/atomic.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <linux/mm.h>
|
2011-05-24 02:51:41 +08:00
|
|
|
#include <linux/export.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
|
|
|
#include <linux/slab.h>
|
2005-05-19 17:56:58 +08:00
|
|
|
#include <linux/err.h>
|
|
|
|
#include <linux/kthread.h>
|
2013-05-04 02:03:50 +08:00
|
|
|
#include <linux/kernel.h>
|
2013-05-01 03:30:32 +08:00
|
|
|
#include <linux/syscalls.h>
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
#include <linux/spinlock.h>
|
|
|
|
#include <linux/rcupdate.h>
|
|
|
|
#include <linux/mutex.h>
|
|
|
|
#include <linux/gfp.h>
|
2017-05-02 22:16:05 +08:00
|
|
|
#include <linux/pid.h>
|
2017-05-02 22:16:05 +08:00
|
|
|
#include <linux/slab.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
#include <linux/audit.h>
|
|
|
|
|
|
|
|
#include <net/sock.h>
|
2006-02-08 01:05:27 +08:00
|
|
|
#include <net/netlink.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <linux/skbuff.h>
|
2011-06-30 19:31:57 +08:00
|
|
|
#ifdef CONFIG_SECURITY
|
|
|
|
#include <linux/security.h>
|
|
|
|
#endif
|
2006-12-07 12:34:23 +08:00
|
|
|
#include <linux/freezer.h>
|
2012-09-11 14:20:20 +08:00
|
|
|
#include <linux/pid_namespace.h>
|
2013-07-17 01:18:45 +08:00
|
|
|
#include <net/netns/generic.h>
|
2006-03-11 08:14:06 +08:00
|
|
|
|
|
|
|
#include "audit.h"
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-11-06 01:47:09 +08:00
|
|
|
/* No auditing will take place until audit_initialized == AUDIT_INITIALIZED.
|
2005-04-17 06:20:36 +08:00
|
|
|
* (Initialization happens after skb_init is called.) */
|
2008-11-06 01:47:09 +08:00
|
|
|
#define AUDIT_DISABLED -1
|
|
|
|
#define AUDIT_UNINITIALIZED 0
|
|
|
|
#define AUDIT_INITIALIZED 1
|
2005-04-17 06:20:36 +08:00
|
|
|
static int audit_initialized;
|
|
|
|
|
2008-01-08 06:09:31 +08:00
|
|
|
#define AUDIT_OFF 0
|
|
|
|
#define AUDIT_ON 1
|
|
|
|
#define AUDIT_LOCKED 2
|
2017-09-01 21:44:34 +08:00
|
|
|
u32 audit_enabled = AUDIT_OFF;
|
2017-09-01 21:44:57 +08:00
|
|
|
bool audit_ever_enabled = !!AUDIT_OFF;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2011-01-18 13:48:12 +08:00
|
|
|
EXPORT_SYMBOL_GPL(audit_enabled);
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/* Default state when kernel boots without any parameters. */
|
2017-09-01 21:44:34 +08:00
|
|
|
static u32 audit_default = AUDIT_OFF;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/* If auditing cannot proceed, audit_failure selects what happens. */
|
2014-01-15 02:33:13 +08:00
|
|
|
static u32 audit_failure = AUDIT_FAIL_PRINTK;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* private audit network namespace index */
|
|
|
|
static unsigned int audit_net_id;
|
|
|
|
|
|
|
|
/**
|
|
|
|
* struct audit_net - audit private network namespace data
|
|
|
|
* @sk: communication socket
|
|
|
|
*/
|
|
|
|
struct audit_net {
|
|
|
|
struct sock *sk;
|
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* struct auditd_connection - kernel/auditd connection state
|
|
|
|
* @pid: auditd PID
|
|
|
|
* @portid: netlink portid
|
|
|
|
* @net: the associated network namespace
|
2017-05-02 22:16:05 +08:00
|
|
|
* @rcu: RCU head
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* This struct is RCU protected; you must either hold the RCU lock for reading
|
2017-05-02 22:16:05 +08:00
|
|
|
* or the associated spinlock for writing.
|
audit: netlink socket can be auto-bound to pid other than current->pid (v2)
From: Pavel Emelyanov <xemul@openvz.org>
This patch is based on the one from Thomas.
The kauditd_thread() calls the netlink_unicast() and passes
the audit_pid to it. The audit_pid, in turn, is received from
the user space and the tool (I've checked the audit v1.6.9)
uses getpid() to pass one in the kernel. Besides, this tool
doesn't bind the netlink socket to this id, but simply creates
it allowing the kernel to auto-bind one.
That's the preamble.
The problem is that netlink_autobind() _does_not_ guarantees
that the socket will be auto-bound to the current pid. Instead
it uses the current pid as a hint to start looking for a free
id. So, in case of conflict, the audit messages can be sent
to a wrong socket. This can happen (it's unlikely, but can be)
in case some task opens more than one netlink sockets and then
the audit one starts - in this case the audit's pid can be busy
and its socket will be bound to another id.
The proposal is to introduce an audit_nlk_pid in audit subsys,
that will point to the netlink socket to send packets to. It
will most often be equal to audit_pid. The socket id can be
got from the skb's netlink CB right in the audit_receive_msg.
The audit_nlk_pid reset to 0 is not required, since all the
decisions are taken based on audit_pid value only.
Later, if the audit tools will bind the socket themselves, the
kernel will have to provide a way to setup the audit_nlk_pid
as well.
A good side effect of this patch is that audit_pid can later
be converted to struct pid, as it is not longer safe to use
pid_t-s in the presence of pid namespaces. But audit code still
uses the tgid from task_struct in the audit_signal_info and in
the audit_filter_syscall.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 06:39:41 +08:00
|
|
|
*/
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
static struct auditd_connection {
|
2017-05-02 22:16:05 +08:00
|
|
|
struct pid *pid;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
u32 portid;
|
|
|
|
struct net *net;
|
2017-05-02 22:16:05 +08:00
|
|
|
struct rcu_head rcu;
|
|
|
|
} *auditd_conn = NULL;
|
|
|
|
static DEFINE_SPINLOCK(auditd_conn_lock);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/* If audit_rate_limit is non-zero, limit the rate of sending audit records
|
2005-04-17 06:20:36 +08:00
|
|
|
* to that number per second. This prevents DoS attacks, but results in
|
|
|
|
* audit records being dropped. */
|
2014-01-15 02:33:13 +08:00
|
|
|
static u32 audit_rate_limit;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-10-23 01:28:49 +08:00
|
|
|
/* Number of outstanding audit_buffers allowed.
|
|
|
|
* When set to zero, this means unlimited. */
|
2014-01-15 02:33:13 +08:00
|
|
|
static u32 audit_backlog_limit = 64;
|
2013-09-13 11:03:51 +08:00
|
|
|
#define AUDIT_BACKLOG_WAIT_TIME (60 * HZ)
|
2014-01-15 02:33:13 +08:00
|
|
|
static u32 audit_backlog_wait_time = AUDIT_BACKLOG_WAIT_TIME;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2005-05-06 19:38:39 +08:00
|
|
|
/* The identity of the user shutting down the audit system. */
|
2012-02-08 08:53:48 +08:00
|
|
|
kuid_t audit_sig_uid = INVALID_UID;
|
2005-05-06 19:38:39 +08:00
|
|
|
pid_t audit_sig_pid = -1;
|
2006-05-25 22:19:47 +08:00
|
|
|
u32 audit_sig_sid = 0;
|
2005-05-06 19:38:39 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/* Records can be lost in several ways:
|
|
|
|
0) [suppressed in audit_alloc]
|
|
|
|
1) out of memory in audit_log_start [kmalloc of struct audit_buffer]
|
|
|
|
2) out of memory in audit_log_move [alloc_skb]
|
|
|
|
3) suppressed due to audit_rate_limit
|
|
|
|
4) suppressed due to audit_backlog_limit
|
|
|
|
*/
|
2017-01-13 16:26:29 +08:00
|
|
|
static atomic_t audit_lost = ATOMIC_INIT(0);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
[PATCH] audit: path-based rules
In this implementation, audit registers inotify watches on the parent
directories of paths specified in audit rules. When audit's inotify
event handler is called, it updates any affected rules based on the
filesystem event. If the parent directory is renamed, removed, or its
filesystem is unmounted, audit removes all rules referencing that
inotify watch.
To keep things simple, this implementation limits location-based
auditing to the directory entries in an existing directory. Given
a path-based rule for /foo/bar/passwd, the following table applies:
passwd modified -- audit event logged
passwd replaced -- audit event logged, rules list updated
bar renamed -- rule removed
foo renamed -- untracked, meaning that the rule now applies to
the new location
Audit users typically want to have many rules referencing filesystem
objects, which can significantly impact filtering performance. This
patch also adds an inode-number-based rule hash to mitigate this
situation.
The patch is relative to the audit git tree:
http://kernel.org/git/?p=linux/kernel/git/viro/audit-current.git;a=summary
and uses the inotify kernel API:
http://lkml.org/lkml/2006/6/1/145
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-04-08 04:55:56 +08:00
|
|
|
/* Hash for inode-based rules */
|
|
|
|
struct list_head audit_inode_hash[AUDIT_INODE_BUCKETS];
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
static struct kmem_cache *audit_buffer_cache;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2016-11-30 05:53:25 +08:00
|
|
|
/* queue msgs to send via kauditd_task */
|
2016-11-30 05:53:24 +08:00
|
|
|
static struct sk_buff_head audit_queue;
|
2016-11-30 05:53:25 +08:00
|
|
|
/* queue msgs due to temporary unicast send problems */
|
|
|
|
static struct sk_buff_head audit_retry_queue;
|
|
|
|
/* queue msgs waiting for new auditd connection */
|
2016-11-30 05:53:24 +08:00
|
|
|
static struct sk_buff_head audit_hold_queue;
|
2016-11-30 05:53:25 +08:00
|
|
|
|
|
|
|
/* queue servicing thread */
|
2005-05-19 17:56:58 +08:00
|
|
|
static struct task_struct *kauditd_task;
|
|
|
|
static DECLARE_WAIT_QUEUE_HEAD(kauditd_wait);
|
2016-11-30 05:53:25 +08:00
|
|
|
|
|
|
|
/* waitqueue for callers who are blocked on the audit backlog */
|
2005-06-22 22:04:33 +08:00
|
|
|
static DECLARE_WAIT_QUEUE_HEAD(audit_backlog_wait);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-05-23 00:54:49 +08:00
|
|
|
static struct audit_features af = {.vers = AUDIT_FEATURE_VERSION,
|
|
|
|
.mask = -1,
|
|
|
|
.features = 0,
|
|
|
|
.lock = 0,};
|
|
|
|
|
2013-05-24 02:26:00 +08:00
|
|
|
static char *audit_feature_names[2] = {
|
2013-05-24 21:18:04 +08:00
|
|
|
"only_unset_loginuid",
|
2013-05-24 02:26:00 +08:00
|
|
|
"loginuid_immutable",
|
2013-05-23 00:54:49 +08:00
|
|
|
};
|
|
|
|
|
2018-02-20 22:52:38 +08:00
|
|
|
/**
|
|
|
|
* struct audit_ctl_mutex - serialize requests from userspace
|
|
|
|
* @lock: the mutex used for locking
|
|
|
|
* @owner: the task which owns the lock
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* This is the lock struct used to ensure we only process userspace requests
|
|
|
|
* in an orderly fashion. We can't simply use a mutex/lock here because we
|
|
|
|
* need to track lock ownership so we don't end up blocking the lock owner in
|
|
|
|
* audit_log_start() or similar.
|
|
|
|
*/
|
|
|
|
static struct audit_ctl_mutex {
|
|
|
|
struct mutex lock;
|
|
|
|
void *owner;
|
|
|
|
} audit_cmd_mutex;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/* AUDIT_BUFSIZ is the size of the temporary buffer used for formatting
|
|
|
|
* audit records. Since printk uses a 1024 byte buffer, this buffer
|
|
|
|
* should be at least that large. */
|
|
|
|
#define AUDIT_BUFSIZ 1024
|
|
|
|
|
|
|
|
/* The audit_buffer is used when formatting an audit record. The caller
|
|
|
|
* locks briefly to get the record off the freelist or to allocate the
|
|
|
|
* buffer, and locks briefly to send the buffer to the netlink layer or
|
|
|
|
* to place it on a transmit queue. Multiple audit_buffers can be in
|
|
|
|
* use simultaneously. */
|
|
|
|
struct audit_buffer {
|
2005-05-06 22:54:17 +08:00
|
|
|
struct sk_buff *skb; /* formatted skb ready to send */
|
2005-04-17 06:20:36 +08:00
|
|
|
struct audit_context *ctx; /* NULL or associated context */
|
2005-10-21 15:22:03 +08:00
|
|
|
gfp_t gfp_mask;
|
2005-04-17 06:20:36 +08:00
|
|
|
};
|
|
|
|
|
2008-04-18 22:11:04 +08:00
|
|
|
struct audit_reply {
|
2013-08-14 23:32:45 +08:00
|
|
|
__u32 portid;
|
2014-03-01 02:49:05 +08:00
|
|
|
struct net *net;
|
2008-04-18 22:11:04 +08:00
|
|
|
struct sk_buff *skb;
|
|
|
|
};
|
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/**
|
|
|
|
* auditd_test_task - Check to see if a given task is an audit daemon
|
|
|
|
* @task: the task to check
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* Return 1 if the task is a registered audit daemon, 0 otherwise.
|
|
|
|
*/
|
2017-05-02 22:16:05 +08:00
|
|
|
int auditd_test_task(struct task_struct *task)
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
{
|
|
|
|
int rc;
|
2017-05-02 22:16:05 +08:00
|
|
|
struct auditd_connection *ac;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
|
|
|
rcu_read_lock();
|
2017-05-02 22:16:05 +08:00
|
|
|
ac = rcu_dereference(auditd_conn);
|
|
|
|
rc = (ac && ac->pid == task_tgid(task) ? 1 : 0);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
rcu_read_unlock();
|
|
|
|
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
2018-02-20 22:52:38 +08:00
|
|
|
/**
|
|
|
|
* audit_ctl_lock - Take the audit control lock
|
|
|
|
*/
|
|
|
|
void audit_ctl_lock(void)
|
|
|
|
{
|
|
|
|
mutex_lock(&audit_cmd_mutex.lock);
|
|
|
|
audit_cmd_mutex.owner = current;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* audit_ctl_unlock - Drop the audit control lock
|
|
|
|
*/
|
|
|
|
void audit_ctl_unlock(void)
|
|
|
|
{
|
|
|
|
audit_cmd_mutex.owner = NULL;
|
|
|
|
mutex_unlock(&audit_cmd_mutex.lock);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* audit_ctl_owner_current - Test to see if the current task owns the lock
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* Return true if the current task owns the audit control lock, false if it
|
|
|
|
* doesn't own the lock.
|
|
|
|
*/
|
|
|
|
static bool audit_ctl_owner_current(void)
|
|
|
|
{
|
|
|
|
return (current == audit_cmd_mutex.owner);
|
|
|
|
}
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
/**
|
|
|
|
* auditd_pid_vnr - Return the auditd PID relative to the namespace
|
|
|
|
*
|
|
|
|
* Description:
|
2017-05-02 22:16:05 +08:00
|
|
|
* Returns the PID in relation to the namespace, 0 on failure.
|
2017-05-02 22:16:05 +08:00
|
|
|
*/
|
2017-05-02 22:16:05 +08:00
|
|
|
static pid_t auditd_pid_vnr(void)
|
2017-05-02 22:16:05 +08:00
|
|
|
{
|
|
|
|
pid_t pid;
|
2017-05-02 22:16:05 +08:00
|
|
|
const struct auditd_connection *ac;
|
2017-05-02 22:16:05 +08:00
|
|
|
|
|
|
|
rcu_read_lock();
|
2017-05-02 22:16:05 +08:00
|
|
|
ac = rcu_dereference(auditd_conn);
|
|
|
|
if (!ac || !ac->pid)
|
2017-05-02 22:16:05 +08:00
|
|
|
pid = 0;
|
|
|
|
else
|
2017-05-02 22:16:05 +08:00
|
|
|
pid = pid_vnr(ac->pid);
|
2017-05-02 22:16:05 +08:00
|
|
|
rcu_read_unlock();
|
|
|
|
|
|
|
|
return pid;
|
|
|
|
}
|
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/**
|
|
|
|
* audit_get_sk - Return the audit socket for the given network namespace
|
|
|
|
* @net: the destination network namespace
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* Returns the sock pointer if valid, NULL otherwise. The caller must ensure
|
|
|
|
* that a reference is held for the network namespace while the sock is in use.
|
|
|
|
*/
|
|
|
|
static struct sock *audit_get_sk(const struct net *net)
|
|
|
|
{
|
|
|
|
struct audit_net *aunet;
|
|
|
|
|
|
|
|
if (!net)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
aunet = net_generic(net, audit_net_id);
|
|
|
|
return aunet->sk;
|
|
|
|
}
|
|
|
|
|
2005-11-04 01:15:16 +08:00
|
|
|
void audit_panic(const char *message)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2014-01-15 02:33:12 +08:00
|
|
|
switch (audit_failure) {
|
2005-04-17 06:20:36 +08:00
|
|
|
case AUDIT_FAIL_SILENT:
|
|
|
|
break;
|
|
|
|
case AUDIT_FAIL_PRINTK:
|
2008-01-24 11:55:05 +08:00
|
|
|
if (printk_ratelimit())
|
2014-01-15 02:33:12 +08:00
|
|
|
pr_err("%s\n", message);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case AUDIT_FAIL_PANIC:
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
panic("audit: %s\n", message);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int audit_rate_check(void)
|
|
|
|
{
|
|
|
|
static unsigned long last_check = 0;
|
|
|
|
static int messages = 0;
|
|
|
|
static DEFINE_SPINLOCK(lock);
|
|
|
|
unsigned long flags;
|
|
|
|
unsigned long now;
|
|
|
|
unsigned long elapsed;
|
|
|
|
int retval = 0;
|
|
|
|
|
|
|
|
if (!audit_rate_limit) return 1;
|
|
|
|
|
|
|
|
spin_lock_irqsave(&lock, flags);
|
|
|
|
if (++messages < audit_rate_limit) {
|
|
|
|
retval = 1;
|
|
|
|
} else {
|
|
|
|
now = jiffies;
|
|
|
|
elapsed = now - last_check;
|
|
|
|
if (elapsed > HZ) {
|
|
|
|
last_check = now;
|
|
|
|
messages = 0;
|
|
|
|
retval = 1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
spin_unlock_irqrestore(&lock, flags);
|
|
|
|
|
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/**
|
|
|
|
* audit_log_lost - conditionally log lost audit message event
|
|
|
|
* @message: the message stating reason for lost audit message
|
|
|
|
*
|
|
|
|
* Emit at least 1 message per second, even if audit_rate_check is
|
|
|
|
* throttling.
|
|
|
|
* Always increment the lost messages counter.
|
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
void audit_log_lost(const char *message)
|
|
|
|
{
|
|
|
|
static unsigned long last_msg = 0;
|
|
|
|
static DEFINE_SPINLOCK(lock);
|
|
|
|
unsigned long flags;
|
|
|
|
unsigned long now;
|
|
|
|
int print;
|
|
|
|
|
|
|
|
atomic_inc(&audit_lost);
|
|
|
|
|
|
|
|
print = (audit_failure == AUDIT_FAIL_PANIC || !audit_rate_limit);
|
|
|
|
|
|
|
|
if (!print) {
|
|
|
|
spin_lock_irqsave(&lock, flags);
|
|
|
|
now = jiffies;
|
|
|
|
if (now - last_msg > HZ) {
|
|
|
|
print = 1;
|
|
|
|
last_msg = now;
|
|
|
|
}
|
|
|
|
spin_unlock_irqrestore(&lock, flags);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (print) {
|
2008-01-24 11:55:05 +08:00
|
|
|
if (printk_ratelimit())
|
2014-01-15 02:33:13 +08:00
|
|
|
pr_warn("audit_lost=%u audit_rate_limit=%u audit_backlog_limit=%u\n",
|
2008-01-24 11:55:05 +08:00
|
|
|
atomic_read(&audit_lost),
|
|
|
|
audit_rate_limit,
|
|
|
|
audit_backlog_limit);
|
2005-04-17 06:20:36 +08:00
|
|
|
audit_panic(message);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-01-15 02:33:13 +08:00
|
|
|
static int audit_log_config_change(char *function_name, u32 new, u32 old,
|
2008-04-18 22:09:25 +08:00
|
|
|
int allow_changes)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-01-08 06:09:31 +08:00
|
|
|
struct audit_buffer *ab;
|
|
|
|
int rc = 0;
|
2006-04-02 07:29:34 +08:00
|
|
|
|
2008-01-08 06:09:31 +08:00
|
|
|
ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
|
2013-01-12 06:32:07 +08:00
|
|
|
if (unlikely(!ab))
|
|
|
|
return rc;
|
2014-01-15 02:33:13 +08:00
|
|
|
audit_log_format(ab, "%s=%u old=%u", function_name, new, old);
|
2013-04-30 21:53:34 +08:00
|
|
|
audit_log_session_info(ab);
|
2013-04-20 03:00:33 +08:00
|
|
|
rc = audit_log_task_context(ab);
|
|
|
|
if (rc)
|
|
|
|
allow_changes = 0; /* Something weird, deny request */
|
2008-01-08 06:09:31 +08:00
|
|
|
audit_log_format(ab, " res=%d", allow_changes);
|
|
|
|
audit_log_end(ab);
|
2007-01-20 03:39:55 +08:00
|
|
|
return rc;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2014-01-15 02:33:13 +08:00
|
|
|
static int audit_do_config_change(char *function_name, u32 *to_change, u32 new)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2014-01-15 02:33:13 +08:00
|
|
|
int allow_changes, rc = 0;
|
|
|
|
u32 old = *to_change;
|
2007-01-20 03:39:55 +08:00
|
|
|
|
|
|
|
/* check if we are locked */
|
2008-01-08 06:09:31 +08:00
|
|
|
if (audit_enabled == AUDIT_LOCKED)
|
|
|
|
allow_changes = 0;
|
2007-01-20 03:39:55 +08:00
|
|
|
else
|
2008-01-08 06:09:31 +08:00
|
|
|
allow_changes = 1;
|
2006-04-02 07:29:34 +08:00
|
|
|
|
2008-01-08 06:09:31 +08:00
|
|
|
if (audit_enabled != AUDIT_OFF) {
|
2013-04-20 01:23:09 +08:00
|
|
|
rc = audit_log_config_change(function_name, new, old, allow_changes);
|
2008-01-08 06:09:31 +08:00
|
|
|
if (rc)
|
|
|
|
allow_changes = 0;
|
2007-01-20 03:39:55 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* If we are allowed, make the change */
|
2008-01-08 06:09:31 +08:00
|
|
|
if (allow_changes == 1)
|
|
|
|
*to_change = new;
|
2007-01-20 03:39:55 +08:00
|
|
|
/* Not allowed, update reason */
|
|
|
|
else if (rc == 0)
|
|
|
|
rc = -EPERM;
|
|
|
|
return rc;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2014-01-15 02:33:13 +08:00
|
|
|
static int audit_set_rate_limit(u32 limit)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2013-04-20 01:23:09 +08:00
|
|
|
return audit_do_config_change("audit_rate_limit", &audit_rate_limit, limit);
|
2008-01-08 06:09:31 +08:00
|
|
|
}
|
2006-04-02 07:29:34 +08:00
|
|
|
|
2014-01-15 02:33:13 +08:00
|
|
|
static int audit_set_backlog_limit(u32 limit)
|
2008-01-08 06:09:31 +08:00
|
|
|
{
|
2013-04-20 01:23:09 +08:00
|
|
|
return audit_do_config_change("audit_backlog_limit", &audit_backlog_limit, limit);
|
2008-01-08 06:09:31 +08:00
|
|
|
}
|
2007-01-20 03:39:55 +08:00
|
|
|
|
2014-01-15 02:33:13 +08:00
|
|
|
static int audit_set_backlog_wait_time(u32 timeout)
|
2013-09-18 23:55:12 +08:00
|
|
|
{
|
|
|
|
return audit_do_config_change("audit_backlog_wait_time",
|
2016-11-30 05:53:25 +08:00
|
|
|
&audit_backlog_wait_time, timeout);
|
2013-09-18 23:55:12 +08:00
|
|
|
}
|
|
|
|
|
2014-01-15 02:33:13 +08:00
|
|
|
static int audit_set_enabled(u32 state)
|
2008-01-08 06:09:31 +08:00
|
|
|
{
|
2008-01-09 06:38:31 +08:00
|
|
|
int rc;
|
2015-03-12 02:08:19 +08:00
|
|
|
if (state > AUDIT_LOCKED)
|
2008-01-08 06:09:31 +08:00
|
|
|
return -EINVAL;
|
2007-01-20 03:39:55 +08:00
|
|
|
|
2013-04-20 01:23:09 +08:00
|
|
|
rc = audit_do_config_change("audit_enabled", &audit_enabled, state);
|
2008-01-09 06:38:31 +08:00
|
|
|
if (!rc)
|
|
|
|
audit_ever_enabled |= !!state;
|
|
|
|
|
|
|
|
return rc;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2014-01-15 02:33:13 +08:00
|
|
|
static int audit_set_failure(u32 state)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
|
|
|
if (state != AUDIT_FAIL_SILENT
|
|
|
|
&& state != AUDIT_FAIL_PRINTK
|
|
|
|
&& state != AUDIT_FAIL_PANIC)
|
|
|
|
return -EINVAL;
|
2006-04-02 07:29:34 +08:00
|
|
|
|
2013-04-20 01:23:09 +08:00
|
|
|
return audit_do_config_change("audit_failure", &audit_failure, state);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
/**
|
|
|
|
* auditd_conn_free - RCU helper to release an auditd connection struct
|
|
|
|
* @rcu: RCU head
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* Drop any references inside the auditd connection tracking struct and free
|
|
|
|
* the memory.
|
|
|
|
*/
|
|
|
|
static void auditd_conn_free(struct rcu_head *rcu)
|
|
|
|
{
|
|
|
|
struct auditd_connection *ac;
|
|
|
|
|
|
|
|
ac = container_of(rcu, struct auditd_connection, rcu);
|
|
|
|
put_pid(ac->pid);
|
|
|
|
put_net(ac->net);
|
|
|
|
kfree(ac);
|
|
|
|
}
|
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/**
|
|
|
|
* auditd_set - Set/Reset the auditd connection state
|
|
|
|
* @pid: auditd PID
|
|
|
|
* @portid: auditd netlink portid
|
|
|
|
* @net: auditd network namespace pointer
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* This function will obtain and drop network namespace references as
|
2017-05-02 22:16:05 +08:00
|
|
|
* necessary. Returns zero on success, negative values on failure.
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
*/
|
2017-05-02 22:16:05 +08:00
|
|
|
static int auditd_set(struct pid *pid, u32 portid, struct net *net)
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
{
|
|
|
|
unsigned long flags;
|
2017-05-02 22:16:05 +08:00
|
|
|
struct auditd_connection *ac_old, *ac_new;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
if (!pid || !net)
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
ac_new = kzalloc(sizeof(*ac_new), GFP_KERNEL);
|
|
|
|
if (!ac_new)
|
|
|
|
return -ENOMEM;
|
|
|
|
ac_new->pid = get_pid(pid);
|
|
|
|
ac_new->portid = portid;
|
|
|
|
ac_new->net = get_net(net);
|
|
|
|
|
|
|
|
spin_lock_irqsave(&auditd_conn_lock, flags);
|
|
|
|
ac_old = rcu_dereference_protected(auditd_conn,
|
|
|
|
lockdep_is_held(&auditd_conn_lock));
|
|
|
|
rcu_assign_pointer(auditd_conn, ac_new);
|
|
|
|
spin_unlock_irqrestore(&auditd_conn_lock, flags);
|
|
|
|
|
|
|
|
if (ac_old)
|
|
|
|
call_rcu(&ac_old->rcu, auditd_conn_free);
|
|
|
|
|
|
|
|
return 0;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* kauditd_print_skb - Print the audit record to the ring buffer
|
|
|
|
* @skb: audit record
|
|
|
|
*
|
|
|
|
* Whatever the reason, this packet may not make it to the auditd connection
|
|
|
|
* so write it via printk so the information isn't completely lost.
|
2009-06-12 02:31:35 +08:00
|
|
|
*/
|
2016-11-30 05:53:24 +08:00
|
|
|
static void kauditd_printk_skb(struct sk_buff *skb)
|
2009-06-12 02:31:35 +08:00
|
|
|
{
|
|
|
|
struct nlmsghdr *nlh = nlmsg_hdr(skb);
|
2012-06-27 12:45:21 +08:00
|
|
|
char *data = nlmsg_data(nlh);
|
2009-06-12 02:31:35 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
if (nlh->nlmsg_type != AUDIT_EOE && printk_ratelimit())
|
|
|
|
pr_notice("type=%d %s\n", nlh->nlmsg_type, data);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* kauditd_rehold_skb - Handle a audit record send failure in the hold queue
|
|
|
|
* @skb: audit record
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* This should only be used by the kauditd_thread when it fails to flush the
|
|
|
|
* hold queue.
|
|
|
|
*/
|
|
|
|
static void kauditd_rehold_skb(struct sk_buff *skb)
|
|
|
|
{
|
|
|
|
/* put the record back in the queue at the same place */
|
|
|
|
skb_queue_head(&audit_hold_queue, skb);
|
2016-11-30 05:53:25 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* kauditd_hold_skb - Queue an audit record, waiting for auditd
|
|
|
|
* @skb: audit record
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* Queue the audit record, waiting for an instance of auditd. When this
|
|
|
|
* function is called we haven't given up yet on sending the record, but things
|
|
|
|
* are not looking good. The first thing we want to do is try to write the
|
|
|
|
* record via printk and then see if we want to try and hold on to the record
|
|
|
|
* and queue it, if we have room. If we want to hold on to the record, but we
|
|
|
|
* don't have room, record a record lost message.
|
|
|
|
*/
|
|
|
|
static void kauditd_hold_skb(struct sk_buff *skb)
|
|
|
|
{
|
|
|
|
/* at this point it is uncertain if we will ever send this to auditd so
|
|
|
|
* try to send the message via printk before we go any further */
|
|
|
|
kauditd_printk_skb(skb);
|
|
|
|
|
|
|
|
/* can we just silently drop the message? */
|
|
|
|
if (!audit_default) {
|
|
|
|
kfree_skb(skb);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* if we have room, queue the message */
|
|
|
|
if (!audit_backlog_limit ||
|
|
|
|
skb_queue_len(&audit_hold_queue) < audit_backlog_limit) {
|
|
|
|
skb_queue_tail(&audit_hold_queue, skb);
|
|
|
|
return;
|
|
|
|
}
|
2009-06-12 02:31:35 +08:00
|
|
|
|
2016-11-30 05:53:25 +08:00
|
|
|
/* we have no other options - drop the message */
|
|
|
|
audit_log_lost("kauditd hold queue overflow");
|
|
|
|
kfree_skb(skb);
|
2009-06-12 02:31:35 +08:00
|
|
|
}
|
|
|
|
|
2016-11-30 05:53:25 +08:00
|
|
|
/**
|
|
|
|
* kauditd_retry_skb - Queue an audit record, attempt to send again to auditd
|
|
|
|
* @skb: audit record
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* Not as serious as kauditd_hold_skb() as we still have a connected auditd,
|
|
|
|
* but for some reason we are having problems sending it audit records so
|
|
|
|
* queue the given record and attempt to resend.
|
|
|
|
*/
|
|
|
|
static void kauditd_retry_skb(struct sk_buff *skb)
|
2008-04-18 22:02:28 +08:00
|
|
|
{
|
2016-11-30 05:53:25 +08:00
|
|
|
/* NOTE: because records should only live in the retry queue for a
|
|
|
|
* short period of time, before either being sent or moved to the hold
|
|
|
|
* queue, we don't currently enforce a limit on this queue */
|
|
|
|
skb_queue_tail(&audit_retry_queue, skb);
|
|
|
|
}
|
audit: try harder to send to auditd upon netlink failure
There are several reports of the kernel losing contact with auditd when
it is, in fact, still running. When this happens, kernel syslogs show:
"audit: *NO* daemon at audit_pid=<pid>"
although auditd is still running, and is apparently happy, listening on
the netlink socket. The pid in the "*NO* daemon" message matches the pid
of the running auditd process. Restarting auditd solves this.
The problem appears to happen randomly, and doesn't seem to be strongly
correlated to the rate of audit events being logged. The problem
happens fairly regularly (every few days), but not yet reproduced to
order.
On production kernels, BUG_ON() is a no-op, so any error will trigger
this.
Commit 34eab0a7cd45 ("audit: prevent an older auditd shutdown from
orphaning a newer auditd startup") eliminates one possible cause. This
isn't the case here, since the PID in the error message and the PID of
the running auditd match.
The primary expected cause of error here is -ECONNREFUSED when the audit
daemon goes away, when netlink_getsockbyportid() can't find the auditd
portid entry in the netlink audit table (or there is no receive
function). If -EPERM is returned, that situation isn't likely to be
resolved in a timely fashion without administrator intervention. In
both cases, reset the audit_pid. This does not rule out a race
condition. SELinux is expected to return zero since this isn't an INET
or INET6 socket. Other LSMs may have other return codes. Log the error
code for better diagnosis in the future.
In the case of -ENOMEM, the situation could be temporary, based on local
or general availability of buffers. -EAGAIN should never happen since
the netlink audit (kernel) socket is set to MAX_SCHEDULE_TIMEOUT.
-ERESTARTSYS and -EINTR are not expected since this kernel thread is not
expected to receive signals. In these cases (or any other unexpected
ones for now), report the error and re-schedule the thread, retrying up
to 5 times.
v2:
Removed BUG_ON().
Moved comma in pr_*() statements.
Removed audit_strerror() text.
Reported-by: Vipin Rathor <v.rathor@gmail.com>
Reported-by: <ctcard@hotmail.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: applied rgb's fixup patch to correct audit_log_lost() format issues]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-11-04 21:23:50 +08:00
|
|
|
|
2017-04-10 23:16:59 +08:00
|
|
|
/**
|
|
|
|
* auditd_reset - Disconnect the auditd connection
|
2017-06-12 21:35:24 +08:00
|
|
|
* @ac: auditd connection state
|
2017-04-10 23:16:59 +08:00
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* Break the auditd/kauditd connection and move all the queued records into the
|
2017-06-12 21:35:24 +08:00
|
|
|
* hold queue in case auditd reconnects. It is important to note that the @ac
|
|
|
|
* pointer should never be dereferenced inside this function as it may be NULL
|
|
|
|
* or invalid, you can only compare the memory address! If @ac is NULL then
|
|
|
|
* the connection will always be reset.
|
2017-04-10 23:16:59 +08:00
|
|
|
*/
|
2017-06-12 21:35:24 +08:00
|
|
|
static void auditd_reset(const struct auditd_connection *ac)
|
2017-04-10 23:16:59 +08:00
|
|
|
{
|
2017-05-02 22:16:05 +08:00
|
|
|
unsigned long flags;
|
2017-04-10 23:16:59 +08:00
|
|
|
struct sk_buff *skb;
|
2017-05-02 22:16:05 +08:00
|
|
|
struct auditd_connection *ac_old;
|
2017-04-10 23:16:59 +08:00
|
|
|
|
|
|
|
/* if it isn't already broken, break the connection */
|
2017-05-02 22:16:05 +08:00
|
|
|
spin_lock_irqsave(&auditd_conn_lock, flags);
|
|
|
|
ac_old = rcu_dereference_protected(auditd_conn,
|
|
|
|
lockdep_is_held(&auditd_conn_lock));
|
2017-06-12 21:35:24 +08:00
|
|
|
if (ac && ac != ac_old) {
|
|
|
|
/* someone already registered a new auditd connection */
|
|
|
|
spin_unlock_irqrestore(&auditd_conn_lock, flags);
|
|
|
|
return;
|
|
|
|
}
|
2017-05-02 22:16:05 +08:00
|
|
|
rcu_assign_pointer(auditd_conn, NULL);
|
|
|
|
spin_unlock_irqrestore(&auditd_conn_lock, flags);
|
|
|
|
|
|
|
|
if (ac_old)
|
|
|
|
call_rcu(&ac_old->rcu, auditd_conn_free);
|
2017-04-10 23:16:59 +08:00
|
|
|
|
2017-06-12 23:53:09 +08:00
|
|
|
/* flush the retry queue to the hold queue, but don't touch the main
|
|
|
|
* queue since we need to process that normally for multicast */
|
2017-04-10 23:16:59 +08:00
|
|
|
while ((skb = skb_dequeue(&audit_retry_queue)))
|
|
|
|
kauditd_hold_skb(skb);
|
|
|
|
}
|
|
|
|
|
2016-11-30 05:53:25 +08:00
|
|
|
/**
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
* auditd_send_unicast_skb - Send a record via unicast to auditd
|
|
|
|
* @skb: audit record
|
2016-11-30 05:53:25 +08:00
|
|
|
*
|
|
|
|
* Description:
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
* Send a skb to the audit daemon, returns positive/zero values on success and
|
|
|
|
* negative values on failure; in all cases the skb will be consumed by this
|
|
|
|
* function. If the send results in -ECONNREFUSED the connection with auditd
|
|
|
|
* will be reset. This function may sleep so callers should not hold any locks
|
|
|
|
* where this would cause a problem.
|
2016-11-30 05:53:25 +08:00
|
|
|
*/
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
static int auditd_send_unicast_skb(struct sk_buff *skb)
|
2016-11-30 05:53:25 +08:00
|
|
|
{
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
int rc;
|
|
|
|
u32 portid;
|
|
|
|
struct net *net;
|
|
|
|
struct sock *sk;
|
2017-05-02 22:16:05 +08:00
|
|
|
struct auditd_connection *ac;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
|
|
|
/* NOTE: we can't call netlink_unicast while in the RCU section so
|
|
|
|
* take a reference to the network namespace and grab local
|
|
|
|
* copies of the namespace, the sock, and the portid; the
|
|
|
|
* namespace and sock aren't going to go away while we hold a
|
|
|
|
* reference and if the portid does become invalid after the RCU
|
|
|
|
* section netlink_unicast() should safely return an error */
|
|
|
|
|
|
|
|
rcu_read_lock();
|
2017-05-02 22:16:05 +08:00
|
|
|
ac = rcu_dereference(auditd_conn);
|
|
|
|
if (!ac) {
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
rcu_read_unlock();
|
2017-07-18 14:37:24 +08:00
|
|
|
kfree_skb(skb);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
rc = -ECONNREFUSED;
|
|
|
|
goto err;
|
2016-12-13 23:03:01 +08:00
|
|
|
}
|
2017-05-02 22:16:05 +08:00
|
|
|
net = get_net(ac->net);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
sk = audit_get_sk(net);
|
2017-05-02 22:16:05 +08:00
|
|
|
portid = ac->portid;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
rcu_read_unlock();
|
2016-11-30 05:53:25 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
rc = netlink_unicast(sk, skb, portid, 0);
|
|
|
|
put_net(net);
|
|
|
|
if (rc < 0)
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
return rc;
|
|
|
|
|
|
|
|
err:
|
2017-06-12 21:35:24 +08:00
|
|
|
if (ac && rc == -ECONNREFUSED)
|
|
|
|
auditd_reset(ac);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
return rc;
|
2016-11-30 05:53:25 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
* kauditd_send_queue - Helper for kauditd_thread to flush skb queues
|
|
|
|
* @sk: the sending sock
|
|
|
|
* @portid: the netlink destination
|
|
|
|
* @queue: the skb queue to process
|
|
|
|
* @retry_limit: limit on number of netlink unicast failures
|
|
|
|
* @skb_hook: per-skb hook for additional processing
|
|
|
|
* @err_hook: hook called if the skb fails the netlink unicast send
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* Run through the given queue and attempt to send the audit records to auditd,
|
|
|
|
* returns zero on success, negative values on failure. It is up to the caller
|
|
|
|
* to ensure that the @sk is valid for the duration of this function.
|
|
|
|
*
|
2016-11-30 05:53:25 +08:00
|
|
|
*/
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
static int kauditd_send_queue(struct sock *sk, u32 portid,
|
|
|
|
struct sk_buff_head *queue,
|
|
|
|
unsigned int retry_limit,
|
|
|
|
void (*skb_hook)(struct sk_buff *skb),
|
|
|
|
void (*err_hook)(struct sk_buff *skb))
|
2016-11-30 05:53:25 +08:00
|
|
|
{
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
int rc = 0;
|
|
|
|
struct sk_buff *skb;
|
|
|
|
static unsigned int failed = 0;
|
audit: try harder to send to auditd upon netlink failure
There are several reports of the kernel losing contact with auditd when
it is, in fact, still running. When this happens, kernel syslogs show:
"audit: *NO* daemon at audit_pid=<pid>"
although auditd is still running, and is apparently happy, listening on
the netlink socket. The pid in the "*NO* daemon" message matches the pid
of the running auditd process. Restarting auditd solves this.
The problem appears to happen randomly, and doesn't seem to be strongly
correlated to the rate of audit events being logged. The problem
happens fairly regularly (every few days), but not yet reproduced to
order.
On production kernels, BUG_ON() is a no-op, so any error will trigger
this.
Commit 34eab0a7cd45 ("audit: prevent an older auditd shutdown from
orphaning a newer auditd startup") eliminates one possible cause. This
isn't the case here, since the PID in the error message and the PID of
the running auditd match.
The primary expected cause of error here is -ECONNREFUSED when the audit
daemon goes away, when netlink_getsockbyportid() can't find the auditd
portid entry in the netlink audit table (or there is no receive
function). If -EPERM is returned, that situation isn't likely to be
resolved in a timely fashion without administrator intervention. In
both cases, reset the audit_pid. This does not rule out a race
condition. SELinux is expected to return zero since this isn't an INET
or INET6 socket. Other LSMs may have other return codes. Log the error
code for better diagnosis in the future.
In the case of -ENOMEM, the situation could be temporary, based on local
or general availability of buffers. -EAGAIN should never happen since
the netlink audit (kernel) socket is set to MAX_SCHEDULE_TIMEOUT.
-ERESTARTSYS and -EINTR are not expected since this kernel thread is not
expected to receive signals. In these cases (or any other unexpected
ones for now), report the error and re-schedule the thread, retrying up
to 5 times.
v2:
Removed BUG_ON().
Moved comma in pr_*() statements.
Removed audit_strerror() text.
Reported-by: Vipin Rathor <v.rathor@gmail.com>
Reported-by: <ctcard@hotmail.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: applied rgb's fixup patch to correct audit_log_lost() format issues]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-11-04 21:23:50 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* NOTE: kauditd_thread takes care of all our locking, we just use
|
|
|
|
* the netlink info passed to us (e.g. sk and portid) */
|
|
|
|
|
|
|
|
while ((skb = skb_dequeue(queue))) {
|
|
|
|
/* call the skb_hook for each skb we touch */
|
|
|
|
if (skb_hook)
|
|
|
|
(*skb_hook)(skb);
|
|
|
|
|
|
|
|
/* can we send to anyone via unicast? */
|
|
|
|
if (!sk) {
|
|
|
|
if (err_hook)
|
|
|
|
(*err_hook)(skb);
|
|
|
|
continue;
|
|
|
|
}
|
2016-11-30 05:53:26 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* grab an extra skb reference in case of error */
|
|
|
|
skb_get(skb);
|
|
|
|
rc = netlink_unicast(sk, skb, portid, 0);
|
|
|
|
if (rc < 0) {
|
|
|
|
/* fatal failure for our queue flush attempt? */
|
|
|
|
if (++failed >= retry_limit ||
|
|
|
|
rc == -ECONNREFUSED || rc == -EPERM) {
|
|
|
|
/* yes - error processing for the queue */
|
|
|
|
sk = NULL;
|
|
|
|
if (err_hook)
|
|
|
|
(*err_hook)(skb);
|
|
|
|
if (!skb_hook)
|
|
|
|
goto out;
|
|
|
|
/* keep processing with the skb_hook */
|
|
|
|
continue;
|
|
|
|
} else
|
|
|
|
/* no - requeue to preserve ordering */
|
|
|
|
skb_queue_head(queue, skb);
|
|
|
|
} else {
|
|
|
|
/* it worked - drop the extra reference and continue */
|
|
|
|
consume_skb(skb);
|
|
|
|
failed = 0;
|
|
|
|
}
|
2016-11-30 05:53:25 +08:00
|
|
|
}
|
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
out:
|
|
|
|
return (rc >= 0 ? 0 : rc);
|
2008-04-18 22:02:28 +08:00
|
|
|
}
|
|
|
|
|
2014-04-23 09:31:57 +08:00
|
|
|
/*
|
2016-11-30 05:53:25 +08:00
|
|
|
* kauditd_send_multicast_skb - Send a record to any multicast listeners
|
|
|
|
* @skb: audit record
|
2014-04-23 09:31:57 +08:00
|
|
|
*
|
2016-11-30 05:53:25 +08:00
|
|
|
* Description:
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
* Write a multicast message to anyone listening in the initial network
|
|
|
|
* namespace. This function doesn't consume an skb as might be expected since
|
|
|
|
* it has to copy it anyways.
|
2014-04-23 09:31:57 +08:00
|
|
|
*/
|
2016-11-30 05:53:25 +08:00
|
|
|
static void kauditd_send_multicast_skb(struct sk_buff *skb)
|
2014-04-23 09:31:57 +08:00
|
|
|
{
|
2016-11-30 05:53:25 +08:00
|
|
|
struct sk_buff *copy;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
struct sock *sock = audit_get_sk(&init_net);
|
2016-11-30 05:53:25 +08:00
|
|
|
struct nlmsghdr *nlh;
|
2014-04-23 09:31:57 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* NOTE: we are not taking an additional reference for init_net since
|
|
|
|
* we don't have to worry about it going away */
|
|
|
|
|
2014-04-23 09:31:58 +08:00
|
|
|
if (!netlink_has_listeners(sock, AUDIT_NLGRP_READLOG))
|
|
|
|
return;
|
|
|
|
|
2014-04-23 09:31:57 +08:00
|
|
|
/*
|
|
|
|
* The seemingly wasteful skb_copy() rather than bumping the refcount
|
|
|
|
* using skb_get() is necessary because non-standard mods are made to
|
|
|
|
* the skb by the original kaudit unicast socket send routine. The
|
|
|
|
* existing auditd daemon assumes this breakage. Fixing this would
|
|
|
|
* require co-ordinating a change in the established protocol between
|
|
|
|
* the kaudit kernel subsystem and the auditd userspace code. There is
|
|
|
|
* no reason for new multicast clients to continue with this
|
|
|
|
* non-compliance.
|
|
|
|
*/
|
2016-11-30 05:53:25 +08:00
|
|
|
copy = skb_copy(skb, GFP_KERNEL);
|
2014-04-23 09:31:57 +08:00
|
|
|
if (!copy)
|
|
|
|
return;
|
2016-11-30 05:53:25 +08:00
|
|
|
nlh = nlmsg_hdr(copy);
|
|
|
|
nlh->nlmsg_len = skb->len;
|
2014-04-23 09:31:57 +08:00
|
|
|
|
2016-11-30 05:53:25 +08:00
|
|
|
nlmsg_multicast(sock, copy, 0, AUDIT_NLGRP_READLOG, GFP_KERNEL);
|
2014-04-23 09:31:57 +08:00
|
|
|
}
|
|
|
|
|
2016-11-30 05:53:25 +08:00
|
|
|
/**
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
* kauditd_thread - Worker thread to send audit records to userspace
|
|
|
|
* @dummy: unused
|
2013-01-25 02:15:10 +08:00
|
|
|
*/
|
2006-01-08 17:02:17 +08:00
|
|
|
static int kauditd_thread(void *dummy)
|
2005-05-19 17:56:58 +08:00
|
|
|
{
|
2016-11-30 05:53:25 +08:00
|
|
|
int rc;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
u32 portid = 0;
|
|
|
|
struct net *net = NULL;
|
|
|
|
struct sock *sk = NULL;
|
2017-05-02 22:16:05 +08:00
|
|
|
struct auditd_connection *ac;
|
2016-11-30 05:53:24 +08:00
|
|
|
|
2016-11-30 05:53:25 +08:00
|
|
|
#define UNICAST_RETRIES 5
|
|
|
|
|
2007-07-17 19:03:35 +08:00
|
|
|
set_freezable();
|
2006-10-06 15:43:48 +08:00
|
|
|
while (!kthread_should_stop()) {
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* NOTE: see the lock comments in auditd_send_unicast_skb() */
|
|
|
|
rcu_read_lock();
|
2017-05-02 22:16:05 +08:00
|
|
|
ac = rcu_dereference(auditd_conn);
|
|
|
|
if (!ac) {
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
rcu_read_unlock();
|
|
|
|
goto main_queue;
|
|
|
|
}
|
2017-05-02 22:16:05 +08:00
|
|
|
net = get_net(ac->net);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
sk = audit_get_sk(net);
|
2017-05-02 22:16:05 +08:00
|
|
|
portid = ac->portid;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
rcu_read_unlock();
|
2016-11-30 05:53:25 +08:00
|
|
|
|
|
|
|
/* attempt to flush the hold queue */
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
rc = kauditd_send_queue(sk, portid,
|
|
|
|
&audit_hold_queue, UNICAST_RETRIES,
|
|
|
|
NULL, kauditd_rehold_skb);
|
2017-06-12 21:35:24 +08:00
|
|
|
if (ac && rc < 0) {
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
sk = NULL;
|
2017-06-12 21:35:24 +08:00
|
|
|
auditd_reset(ac);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
goto main_queue;
|
2016-11-30 05:53:25 +08:00
|
|
|
}
|
2008-04-18 22:02:28 +08:00
|
|
|
|
2016-11-30 05:53:25 +08:00
|
|
|
/* attempt to flush the retry queue */
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
rc = kauditd_send_queue(sk, portid,
|
|
|
|
&audit_retry_queue, UNICAST_RETRIES,
|
|
|
|
NULL, kauditd_hold_skb);
|
2017-06-12 21:35:24 +08:00
|
|
|
if (ac && rc < 0) {
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
sk = NULL;
|
2017-06-12 21:35:24 +08:00
|
|
|
auditd_reset(ac);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
goto main_queue;
|
2016-11-30 05:53:25 +08:00
|
|
|
}
|
2013-09-16 23:11:12 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
main_queue:
|
|
|
|
/* process the main queue - do the multicast send and attempt
|
|
|
|
* unicast, dump failed record sends to the retry queue; if
|
|
|
|
* sk == NULL due to previous failures we will just do the
|
2017-06-12 21:35:24 +08:00
|
|
|
* multicast send and move the record to the hold queue */
|
2017-04-10 23:16:59 +08:00
|
|
|
rc = kauditd_send_queue(sk, portid, &audit_queue, 1,
|
|
|
|
kauditd_send_multicast_skb,
|
2017-06-12 21:35:24 +08:00
|
|
|
(sk ?
|
|
|
|
kauditd_retry_skb : kauditd_hold_skb));
|
|
|
|
if (ac && rc < 0)
|
|
|
|
auditd_reset(ac);
|
2017-04-10 23:16:59 +08:00
|
|
|
sk = NULL;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
|
|
|
/* drop our netns reference, no auditd sends past this line */
|
|
|
|
if (net) {
|
|
|
|
put_net(net);
|
|
|
|
net = NULL;
|
2013-01-25 02:15:11 +08:00
|
|
|
}
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
|
|
|
/* we have processed all the queues so wake everyone */
|
|
|
|
wake_up(&audit_backlog_wait);
|
|
|
|
|
|
|
|
/* NOTE: we want to wake up if there is anything on the queue,
|
|
|
|
* regardless of if an auditd is connected, as we need to
|
|
|
|
* do the multicast send and rotate records from the
|
|
|
|
* main queue to the retry/hold queues */
|
|
|
|
wait_event_freezable(kauditd_wait,
|
|
|
|
(skb_queue_len(&audit_queue) ? 1 : 0));
|
2005-05-19 17:56:58 +08:00
|
|
|
}
|
2016-11-30 05:53:25 +08:00
|
|
|
|
2006-10-06 15:43:48 +08:00
|
|
|
return 0;
|
2005-05-19 17:56:58 +08:00
|
|
|
}
|
|
|
|
|
2006-05-22 13:09:24 +08:00
|
|
|
int audit_send_list(void *_dest)
|
|
|
|
{
|
|
|
|
struct audit_netlink_list *dest = _dest;
|
|
|
|
struct sk_buff *skb;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
struct sock *sk = audit_get_sk(dest->net);
|
2006-05-22 13:09:24 +08:00
|
|
|
|
|
|
|
/* wait for parent to finish and send an ACK */
|
2018-02-20 22:52:38 +08:00
|
|
|
audit_ctl_lock();
|
|
|
|
audit_ctl_unlock();
|
2006-05-22 13:09:24 +08:00
|
|
|
|
|
|
|
while ((skb = __skb_dequeue(&dest->q)) != NULL)
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
netlink_unicast(sk, skb, dest->portid, 0);
|
2006-05-22 13:09:24 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
put_net(dest->net);
|
2006-05-22 13:09:24 +08:00
|
|
|
kfree(dest);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
struct sk_buff *audit_make_reply(int seq, int type, int done,
|
2010-10-21 08:23:50 +08:00
|
|
|
int multi, const void *payload, int size)
|
2006-05-22 13:09:24 +08:00
|
|
|
{
|
|
|
|
struct sk_buff *skb;
|
|
|
|
struct nlmsghdr *nlh;
|
|
|
|
void *data;
|
|
|
|
int flags = multi ? NLM_F_MULTI : 0;
|
|
|
|
int t = done ? NLMSG_DONE : type;
|
|
|
|
|
2009-06-12 02:31:35 +08:00
|
|
|
skb = nlmsg_new(size, GFP_KERNEL);
|
2006-05-22 13:09:24 +08:00
|
|
|
if (!skb)
|
|
|
|
return NULL;
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
nlh = nlmsg_put(skb, 0, seq, t, size, flags);
|
2012-06-27 12:45:21 +08:00
|
|
|
if (!nlh)
|
|
|
|
goto out_kfree_skb;
|
|
|
|
data = nlmsg_data(nlh);
|
2006-05-22 13:09:24 +08:00
|
|
|
memcpy(data, payload, size);
|
|
|
|
return skb;
|
|
|
|
|
2012-06-27 12:45:21 +08:00
|
|
|
out_kfree_skb:
|
|
|
|
kfree_skb(skb);
|
2006-05-22 13:09:24 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2008-04-18 22:11:04 +08:00
|
|
|
static int audit_send_reply_thread(void *arg)
|
|
|
|
{
|
|
|
|
struct audit_reply *reply = (struct audit_reply *)arg;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
struct sock *sk = audit_get_sk(reply->net);
|
2008-04-18 22:11:04 +08:00
|
|
|
|
2018-02-20 22:52:38 +08:00
|
|
|
audit_ctl_lock();
|
|
|
|
audit_ctl_unlock();
|
2008-04-18 22:11:04 +08:00
|
|
|
|
|
|
|
/* Ignore failure. It'll only happen if the sender goes away,
|
|
|
|
because our timeout is set to infinite. */
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
netlink_unicast(sk, reply->skb, reply->portid, 0);
|
|
|
|
put_net(reply->net);
|
2008-04-18 22:11:04 +08:00
|
|
|
kfree(reply);
|
|
|
|
return 0;
|
|
|
|
}
|
2016-11-30 05:53:25 +08:00
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/**
|
|
|
|
* audit_send_reply - send an audit reply message via netlink
|
2014-03-09 07:31:54 +08:00
|
|
|
* @request_skb: skb of request we are replying to (used to target the reply)
|
2005-09-14 03:47:11 +08:00
|
|
|
* @seq: sequence number
|
|
|
|
* @type: audit message type
|
|
|
|
* @done: done (last) flag
|
|
|
|
* @multi: multi-part message flag
|
|
|
|
* @payload: payload data
|
|
|
|
* @size: payload size
|
|
|
|
*
|
2013-08-14 23:32:45 +08:00
|
|
|
* Allocates an skb, builds the netlink message, and sends it to the port id.
|
2005-09-14 03:47:11 +08:00
|
|
|
* No failure notifications.
|
|
|
|
*/
|
2014-03-01 11:44:55 +08:00
|
|
|
static void audit_send_reply(struct sk_buff *request_skb, int seq, int type, int done,
|
2013-08-14 23:32:45 +08:00
|
|
|
int multi, const void *payload, int size)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2014-03-01 11:44:55 +08:00
|
|
|
struct net *net = sock_net(NETLINK_CB(request_skb).sk);
|
2008-04-18 22:11:04 +08:00
|
|
|
struct sk_buff *skb;
|
|
|
|
struct task_struct *tsk;
|
|
|
|
struct audit_reply *reply = kmalloc(sizeof(struct audit_reply),
|
|
|
|
GFP_KERNEL);
|
|
|
|
|
|
|
|
if (!reply)
|
|
|
|
return;
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
skb = audit_make_reply(seq, type, done, multi, payload, size);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (!skb)
|
2008-05-15 07:11:48 +08:00
|
|
|
goto out;
|
2008-04-18 22:11:04 +08:00
|
|
|
|
2014-03-01 11:44:55 +08:00
|
|
|
reply->net = get_net(net);
|
2017-05-02 22:16:05 +08:00
|
|
|
reply->portid = NETLINK_CB(request_skb).portid;
|
2008-04-18 22:11:04 +08:00
|
|
|
reply->skb = skb;
|
|
|
|
|
|
|
|
tsk = kthread_run(audit_send_reply_thread, reply, "audit_send_reply");
|
2008-05-15 07:11:48 +08:00
|
|
|
if (!IS_ERR(tsk))
|
|
|
|
return;
|
|
|
|
kfree_skb(skb);
|
|
|
|
out:
|
|
|
|
kfree(reply);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check for appropriate CAP_AUDIT_ capabilities on incoming audit
|
|
|
|
* control messages.
|
|
|
|
*/
|
2006-06-28 04:26:11 +08:00
|
|
|
static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
|
|
|
int err = 0;
|
|
|
|
|
2013-08-16 12:04:46 +08:00
|
|
|
/* Only support initial user namespace for now. */
|
2014-03-31 07:07:54 +08:00
|
|
|
/*
|
|
|
|
* We return ECONNREFUSED because it tricks userspace into thinking
|
|
|
|
* that audit was not configured into the kernel. Lots of users
|
|
|
|
* configure their PAM stack (because that's what the distro does)
|
|
|
|
* to reject login if unable to send messages to audit. If we return
|
|
|
|
* ECONNREFUSED the PAM stack thinks the kernel does not have audit
|
|
|
|
* configured in and will let login proceed. If we return EPERM
|
|
|
|
* userspace will reject all logins. This should be removed when we
|
|
|
|
* support non init namespaces!!
|
|
|
|
*/
|
2014-04-13 03:38:53 +08:00
|
|
|
if (current_user_ns() != &init_user_ns)
|
2014-03-31 07:07:54 +08:00
|
|
|
return -ECONNREFUSED;
|
2012-09-11 14:20:20 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
switch (msg_type) {
|
|
|
|
case AUDIT_LIST:
|
|
|
|
case AUDIT_ADD:
|
|
|
|
case AUDIT_DEL:
|
2013-04-19 07:16:36 +08:00
|
|
|
return -EOPNOTSUPP;
|
|
|
|
case AUDIT_GET:
|
|
|
|
case AUDIT_SET:
|
2013-05-23 00:54:49 +08:00
|
|
|
case AUDIT_GET_FEATURE:
|
|
|
|
case AUDIT_SET_FEATURE:
|
2013-04-19 07:16:36 +08:00
|
|
|
case AUDIT_LIST_RULES:
|
|
|
|
case AUDIT_ADD_RULE:
|
2006-02-08 01:05:27 +08:00
|
|
|
case AUDIT_DEL_RULE:
|
2005-05-06 19:38:39 +08:00
|
|
|
case AUDIT_SIGNAL_INFO:
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
case AUDIT_TTY_GET:
|
|
|
|
case AUDIT_TTY_SET:
|
[PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.
Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 20:04:18 +08:00
|
|
|
case AUDIT_TRIM:
|
|
|
|
case AUDIT_MAKE_EQUIV:
|
2013-08-16 12:04:46 +08:00
|
|
|
/* Only support auditd and auditctl in initial pid namespace
|
|
|
|
* for now. */
|
2015-02-24 04:38:00 +08:00
|
|
|
if (task_active_pid_ns(current) != &init_pid_ns)
|
2013-08-16 12:04:46 +08:00
|
|
|
return -EPERM;
|
|
|
|
|
2014-04-24 05:29:27 +08:00
|
|
|
if (!netlink_capable(skb, CAP_AUDIT_CONTROL))
|
2005-04-17 06:20:36 +08:00
|
|
|
err = -EPERM;
|
|
|
|
break;
|
2005-05-21 07:18:37 +08:00
|
|
|
case AUDIT_USER:
|
2007-05-08 15:29:20 +08:00
|
|
|
case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
|
|
|
|
case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
|
2014-04-24 05:29:27 +08:00
|
|
|
if (!netlink_capable(skb, CAP_AUDIT_WRITE))
|
2005-04-17 06:20:36 +08:00
|
|
|
err = -EPERM;
|
|
|
|
break;
|
|
|
|
default: /* bad msg */
|
|
|
|
err = -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2015-11-04 21:23:52 +08:00
|
|
|
static void audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type)
|
2008-01-08 07:14:19 +08:00
|
|
|
{
|
2013-04-20 01:23:09 +08:00
|
|
|
uid_t uid = from_kuid(&init_user_ns, current_uid());
|
2013-12-12 02:52:26 +08:00
|
|
|
pid_t pid = task_tgid_nr(current);
|
2008-01-08 07:14:19 +08:00
|
|
|
|
2013-07-26 09:02:55 +08:00
|
|
|
if (!audit_enabled && msg_type != AUDIT_USER_AVC) {
|
2008-01-08 07:14:19 +08:00
|
|
|
*ab = NULL;
|
2015-11-04 21:23:52 +08:00
|
|
|
return;
|
2008-01-08 07:14:19 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
*ab = audit_log_start(NULL, GFP_KERNEL, msg_type);
|
2013-01-12 06:32:07 +08:00
|
|
|
if (unlikely(!*ab))
|
2015-11-04 21:23:52 +08:00
|
|
|
return;
|
2013-12-12 02:52:26 +08:00
|
|
|
audit_log_format(*ab, "pid=%d uid=%u", pid, uid);
|
2013-04-30 21:53:34 +08:00
|
|
|
audit_log_session_info(*ab);
|
2013-04-20 03:00:33 +08:00
|
|
|
audit_log_task_context(*ab);
|
2008-01-08 07:14:19 +08:00
|
|
|
}
|
|
|
|
|
2013-05-23 00:54:49 +08:00
|
|
|
int is_audit_feature_set(int i)
|
|
|
|
{
|
|
|
|
return af.features & AUDIT_FEATURE_TO_MASK(i);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static int audit_get_feature(struct sk_buff *skb)
|
|
|
|
{
|
|
|
|
u32 seq;
|
|
|
|
|
|
|
|
seq = nlmsg_hdr(skb)->nlmsg_seq;
|
|
|
|
|
2014-08-25 08:37:52 +08:00
|
|
|
audit_send_reply(skb, seq, AUDIT_GET_FEATURE, 0, 0, &af, sizeof(af));
|
2013-05-23 00:54:49 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void audit_log_feature_change(int which, u32 old_feature, u32 new_feature,
|
|
|
|
u32 old_lock, u32 new_lock, int res)
|
|
|
|
{
|
|
|
|
struct audit_buffer *ab;
|
|
|
|
|
2013-11-01 19:34:43 +08:00
|
|
|
if (audit_enabled == AUDIT_OFF)
|
|
|
|
return;
|
|
|
|
|
2013-05-23 00:54:49 +08:00
|
|
|
ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_FEATURE_CHANGE);
|
2018-02-21 17:30:07 +08:00
|
|
|
if (!ab)
|
|
|
|
return;
|
2014-01-08 02:08:41 +08:00
|
|
|
audit_log_task_info(ab, current);
|
2014-10-30 23:22:53 +08:00
|
|
|
audit_log_format(ab, " feature=%s old=%u new=%u old_lock=%u new_lock=%u res=%d",
|
2013-05-23 00:54:49 +08:00
|
|
|
audit_feature_names[which], !!old_feature, !!new_feature,
|
|
|
|
!!old_lock, !!new_lock, res);
|
|
|
|
audit_log_end(ab);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int audit_set_feature(struct sk_buff *skb)
|
|
|
|
{
|
|
|
|
struct audit_features *uaf;
|
|
|
|
int i;
|
|
|
|
|
2014-06-04 04:05:10 +08:00
|
|
|
BUILD_BUG_ON(AUDIT_LAST_FEATURE + 1 > ARRAY_SIZE(audit_feature_names));
|
2013-05-23 00:54:49 +08:00
|
|
|
uaf = nlmsg_data(nlmsg_hdr(skb));
|
|
|
|
|
|
|
|
/* if there is ever a version 2 we should handle that here */
|
|
|
|
|
|
|
|
for (i = 0; i <= AUDIT_LAST_FEATURE; i++) {
|
|
|
|
u32 feature = AUDIT_FEATURE_TO_MASK(i);
|
|
|
|
u32 old_feature, new_feature, old_lock, new_lock;
|
|
|
|
|
|
|
|
/* if we are not changing this feature, move along */
|
|
|
|
if (!(feature & uaf->mask))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
old_feature = af.features & feature;
|
|
|
|
new_feature = uaf->features & feature;
|
|
|
|
new_lock = (uaf->lock | af.lock) & feature;
|
|
|
|
old_lock = af.lock & feature;
|
|
|
|
|
|
|
|
/* are we changing a locked feature? */
|
2013-11-01 19:34:44 +08:00
|
|
|
if (old_lock && (new_feature != old_feature)) {
|
2013-05-23 00:54:49 +08:00
|
|
|
audit_log_feature_change(i, old_feature, new_feature,
|
|
|
|
old_lock, new_lock, 0);
|
|
|
|
return -EPERM;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
/* nothing invalid, do the changes */
|
|
|
|
for (i = 0; i <= AUDIT_LAST_FEATURE; i++) {
|
|
|
|
u32 feature = AUDIT_FEATURE_TO_MASK(i);
|
|
|
|
u32 old_feature, new_feature, old_lock, new_lock;
|
|
|
|
|
|
|
|
/* if we are not changing this feature, move along */
|
|
|
|
if (!(feature & uaf->mask))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
old_feature = af.features & feature;
|
|
|
|
new_feature = uaf->features & feature;
|
|
|
|
old_lock = af.lock & feature;
|
|
|
|
new_lock = (uaf->lock | af.lock) & feature;
|
|
|
|
|
|
|
|
if (new_feature != old_feature)
|
|
|
|
audit_log_feature_change(i, old_feature, new_feature,
|
|
|
|
old_lock, new_lock, 1);
|
|
|
|
|
|
|
|
if (new_feature)
|
|
|
|
af.features |= feature;
|
|
|
|
else
|
|
|
|
af.features &= ~feature;
|
|
|
|
af.lock |= new_lock;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
static int audit_replace(struct pid *pid)
|
2016-01-26 07:04:15 +08:00
|
|
|
{
|
2017-05-02 22:16:05 +08:00
|
|
|
pid_t pvnr;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
struct sk_buff *skb;
|
2016-01-26 07:04:15 +08:00
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
pvnr = pid_vnr(pid);
|
|
|
|
skb = audit_make_reply(0, AUDIT_REPLACE, 0, 0, &pvnr, sizeof(pvnr));
|
2016-01-26 07:04:15 +08:00
|
|
|
if (!skb)
|
|
|
|
return -ENOMEM;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
return auditd_send_unicast_skb(skb);
|
2016-01-26 07:04:15 +08:00
|
|
|
}
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
|
|
|
|
{
|
2013-04-20 01:23:09 +08:00
|
|
|
u32 seq;
|
2005-04-17 06:20:36 +08:00
|
|
|
void *data;
|
|
|
|
int err;
|
2005-05-14 01:17:42 +08:00
|
|
|
struct audit_buffer *ab;
|
2005-04-17 06:20:36 +08:00
|
|
|
u16 msg_type = nlh->nlmsg_type;
|
2006-05-25 22:19:47 +08:00
|
|
|
struct audit_sig_info *sig_data;
|
2008-01-08 07:14:19 +08:00
|
|
|
char *ctx = NULL;
|
2006-05-25 22:19:47 +08:00
|
|
|
u32 len;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2006-06-28 04:26:11 +08:00
|
|
|
err = audit_netlink_ok(skb, msg_type);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
seq = nlh->nlmsg_seq;
|
2012-06-27 12:45:21 +08:00
|
|
|
data = nlmsg_data(nlh);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
switch (msg_type) {
|
2013-09-18 21:32:24 +08:00
|
|
|
case AUDIT_GET: {
|
|
|
|
struct audit_status s;
|
|
|
|
memset(&s, 0, sizeof(s));
|
|
|
|
s.enabled = audit_enabled;
|
|
|
|
s.failure = audit_failure;
|
2017-05-02 22:16:05 +08:00
|
|
|
/* NOTE: use pid_vnr() so the PID is relative to the current
|
|
|
|
* namespace */
|
2017-05-02 22:16:05 +08:00
|
|
|
s.pid = auditd_pid_vnr();
|
2013-09-18 21:32:24 +08:00
|
|
|
s.rate_limit = audit_rate_limit;
|
|
|
|
s.backlog_limit = audit_backlog_limit;
|
|
|
|
s.lost = atomic_read(&audit_lost);
|
2016-11-30 05:53:24 +08:00
|
|
|
s.backlog = skb_queue_len(&audit_queue);
|
2014-11-18 04:51:01 +08:00
|
|
|
s.feature_bitmap = AUDIT_FEATURE_BITMAP_ALL;
|
2016-11-30 05:53:25 +08:00
|
|
|
s.backlog_wait_time = audit_backlog_wait_time;
|
2014-03-01 11:44:55 +08:00
|
|
|
audit_send_reply(skb, seq, AUDIT_GET, 0, 0, &s, sizeof(s));
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2013-09-18 21:32:24 +08:00
|
|
|
}
|
|
|
|
case AUDIT_SET: {
|
|
|
|
struct audit_status s;
|
|
|
|
memset(&s, 0, sizeof(s));
|
|
|
|
/* guard against past and future API changes */
|
|
|
|
memcpy(&s, data, min_t(size_t, sizeof(s), nlmsg_len(nlh)));
|
|
|
|
if (s.mask & AUDIT_STATUS_ENABLED) {
|
|
|
|
err = audit_set_enabled(s.enabled);
|
2008-07-31 10:11:19 +08:00
|
|
|
if (err < 0)
|
|
|
|
return err;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2013-09-18 21:32:24 +08:00
|
|
|
if (s.mask & AUDIT_STATUS_FAILURE) {
|
|
|
|
err = audit_set_failure(s.failure);
|
2008-07-31 10:11:19 +08:00
|
|
|
if (err < 0)
|
|
|
|
return err;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2013-09-18 21:32:24 +08:00
|
|
|
if (s.mask & AUDIT_STATUS_PID) {
|
2017-05-02 22:16:05 +08:00
|
|
|
/* NOTE: we are using the vnr PID functions below
|
|
|
|
* because the s.pid value is relative to the
|
|
|
|
* namespace of the caller; at present this
|
|
|
|
* doesn't matter much since you can really only
|
|
|
|
* run auditd from the initial pid namespace, but
|
|
|
|
* something to keep in mind if this changes */
|
|
|
|
pid_t new_pid = s.pid;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
pid_t auditd_pid;
|
2017-05-02 22:16:05 +08:00
|
|
|
struct pid *req_pid = task_tgid(current);
|
|
|
|
|
2017-10-18 06:29:22 +08:00
|
|
|
/* Sanity check - PID values must match. Setting
|
|
|
|
* pid to 0 is how auditd ends auditing. */
|
|
|
|
if (new_pid && (new_pid != pid_vnr(req_pid)))
|
2017-05-02 22:16:05 +08:00
|
|
|
return -EINVAL;
|
2008-01-08 06:09:31 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* test the auditd connection */
|
2017-05-02 22:16:05 +08:00
|
|
|
audit_replace(req_pid);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
auditd_pid = auditd_pid_vnr();
|
2017-10-18 06:29:22 +08:00
|
|
|
if (auditd_pid) {
|
|
|
|
/* replacing a healthy auditd is not allowed */
|
|
|
|
if (new_pid) {
|
|
|
|
audit_log_config_change("audit_pid",
|
|
|
|
new_pid, auditd_pid, 0);
|
|
|
|
return -EEXIST;
|
|
|
|
}
|
|
|
|
/* only current auditd can unregister itself */
|
|
|
|
if (pid_vnr(req_pid) != auditd_pid) {
|
|
|
|
audit_log_config_change("audit_pid",
|
|
|
|
new_pid, auditd_pid, 0);
|
|
|
|
return -EACCES;
|
|
|
|
}
|
2016-01-26 07:04:15 +08:00
|
|
|
}
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
2016-12-13 23:03:01 +08:00
|
|
|
if (new_pid) {
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* register a new auditd connection */
|
2017-05-02 22:16:05 +08:00
|
|
|
err = auditd_set(req_pid,
|
|
|
|
NETLINK_CB(skb).portid,
|
|
|
|
sock_net(NETLINK_CB(skb).sk));
|
|
|
|
if (audit_enabled != AUDIT_OFF)
|
|
|
|
audit_log_config_change("audit_pid",
|
|
|
|
new_pid,
|
|
|
|
auditd_pid,
|
|
|
|
err ? 0 : 1);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* try to process any backlog */
|
|
|
|
wake_up_interruptible(&kauditd_wait);
|
2017-05-02 22:16:05 +08:00
|
|
|
} else {
|
|
|
|
if (audit_enabled != AUDIT_OFF)
|
|
|
|
audit_log_config_change("audit_pid",
|
|
|
|
new_pid,
|
|
|
|
auditd_pid, 1);
|
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* unregister the auditd connection */
|
2017-06-12 21:35:24 +08:00
|
|
|
auditd_reset(NULL);
|
2017-05-02 22:16:05 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2013-09-18 21:32:24 +08:00
|
|
|
if (s.mask & AUDIT_STATUS_RATE_LIMIT) {
|
|
|
|
err = audit_set_rate_limit(s.rate_limit);
|
2008-07-31 10:11:19 +08:00
|
|
|
if (err < 0)
|
|
|
|
return err;
|
|
|
|
}
|
2013-09-18 23:55:12 +08:00
|
|
|
if (s.mask & AUDIT_STATUS_BACKLOG_LIMIT) {
|
2013-09-18 21:32:24 +08:00
|
|
|
err = audit_set_backlog_limit(s.backlog_limit);
|
2013-09-18 23:55:12 +08:00
|
|
|
if (err < 0)
|
|
|
|
return err;
|
|
|
|
}
|
2014-01-14 05:49:28 +08:00
|
|
|
if (s.mask & AUDIT_STATUS_BACKLOG_WAIT_TIME) {
|
|
|
|
if (sizeof(s) > (size_t)nlh->nlmsg_len)
|
|
|
|
return -EINVAL;
|
2015-03-12 02:08:19 +08:00
|
|
|
if (s.backlog_wait_time > 10*AUDIT_BACKLOG_WAIT_TIME)
|
2014-01-14 05:49:28 +08:00
|
|
|
return -EINVAL;
|
|
|
|
err = audit_set_backlog_wait_time(s.backlog_wait_time);
|
|
|
|
if (err < 0)
|
|
|
|
return err;
|
2013-09-18 23:55:12 +08:00
|
|
|
}
|
2017-01-13 16:26:29 +08:00
|
|
|
if (s.mask == AUDIT_STATUS_LOST) {
|
|
|
|
u32 lost = atomic_xchg(&audit_lost, 0);
|
|
|
|
|
|
|
|
audit_log_config_change("lost", 0, lost, 1);
|
|
|
|
return lost;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2013-09-18 21:32:24 +08:00
|
|
|
}
|
2013-05-23 00:54:49 +08:00
|
|
|
case AUDIT_GET_FEATURE:
|
|
|
|
err = audit_get_feature(skb);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
break;
|
|
|
|
case AUDIT_SET_FEATURE:
|
|
|
|
err = audit_set_feature(skb);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
break;
|
2005-05-21 07:18:37 +08:00
|
|
|
case AUDIT_USER:
|
2007-05-08 15:29:20 +08:00
|
|
|
case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
|
|
|
|
case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
|
2005-06-22 21:56:47 +08:00
|
|
|
if (!audit_enabled && msg_type != AUDIT_USER_AVC)
|
|
|
|
return 0;
|
|
|
|
|
2016-06-25 04:35:46 +08:00
|
|
|
err = audit_filter(msg_type, AUDIT_FILTER_USER);
|
2013-11-26 10:57:51 +08:00
|
|
|
if (err == 1) { /* match or error */
|
2005-06-22 21:56:47 +08:00
|
|
|
err = 0;
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
if (msg_type == AUDIT_USER_TTY) {
|
2016-01-10 14:55:31 +08:00
|
|
|
err = tty_audit_push();
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
if (err)
|
|
|
|
break;
|
|
|
|
}
|
2013-04-20 01:23:09 +08:00
|
|
|
audit_log_common_recv_msg(&ab, msg_type);
|
2008-01-08 07:14:19 +08:00
|
|
|
if (msg_type != AUDIT_USER_TTY)
|
2013-09-17 06:20:42 +08:00
|
|
|
audit_log_format(ab, " msg='%.*s'",
|
|
|
|
AUDIT_MESSAGE_TEXT_MAX,
|
2008-01-08 07:14:19 +08:00
|
|
|
(char *)data);
|
|
|
|
else {
|
|
|
|
int size;
|
|
|
|
|
2013-04-11 23:25:00 +08:00
|
|
|
audit_log_format(ab, " data=");
|
2008-01-08 07:14:19 +08:00
|
|
|
size = nlmsg_len(nlh);
|
2009-03-19 21:52:47 +08:00
|
|
|
if (size > 0 &&
|
|
|
|
((unsigned char *)data)[size - 1] == '\0')
|
|
|
|
size--;
|
2008-04-18 22:12:59 +08:00
|
|
|
audit_log_n_untrustedstring(ab, data, size);
|
2005-06-22 21:56:47 +08:00
|
|
|
}
|
2008-01-08 07:14:19 +08:00
|
|
|
audit_log_end(ab);
|
2005-06-20 02:35:50 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2006-02-08 01:05:27 +08:00
|
|
|
case AUDIT_ADD_RULE:
|
|
|
|
case AUDIT_DEL_RULE:
|
|
|
|
if (nlmsg_len(nlh) < sizeof(struct audit_rule_data))
|
|
|
|
return -EINVAL;
|
2008-01-08 06:09:31 +08:00
|
|
|
if (audit_enabled == AUDIT_LOCKED) {
|
2013-04-20 01:23:09 +08:00
|
|
|
audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
|
|
|
|
audit_log_format(ab, " audit_enabled=%d res=0", audit_enabled);
|
2008-01-08 07:14:19 +08:00
|
|
|
audit_log_end(ab);
|
2007-01-20 03:39:55 +08:00
|
|
|
return -EPERM;
|
|
|
|
}
|
2017-05-02 22:16:05 +08:00
|
|
|
err = audit_rule_change(msg_type, seq, data, nlmsg_len(nlh));
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2013-11-21 03:01:53 +08:00
|
|
|
case AUDIT_LIST_RULES:
|
2014-03-01 11:44:55 +08:00
|
|
|
err = audit_list_rules_send(skb, seq);
|
2013-11-21 03:01:53 +08:00
|
|
|
break;
|
[PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.
Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 20:04:18 +08:00
|
|
|
case AUDIT_TRIM:
|
|
|
|
audit_trim_trees();
|
2013-04-20 01:23:09 +08:00
|
|
|
audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
|
[PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.
Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 20:04:18 +08:00
|
|
|
audit_log_format(ab, " op=trim res=1");
|
|
|
|
audit_log_end(ab);
|
|
|
|
break;
|
|
|
|
case AUDIT_MAKE_EQUIV: {
|
|
|
|
void *bufp = data;
|
|
|
|
u32 sizes[2];
|
2008-04-27 17:39:56 +08:00
|
|
|
size_t msglen = nlmsg_len(nlh);
|
[PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.
Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 20:04:18 +08:00
|
|
|
char *old, *new;
|
|
|
|
|
|
|
|
err = -EINVAL;
|
2008-04-27 17:39:56 +08:00
|
|
|
if (msglen < 2 * sizeof(u32))
|
[PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.
Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 20:04:18 +08:00
|
|
|
break;
|
|
|
|
memcpy(sizes, bufp, 2 * sizeof(u32));
|
|
|
|
bufp += 2 * sizeof(u32);
|
2008-04-27 17:39:56 +08:00
|
|
|
msglen -= 2 * sizeof(u32);
|
|
|
|
old = audit_unpack_string(&bufp, &msglen, sizes[0]);
|
[PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.
Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 20:04:18 +08:00
|
|
|
if (IS_ERR(old)) {
|
|
|
|
err = PTR_ERR(old);
|
|
|
|
break;
|
|
|
|
}
|
2008-04-27 17:39:56 +08:00
|
|
|
new = audit_unpack_string(&bufp, &msglen, sizes[1]);
|
[PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.
Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 20:04:18 +08:00
|
|
|
if (IS_ERR(new)) {
|
|
|
|
err = PTR_ERR(new);
|
|
|
|
kfree(old);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
/* OK, here comes... */
|
|
|
|
err = audit_tag_tree(old, new);
|
|
|
|
|
2013-04-20 01:23:09 +08:00
|
|
|
audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
|
2008-01-08 07:14:19 +08:00
|
|
|
|
[PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.
Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22 20:04:18 +08:00
|
|
|
audit_log_format(ab, " op=make_equiv old=");
|
|
|
|
audit_log_untrustedstring(ab, old);
|
|
|
|
audit_log_format(ab, " new=");
|
|
|
|
audit_log_untrustedstring(ab, new);
|
|
|
|
audit_log_format(ab, " res=%d", !err);
|
|
|
|
audit_log_end(ab);
|
|
|
|
kfree(old);
|
|
|
|
kfree(new);
|
|
|
|
break;
|
|
|
|
}
|
2005-05-06 19:38:39 +08:00
|
|
|
case AUDIT_SIGNAL_INFO:
|
2009-09-24 01:46:00 +08:00
|
|
|
len = 0;
|
|
|
|
if (audit_sig_sid) {
|
|
|
|
err = security_secid_to_secctx(audit_sig_sid, &ctx, &len);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
2006-05-25 22:19:47 +08:00
|
|
|
sig_data = kmalloc(sizeof(*sig_data) + len, GFP_KERNEL);
|
|
|
|
if (!sig_data) {
|
2009-09-24 01:46:00 +08:00
|
|
|
if (audit_sig_sid)
|
|
|
|
security_release_secctx(ctx, len);
|
2006-05-25 22:19:47 +08:00
|
|
|
return -ENOMEM;
|
|
|
|
}
|
2012-02-08 08:53:48 +08:00
|
|
|
sig_data->uid = from_kuid(&init_user_ns, audit_sig_uid);
|
2006-05-25 22:19:47 +08:00
|
|
|
sig_data->pid = audit_sig_pid;
|
2009-09-24 01:46:00 +08:00
|
|
|
if (audit_sig_sid) {
|
|
|
|
memcpy(sig_data->ctx, ctx, len);
|
|
|
|
security_release_secctx(ctx, len);
|
|
|
|
}
|
2014-03-01 11:44:55 +08:00
|
|
|
audit_send_reply(skb, seq, AUDIT_SIGNAL_INFO, 0, 0,
|
|
|
|
sig_data, sizeof(*sig_data) + len);
|
2006-05-25 22:19:47 +08:00
|
|
|
kfree(sig_data);
|
2005-05-06 19:38:39 +08:00
|
|
|
break;
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
case AUDIT_TTY_GET: {
|
|
|
|
struct audit_tty_status s;
|
2016-01-10 14:55:33 +08:00
|
|
|
unsigned int t;
|
2012-09-11 14:43:14 +08:00
|
|
|
|
2016-01-10 14:55:33 +08:00
|
|
|
t = READ_ONCE(current->signal->audit_tty);
|
|
|
|
s.enabled = t & AUDIT_TTY_ENABLE;
|
|
|
|
s.log_passwd = !!(t & AUDIT_TTY_LOG_PASSWD);
|
2012-09-11 14:43:14 +08:00
|
|
|
|
2014-03-01 11:44:55 +08:00
|
|
|
audit_send_reply(skb, seq, AUDIT_TTY_GET, 0, 0, &s, sizeof(s));
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
case AUDIT_TTY_SET: {
|
2013-11-16 00:29:02 +08:00
|
|
|
struct audit_tty_status s, old;
|
|
|
|
struct audit_buffer *ab;
|
2016-01-10 14:55:33 +08:00
|
|
|
unsigned int t;
|
2014-01-14 10:12:34 +08:00
|
|
|
|
|
|
|
memset(&s, 0, sizeof(s));
|
|
|
|
/* guard against past and future API changes */
|
|
|
|
memcpy(&s, data, min_t(size_t, sizeof(s), nlmsg_len(nlh)));
|
|
|
|
/* check if new data is valid */
|
|
|
|
if ((s.enabled != 0 && s.enabled != 1) ||
|
|
|
|
(s.log_passwd != 0 && s.log_passwd != 1))
|
|
|
|
err = -EINVAL;
|
2013-11-16 00:29:02 +08:00
|
|
|
|
2016-01-10 14:55:33 +08:00
|
|
|
if (err)
|
|
|
|
t = READ_ONCE(current->signal->audit_tty);
|
|
|
|
else {
|
|
|
|
t = s.enabled | (-s.log_passwd & AUDIT_TTY_LOG_PASSWD);
|
|
|
|
t = xchg(¤t->signal->audit_tty, t);
|
2014-01-14 10:12:34 +08:00
|
|
|
}
|
2016-01-10 14:55:33 +08:00
|
|
|
old.enabled = t & AUDIT_TTY_ENABLE;
|
|
|
|
old.log_passwd = !!(t & AUDIT_TTY_LOG_PASSWD);
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
|
2013-11-16 00:29:02 +08:00
|
|
|
audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
|
2014-01-14 10:16:59 +08:00
|
|
|
audit_log_format(ab, " op=tty_set old-enabled=%d new-enabled=%d"
|
|
|
|
" old-log_passwd=%d new-log_passwd=%d res=%d",
|
|
|
|
old.enabled, s.enabled, old.log_passwd,
|
|
|
|
s.log_passwd, !err);
|
2013-11-16 00:29:02 +08:00
|
|
|
audit_log_end(ab);
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
break;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
default:
|
|
|
|
err = -EINVAL;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
return err < 0 ? err : 0;
|
|
|
|
}
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
/**
|
|
|
|
* audit_receive - receive messages from a netlink control socket
|
|
|
|
* @skb: the message buffer
|
|
|
|
*
|
|
|
|
* Parse the provided skb and deal with any messages that may be present,
|
|
|
|
* malformed skbs are discarded.
|
2005-09-14 03:47:11 +08:00
|
|
|
*/
|
2017-05-02 22:16:05 +08:00
|
|
|
static void audit_receive(struct sk_buff *skb)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-06-12 02:31:35 +08:00
|
|
|
struct nlmsghdr *nlh;
|
|
|
|
/*
|
2013-03-27 14:49:06 +08:00
|
|
|
* len MUST be signed for nlmsg_next to be able to dec it below 0
|
2009-06-12 02:31:35 +08:00
|
|
|
* if the nlmsg_len was not aligned
|
|
|
|
*/
|
|
|
|
int len;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
nlh = nlmsg_hdr(skb);
|
|
|
|
len = skb->len;
|
|
|
|
|
2018-02-20 22:52:38 +08:00
|
|
|
audit_ctl_lock();
|
2013-03-27 14:49:06 +08:00
|
|
|
while (nlmsg_ok(nlh, len)) {
|
2009-06-12 02:31:35 +08:00
|
|
|
err = audit_receive_msg(skb, nlh);
|
|
|
|
/* if err or if this message says it wants a response */
|
|
|
|
if (err || (nlh->nlmsg_flags & NLM_F_ACK))
|
2017-04-12 20:34:04 +08:00
|
|
|
netlink_ack(skb, nlh, err, NULL);
|
2009-06-12 02:31:35 +08:00
|
|
|
|
2013-03-29 05:31:29 +08:00
|
|
|
nlh = nlmsg_next(nlh, &len);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2018-02-20 22:52:38 +08:00
|
|
|
audit_ctl_unlock();
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2014-04-23 09:31:56 +08:00
|
|
|
/* Run custom bind function on netlink socket group connect or bind requests. */
|
2014-12-24 04:00:06 +08:00
|
|
|
static int audit_bind(struct net *net, int group)
|
2014-04-23 09:31:56 +08:00
|
|
|
{
|
|
|
|
if (!capable(CAP_AUDIT_READ))
|
|
|
|
return -EPERM;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-07-17 01:18:45 +08:00
|
|
|
static int __net_init audit_net_init(struct net *net)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2012-06-29 14:15:21 +08:00
|
|
|
struct netlink_kernel_cfg cfg = {
|
|
|
|
.input = audit_receive,
|
2014-04-23 09:31:56 +08:00
|
|
|
.bind = audit_bind,
|
2014-04-23 09:31:57 +08:00
|
|
|
.flags = NL_CFG_F_NONROOT_RECV,
|
|
|
|
.groups = AUDIT_NLGRP_MAX,
|
2012-06-29 14:15:21 +08:00
|
|
|
};
|
[PATCH] audit: path-based rules
In this implementation, audit registers inotify watches on the parent
directories of paths specified in audit rules. When audit's inotify
event handler is called, it updates any affected rules based on the
filesystem event. If the parent directory is renamed, removed, or its
filesystem is unmounted, audit removes all rules referencing that
inotify watch.
To keep things simple, this implementation limits location-based
auditing to the directory entries in an existing directory. Given
a path-based rule for /foo/bar/passwd, the following table applies:
passwd modified -- audit event logged
passwd replaced -- audit event logged, rules list updated
bar renamed -- rule removed
foo renamed -- untracked, meaning that the rule now applies to
the new location
Audit users typically want to have many rules referencing filesystem
objects, which can significantly impact filtering performance. This
patch also adds an inode-number-based rule hash to mitigate this
situation.
The patch is relative to the audit git tree:
http://kernel.org/git/?p=linux/kernel/git/viro/audit-current.git;a=summary
and uses the inotify kernel API:
http://lkml.org/lkml/2006/6/1/145
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-04-08 04:55:56 +08:00
|
|
|
|
2013-07-17 01:18:45 +08:00
|
|
|
struct audit_net *aunet = net_generic(net, audit_net_id);
|
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
aunet->sk = netlink_kernel_create(net, NETLINK_AUDIT, &cfg);
|
|
|
|
if (aunet->sk == NULL) {
|
2013-07-17 01:18:45 +08:00
|
|
|
audit_panic("cannot initialize netlink socket in namespace");
|
2013-12-17 11:10:41 +08:00
|
|
|
return -ENOMEM;
|
|
|
|
}
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
aunet->sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
|
|
|
|
|
2013-07-17 01:18:45 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void __net_exit audit_net_exit(struct net *net)
|
|
|
|
{
|
|
|
|
struct audit_net *aunet = net_generic(net, audit_net_id);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
/* NOTE: you would think that we would want to check the auditd
|
|
|
|
* connection and potentially reset it here if it lives in this
|
|
|
|
* namespace, but since the auditd connection tracking struct holds a
|
|
|
|
* reference to this namespace (see auditd_set()) we are only ever
|
|
|
|
* going to get here after that connection has been released */
|
2013-07-17 01:18:45 +08:00
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
netlink_kernel_release(aunet->sk);
|
2013-07-17 01:18:45 +08:00
|
|
|
}
|
|
|
|
|
2013-07-17 01:18:45 +08:00
|
|
|
static struct pernet_operations audit_net_ops __net_initdata = {
|
2013-07-17 01:18:45 +08:00
|
|
|
.init = audit_net_init,
|
|
|
|
.exit = audit_net_exit,
|
|
|
|
.id = &audit_net_id,
|
|
|
|
.size = sizeof(struct audit_net),
|
|
|
|
};
|
|
|
|
|
|
|
|
/* Initialize audit support at boot time. */
|
|
|
|
static int __init audit_init(void)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
2008-11-06 01:47:09 +08:00
|
|
|
if (audit_initialized == AUDIT_DISABLED)
|
|
|
|
return 0;
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
audit_buffer_cache = kmem_cache_create("audit_buffer",
|
|
|
|
sizeof(struct audit_buffer),
|
|
|
|
0, SLAB_PANIC, NULL);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2016-11-30 05:53:24 +08:00
|
|
|
skb_queue_head_init(&audit_queue);
|
2016-11-30 05:53:25 +08:00
|
|
|
skb_queue_head_init(&audit_retry_queue);
|
2016-11-30 05:53:24 +08:00
|
|
|
skb_queue_head_init(&audit_hold_queue);
|
2006-03-11 08:14:06 +08:00
|
|
|
|
[PATCH] audit: path-based rules
In this implementation, audit registers inotify watches on the parent
directories of paths specified in audit rules. When audit's inotify
event handler is called, it updates any affected rules based on the
filesystem event. If the parent directory is renamed, removed, or its
filesystem is unmounted, audit removes all rules referencing that
inotify watch.
To keep things simple, this implementation limits location-based
auditing to the directory entries in an existing directory. Given
a path-based rule for /foo/bar/passwd, the following table applies:
passwd modified -- audit event logged
passwd replaced -- audit event logged, rules list updated
bar renamed -- rule removed
foo renamed -- untracked, meaning that the rule now applies to
the new location
Audit users typically want to have many rules referencing filesystem
objects, which can significantly impact filtering performance. This
patch also adds an inode-number-based rule hash to mitigate this
situation.
The patch is relative to the audit git tree:
http://kernel.org/git/?p=linux/kernel/git/viro/audit-current.git;a=summary
and uses the inotify kernel API:
http://lkml.org/lkml/2006/6/1/145
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-04-08 04:55:56 +08:00
|
|
|
for (i = 0; i < AUDIT_INODE_BUCKETS; i++)
|
|
|
|
INIT_LIST_HEAD(&audit_inode_hash[i]);
|
|
|
|
|
2018-02-20 22:52:38 +08:00
|
|
|
mutex_init(&audit_cmd_mutex.lock);
|
|
|
|
audit_cmd_mutex.owner = NULL;
|
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
pr_info("initializing netlink subsys (%s)\n",
|
|
|
|
audit_default ? "enabled" : "disabled");
|
|
|
|
register_pernet_subsys(&audit_net_ops);
|
|
|
|
|
|
|
|
audit_initialized = AUDIT_INITIALIZED;
|
|
|
|
|
2016-11-30 05:53:23 +08:00
|
|
|
kauditd_task = kthread_run(kauditd_thread, NULL, "kauditd");
|
|
|
|
if (IS_ERR(kauditd_task)) {
|
|
|
|
int err = PTR_ERR(kauditd_task);
|
|
|
|
panic("audit: failed to start the kauditd thread (%d)\n", err);
|
|
|
|
}
|
|
|
|
|
2016-12-15 04:59:46 +08:00
|
|
|
audit_log(NULL, GFP_KERNEL, AUDIT_KERNEL,
|
|
|
|
"state=initialized audit_enabled=%u res=1",
|
|
|
|
audit_enabled);
|
2016-11-30 05:53:23 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
return 0;
|
|
|
|
}
|
2017-09-01 21:44:44 +08:00
|
|
|
postcore_initcall(audit_init);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/* Process kernel command-line parameter at boot time. audit=0 or audit=1. */
|
|
|
|
static int __init audit_enable(char *str)
|
|
|
|
{
|
2017-09-01 21:44:51 +08:00
|
|
|
long val;
|
|
|
|
|
|
|
|
if (kstrtol(str, 0, &val))
|
|
|
|
panic("audit: invalid 'audit' parameter value (%s)\n", str);
|
|
|
|
audit_default = (val ? AUDIT_ON : AUDIT_OFF);
|
|
|
|
|
|
|
|
if (audit_default == AUDIT_OFF)
|
2008-11-06 01:47:09 +08:00
|
|
|
audit_initialized = AUDIT_DISABLED;
|
2017-09-01 21:45:05 +08:00
|
|
|
if (audit_set_enabled(audit_default))
|
|
|
|
panic("audit: error setting audit state (%d)\n", audit_default);
|
2008-11-06 01:47:09 +08:00
|
|
|
|
2014-01-15 02:33:12 +08:00
|
|
|
pr_info("%s\n", audit_default ?
|
2013-10-31 14:31:01 +08:00
|
|
|
"enabled (after initialization)" : "disabled (until reboot)");
|
2008-11-06 01:47:09 +08:00
|
|
|
|
2006-03-31 18:30:33 +08:00
|
|
|
return 1;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
__setup("audit=", audit_enable);
|
|
|
|
|
2013-09-18 00:34:52 +08:00
|
|
|
/* Process kernel command-line parameter at boot time.
|
|
|
|
* audit_backlog_limit=<n> */
|
|
|
|
static int __init audit_backlog_limit_set(char *str)
|
|
|
|
{
|
2014-01-15 02:33:13 +08:00
|
|
|
u32 audit_backlog_limit_arg;
|
2014-01-15 02:33:12 +08:00
|
|
|
|
2013-09-18 00:34:52 +08:00
|
|
|
pr_info("audit_backlog_limit: ");
|
2014-01-15 02:33:13 +08:00
|
|
|
if (kstrtouint(str, 0, &audit_backlog_limit_arg)) {
|
|
|
|
pr_cont("using default of %u, unable to parse %s\n",
|
2014-01-15 02:33:12 +08:00
|
|
|
audit_backlog_limit, str);
|
2013-09-18 00:34:52 +08:00
|
|
|
return 1;
|
|
|
|
}
|
2014-01-15 02:33:13 +08:00
|
|
|
|
|
|
|
audit_backlog_limit = audit_backlog_limit_arg;
|
2014-01-15 02:33:12 +08:00
|
|
|
pr_cont("%d\n", audit_backlog_limit);
|
2013-09-18 00:34:52 +08:00
|
|
|
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
__setup("audit_backlog_limit=", audit_backlog_limit_set);
|
|
|
|
|
2005-05-06 22:53:34 +08:00
|
|
|
static void audit_buffer_free(struct audit_buffer *ab)
|
|
|
|
{
|
2005-05-06 22:54:17 +08:00
|
|
|
if (!ab)
|
|
|
|
return;
|
|
|
|
|
2016-01-13 22:18:55 +08:00
|
|
|
kfree_skb(ab->skb);
|
2017-05-02 22:16:05 +08:00
|
|
|
kmem_cache_free(audit_buffer_cache, ab);
|
2005-05-06 22:53:34 +08:00
|
|
|
}
|
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
static struct audit_buffer *audit_buffer_alloc(struct audit_context *ctx,
|
|
|
|
gfp_t gfp_mask, int type)
|
2005-05-06 22:53:34 +08:00
|
|
|
{
|
2017-05-02 22:16:05 +08:00
|
|
|
struct audit_buffer *ab;
|
2005-05-06 22:54:17 +08:00
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
ab = kmem_cache_alloc(audit_buffer_cache, gfp_mask);
|
|
|
|
if (!ab)
|
|
|
|
return NULL;
|
2009-06-12 02:31:35 +08:00
|
|
|
|
|
|
|
ab->skb = nlmsg_new(AUDIT_BUFSIZ, gfp_mask);
|
|
|
|
if (!ab->skb)
|
2012-06-27 12:45:21 +08:00
|
|
|
goto err;
|
2017-05-02 22:16:05 +08:00
|
|
|
if (!nlmsg_put(ab->skb, 0, 0, type, 0, 0))
|
|
|
|
goto err;
|
2009-06-12 02:31:35 +08:00
|
|
|
|
2017-05-02 22:16:05 +08:00
|
|
|
ab->ctx = ctx;
|
|
|
|
ab->gfp_mask = gfp_mask;
|
2009-06-12 02:31:35 +08:00
|
|
|
|
2005-05-06 22:53:34 +08:00
|
|
|
return ab;
|
2009-06-12 02:31:35 +08:00
|
|
|
|
2005-05-06 22:54:17 +08:00
|
|
|
err:
|
|
|
|
audit_buffer_free(ab);
|
|
|
|
return NULL;
|
2005-05-06 22:53:34 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/**
|
|
|
|
* audit_serial - compute a serial number for the audit record
|
|
|
|
*
|
|
|
|
* Compute a serial number for the audit record. Audit records are
|
2005-05-22 04:08:09 +08:00
|
|
|
* written to user-space as soon as they are generated, so a complete
|
|
|
|
* audit record may be written in several pieces. The timestamp of the
|
|
|
|
* record and this serial number are used by the user-space tools to
|
|
|
|
* determine which pieces belong to the same audit record. The
|
|
|
|
* (timestamp,serial) tuple is unique for each syscall and is live from
|
|
|
|
* syscall entry to syscall exit.
|
|
|
|
*
|
|
|
|
* NOTE: Another possibility is to store the formatted records off the
|
|
|
|
* audit context (for those records that have a context), and emit them
|
|
|
|
* all at syscall exit. However, this could delay the reporting of
|
|
|
|
* significant errors until syscall exit (or never, if the system
|
2005-09-14 03:47:11 +08:00
|
|
|
* halts).
|
|
|
|
*/
|
2005-05-22 04:08:09 +08:00
|
|
|
unsigned int audit_serial(void)
|
|
|
|
{
|
2014-06-14 06:22:00 +08:00
|
|
|
static atomic_t serial = ATOMIC_INIT(0);
|
2005-07-15 19:56:03 +08:00
|
|
|
|
2014-06-14 06:22:00 +08:00
|
|
|
return atomic_add_return(1, &serial);
|
2005-05-22 04:08:09 +08:00
|
|
|
}
|
|
|
|
|
2007-10-18 18:06:10 +08:00
|
|
|
static inline void audit_get_stamp(struct audit_context *ctx,
|
2017-05-02 22:16:05 +08:00
|
|
|
struct timespec64 *t, unsigned int *serial)
|
2005-05-22 04:08:09 +08:00
|
|
|
{
|
2008-12-06 14:05:50 +08:00
|
|
|
if (!ctx || !auditsc_get_stamp(ctx, t, serial)) {
|
audit: Reduce overhead using a coarse clock
Commit 2115bb250f26 ("audit: Use timespec64 to represent audit timestamps")
noted that audit timestamps were not y2038 safe and used a 64-bit
timestamp. In itself, this makes sense but the conversion was from
CURRENT_TIME to ktime_get_real_ts64() which is a heavier call to record
an accurate timestamp which is required in some, but not all, cases. The
impact is that when auditd is running without any rules that all syscalls
have higher overhead. This is visible in the sysbench-thread benchmark as
a 11.5% performance hit. That benchmark is dumb as rocks but it's also
visible in redis as an 8-10% hit on all operations which is of greater
concern. It is somewhat stupid of audit to track syscalls without any
rules related to syscalls but that is how it behaves.
The overhead can be directly measured with perf comparing 4.9 with 4.12
4.9
7.76% sysbench [kernel.vmlinux] [k] __schedule
7.62% sysbench [kernel.vmlinux] [k] _raw_spin_lock
7.37% sysbench libpthread-2.22.so [.] __lll_lock_elision
7.29% sysbench [kernel.vmlinux] [.] syscall_return_via_sysret
6.59% sysbench [kernel.vmlinux] [k] native_sched_clock
5.21% sysbench libc-2.22.so [.] __sched_yield
4.38% sysbench [kernel.vmlinux] [k] entry_SYSCALL_64
4.28% sysbench [kernel.vmlinux] [k] do_syscall_64
3.49% sysbench libpthread-2.22.so [.] __lll_unlock_elision
3.13% sysbench [kernel.vmlinux] [k] __audit_syscall_exit
2.87% sysbench [kernel.vmlinux] [k] update_curr
2.73% sysbench [kernel.vmlinux] [k] pick_next_task_fair
2.31% sysbench [kernel.vmlinux] [k] syscall_trace_enter
2.20% sysbench [kernel.vmlinux] [k] __audit_syscall_entry
.....
0.00% swapper [kernel.vmlinux] [k] read_tsc
4.12
7.84% sysbench [kernel.vmlinux] [k] __schedule
7.05% sysbench [kernel.vmlinux] [k] _raw_spin_lock
6.57% sysbench libpthread-2.22.so [.] __lll_lock_elision
6.50% sysbench [kernel.vmlinux] [.] syscall_return_via_sysret
5.95% sysbench [kernel.vmlinux] [k] read_tsc
5.71% sysbench [kernel.vmlinux] [k] native_sched_clock
4.78% sysbench libc-2.22.so [.] __sched_yield
4.30% sysbench [kernel.vmlinux] [k] entry_SYSCALL_64
3.94% sysbench [kernel.vmlinux] [k] do_syscall_64
3.37% sysbench libpthread-2.22.so [.] __lll_unlock_elision
3.32% sysbench [kernel.vmlinux] [k] __audit_syscall_exit
2.91% sysbench [kernel.vmlinux] [k] __getnstimeofday64
Note the additional overhead from read_tsc which goes from 0% to 5.95%.
This is on a single-socket E3-1230 but similar overheads have been measured
on an older machine which the patch also eliminates.
The patch in question has no explanation as to why a fully-accurate timestamp
is required and is likely an oversight. Using a coarser, but monotically
increasing, timestamp the overhead can be eliminated. While it can be
worked around by configuring or disabling audit, it's tricky enough to
detect that a kernel fix is justified. With this patch, we see the following;
sysbenchthread
4.9.0 4.12.0 4.12.0
vanilla vanilla coarse-v1r1
Amean 1 1.49 ( 0.00%) 1.66 ( -11.42%) 1.51 ( -1.34%)
Amean 3 1.48 ( 0.00%) 1.65 ( -11.45%) 1.50 ( -0.96%)
Amean 5 1.49 ( 0.00%) 1.67 ( -12.31%) 1.51 ( -1.83%)
Amean 7 1.49 ( 0.00%) 1.66 ( -11.72%) 1.50 ( -0.67%)
Amean 12 1.48 ( 0.00%) 1.65 ( -11.57%) 1.52 ( -2.89%)
Amean 16 1.49 ( 0.00%) 1.65 ( -11.13%) 1.51 ( -1.73%)
The benchmark is reporting the time required for different thread counts to
lock/unlock a private mutex which, while dense, demonstrates the syscall
overhead. This is showing that 4.12 took a 11-12% hit but the overhead is
almost eliminated by the patch. While the variance is not reported here,
it's well within the noise with the patch applied.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-07-04 20:11:43 +08:00
|
|
|
*t = current_kernel_time64();
|
2005-05-22 04:08:09 +08:00
|
|
|
*serial = audit_serial();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/**
|
|
|
|
* audit_log_start - obtain an audit buffer
|
|
|
|
* @ctx: audit_context (may be NULL)
|
|
|
|
* @gfp_mask: type of allocation
|
|
|
|
* @type: audit message type
|
|
|
|
*
|
|
|
|
* Returns audit_buffer pointer on success or NULL on error.
|
|
|
|
*
|
|
|
|
* Obtain an audit buffer. This routine does locking to obtain the
|
|
|
|
* audit buffer, but then no locking is required for calls to
|
|
|
|
* audit_log_*format. If the task (ctx) is a task that is currently in a
|
|
|
|
* syscall, then the syscall is marked as auditable and an audit record
|
|
|
|
* will be written at syscall exit. If there is no associated task, then
|
|
|
|
* task context (ctx) should be NULL.
|
|
|
|
*/
|
2005-10-21 15:22:03 +08:00
|
|
|
struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
|
2005-06-22 22:04:33 +08:00
|
|
|
int type)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2016-11-30 05:53:25 +08:00
|
|
|
struct audit_buffer *ab;
|
2017-05-02 22:16:05 +08:00
|
|
|
struct timespec64 t;
|
2016-11-30 05:53:25 +08:00
|
|
|
unsigned int uninitialized_var(serial);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-11-06 01:47:09 +08:00
|
|
|
if (audit_initialized != AUDIT_INITIALIZED)
|
2005-04-17 06:20:36 +08:00
|
|
|
return NULL;
|
|
|
|
|
2016-06-25 04:35:46 +08:00
|
|
|
if (unlikely(!audit_filter(type, AUDIT_FILTER_TYPE)))
|
2005-11-04 00:12:36 +08:00
|
|
|
return NULL;
|
|
|
|
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
/* NOTE: don't ever fail/sleep on these two conditions:
|
2016-11-30 05:53:26 +08:00
|
|
|
* 1. auditd generated record - since we need auditd to drain the
|
|
|
|
* queue; also, when we are checking for auditd, compare PIDs using
|
|
|
|
* task_tgid_vnr() since auditd_pid is set in audit_receive_msg()
|
|
|
|
* using a PID anchored in the caller's namespace
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
* 2. generator holding the audit_cmd_mutex - we don't want to block
|
|
|
|
* while holding the mutex */
|
2018-02-20 22:52:38 +08:00
|
|
|
if (!(auditd_test_task(current) || audit_ctl_owner_current())) {
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
long stime = audit_backlog_wait_time;
|
2016-11-30 05:53:25 +08:00
|
|
|
|
|
|
|
while (audit_backlog_limit &&
|
|
|
|
(skb_queue_len(&audit_queue) > audit_backlog_limit)) {
|
|
|
|
/* wake kauditd to try and flush the queue */
|
|
|
|
wake_up_interruptible(&kauditd_wait);
|
2005-06-22 22:04:33 +08:00
|
|
|
|
2016-11-30 05:53:25 +08:00
|
|
|
/* sleep if we are allowed and we haven't exhausted our
|
|
|
|
* backlog wait limit */
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
if (gfpflags_allow_blocking(gfp_mask) && (stime > 0)) {
|
2016-11-30 05:53:25 +08:00
|
|
|
DECLARE_WAITQUEUE(wait, current);
|
|
|
|
|
|
|
|
add_wait_queue_exclusive(&audit_backlog_wait,
|
|
|
|
&wait);
|
|
|
|
set_current_state(TASK_UNINTERRUPTIBLE);
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
stime = schedule_timeout(stime);
|
2016-11-30 05:53:25 +08:00
|
|
|
remove_wait_queue(&audit_backlog_wait, &wait);
|
|
|
|
} else {
|
|
|
|
if (audit_rate_check() && printk_ratelimit())
|
|
|
|
pr_warn("audit_backlog=%d > audit_backlog_limit=%d\n",
|
|
|
|
skb_queue_len(&audit_queue),
|
|
|
|
audit_backlog_limit);
|
|
|
|
audit_log_lost("backlog limit exceeded");
|
|
|
|
return NULL;
|
2013-09-25 06:27:42 +08:00
|
|
|
}
|
2005-06-22 22:04:33 +08:00
|
|
|
}
|
2005-05-19 21:55:56 +08:00
|
|
|
}
|
|
|
|
|
2005-06-22 22:04:33 +08:00
|
|
|
ab = audit_buffer_alloc(ctx, gfp_mask, type);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (!ab) {
|
|
|
|
audit_log_lost("out of memory in audit_log_start");
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2005-05-22 04:08:09 +08:00
|
|
|
audit_get_stamp(ab->ctx, &t, &serial);
|
2017-05-02 22:16:05 +08:00
|
|
|
audit_log_format(ab, "audit(%llu.%03lu:%u): ",
|
|
|
|
(unsigned long long)t.tv_sec, t.tv_nsec/1000000, serial);
|
2016-11-30 05:53:25 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
return ab;
|
|
|
|
}
|
|
|
|
|
2005-05-06 22:54:17 +08:00
|
|
|
/**
|
2005-05-06 22:54:53 +08:00
|
|
|
* audit_expand - expand skb in the audit buffer
|
2005-05-06 22:54:17 +08:00
|
|
|
* @ab: audit_buffer
|
2005-09-14 03:47:11 +08:00
|
|
|
* @extra: space to add at tail of the skb
|
2005-05-06 22:54:17 +08:00
|
|
|
*
|
|
|
|
* Returns 0 (no space) on failed expansion, or available space if
|
|
|
|
* successful.
|
|
|
|
*/
|
2005-05-11 01:56:08 +08:00
|
|
|
static inline int audit_expand(struct audit_buffer *ab, int extra)
|
2005-05-06 22:54:17 +08:00
|
|
|
{
|
2005-05-06 22:54:53 +08:00
|
|
|
struct sk_buff *skb = ab->skb;
|
2008-01-29 12:47:09 +08:00
|
|
|
int oldtail = skb_tailroom(skb);
|
|
|
|
int ret = pskb_expand_head(skb, 0, extra, ab->gfp_mask);
|
|
|
|
int newtail = skb_tailroom(skb);
|
|
|
|
|
2005-05-06 22:54:53 +08:00
|
|
|
if (ret < 0) {
|
|
|
|
audit_log_lost("out of memory in audit_expand");
|
2005-05-06 22:54:17 +08:00
|
|
|
return 0;
|
2005-05-06 22:54:53 +08:00
|
|
|
}
|
2008-01-29 12:47:09 +08:00
|
|
|
|
|
|
|
skb->truesize += newtail - oldtail;
|
|
|
|
return newtail;
|
2005-05-06 22:54:17 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/*
|
|
|
|
* Format an audit message into the audit buffer. If there isn't enough
|
2005-04-17 06:20:36 +08:00
|
|
|
* room in the audit buffer, more room will be allocated and vsnprint
|
|
|
|
* will be called a second time. Currently, we assume that a printk
|
2005-09-14 03:47:11 +08:00
|
|
|
* can't format message larger than 1024 bytes, so we don't either.
|
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
static void audit_log_vformat(struct audit_buffer *ab, const char *fmt,
|
|
|
|
va_list args)
|
|
|
|
{
|
|
|
|
int len, avail;
|
2005-05-06 22:54:53 +08:00
|
|
|
struct sk_buff *skb;
|
2005-05-11 01:58:51 +08:00
|
|
|
va_list args2;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
if (!ab)
|
|
|
|
return;
|
|
|
|
|
2005-05-06 22:54:53 +08:00
|
|
|
BUG_ON(!ab->skb);
|
|
|
|
skb = ab->skb;
|
|
|
|
avail = skb_tailroom(skb);
|
|
|
|
if (avail == 0) {
|
2005-05-11 01:56:08 +08:00
|
|
|
avail = audit_expand(ab, AUDIT_BUFSIZ);
|
2005-05-06 22:54:17 +08:00
|
|
|
if (!avail)
|
|
|
|
goto out;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2005-05-11 01:58:51 +08:00
|
|
|
va_copy(args2, args);
|
2007-04-20 11:29:13 +08:00
|
|
|
len = vsnprintf(skb_tail_pointer(skb), avail, fmt, args);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (len >= avail) {
|
|
|
|
/* The printk buffer is 1024 bytes long, so if we get
|
|
|
|
* here and AUDIT_BUFSIZ is at least 1024, then we can
|
|
|
|
* log everything that printk could have logged. */
|
2005-09-14 03:47:11 +08:00
|
|
|
avail = audit_expand(ab,
|
|
|
|
max_t(unsigned, AUDIT_BUFSIZ, 1+len-avail));
|
2005-05-06 22:54:17 +08:00
|
|
|
if (!avail)
|
2012-01-09 05:44:29 +08:00
|
|
|
goto out_va_end;
|
2007-04-20 11:29:13 +08:00
|
|
|
len = vsnprintf(skb_tail_pointer(skb), avail, fmt, args2);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2005-05-19 17:24:22 +08:00
|
|
|
if (len > 0)
|
|
|
|
skb_put(skb, len);
|
2012-01-09 05:44:29 +08:00
|
|
|
out_va_end:
|
|
|
|
va_end(args2);
|
2005-05-06 22:54:17 +08:00
|
|
|
out:
|
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/**
|
|
|
|
* audit_log_format - format a message into the audit buffer.
|
|
|
|
* @ab: audit_buffer
|
|
|
|
* @fmt: format string
|
|
|
|
* @...: optional parameters matching @fmt string
|
|
|
|
*
|
|
|
|
* All the work is done in audit_log_vformat.
|
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
void audit_log_format(struct audit_buffer *ab, const char *fmt, ...)
|
|
|
|
{
|
|
|
|
va_list args;
|
|
|
|
|
|
|
|
if (!ab)
|
|
|
|
return;
|
|
|
|
va_start(args, fmt);
|
|
|
|
audit_log_vformat(ab, fmt, args);
|
|
|
|
va_end(args);
|
|
|
|
}
|
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/**
|
2017-08-07 21:44:24 +08:00
|
|
|
* audit_log_n_hex - convert a buffer to hex and append it to the audit skb
|
2005-09-14 03:47:11 +08:00
|
|
|
* @ab: the audit_buffer
|
|
|
|
* @buf: buffer to convert to hex
|
|
|
|
* @len: length of @buf to be converted
|
|
|
|
*
|
|
|
|
* No return value; failure to expand is silently ignored.
|
|
|
|
*
|
|
|
|
* This function will take the passed buf and convert it into a string of
|
|
|
|
* ascii hex digits. The new string is placed onto the skb.
|
|
|
|
*/
|
2008-04-18 22:12:59 +08:00
|
|
|
void audit_log_n_hex(struct audit_buffer *ab, const unsigned char *buf,
|
2005-05-19 17:24:22 +08:00
|
|
|
size_t len)
|
2005-04-29 22:54:44 +08:00
|
|
|
{
|
2005-05-19 17:24:22 +08:00
|
|
|
int i, avail, new_len;
|
|
|
|
unsigned char *ptr;
|
|
|
|
struct sk_buff *skb;
|
|
|
|
|
2006-09-08 05:03:02 +08:00
|
|
|
if (!ab)
|
|
|
|
return;
|
|
|
|
|
2005-05-19 17:24:22 +08:00
|
|
|
BUG_ON(!ab->skb);
|
|
|
|
skb = ab->skb;
|
|
|
|
avail = skb_tailroom(skb);
|
|
|
|
new_len = len<<1;
|
|
|
|
if (new_len >= avail) {
|
|
|
|
/* Round the buffer request up to the next multiple */
|
|
|
|
new_len = AUDIT_BUFSIZ*(((new_len-avail)/AUDIT_BUFSIZ) + 1);
|
|
|
|
avail = audit_expand(ab, new_len);
|
|
|
|
if (!avail)
|
|
|
|
return;
|
|
|
|
}
|
2005-04-29 22:54:44 +08:00
|
|
|
|
2007-04-20 11:29:13 +08:00
|
|
|
ptr = skb_tail_pointer(skb);
|
2014-01-14 15:31:27 +08:00
|
|
|
for (i = 0; i < len; i++)
|
|
|
|
ptr = hex_byte_pack_upper(ptr, buf[i]);
|
2005-05-19 17:24:22 +08:00
|
|
|
*ptr = 0;
|
|
|
|
skb_put(skb, len << 1); /* new string is twice the old string */
|
2005-04-29 22:54:44 +08:00
|
|
|
}
|
|
|
|
|
2006-06-09 11:19:31 +08:00
|
|
|
/*
|
|
|
|
* Format a string of no more than slen characters into the audit buffer,
|
|
|
|
* enclosed in quote marks.
|
|
|
|
*/
|
2008-04-18 22:12:59 +08:00
|
|
|
void audit_log_n_string(struct audit_buffer *ab, const char *string,
|
|
|
|
size_t slen)
|
2006-06-09 11:19:31 +08:00
|
|
|
{
|
|
|
|
int avail, new_len;
|
|
|
|
unsigned char *ptr;
|
|
|
|
struct sk_buff *skb;
|
|
|
|
|
2006-09-08 05:03:02 +08:00
|
|
|
if (!ab)
|
|
|
|
return;
|
|
|
|
|
2006-06-09 11:19:31 +08:00
|
|
|
BUG_ON(!ab->skb);
|
|
|
|
skb = ab->skb;
|
|
|
|
avail = skb_tailroom(skb);
|
|
|
|
new_len = slen + 3; /* enclosing quotes + null terminator */
|
|
|
|
if (new_len > avail) {
|
|
|
|
avail = audit_expand(ab, new_len);
|
|
|
|
if (!avail)
|
|
|
|
return;
|
|
|
|
}
|
2007-04-20 11:29:13 +08:00
|
|
|
ptr = skb_tail_pointer(skb);
|
2006-06-09 11:19:31 +08:00
|
|
|
*ptr++ = '"';
|
|
|
|
memcpy(ptr, string, slen);
|
|
|
|
ptr += slen;
|
|
|
|
*ptr++ = '"';
|
|
|
|
*ptr = 0;
|
|
|
|
skb_put(skb, slen + 2); /* don't include null terminator */
|
|
|
|
}
|
|
|
|
|
2008-01-08 03:31:58 +08:00
|
|
|
/**
|
|
|
|
* audit_string_contains_control - does a string need to be logged in hex
|
2008-03-29 05:15:56 +08:00
|
|
|
* @string: string to be checked
|
|
|
|
* @len: max length of the string to check
|
2008-01-08 03:31:58 +08:00
|
|
|
*/
|
2015-11-04 21:23:51 +08:00
|
|
|
bool audit_string_contains_control(const char *string, size_t len)
|
2008-01-08 03:31:58 +08:00
|
|
|
{
|
|
|
|
const unsigned char *p;
|
2009-03-19 21:48:27 +08:00
|
|
|
for (p = string; p < (const unsigned char *)string + len; p++) {
|
2008-07-23 05:06:13 +08:00
|
|
|
if (*p == '"' || *p < 0x21 || *p > 0x7e)
|
2015-11-04 21:23:51 +08:00
|
|
|
return true;
|
2008-01-08 03:31:58 +08:00
|
|
|
}
|
2015-11-04 21:23:51 +08:00
|
|
|
return false;
|
2008-01-08 03:31:58 +08:00
|
|
|
}
|
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/**
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
* audit_log_n_untrustedstring - log a string that may contain random characters
|
2005-09-14 03:47:11 +08:00
|
|
|
* @ab: audit_buffer
|
2008-03-29 05:15:56 +08:00
|
|
|
* @len: length of string (not including trailing null)
|
2005-09-14 03:47:11 +08:00
|
|
|
* @string: string to be logged
|
|
|
|
*
|
|
|
|
* This code will escape a string that is passed to it if the string
|
|
|
|
* contains a control character, unprintable character, double quote mark,
|
2005-05-19 17:24:22 +08:00
|
|
|
* or a space. Unescaped strings will start and end with a double quote mark.
|
2005-09-14 03:47:11 +08:00
|
|
|
* Strings that are escaped are printed in hex (2 digits per char).
|
2006-06-09 11:19:31 +08:00
|
|
|
*
|
|
|
|
* The caller specifies the number of characters in the string to log, which may
|
|
|
|
* or may not be the entire string.
|
2005-09-14 03:47:11 +08:00
|
|
|
*/
|
2008-04-18 22:12:59 +08:00
|
|
|
void audit_log_n_untrustedstring(struct audit_buffer *ab, const char *string,
|
|
|
|
size_t len)
|
2005-04-29 22:54:44 +08:00
|
|
|
{
|
2008-01-08 03:31:58 +08:00
|
|
|
if (audit_string_contains_control(string, len))
|
2008-04-18 22:12:59 +08:00
|
|
|
audit_log_n_hex(ab, string, len);
|
2008-01-08 03:31:58 +08:00
|
|
|
else
|
2008-04-18 22:12:59 +08:00
|
|
|
audit_log_n_string(ab, string, len);
|
2005-04-29 22:54:44 +08:00
|
|
|
}
|
|
|
|
|
2006-06-09 11:19:31 +08:00
|
|
|
/**
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
* audit_log_untrustedstring - log a string that may contain random characters
|
2006-06-09 11:19:31 +08:00
|
|
|
* @ab: audit_buffer
|
|
|
|
* @string: string to be logged
|
|
|
|
*
|
Audit: add TTY input auditing
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:40:56 +08:00
|
|
|
* Same as audit_log_n_untrustedstring(), except that strlen is used to
|
2006-06-09 11:19:31 +08:00
|
|
|
* determine string length.
|
|
|
|
*/
|
2008-01-08 03:31:58 +08:00
|
|
|
void audit_log_untrustedstring(struct audit_buffer *ab, const char *string)
|
2006-06-09 11:19:31 +08:00
|
|
|
{
|
2008-04-18 22:12:59 +08:00
|
|
|
audit_log_n_untrustedstring(ab, string, strlen(string));
|
2006-06-09 11:19:31 +08:00
|
|
|
}
|
|
|
|
|
2005-05-19 17:24:22 +08:00
|
|
|
/* This is a helper-function to print the escaped d_path */
|
2005-04-17 06:20:36 +08:00
|
|
|
void audit_log_d_path(struct audit_buffer *ab, const char *prefix,
|
2012-03-15 09:48:20 +08:00
|
|
|
const struct path *path)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-02-15 11:38:33 +08:00
|
|
|
char *p, *pathname;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2005-05-06 22:54:17 +08:00
|
|
|
if (prefix)
|
2012-01-07 06:07:10 +08:00
|
|
|
audit_log_format(ab, "%s", prefix);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2005-05-19 17:24:22 +08:00
|
|
|
/* We will allow 11 spaces for ' (deleted)' to be appended */
|
2008-02-15 11:38:33 +08:00
|
|
|
pathname = kmalloc(PATH_MAX+11, ab->gfp_mask);
|
|
|
|
if (!pathname) {
|
2009-03-11 06:00:14 +08:00
|
|
|
audit_log_string(ab, "<no_memory>");
|
2005-05-19 17:24:22 +08:00
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2008-02-15 11:38:44 +08:00
|
|
|
p = d_path(path, pathname, PATH_MAX+11);
|
2005-05-19 17:24:22 +08:00
|
|
|
if (IS_ERR(p)) { /* Should never happen since we send PATH_MAX */
|
|
|
|
/* FIXME: can we save some information here? */
|
2009-03-11 06:00:14 +08:00
|
|
|
audit_log_string(ab, "<too_long>");
|
2007-10-18 18:06:10 +08:00
|
|
|
} else
|
2005-05-19 17:24:22 +08:00
|
|
|
audit_log_untrustedstring(ab, p);
|
2008-02-15 11:38:33 +08:00
|
|
|
kfree(pathname);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2013-04-30 21:53:34 +08:00
|
|
|
void audit_log_session_info(struct audit_buffer *ab)
|
|
|
|
{
|
2013-11-28 06:35:17 +08:00
|
|
|
unsigned int sessionid = audit_get_sessionid(current);
|
2013-04-30 21:53:34 +08:00
|
|
|
uid_t auid = from_kuid(&init_user_ns, audit_get_loginuid(current));
|
|
|
|
|
2013-09-18 23:17:43 +08:00
|
|
|
audit_log_format(ab, " auid=%u ses=%u", auid, sessionid);
|
2013-04-30 21:53:34 +08:00
|
|
|
}
|
|
|
|
|
2009-06-12 02:31:37 +08:00
|
|
|
void audit_log_key(struct audit_buffer *ab, char *key)
|
|
|
|
{
|
|
|
|
audit_log_format(ab, " key=");
|
|
|
|
if (key)
|
|
|
|
audit_log_untrustedstring(ab, key);
|
|
|
|
else
|
|
|
|
audit_log_format(ab, "(null)");
|
|
|
|
}
|
|
|
|
|
2013-05-01 03:30:32 +08:00
|
|
|
void audit_log_cap(struct audit_buffer *ab, char *prefix, kernel_cap_t *cap)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
audit_log_format(ab, " %s=", prefix);
|
|
|
|
CAP_FOR_EACH_U32(i) {
|
|
|
|
audit_log_format(ab, "%08x",
|
2014-07-24 03:36:26 +08:00
|
|
|
cap->cap[CAP_LAST_U32 - i]);
|
2013-05-01 03:30:32 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-05-26 23:02:48 +08:00
|
|
|
static void audit_log_fcaps(struct audit_buffer *ab, struct audit_names *name)
|
2013-05-01 03:30:32 +08:00
|
|
|
{
|
2017-04-21 01:07:30 +08:00
|
|
|
audit_log_cap(ab, "cap_fp", &name->fcap.permitted);
|
|
|
|
audit_log_cap(ab, "cap_fi", &name->fcap.inheritable);
|
|
|
|
audit_log_format(ab, " cap_fe=%d cap_fver=%x",
|
|
|
|
name->fcap.fE, name->fcap_ver);
|
2013-05-01 03:30:32 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline int audit_copy_fcaps(struct audit_names *name,
|
|
|
|
const struct dentry *dentry)
|
|
|
|
{
|
|
|
|
struct cpu_vfs_cap_data caps;
|
|
|
|
int rc;
|
|
|
|
|
|
|
|
if (!dentry)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
rc = get_vfs_caps_from_disk(dentry, &caps);
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
|
|
|
|
|
|
|
name->fcap.permitted = caps.permitted;
|
|
|
|
name->fcap.inheritable = caps.inheritable;
|
|
|
|
name->fcap.fE = !!(caps.magic_etc & VFS_CAP_FLAGS_EFFECTIVE);
|
|
|
|
name->fcap_ver = (caps.magic_etc & VFS_CAP_REVISION_MASK) >>
|
|
|
|
VFS_CAP_REVISION_SHIFT;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Copy inode data into an audit_names. */
|
|
|
|
void audit_copy_inode(struct audit_names *name, const struct dentry *dentry,
|
2015-12-25 00:09:39 +08:00
|
|
|
struct inode *inode)
|
2013-05-01 03:30:32 +08:00
|
|
|
{
|
|
|
|
name->ino = inode->i_ino;
|
|
|
|
name->dev = inode->i_sb->s_dev;
|
|
|
|
name->mode = inode->i_mode;
|
|
|
|
name->uid = inode->i_uid;
|
|
|
|
name->gid = inode->i_gid;
|
|
|
|
name->rdev = inode->i_rdev;
|
|
|
|
security_inode_getsecid(inode, &name->osid);
|
|
|
|
audit_copy_fcaps(name, dentry);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* audit_log_name - produce AUDIT_PATH record from struct audit_names
|
|
|
|
* @context: audit_context for the task
|
|
|
|
* @n: audit_names structure with reportable details
|
|
|
|
* @path: optional path to report instead of audit_names->name
|
|
|
|
* @record_num: record number to report when handling a list of names
|
|
|
|
* @call_panic: optional pointer to int that will be updated if secid fails
|
|
|
|
*/
|
|
|
|
void audit_log_name(struct audit_context *context, struct audit_names *n,
|
2016-11-21 09:36:51 +08:00
|
|
|
const struct path *path, int record_num, int *call_panic)
|
2013-05-01 03:30:32 +08:00
|
|
|
{
|
|
|
|
struct audit_buffer *ab;
|
|
|
|
ab = audit_log_start(context, GFP_KERNEL, AUDIT_PATH);
|
|
|
|
if (!ab)
|
|
|
|
return;
|
|
|
|
|
|
|
|
audit_log_format(ab, "item=%d", record_num);
|
|
|
|
|
|
|
|
if (path)
|
|
|
|
audit_log_d_path(ab, " name=", path);
|
|
|
|
else if (n->name) {
|
|
|
|
switch (n->name_len) {
|
|
|
|
case AUDIT_NAME_FULL:
|
|
|
|
/* log the full path */
|
|
|
|
audit_log_format(ab, " name=");
|
|
|
|
audit_log_untrustedstring(ab, n->name->name);
|
|
|
|
break;
|
|
|
|
case 0:
|
|
|
|
/* name was specified as a relative path and the
|
|
|
|
* directory component is the cwd */
|
|
|
|
audit_log_d_path(ab, " name=", &context->pwd);
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
/* log the name's directory component */
|
|
|
|
audit_log_format(ab, " name=");
|
|
|
|
audit_log_n_untrustedstring(ab, n->name->name,
|
|
|
|
n->name_len);
|
|
|
|
}
|
|
|
|
} else
|
|
|
|
audit_log_format(ab, " name=(null)");
|
|
|
|
|
2015-09-09 04:34:59 +08:00
|
|
|
if (n->ino != AUDIT_INO_UNSET)
|
2013-05-01 03:30:32 +08:00
|
|
|
audit_log_format(ab, " inode=%lu"
|
|
|
|
" dev=%02x:%02x mode=%#ho"
|
|
|
|
" ouid=%u ogid=%u rdev=%02x:%02x",
|
|
|
|
n->ino,
|
|
|
|
MAJOR(n->dev),
|
|
|
|
MINOR(n->dev),
|
|
|
|
n->mode,
|
|
|
|
from_kuid(&init_user_ns, n->uid),
|
|
|
|
from_kgid(&init_user_ns, n->gid),
|
|
|
|
MAJOR(n->rdev),
|
|
|
|
MINOR(n->rdev));
|
|
|
|
if (n->osid != 0) {
|
|
|
|
char *ctx = NULL;
|
|
|
|
u32 len;
|
|
|
|
if (security_secid_to_secctx(
|
|
|
|
n->osid, &ctx, &len)) {
|
|
|
|
audit_log_format(ab, " osid=%u", n->osid);
|
|
|
|
if (call_panic)
|
|
|
|
*call_panic = 2;
|
|
|
|
} else {
|
|
|
|
audit_log_format(ab, " obj=%s", ctx);
|
|
|
|
security_release_secctx(ctx, len);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-05-08 22:32:23 +08:00
|
|
|
/* log the audit_names record type */
|
|
|
|
audit_log_format(ab, " nametype=");
|
|
|
|
switch(n->type) {
|
|
|
|
case AUDIT_TYPE_NORMAL:
|
|
|
|
audit_log_format(ab, "NORMAL");
|
|
|
|
break;
|
|
|
|
case AUDIT_TYPE_PARENT:
|
|
|
|
audit_log_format(ab, "PARENT");
|
|
|
|
break;
|
|
|
|
case AUDIT_TYPE_CHILD_DELETE:
|
|
|
|
audit_log_format(ab, "DELETE");
|
|
|
|
break;
|
|
|
|
case AUDIT_TYPE_CHILD_CREATE:
|
|
|
|
audit_log_format(ab, "CREATE");
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
audit_log_format(ab, "UNKNOWN");
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2013-05-01 03:30:32 +08:00
|
|
|
audit_log_fcaps(ab, n);
|
|
|
|
audit_log_end(ab);
|
|
|
|
}
|
|
|
|
|
|
|
|
int audit_log_task_context(struct audit_buffer *ab)
|
|
|
|
{
|
|
|
|
char *ctx = NULL;
|
|
|
|
unsigned len;
|
|
|
|
int error;
|
|
|
|
u32 sid;
|
|
|
|
|
|
|
|
security_task_getsecid(current, &sid);
|
|
|
|
if (!sid)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
error = security_secid_to_secctx(sid, &ctx, &len);
|
|
|
|
if (error) {
|
|
|
|
if (error != -EINVAL)
|
|
|
|
goto error_path;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
audit_log_format(ab, " subj=%s", ctx);
|
|
|
|
security_release_secctx(ctx, len);
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
error_path:
|
|
|
|
audit_panic("error in audit_log_task_context");
|
|
|
|
return error;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(audit_log_task_context);
|
|
|
|
|
2015-02-23 10:20:00 +08:00
|
|
|
void audit_log_d_path_exe(struct audit_buffer *ab,
|
|
|
|
struct mm_struct *mm)
|
|
|
|
{
|
2015-02-23 10:20:09 +08:00
|
|
|
struct file *exe_file;
|
|
|
|
|
|
|
|
if (!mm)
|
|
|
|
goto out_null;
|
2015-02-23 10:20:00 +08:00
|
|
|
|
2015-02-23 10:20:09 +08:00
|
|
|
exe_file = get_mm_exe_file(mm);
|
|
|
|
if (!exe_file)
|
|
|
|
goto out_null;
|
|
|
|
|
|
|
|
audit_log_d_path(ab, " exe=", &exe_file->f_path);
|
|
|
|
fput(exe_file);
|
|
|
|
return;
|
|
|
|
out_null:
|
|
|
|
audit_log_format(ab, " exe=(null)");
|
2015-02-23 10:20:00 +08:00
|
|
|
}
|
|
|
|
|
2016-06-29 00:07:50 +08:00
|
|
|
struct tty_struct *audit_get_tty(struct task_struct *tsk)
|
|
|
|
{
|
|
|
|
struct tty_struct *tty = NULL;
|
|
|
|
unsigned long flags;
|
|
|
|
|
|
|
|
spin_lock_irqsave(&tsk->sighand->siglock, flags);
|
|
|
|
if (tsk->signal)
|
|
|
|
tty = tty_kref_get(tsk->signal->tty);
|
|
|
|
spin_unlock_irqrestore(&tsk->sighand->siglock, flags);
|
|
|
|
return tty;
|
|
|
|
}
|
|
|
|
|
|
|
|
void audit_put_tty(struct tty_struct *tty)
|
|
|
|
{
|
|
|
|
tty_kref_put(tty);
|
|
|
|
}
|
|
|
|
|
2013-05-01 03:30:32 +08:00
|
|
|
void audit_log_task_info(struct audit_buffer *ab, struct task_struct *tsk)
|
|
|
|
{
|
|
|
|
const struct cred *cred;
|
2014-03-16 06:42:34 +08:00
|
|
|
char comm[sizeof(tsk->comm)];
|
2016-04-22 02:14:01 +08:00
|
|
|
struct tty_struct *tty;
|
2013-05-01 03:30:32 +08:00
|
|
|
|
|
|
|
if (!ab)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* tsk == current */
|
|
|
|
cred = current_cred();
|
2016-04-22 02:14:01 +08:00
|
|
|
tty = audit_get_tty(tsk);
|
2013-05-01 03:30:32 +08:00
|
|
|
audit_log_format(ab,
|
2013-12-11 11:10:41 +08:00
|
|
|
" ppid=%d pid=%d auid=%u uid=%u gid=%u"
|
2013-05-01 03:30:32 +08:00
|
|
|
" euid=%u suid=%u fsuid=%u"
|
2013-07-15 22:23:11 +08:00
|
|
|
" egid=%u sgid=%u fsgid=%u tty=%s ses=%u",
|
2013-12-11 11:10:41 +08:00
|
|
|
task_ppid_nr(tsk),
|
2016-08-31 05:19:13 +08:00
|
|
|
task_tgid_nr(tsk),
|
2013-05-01 03:30:32 +08:00
|
|
|
from_kuid(&init_user_ns, audit_get_loginuid(tsk)),
|
|
|
|
from_kuid(&init_user_ns, cred->uid),
|
|
|
|
from_kgid(&init_user_ns, cred->gid),
|
|
|
|
from_kuid(&init_user_ns, cred->euid),
|
|
|
|
from_kuid(&init_user_ns, cred->suid),
|
|
|
|
from_kuid(&init_user_ns, cred->fsuid),
|
|
|
|
from_kgid(&init_user_ns, cred->egid),
|
|
|
|
from_kgid(&init_user_ns, cred->sgid),
|
|
|
|
from_kgid(&init_user_ns, cred->fsgid),
|
2016-04-22 02:14:01 +08:00
|
|
|
tty ? tty_name(tty) : "(none)",
|
|
|
|
audit_get_sessionid(tsk));
|
|
|
|
audit_put_tty(tty);
|
2013-05-01 03:30:32 +08:00
|
|
|
audit_log_format(ab, " comm=");
|
2014-03-16 06:42:34 +08:00
|
|
|
audit_log_untrustedstring(ab, get_task_comm(comm, tsk));
|
2015-02-23 10:20:00 +08:00
|
|
|
audit_log_d_path_exe(ab, tsk->mm);
|
2013-05-01 03:30:32 +08:00
|
|
|
audit_log_task_context(ab);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(audit_log_task_info);
|
|
|
|
|
2012-07-26 08:29:08 +08:00
|
|
|
/**
|
|
|
|
* audit_log_link_denied - report a link restriction denial
|
2015-05-23 13:10:27 +08:00
|
|
|
* @operation: specific link operation
|
2012-07-26 08:29:08 +08:00
|
|
|
* @link: the path that triggered the restriction
|
|
|
|
*/
|
2016-11-21 09:36:51 +08:00
|
|
|
void audit_log_link_denied(const char *operation, const struct path *link)
|
2012-07-26 08:29:08 +08:00
|
|
|
{
|
|
|
|
struct audit_buffer *ab;
|
2013-05-01 03:30:32 +08:00
|
|
|
struct audit_names *name;
|
|
|
|
|
|
|
|
name = kzalloc(sizeof(*name), GFP_NOFS);
|
|
|
|
if (!name)
|
|
|
|
return;
|
2012-07-26 08:29:08 +08:00
|
|
|
|
2013-05-01 03:30:32 +08:00
|
|
|
/* Generate AUDIT_ANOM_LINK with subject, operation, outcome. */
|
2012-07-26 08:29:08 +08:00
|
|
|
ab = audit_log_start(current->audit_context, GFP_KERNEL,
|
|
|
|
AUDIT_ANOM_LINK);
|
2012-10-05 07:57:31 +08:00
|
|
|
if (!ab)
|
2013-05-01 03:30:32 +08:00
|
|
|
goto out;
|
|
|
|
audit_log_format(ab, "op=%s", operation);
|
|
|
|
audit_log_task_info(ab, current);
|
|
|
|
audit_log_format(ab, " res=0");
|
2012-07-26 08:29:08 +08:00
|
|
|
audit_log_end(ab);
|
2013-05-01 03:30:32 +08:00
|
|
|
|
|
|
|
/* Generate AUDIT_PATH record with object. */
|
|
|
|
name->type = AUDIT_TYPE_NORMAL;
|
2015-03-18 06:26:21 +08:00
|
|
|
audit_copy_inode(name, link->dentry, d_backing_inode(link->dentry));
|
2013-05-01 03:30:32 +08:00
|
|
|
audit_log_name(current->audit_context, name, link, 0, NULL);
|
|
|
|
out:
|
|
|
|
kfree(name);
|
2012-07-26 08:29:08 +08:00
|
|
|
}
|
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/**
|
|
|
|
* audit_log_end - end one audit record
|
|
|
|
* @ab: the audit_buffer
|
|
|
|
*
|
2016-11-30 05:53:24 +08:00
|
|
|
* We can not do a netlink send inside an irq context because it blocks (last
|
|
|
|
* arg, flags, is not set to MSG_DONTWAIT), so the audit buffer is placed on a
|
|
|
|
* queue and a tasklet is scheduled to remove them from the queue outside the
|
|
|
|
* irq context. May be called in any context.
|
2005-09-14 03:47:11 +08:00
|
|
|
*/
|
2005-05-19 17:56:58 +08:00
|
|
|
void audit_log_end(struct audit_buffer *ab)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
struct sk_buff *skb;
|
|
|
|
struct nlmsghdr *nlh;
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
if (!ab)
|
|
|
|
return;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
|
|
|
if (audit_rate_check()) {
|
|
|
|
skb = ab->skb;
|
2008-04-18 22:02:28 +08:00
|
|
|
ab->skb = NULL;
|
audit: fix auditd/kernel connection state tracking
What started as a rather straightforward race condition reported by
Dmitry using the syzkaller fuzzer ended up revealing some major
problems with how the audit subsystem managed its netlink sockets and
its connection with the userspace audit daemon. Fixing this properly
had quite the cascading effect and what we are left with is this rather
large and complicated patch. My initial goal was to try and decompose
this patch into multiple smaller patches, but the way these changes
are intertwined makes it difficult to split these changes into
meaningful pieces that don't break or somehow make things worse for
the intermediate states.
The patch makes a number of changes, but the most significant are
highlighted below:
* The auditd tracking variables, e.g. audit_sock, are now gone and
replaced by a RCU/spin_lock protected variable auditd_conn which is
a structure containing all of the auditd tracking information.
* We no longer track the auditd sock directly, instead we track it
via the network namespace in which it resides and we use the audit
socket associated with that namespace. In spirit, this is what the
code was trying to do prior to this patch (at least I think that is
what the original authors intended), but it was done rather poorly
and added a layer of obfuscation that only masked the underlying
problems.
* Big backlog queue cleanup, again. In v4.10 we made some pretty big
changes to how the audit backlog queues work, here we haven't changed
the queue design so much as cleaned up the implementation. Brought
about by the locking changes, we've simplified kauditd_thread() quite
a bit by consolidating the queue handling into a new helper function,
kauditd_send_queue(), which allows us to eliminate a lot of very
similar code and makes the looping logic in kauditd_thread() clearer.
* All netlink messages sent to auditd are now sent via
auditd_send_unicast_skb(). Other than just making sense, this makes
the lock handling easier.
* Change the audit_log_start() sleep behavior so that we never sleep
on auditd events (unchanged) or if the caller is holding the
audit_cmd_mutex (changed). Previously we didn't sleep if the caller
was auditd or if the message type fell between a certain range; the
type check was a poor effort of doing what the cmd_mutex check now
does. Richard Guy Briggs originally proposed not sleeping the
cmd_mutex owner several years ago but his patch wasn't acceptable
at the time. At least the idea lives on here.
* A problem with the lost record counter has been resolved. Steve
Grubb and I both happened to notice this problem and according to
some quick testing by Steve, this problem goes back quite some time.
It's largely a harmless problem, although it may have left some
careful sysadmins quite puzzled.
Cc: <stable@vger.kernel.org> # 4.10.x-
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-21 23:26:35 +08:00
|
|
|
|
|
|
|
/* setup the netlink header, see the comments in
|
|
|
|
* kauditd_send_multicast_skb() for length quirks */
|
|
|
|
nlh = nlmsg_hdr(skb);
|
|
|
|
nlh->nlmsg_len = skb->len - NLMSG_HDRLEN;
|
|
|
|
|
|
|
|
/* queue the netlink packet and poke the kauditd thread */
|
|
|
|
skb_queue_tail(&audit_queue, skb);
|
|
|
|
wake_up_interruptible(&kauditd_wait);
|
|
|
|
} else
|
|
|
|
audit_log_lost("rate limit exceeded");
|
|
|
|
|
2005-05-06 22:53:34 +08:00
|
|
|
audit_buffer_free(ab);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2005-09-14 03:47:11 +08:00
|
|
|
/**
|
|
|
|
* audit_log - Log an audit record
|
|
|
|
* @ctx: audit context
|
|
|
|
* @gfp_mask: type of allocation
|
|
|
|
* @type: audit message type
|
|
|
|
* @fmt: format string to use
|
|
|
|
* @...: variable parameters matching the format string
|
|
|
|
*
|
|
|
|
* This is a convenience function that calls audit_log_start,
|
|
|
|
* audit_log_vformat, and audit_log_end. It may be called
|
|
|
|
* in any context.
|
|
|
|
*/
|
2007-10-18 18:06:10 +08:00
|
|
|
void audit_log(struct audit_context *ctx, gfp_t gfp_mask, int type,
|
2005-06-22 22:04:33 +08:00
|
|
|
const char *fmt, ...)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
|
|
|
struct audit_buffer *ab;
|
|
|
|
va_list args;
|
|
|
|
|
2005-06-22 22:04:33 +08:00
|
|
|
ab = audit_log_start(ctx, gfp_mask, type);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (ab) {
|
|
|
|
va_start(args, fmt);
|
|
|
|
audit_log_vformat(ab, fmt, args);
|
|
|
|
va_end(args);
|
|
|
|
audit_log_end(ab);
|
|
|
|
}
|
|
|
|
}
|
2006-03-09 07:33:47 +08:00
|
|
|
|
|
|
|
EXPORT_SYMBOL(audit_log_start);
|
|
|
|
EXPORT_SYMBOL(audit_log_end);
|
|
|
|
EXPORT_SYMBOL(audit_log_format);
|
|
|
|
EXPORT_SYMBOL(audit_log);
|