linux

History

Daniel Borkmann 4f3446bb80 bpf: add generic constant blinding for use in jits This work adds a generic facility for use from eBPF JIT compilers that allows for further hardening of JIT generated images through blinding constants. In response to the original work on BPF JIT spraying published by Keegan McAllister [1], most BPF JITs were changed to make images read-only and start at a randomized offset in the page, where the rest was filled with trap instructions. We have this nowadays in x86, arm, arm64 and s390 JIT compilers. Additionally, later work also made eBPF interpreter images read only for kernels supporting DEBUG_SET_MODULE_RONX, that is, x86, arm, arm64 and s390 archs as well currently. This is done by default for mentioned JITs when JITing is enabled. Furthermore, we had a generic and configurable constant blinding facility on our todo for quite some time now to further make spraying harder, and first implementation since around netconf 2016. We found that for systems where untrusted users can load cBPF/eBPF code where JIT is enabled, start offset randomization helps a bit to make jumps into crafted payload harder, but in case where larger programs that cross page boundary are injected, we again have some part of the program opcodes at a page start offset. With improved guessing and more reliable payload injection, chances can increase to jump into such payload. Elena Reshetova recently wrote a test case for it [2, 3]. Moreover, eBPF comes with 64 bit constants, which can leave some more room for payloads. Note that for all this, additional bugs in the kernel are still required to make the jump (and of course to guess right, to not jump into a trap) and naturally the JIT must be enabled, which is disabled by default. For helping mitigation, the general idea is to provide an option bpf_jit_harden that admins can tweak along with bpf_jit_enable, so that for cases where JIT should be enabled for performance reasons, the generated image can be further hardened with blinding constants for unpriviledged users (bpf_jit_harden == 1), with trading off performance for these, but not for privileged ones. We also added the option of blinding for all users (bpf_jit_harden == 2), which is quite helpful for testing f.e. with test_bpf.ko. There are no further e.g. hardening levels of bpf_jit_harden switch intended, rationale is to have it dead simple to use as on/off. Since this functionality would need to be duplicated over and over for JIT compilers to use, which are already complex enough, we provide a generic eBPF byte-code level based blinding implementation, which is then just transparently JITed. JIT compilers need to make only a few changes to integrate this facility and can be migrated one by one. This option is for eBPF JITs and will be used in x86, arm64, s390 without too much effort, and soon ppc64 JITs, thus that native eBPF can be blinded as well as cBPF to eBPF migrations, so that both can be covered with a single implementation. The rule for JITs is that bpf_jit_blind_constants() must be called from bpf_int_jit_compile(), and in case blinding is disabled, we follow normally with JITing the passed program. In case blinding is enabled and we fail during the process of blinding itself, we must return with the interpreter. Similarly, in case the JITing process after the blinding failed, we return normally to the interpreter with the non-blinded code. Meaning, interpreter doesn't change in any way and operates on eBPF code as usual. For doing this pre-JIT blinding step, we need to make use of a helper/auxiliary register, here BPF_REG_AX. This is strictly internal to the JIT and not in any way part of the eBPF architecture. Just like in the same way as JITs internally make use of some helper registers when emitting code, only that here the helper register is one abstraction level higher in eBPF bytecode, but nevertheless in JIT phase. That helper register is needed since f.e. manually written program can issue loads to all registers of eBPF architecture. The core concept with the additional register is: blind out all 32 and 64 bit constants by converting BPF_K based instructions into a small sequence from K_VAL into ((RND ^ K_VAL) ^ RND). Therefore, this is transformed into: BPF_REG_AX := (RND ^ K_VAL), BPF_REG_AX ^= RND, and REG <OP> BPF_REG_AX, so actual operation on the target register is translated from BPF_K into BPF_X one that is operating on BPF_REG_AX's content. During rewriting phase when blinding, RND is newly generated via prandom_u32() for each processed instruction. 64 bit loads are split into two 32 bit loads to make translation and patching not too complex. Only basic thing required by JITs is to call the helper bpf_jit_blind_constants()/bpf_jit_prog_release_other() pair, and to map BPF_REG_AX into an unused register. Small bpf_jit_disasm extract from [2] when applied to x86 JIT: echo 0 > /proc/sys/net/core/bpf_jit_harden ffffffffa034f5e9 + <x>: [...] 39: mov $0xa8909090,%eax 3e: mov $0xa8909090,%eax 43: mov $0xa8ff3148,%eax 48: mov $0xa89081b4,%eax 4d: mov $0xa8900bb0,%eax 52: mov $0xa810e0c1,%eax 57: mov $0xa8908eb4,%eax 5c: mov $0xa89020b0,%eax [...] echo 1 > /proc/sys/net/core/bpf_jit_harden ffffffffa034f1e5 + <x>: [...] 39: mov $0xe1192563,%r10d 3f: xor $0x4989b5f3,%r10d 46: mov %r10d,%eax 49: mov $0xb8296d93,%r10d 4f: xor $0x10b9fd03,%r10d 56: mov %r10d,%eax 59: mov $0x8c381146,%r10d 5f: xor $0x24c7200e,%r10d 66: mov %r10d,%eax 69: mov $0xeb2a830e,%r10d 6f: xor $0x43ba02ba,%r10d 76: mov %r10d,%eax 79: mov $0xd9730af,%r10d 7f: xor $0xa5073b1f,%r10d 86: mov %r10d,%eax 89: mov $0x9a45662b,%r10d 8f: xor $0x325586ea,%r10d 96: mov %r10d,%eax [...] As can be seen, original constants that carry payload are hidden when enabled, actual operations are transformed from constant-based to register-based ones, making jumps into constants ineffective. Above extract/example uses single BPF load instruction over and over, but of course all instructions with constants are blinded. Performance wise, JIT with blinding performs a bit slower than just JIT and faster than interpreter case. This is expected, since we still get all the performance benefits from JITing and in normal use-cases not every single instruction needs to be blinded. Summing up all 296 test cases averaged over multiple runs from test_bpf.ko suite, interpreter was 55% slower than JIT only and JIT with blinding was 8% slower than JIT only. Since there are also some extremes in the test suite, I expect for ordinary workloads that the performance for the JIT with blinding case is even closer to JIT only case, f.e. nmap test case from suite has averaged timings in ns 29 (JIT), 35 (+ blinding), and 151 (interpreter). BPF test suite, seccomp test suite, eBPF sample code and various bigger networking eBPF programs have been tested with this and were running fine. For testing purposes, I also adapted interpreter and redirected blinded eBPF image to interpreter and also here all tests pass. [1] http://mainisusuallyafunction.blogspot.com/2012/11/attacking-hardened-linux-systems-with.html [2] https://github.com/01org/jit-spray-poc-for-ksp/ [3] http://www.openwall.com/lists/kernel-hardening/2016/05/03/5 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Elena Reshetova <elena.reshetova@intel.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>		2016-05-16 13:49:32 -04:00
..
bpf	bpf: add generic constant blinding for use in jits	2016-05-16 13:49:32 -04:00
configs	kconfig: add xenconfig defconfig helper	2015-06-16 11:04:29 +01:00
debug	mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings	2016-02-22 08:51:37 +01:00
events	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-05-15 13:32:48 -04:00
gcov	gcov: use within_module() helper.	2015-12-04 22:46:25 +01:00
irq	genirq: Dont allow affinity mask to be updated on IPIs	2016-04-21 12:05:15 +02:00
livepatch	livepatch/module: remove livepatch module notifier	2016-03-17 09:45:10 +01:00
locking	lockdep: Fix lock_chain::base size	2016-04-23 13:53:03 +02:00
power	Power management and ACPI material for v4.6-rc1, part 2	2016-03-24 22:59:58 -07:00
printk	printk: add clear_idx symbol to vmcoreinfo	2016-03-17 15:09:34 -07:00
rcu	kernel: add kcov code coverage	2016-03-22 15:36:02 -07:00
sched	Revert "sched/fair: Fix fairness issue on migration"	2016-05-11 08:25:53 +02:00
time	timers/nohz: Convert tick dependency mask to atomic_t	2016-03-29 11:52:11 +02:00
trace	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-05-09 15:59:24 -04:00
.gitignore	certs: add .gitignore to stop git nagging about x509_certificate_list	2015-10-21 15:18:35 +01:00
Kconfig.freezer	…
Kconfig.hz	…
Kconfig.locks	locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS	2015-05-12 09:46:00 +02:00
Kconfig.preempt	…
Makefile	kernel: add kcov code coverage	2016-03-22 15:36:02 -07:00
acct.c	…
async.c	async: export current_is_async()	2015-11-19 17:51:48 +01:00
audit.c	Merge branch 'stable-4.6' of git://git.infradead.org/users/pcmoore/audit	2016-03-19 17:52:49 -07:00
audit.h	security: Make inode argument of inode_getsecid non-const	2015-12-24 11:09:39 -05:00
audit_fsnotify.c	wrappers for ->i_mutex access	2016-01-22 18:04:28 -05:00
audit_tree.c	audit: audit_tree_match can be boolean	2015-11-04 08:23:51 -05:00
audit_watch.c	Merge branch 'stable-4.6' of git://git.infradead.org/users/pcmoore/audit	2016-03-19 17:52:49 -07:00
auditfilter.c	audit: Fix typo in comment	2016-02-08 11:25:39 -05:00
auditsc.c	auditsc: for seccomp events, log syscall compat state using in_compat_syscall	2016-03-22 15:36:02 -07:00
backtracetest.c	…
bounds.c	…
capability.c	…
cgroup.c	cgroup: fix compile warning	2016-05-12 11:05:27 -04:00
cgroup_freezer.c	cgroup: kill cgrp_ss_priv[CGROUP_CANFORK_COUNT] and friends	2015-12-03 10:24:08 -05:00
cgroup_pids.c	cgroup_pids: fix a typo.	2015-12-14 14:54:37 -05:00
compat.c	compat: cleanup coding in compat_get_bitmap() and compat_put_bitmap()	2015-06-04 23:57:18 +02:00
configs.c	…
context_tracking.c	context_tracking: Switch to new static_branch API	2015-11-24 09:56:43 +01:00
cpu.c	cpu/hotplug: Fix rollback during error-out in __cpu_disable()	2016-04-22 09:49:49 +02:00
cpu_pm.c	kernel/cpu_pm: fix cpu_cluster_pm_exit comment	2015-09-03 02:42:20 +02:00
cpuset.c	cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback	2016-04-25 15:45:14 -04:00
crash_dump.c	…
cred.c	kmemcg: account certain kmem allocations to memcg	2016-01-14 16:00:49 -08:00
delayacct.c	kmemcg: account certain kmem allocations to memcg	2016-01-14 16:00:49 -08:00
dma.c	…
elfcore.c	…
exec_domain.c	…
exit.c	oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space	2016-03-25 16:37:42 -07:00
extable.c	kernel/extable.c: remove duplicated include	2015-09-10 13:29:01 -07:00
fork.c	kernel: add kcov code coverage	2016-03-22 15:36:02 -07:00
freezer.c	…
futex.c	futex: Acknowledge a new waiter in counter before plist	2016-04-21 11:06:09 +02:00
futex_compat.c	ptrace: use fsuid, fsgid, effective creds for fs access checks	2016-01-20 17:09:18 -08:00
groups.c	…
hung_task.c	kernel/hung_task.c: use timeout diff when timeout is updated	2016-03-22 15:36:02 -07:00
irq_work.c	treewide: Remove old email address	2015-11-23 09:44:58 +01:00
jump_label.c	treewide: Remove old email address	2015-11-23 09:44:58 +01:00
kallsyms.c	kallsyms: add support for relative offsets in kallsyms address table	2016-03-15 16:55:16 -07:00
kcmp.c	ptrace: use fsuid, fsgid, effective creds for fs access checks	2016-01-20 17:09:18 -08:00
kcov.c	kcov: don't profile branches in kcov	2016-04-28 19:34:04 -07:00
kexec.c	kexec: set KEXEC_TYPE_CRASH before sanity_check_segment_list()	2016-01-20 17:09:18 -08:00
kexec_core.c	kexec: export OFFSET(page.compound_head) to find out compound tail page	2016-04-28 19:34:04 -07:00
kexec_file.c	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security	2016-03-17 11:33:45 -07:00
kexec_internal.h	kexec: move some memembers and definitions within the scope of CONFIG_KEXEC_FILE	2016-01-20 17:09:18 -08:00
kmod.c	kmod: don't run async usermode helper as a child of kworker thread	2015-10-23 17:55:10 +09:00
kprobes.c	perf/x86/hw_breakpoints: Disallow kernel breakpoints unless kprobe-safe	2015-08-04 10:16:54 +02:00
ksysfs.c	rcu: Remove TINY_RCU bloat from pointless boot parameters	2015-12-07 16:59:37 -08:00
kthread.c	kernel/kthread.c:kthread_create_on_node(): clarify documentation	2015-09-04 16:54:41 -07:00
latencytop.c	sched/debug: Make schedstats a runtime tunable that is disabled by default	2016-02-09 11:54:23 +01:00
membarrier.c	sys_membarrier(): system-wide memory barrier (generic, x86)	2015-09-11 15:21:34 -07:00
memremap.c	memremap: add MEMREMAP_WC flag	2016-03-22 15:36:02 -07:00
module-internal.h	…
module.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching	2016-03-17 21:46:32 -07:00
module_signing.c	X.509: Make algo identifiers text instead of enum	2016-03-03 21:49:27 +00:00
notifier.c	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2015-09-01 08:40:25 -07:00
nsproxy.c	cgroup: introduce cgroup namespaces	2016-02-16 13:04:58 -05:00
padata.c	…
panic.c	panic: change nmi_panic from macro to function	2016-03-22 15:36:02 -07:00
params.c	Nothing exciting, minor tweaks and cleanups.	2015-11-09 15:53:39 -08:00
pid.c	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-01-31 15:44:04 -08:00
pid_namespace.c	…
profile.c	profile: hide unused functions when !CONFIG_PROC_FS	2016-03-22 15:36:02 -07:00
ptrace.c	ptrace: change __ptrace_unlink() to clear ->ptrace under ->siglock	2016-03-22 15:36:02 -07:00
range.c	…
reboot.c	kexec: split kexec_load syscall from kexec core code	2015-09-10 13:29:01 -07:00
relay.c	wrappers for ->i_mutex access	2016-01-22 18:04:28 -05:00
resource.c	/proc/iomem: only expose physical resource addresses to privileged users	2016-04-14 12:56:09 -07:00
seccomp.c	seccomp: check in_compat_syscall, not is_compat_task, in strict mode	2016-03-22 15:36:02 -07:00
signal.c	kernel/signal.c: add compile-time check for __ARCH_SI_PREAMBLE_SIZE	2016-03-22 15:36:02 -07:00
smp.c	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-03-15 13:50:29 -07:00
smpboot.c	cpu/hotplug: Unpark smpboot threads from the state machine	2016-03-01 20:36:56 +01:00
smpboot.h	cpu/hotplug: Create hotplug threads	2016-03-01 20:36:56 +01:00
softirq.c	arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections	2016-03-25 16:37:42 -07:00
stacktrace.c	…
stop_machine.c	kernel/stop_machine.c: remove CONFIG_SMP dependencies	2016-01-16 11:17:24 -08:00
sys.c	timer: convert timer_slack_ns from unsigned long to u64	2016-03-17 15:09:34 -07:00
sys_ni.c	vfs: add copy_file_range syscall and vfs helper	2015-12-01 14:00:53 -05:00
sysctl.c	mm: scale kswapd watermarks in proportion to memory	2016-03-17 15:09:34 -07:00
sysctl_binary.c	fs/coredump: prevent fsuid=0 dumps into user-controlled directories	2016-03-22 15:36:02 -07:00
task_work.c	task_work: remove fifo ordering guarantee	2015-09-05 13:46:58 -07:00
taskstats.c	taskstats: use the libnl API to align nlattr on 64-bit	2016-04-23 20:13:25 -04:00
test_kprobes.c	…
torture.c	torture: Consolidate cond_resched_rcu_qs() into stutter_wait()	2015-10-06 11:25:01 -07:00
tracepoint.c	kernel/...: convert pr_warning to pr_warn	2016-03-22 15:36:02 -07:00
tsacct.c	time, acct: Drop irq save & restore from __acct_update_integrals()	2016-02-29 09:53:09 +01:00
uid16.c	…
up.c	…
user-return-notifier.c	…
user.c	…
user_namespace.c	kernel/*: switch to memdup_user_nul()	2016-01-04 10:27:55 -05:00
utsname.c	…
utsname_sysctl.c	…
watchdog.c	watchdog: don't run proc_watchdog_update if new value is same as old	2016-03-17 15:09:34 -07:00
workqueue.c	Merge branch 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq	2016-05-13 16:16:51 -07:00
workqueue_internal.h	sched/core: Get rid of 'cpu' argument in wq_worker_sleeping()	2016-03-02 10:28:47 -05:00