linux/Documentation
Daniel Borkmann 4f3446bb80 bpf: add generic constant blinding for use in jits
This work adds a generic facility for use from eBPF JIT compilers
that allows for further hardening of JIT generated images through
blinding constants. In response to the original work on BPF JIT
spraying published by Keegan McAllister [1], most BPF JITs were
changed to make images read-only and start at a randomized offset
in the page, where the rest was filled with trap instructions. We
have this nowadays in x86, arm, arm64 and s390 JIT compilers.
Additionally, later work also made eBPF interpreter images read
only for kernels supporting DEBUG_SET_MODULE_RONX, that is, x86,
arm, arm64 and s390 archs as well currently. This is done by
default for mentioned JITs when JITing is enabled. Furthermore,
we had a generic and configurable constant blinding facility on our
todo for quite some time now to further make spraying harder, and
first implementation since around netconf 2016.

We found that for systems where untrusted users can load cBPF/eBPF
code where JIT is enabled, start offset randomization helps a bit
to make jumps into crafted payload harder, but in case where larger
programs that cross page boundary are injected, we again have some
part of the program opcodes at a page start offset. With improved
guessing and more reliable payload injection, chances can increase
to jump into such payload. Elena Reshetova recently wrote a test
case for it [2, 3]. Moreover, eBPF comes with 64 bit constants, which
can leave some more room for payloads. Note that for all this,
additional bugs in the kernel are still required to make the jump
(and of course to guess right, to not jump into a trap) and naturally
the JIT must be enabled, which is disabled by default.

For helping mitigation, the general idea is to provide an option
bpf_jit_harden that admins can tweak along with bpf_jit_enable, so
that for cases where JIT should be enabled for performance reasons,
the generated image can be further hardened with blinding constants
for unpriviledged users (bpf_jit_harden == 1), with trading off
performance for these, but not for privileged ones. We also added
the option of blinding for all users (bpf_jit_harden == 2), which
is quite helpful for testing f.e. with test_bpf.ko. There are no
further e.g. hardening levels of bpf_jit_harden switch intended,
rationale is to have it dead simple to use as on/off. Since this
functionality would need to be duplicated over and over for JIT
compilers to use, which are already complex enough, we provide a
generic eBPF byte-code level based blinding implementation, which is
then just transparently JITed. JIT compilers need to make only a few
changes to integrate this facility and can be migrated one by one.

This option is for eBPF JITs and will be used in x86, arm64, s390
without too much effort, and soon ppc64 JITs, thus that native eBPF
can be blinded as well as cBPF to eBPF migrations, so that both can
be covered with a single implementation. The rule for JITs is that
bpf_jit_blind_constants() must be called from bpf_int_jit_compile(),
and in case blinding is disabled, we follow normally with JITing the
passed program. In case blinding is enabled and we fail during the
process of blinding itself, we must return with the interpreter.
Similarly, in case the JITing process after the blinding failed, we
return normally to the interpreter with the non-blinded code. Meaning,
interpreter doesn't change in any way and operates on eBPF code as
usual. For doing this pre-JIT blinding step, we need to make use of
a helper/auxiliary register, here BPF_REG_AX. This is strictly internal
to the JIT and not in any way part of the eBPF architecture. Just like
in the same way as JITs internally make use of some helper registers
when emitting code, only that here the helper register is one
abstraction level higher in eBPF bytecode, but nevertheless in JIT
phase. That helper register is needed since f.e. manually written
program can issue loads to all registers of eBPF architecture.

The core concept with the additional register is: blind out all 32
and 64 bit constants by converting BPF_K based instructions into a
small sequence from K_VAL into ((RND ^ K_VAL) ^ RND). Therefore, this
is transformed into: BPF_REG_AX := (RND ^ K_VAL), BPF_REG_AX ^= RND,
and REG <OP> BPF_REG_AX, so actual operation on the target register
is translated from BPF_K into BPF_X one that is operating on
BPF_REG_AX's content. During rewriting phase when blinding, RND is
newly generated via prandom_u32() for each processed instruction.
64 bit loads are split into two 32 bit loads to make translation and
patching not too complex. Only basic thing required by JITs is to
call the helper bpf_jit_blind_constants()/bpf_jit_prog_release_other()
pair, and to map BPF_REG_AX into an unused register.

Small bpf_jit_disasm extract from [2] when applied to x86 JIT:

echo 0 > /proc/sys/net/core/bpf_jit_harden

  ffffffffa034f5e9 + <x>:
  [...]
  39:   mov    $0xa8909090,%eax
  3e:   mov    $0xa8909090,%eax
  43:   mov    $0xa8ff3148,%eax
  48:   mov    $0xa89081b4,%eax
  4d:   mov    $0xa8900bb0,%eax
  52:   mov    $0xa810e0c1,%eax
  57:   mov    $0xa8908eb4,%eax
  5c:   mov    $0xa89020b0,%eax
  [...]

echo 1 > /proc/sys/net/core/bpf_jit_harden

  ffffffffa034f1e5 + <x>:
  [...]
  39:   mov    $0xe1192563,%r10d
  3f:   xor    $0x4989b5f3,%r10d
  46:   mov    %r10d,%eax
  49:   mov    $0xb8296d93,%r10d
  4f:   xor    $0x10b9fd03,%r10d
  56:   mov    %r10d,%eax
  59:   mov    $0x8c381146,%r10d
  5f:   xor    $0x24c7200e,%r10d
  66:   mov    %r10d,%eax
  69:   mov    $0xeb2a830e,%r10d
  6f:   xor    $0x43ba02ba,%r10d
  76:   mov    %r10d,%eax
  79:   mov    $0xd9730af,%r10d
  7f:   xor    $0xa5073b1f,%r10d
  86:   mov    %r10d,%eax
  89:   mov    $0x9a45662b,%r10d
  8f:   xor    $0x325586ea,%r10d
  96:   mov    %r10d,%eax
  [...]

As can be seen, original constants that carry payload are hidden
when enabled, actual operations are transformed from constant-based
to register-based ones, making jumps into constants ineffective.
Above extract/example uses single BPF load instruction over and
over, but of course all instructions with constants are blinded.

Performance wise, JIT with blinding performs a bit slower than just
JIT and faster than interpreter case. This is expected, since we
still get all the performance benefits from JITing and in normal
use-cases not every single instruction needs to be blinded. Summing
up all 296 test cases averaged over multiple runs from test_bpf.ko
suite, interpreter was 55% slower than JIT only and JIT with blinding
was 8% slower than JIT only. Since there are also some extremes in
the test suite, I expect for ordinary workloads that the performance
for the JIT with blinding case is even closer to JIT only case,
f.e. nmap test case from suite has averaged timings in ns 29 (JIT),
35 (+ blinding), and 151 (interpreter).

BPF test suite, seccomp test suite, eBPF sample code and various
bigger networking eBPF programs have been tested with this and were
running fine. For testing purposes, I also adapted interpreter and
redirected blinded eBPF image to interpreter and also here all tests
pass.

  [1] http://mainisusuallyafunction.blogspot.com/2012/11/attacking-hardened-linux-systems-with.html
  [2] https://github.com/01org/jit-spray-poc-for-ksp/
  [3] http://www.openwall.com/lists/kernel-hardening/2016/05/03/5

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Elena Reshetova <elena.reshetova@intel.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:49:32 -04:00
..
ABI i2c: mux: demux-pinctrl: Update docs to new sysfs-attributes 2016-04-07 21:13:02 +02:00
DocBook cfg80211: Add option to report the bss entry in connect result 2016-04-26 09:40:12 +02:00
EDID
PCI
RCU documentation: Update RCU requirements based on expedited changes 2015-12-05 12:34:32 -08:00
accounting taskstats: fix nl parsing in accounting/getdelays.c 2016-04-27 12:50:14 -04:00
acpi mfd: core: redo ACPI matching of the children devices 2015-10-26 15:25:53 +01:00
aoe
arm ARM: SoC 64-bit changes for v4.6 2016-03-20 15:08:45 -07:00
arm64 arm64: Add workaround for Cavium erratum 27456 2016-02-26 15:14:27 +00:00
auxdisplay
backlight
blackfin
block A relatively boring cycle in the docs tree. There's a few kernel-doc 2016-01-17 11:55:07 -08:00
blockdev cpqarray: remove it from the kernel 2016-03-14 09:06:01 -06:00
bus-devices
cdrom
cgroup-v1 Merge branch 'for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2016-03-18 20:25:49 -07:00
cma
connector
console
cpu-freq Documentation: cpufreq: intel_pstate: fix typo 2016-02-18 20:31:53 +01:00
cpuidle
cris
crypto crypto: doc - Use ahash 2016-02-06 15:33:11 +08:00
development-process
device-mapper dm cache: make the 'mq' policy an alias for 'smq' 2016-03-10 17:12:08 -05:00
devicetree phy: add support for a reset-gpio specification 2016-05-16 13:22:53 -04:00
dmaengine Merge branch 'topic/async' into for-linus 2016-01-06 15:17:47 +05:30
driver-model Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2016-03-19 16:31:54 -07:00
dvb [media] media: change email address 2016-01-25 12:01:08 -02:00
early-userspace
extcon
fault-injection net: Add support for CHANGEUPPER notifier error injection 2015-12-03 11:49:23 -05:00
fb Documentation/fb: add documentation for sm712fb 2015-08-07 15:05:01 -07:00
features arm64 updates for 4.6: 2016-03-17 20:03:47 -07:00
filesystems mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage 2016-04-04 10:41:08 -07:00
firmware_class
fmc
fpga usage documentation for FPGA manager core 2015-10-07 18:07:20 +01:00
frv
gpio gpio: documenatation: fix GPIO_LOOKUP{,_IDX} documentation 2016-02-25 16:01:51 +01:00
hid
hwmon hwmon: Create an NSA320 hardware monitoring driver 2016-03-08 18:40:49 -08:00
i2c Doc: i2c: Fix typo in Documentation/i2c 2016-02-10 13:12:14 -07:00
ia64
ide
iio iio: Documentation: Add IIO configfs documentation 2015-12-03 18:19:28 +00:00
infiniband staging/rdma/hfi1: Method to toggle "fast ECN" detection 2016-03-10 20:37:50 -05:00
input Input: clarify we want BTN_TOOL_<name> on proximity 2016-04-06 10:23:09 -07:00
ioctl gpio: uapi: use 0xB4 as ioctl() major 2016-03-10 16:02:52 +07:00
isdn isdn: i4l: move active-isdn drivers to staging 2016-03-05 15:00:38 -08:00
ja_JP Doc: ja_JP: Fix a typo in HOWTO 2016-02-10 13:14:37 -07:00
kbuild kbuild: document recursive dependency limitation / resolution 2015-10-08 15:36:16 +02:00
kdump
ko_KR Documentation/ko_KR: update maintainer information 2016-02-17 14:10:39 -07:00
laptops Move freefall program from Documentation/ to tools/ 2015-06-08 16:42:07 -06:00
leds Documentation: leds: Add description of brightness setting API 2016-01-04 09:57:31 +01:00
locking Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 16:10:43 -08:00
m68k
memory-devices
metag
mic Char/Misc patches for 4.6-rc1 2016-03-17 13:47:50 -07:00
mips
misc-devices Merge char-misc-next into staging-next 2016-02-22 14:46:24 -08:00
mmc mmc: core: Remove MMC_CLKGATE 2015-10-26 16:00:09 +01:00
mn10300
mtd Documentation: mtd: improve nand_ecc.txt for readability and correctness 2015-11-17 17:05:14 -08:00
namespaces
netlabel
networking Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-09 15:59:24 -04:00
nfc NFC: Fix typo in nfc-hci.txt 2015-06-08 23:15:45 +02:00
nios2
nvdimm libnvdimm: documentation clarifications 2015-11-12 09:55:23 -08:00
nvmem Documentation: nvmem: add nvmem api level and how-to doc 2015-08-05 13:43:45 -07:00
parisc
pcmcia pcmcia: Fix typo in locking documentation 2015-08-07 14:34:58 +02:00
phy
platform
power PM / runtime: Document steps for device removal 2016-04-05 03:46:59 +02:00
powerpc cxl: Support to flash a new image on the adapter from a guest 2016-03-09 23:39:56 +11:00
pps Doc: pps: Fix file name in pps.txt 2015-07-14 12:35:42 -06:00
prctl Documentation: Fix int/unsigned int comparison 2016-02-17 14:09:43 -07:00
pti
ptp Another relatively boring cycle for the docs tree: typo fixes, translation 2016-03-17 12:09:35 -07:00
rapidio rapidio: add mport char device driver 2016-03-22 15:36:02 -07:00
s390 s390/zcore: remove /sys/kernel/debug/zcore/mem 2015-11-27 09:24:12 +01:00
scheduler
scsi st: Fix MTMKPART to work with newer drives 2016-02-23 21:27:02 -05:00
security keys, trusted: seal with a TPM2 authorization policy 2015-12-20 15:27:13 +02:00
serial tty: Remove chars_in_buffer() line discipline method 2016-01-27 15:01:44 -08:00
sh
sound Merge branch 'topic/hda-mst' into for-next 2016-02-10 09:25:15 +01:00
spi spi: tools: move spidev_test metadata 2015-11-30 12:14:12 +00:00
sysctl bpf: add generic constant blinding for use in jits 2016-05-16 13:49:32 -04:00
target target/user: Report capability of handling out-of-order completions to userspace 2016-03-10 21:49:09 -08:00
thermal thermal: doc: Add details of devm_thermal_zone_of_sensor_{register,unregister} 2016-03-09 10:51:41 -08:00
timers Another relatively boring cycle for the docs tree: typo fixes, translation 2016-03-17 12:09:35 -07:00
tpm
trace x86, tracing, perf: Add trace point for MSR accesses 2015-12-06 12:56:10 +01:00
usb doc: usb: Fix typo in gadget_multi documentation 2016-04-13 12:02:28 -07:00
vDSO Documentation/vDSO: don't build tests when cross compiling 2015-06-22 16:04:57 -06:00
video4linux [media] saa7134: Add support for Snazio TvPVR PRO 2016-03-03 09:03:48 -03:00
virtual One of the largest releases for KVM... Hardly any generic improvement, 2016-03-16 09:55:35 -07:00
vm mm: thp: set THP defrag by default to madvise and add a stall-free defrag option 2016-03-17 15:09:34 -07:00
w1 w1: masters: omap_hdq: add support for 1-wire mode 2015-10-05 04:47:09 +01:00
watchdog Merge git://www.linux-watchdog.org/linux-watchdog 2016-03-19 19:35:51 -07:00
wimax
x86 x86/doc: Correct limits in Documentation/x86/x86_64/mm.txt 2016-04-22 10:03:24 +02:00
xtensa
zh_CN Documentation: Chinese translation of arm64/silicon-errata.txt 2016-02-17 14:08:07 -07:00
00-INDEX
BUG-HUNTING
Changes There is a nice new document from Neil on how pathname lookups work and 2015-11-05 15:59:24 -08:00
CodeOfConflict
CodingStyle Documentation/CodingStyle: add space before parenthesis in example macro 2016-01-25 12:36:28 -07:00
DMA-API-HOWTO.txt dma-mapping: always provide the dma_map_ops based implementation 2016-01-20 17:09:18 -08:00
DMA-API.txt DMA-API: fix confusing sentence in Documentation/DMA-API.txt 2016-01-11 18:29:00 -07:00
DMA-ISA-LPC.txt
DMA-attributes.txt ARM: 8506/1: common: DMA-mapping: add DMA_ATTR_ALLOC_SINGLE_PAGES attribute 2016-02-11 15:33:38 +00:00
HOWTO Documentation: Howto: Fixed subtitles style 2016-03-09 15:30:03 -07:00
IPMI.txt ipmi watchdog : add panic_wdt_timeout parameter 2015-11-16 06:28:43 -06:00
IRQ-affinity.txt
IRQ-domain.txt irqdomain: Documentation updates 2015-10-13 19:01:25 +02:00
IRQ.txt
Intel-IOMMU.txt iommu/vt-d: Fix link to Intel IOMMU Specification 2016-01-29 12:32:12 +01:00
Makefile spi: Move spi code from Documentation to tools 2015-11-23 14:54:01 +00:00
ManagementStyle
SAK.txt
SM501.txt
SecurityBugs
SubmitChecklist
SubmittingDrivers
SubmittingPatches SubmittingPatches: fix spelling of "git send-email" 2016-01-25 12:30:18 -07:00
VGA-softcursor.txt
adding-syscalls.txt Documentation: describe how to add a system call 2015-08-13 17:54:06 -06:00
applying-patches.txt
assoc_array.txt
atomic_ops.txt locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h 2015-09-13 10:35:46 +02:00
bad_memory.txt
basic_profiling.txt
bcache.txt
binfmt_misc.txt
braille-console.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
cachetlb.txt
cgroup-v2.txt Merge branch 'for-4.6-ns' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2016-03-21 10:05:13 -07:00
circular-buffers.txt
clk.txt clk: change clk_ops' ->determine_rate() prototype 2015-07-27 18:12:01 -07:00
coccinelle.txt
cpu-hotplug.txt Documentation: cpu-hotplug: Fix sysfs mount instructions 2015-12-10 11:35:30 -07:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt Doc: Change wikipedia's URL from http to https 2015-06-22 10:14:05 -06:00
dell_rbu.txt
devices.txt
digsig.txt
dma-buf-sharing.txt dma-buf: Update docs for SYNC ioctl 2016-03-21 09:26:45 +01:00
dontdiff Documentation: dontdiff: remove media from dontdiff 2015-11-11 10:08:07 -07:00
dynamic-debug-howto.txt
edac.txt EDAC: Remove references to bluesmoke.sourceforge.net 2015-11-26 14:46:06 +01:00
efi-stub.txt doc: efi-stub.txt: Fix arm64 paths 2015-12-14 15:24:03 +00:00
eisa.txt
email-clients.txt A few more documentation patches that wandered in and have no reason to 2015-11-13 09:19:05 -08:00
flexible-arrays.txt
futex-requeue-pi.txt
gcov.txt
gdb-kernel-debugging.txt
highuid.txt
hsi.txt
hw_random.txt hwrng: doc - Fix device node name reference /dev/hw_random => /dev/hwrng 2015-09-21 22:00:41 +08:00
hwspinlock.txt
init.txt
initrd.txt
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt
irqflags-tracing.txt
isapnp.txt
java.txt
kasan.txt mm, kasan: SLAB support 2016-03-25 16:37:42 -07:00
kcov.txt kernel: add kcov code coverage 2016-03-22 15:36:02 -07:00
kernel-doc-nano-HOWTO.txt Documenation: Update location of docproc.c 2015-07-14 12:36:39 -06:00
kernel-docs.txt Documentation: translations: update linux cross reference link 2016-01-11 18:26:58 -07:00
kernel-parameters.txt USB: uas: Add a new NO_REPORT_LUNS quirk 2016-04-13 12:02:28 -07:00
kernel-per-CPU-kthreads.txt irq_poll: make blk-iopoll available outside the block layer 2015-12-11 11:52:24 -08:00
kmemcheck.txt
kmemleak.txt Doc: Change wikipedia's URL from http to https 2015-06-22 10:14:05 -06:00
kobject.txt
kprobes.txt
kref.txt
kselftest.txt Documentation: kselftest: Remove duplicate word 2016-03-09 15:33:38 -07:00
ldm.txt
local_ops.txt
lockup-watchdogs.txt kernel/watchdog.c: add sysctl knob hardlockup_panic 2015-11-05 19:34:48 -08:00
logo.gif
logo.txt
lzo.txt
magic-number.txt
mailbox.txt Documentation: minor typo fix in mailbox.txt 2015-08-13 18:03:18 -06:00
md-cluster.txt md-cluster: update the documentation 2016-01-06 11:39:06 +11:00
md.txt doc:md: fix typo in md.txt. 2015-06-23 06:49:44 -06:00
memory-barriers.txt documentation: Clarify compiler store-fusion example 2016-03-14 15:52:19 -07:00
memory-hotplug.txt memory-hotplug: add automatic onlining policy for the newly added memory 2016-03-15 16:55:16 -07:00
men-chameleon-bus.txt Documentation: Minor changes to men-chameleon-bus.txt 2015-07-24 15:15:17 +02:00
module-signing.txt modsign: Fix documentation on module signing enforcement parameter. 2016-03-12 01:48:11 -07:00
mono.txt
nommu-mmap.txt
ntb.txt NTB: Rename Intel code names to platform names 2015-07-04 14:09:25 -04:00
numastat.txt
oops-tracing.txt
padata.txt
parport-lowlevel.txt
parport.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pinctrl.txt
pnp.txt
preempt-locking.txt
printk-formats.txt mm, printk: introduce new format string for flags 2016-03-15 16:55:16 -07:00
pwm.txt
ramoops.txt
rbtree.txt documentation: fix small typo in rbtree.txt 2015-09-13 14:38:50 -06:00
remoteproc.txt remoteproc: introduce rproc_get_by_phandle API 2015-06-16 21:12:52 +03:00
rfkill.txt rfkill: Add documentation about LED triggers 2016-02-24 09:13:12 +01:00
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt rtc: implement a sysfs interface for clock offset 2016-03-14 17:08:16 +01:00
serial-console.txt
sgi-ioc4.txt
smsc_ece1099.txt
sparse.txt
stable_api_nonsense.txt
stable_kernel_rules.txt stable_kernel_rules.txt: Remove extra space after Cc: 2015-11-20 16:54:57 -07:00
static-keys.txt locking/static_keys: Fix up the static keys documentation 2015-09-15 07:12:06 +02:00
svga.txt
sysfs-rules.txt
sysrq.txt mm, oom: do not panic for oom kills triggered from sysrq 2015-09-08 15:35:28 -07:00
this_cpu_ops.txt
ubsan.txt UBSAN: run-time undefined behavior sanity checker 2016-01-20 17:09:18 -08:00
unaligned-memory-access.txt
unicode.txt
unshare.txt
vfio.txt vfio: powerpc/spapr: Support Dynamic DMA windows 2015-06-11 15:16:55 +10:00
vgaarbiter.txt
video-output.txt
vme_api.txt Documentation: mention vme_master_mmap() in VME API 2015-06-12 17:26:56 -07:00
volatile-considered-harmful.txt
workqueue.txt
xillybus.txt
xz.txt
zorro.txt