Back in the days where eBPF (or back then "internal BPF" ;->) was not
exposed to user space, and only the classic BPF programs internally
translated into eBPF programs, we missed the fact that for classic BPF
A and X needed to be cleared. It was fixed back then via 83d5b7ef99
("net: filter: initialize A and X registers"), and thus classic BPF
specifics were added to the eBPF interpreter core to work around it.
This added some confusion for JIT developers later on that take the
eBPF interpreter code as an example for deriving their JIT. F.e. in
f75298f5c3 ("s390/bpf: clear correct BPF accumulator register"), at
least X could leak stack memory. Furthermore, since this is only needed
for classic BPF translations and not for eBPF (verifier takes care
that read access to regs cannot be done uninitialized), more complexity
is added to JITs as they need to determine whether they deal with
migrations or native eBPF where they can just omit clearing A/X in
their prologue and thus reduce image size a bit, see f.e. cde66c2d88
("s390/bpf: Only clear A and X for converted BPF programs"). In other
cases (x86, arm64), A and X is being cleared in the prologue also for
eBPF case, which is unnecessary.
Lets move this into the BPF migration in bpf_convert_filter() where it
actually belongs as long as the number of eBPF JITs are still few. It
can thus be done generically; allowing us to remove the quirk from
__bpf_prog_run() and to slightly reduce JIT image size in case of eBPF,
while reducing code duplication on this matter in current(/future) eBPF
JITs.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Yang Shi <yang.shi@linaro.org>
Acked-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
aarch64 doesn't have native store immediate instruction, such operation
has to be implemented by the below instruction sequence:
Load immediate to register
Store register
Signed-off-by: Yang Shi <yang.shi@linaro.org>
CC: Zi Shen Lim <zlim.lnx@gmail.com>
CC: Xi Wang <xi.wang@gmail.com>
Reviewed-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During code review, I noticed we were passing a bad buffer pointer
to bpf_load_pointer helper function called by jitted code.
Point to the buffer allocated by JIT, so we don't silently corrupt
other parts of the stack.
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Acked-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix list tests in netfilter ingress support, from Florian Westphal.
2) Fix reversal of input and output interfaces in ingress hook
invocation, from Pablo Neira Ayuso.
3) We have a use after free in r8169, caught by Dave Jones, fixed by
Francois Romieu.
4) Splice use-after-free fix in AF_UNIX frmo Hannes Frederic Sowa.
5) Three ipv6 route handling bug fixes from Martin KaFai Lau:
a) Don't create clone routes not managed by the fib6 tree
b) Don't forget to check expiration of DST_NOCACHE routes.
c) Handle rt->dst.from == NULL properly.
6) Several AF_PACKET fixes wrt transport header setting and SKB
protocol setting, from Daniel Borkmann.
7) Fix thunder driver crash on shutdown, from Pavel Fedin.
8) Several Mellanox driver fixes (max MTU calculations, use of correct
DMA unmap in TX path, etc.) from Saeed Mahameed, Tariq Toukan, Doron
Tsur, Achiad Shochat, Eran Ben Elisha, and Noa Osherovich.
9) Several mv88e6060 DSA driver fixes (wrong bit definitions for
certain registers, etc.) from Neil Armstrong.
10) Make sure to disable preemption while updating per-cpu stats of ip
tunnels, from Jason A. Donenfeld.
11) Various ARM64 bpf JIT fixes, from Yang Shi.
12) Flush icache properly in ARM JITs, from Daniel Borkmann.
13) Fix masking of RX and TX interrupts in ravb driver, from Masaru
Nagai.
14) Fix netdev feature propagation for devices not implementing
->ndo_set_features(). From Nikolay Aleksandrov.
15) Big endian fix in vmxnet3 driver, from Shrikrishna Khare.
16) RAW socket code increments incorrect SNMP counters, fix from Ben
Cartwright-Cox.
17) IPv6 multicast SNMP counters are bumped twice, fix from Neil Horman.
18) Fix handling of VLAN headers on stacked devices when REORDER is
disabled. From Vlad Yasevich.
19) Fix SKB leaks and use-after-free in ipvlan and macvlan drivers, from
Sabrina Dubroca.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits)
MAINTAINERS: Update Mellanox's Eth NIC driver entries
net/core: revert "net: fix __netdev_update_features return.." and add comment
af_unix: take receive queue lock while appending new skb
rtnetlink: fix frame size warning in rtnl_fill_ifinfo
net: use skb_clone to avoid alloc_pages failure.
packet: Use PAGE_ALIGNED macro
packet: Don't check frames_per_block against negative values
net: phy: Use interrupts when available in NOLINK state
phy: marvell: Add support for 88E1540 PHY
arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS
macvlan: fix leak in macvlan_handle_frame
ipvlan: fix use after free of skb
ipvlan: fix leak in ipvlan_rcv_frame
vlan: Do not put vlan headers back on bridge and macvlan ports
vlan: Fix untag operations of stacked vlans with REORDER_HEADER off
via-velocity: unconditionally drop frames with bad l2 length
ipg: Remove ipg driver
dl2k: Add support for IP1000A-based cards
snmp: Remove duplicate OUTMCAST stat increment
net: thunder: Check for driver data in nicvf_remove()
...
Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
in prologue in order to get the correct stack backtrace.
However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
change during function call so it may cause the BPF prog stack base address
change too.
Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
saved register, so it will keep intact during function call.
It is initialized in BPF prog prologue when BPF prog is started to run
everytime. Save and restore x25/x26 in BPF prologue and epilogue to keep
them intact for the outside of BPF. Actually, x26 is unnecessary, but SP
requires 16 bytes alignment.
So, the BPF stack layout looks like:
high
original A64_SP => 0:+-----+ BPF prologue
|FP/LR|
current A64_FP => -16:+-----+
| ... | callee saved registers
+-----+
| | x25/x26
BPF fp register => -80:+-----+
| |
| ... | BPF prog stack
| |
| |
current A64_SP => +-----+
| |
| ... | Function call stack
| |
+-----+
low
CC: Zi Shen Lim <zlim.lnx@gmail.com>
CC: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While recently going over ARM64's BPF code, I noticed that the icache
range we're flushing should start at header already and not at ctx.image.
Reason is that after b569c1c622 ("net: bpf: arm64: address randomize
and write protect JIT code"), we also want to make sure to flush the
random-sized trap in front of the start of the actual program (analogous
to x86). No operational differences from user side.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
BPF fp should point to the top of the BPF prog stack. The original
implementation made it point to the bottom incorrectly.
Move A64_SP to fp before reserve BPF prog stack space.
CC: Zi Shen Lim <zlim.lnx@gmail.com>
CC: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Reviewed-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- __cmpxchg_double*() return type fix to avoid truncation of a long to
int and subsequent logical "not" in cmpxchg_double() misinterpreting
the operation success/failure
- BPF fixes for mod and div by zero
- Fix compilation with STRICT_MM_TYPECHECKS enabled
- VDSO build fix without libgcov
- Some static and __maybe_unused annotations
- Kconfig clean-up (FRAME_POINTER)
- defconfig update for CRYPTO_CRC32_ARM64
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWRNNbAAoJEGvWsS0AyF7xp+wQAIc0A+uSReEJ0Be3kSWZIy0O
9wGCtfp2e3X78ibgVoP/+KvA1JUrMJNwNH54CgGgG6H4rwjRthCvIV/HbKfYufM8
vfuTL2MV1ywkNO0uTzspsICqgKPcpG27SwAlgOcxNXpO0Kui2OlKSxS4kTA8+6Z5
Lm64qDmFG7Z6wcBHhr8JSngC+xvXOvlcUW8odnjXjyCimwnpCFXXnRWDU3RnXJZa
3Khgp8OiRtnCSLfj7YBQA9wfNNgPgKdJ5wevz2g7hiIbYx0IOHmDpzbb3sUNMMKV
XLKeeJgqZL4EXZBCzapHRHCE/q0kiiBhzYSHw6aOBwjD9v683aytT/ax2/AgjzvW
nB3ZPdrbRMjcmNRBT2bheoU8diilhtfxSxf+4T+pVUnVMXDNl/xY9hekGA0hFO1z
nH5P5vkFKsX3U02Ox/G50Od2rM6p7uGRGFYuomSIoJYBItuxGOAuYWlY2+ujcxY5
YvAQ+3FYCkjLipVutlqLxKoZSY8Ex+0LOjPYYsI/+rsE70IVjGuLj0bTm8B/aTcy
dOctNqvOGwo8O5n2jsKM3XkjfUCPRdzu1C7rQz2BqfE9cPAZxg2fQpPv4SGtPuFe
lEvokuYRJ3qYnMt5MG/9Mkqmczfbch88A41wgS9/ySQ57eo3wISLkOiKqzKdJjOa
0qldWaEvST2iVUQmiMl7
=ApkD
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes and clean-ups from Catalin Marinas:
"Here's a second pull request for this merging window with some
fixes/clean-ups:
- __cmpxchg_double*() return type fix to avoid truncation of a long
to int and subsequent logical "not" in cmpxchg_double()
misinterpreting the operation success/failure
- BPF fixes for mod and div by zero
- Fix compilation with STRICT_MM_TYPECHECKS enabled
- VDSO build fix without libgcov
- Some static and __maybe_unused annotations
- Kconfig clean-up (FRAME_POINTER)
- defconfig update for CRYPTO_CRC32_ARM64"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: suspend: make hw_breakpoint_restore static
arm64: mmu: make split_pud and fixup_executable static
arm64: smp: make of_parse_and_init_cpus static
arm64: use linux/types.h in kvm.h
arm64: build vdso without libgcov
arm64: mark cpus_have_hwcap as __maybe_unused
arm64: remove redundant FRAME_POINTER kconfig option and force to select it
arm64: fix R/O permissions of FDT mapping
arm64: fix STRICT_MM_TYPECHECKS issue in PTE_CONT manipulation
arm64: bpf: fix mod-by-zero case
arm64: bpf: fix div-by-zero case
arm64: Enable CRYPTO_CRC32_ARM64 in defconfig
arm64: cmpxchg_dbl: fix return value type
Turns out in the case of modulo by zero in a BPF program:
A = A % X; (X == 0)
the expected behavior is to terminate with return value 0.
The bug in JIT is exposed by a new test case [1].
[1] https://lkml.org/lkml/2015/11/4/499
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Reported-by: Yang Shi <yang.shi@linaro.org>
Reported-by: Xi Wang <xi.wang@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
Fixes: e54bcde3d6 ("arm64: eBPF JIT compiler")
Cc: <stable@vger.kernel.org> # 3.18+
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In the case of division by zero in a BPF program:
A = A / X; (X == 0)
the expected behavior is to terminate with return value 0.
This is confirmed by the test case introduced in commit 86bf1721b2
("test_bpf: add tests checking that JIT/interpreter sets A and X to 0.").
Reported-by: Yang Shi <yang.shi@linaro.org>
Tested-by: Yang Shi <yang.shi@linaro.org>
CC: Xi Wang <xi.wang@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: linux-arm-kernel@lists.infradead.org
CC: linux-kernel@vger.kernel.org
Fixes: e54bcde3d6 ("arm64: eBPF JIT compiler")
Cc: <stable@vger.kernel.org> # 3.18+
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
As we need to add further flags to the bpf_prog structure, lets migrate
both bools to a bitfield representation. The size of the base structure
(excluding insns) remains unchanged at 40 bytes.
Add also tags for the kmemchecker, so that it doesn't throw false
positives. Even in case gcc would generate suboptimal code, it's not
being accessed in performance critical paths.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upper bits should be zeroed in endianness conversion:
- even when there's no need to change endianness (i.e., BPF_FROM_BE
on big endian or BPF_FROM_LE on little endian);
- after rev16.
This patch fixes such bugs by emitting extra instructions to clear
upper bits.
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Fixes: e54bcde3d6 ("arm64: eBPF JIT compiler")
Cc: <stable@vger.kernel.org> # 3.18+
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Problems occur when bpf_to or bpf_from has value prog->len - 1 (e.g.,
"Very long jump backwards" in test_bpf where the last instruction is a
jump): since ctx->offset has length prog->len, ctx->offset[bpf_to + 1]
or ctx->offset[bpf_from + 1] will cause an out-of-bounds read, leading
to a bogus jump offset and kernel panic.
This patch moves updating ctx->offset to after calling build_insn(),
and changes indexing to use bpf_to and bpf_from without + 1.
Fixes: e54bcde3d6 ("arm64: eBPF JIT compiler")
Cc: <stable@vger.kernel.org> # 3.18+
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Consider "(u64)insn1.imm << 32 | imm" in the arm64 JIT. Since imm is
signed 32-bit, it is sign-extended to 64-bit, losing the high 32 bits.
The fix is to convert imm to u32 first, which will be zero-extended to
u64 implicitly.
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>
Fixes: 30d3d94cc3 ("arm64: bpf: add 'load 64-bit immediate' instruction")
Signed-off-by: Xi Wang <xi.wang@gmail.com>
[will: removed non-arm64 bits and redundant casting]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Earlier implementation assumed last instruction is BPF_EXIT.
Since this is no longer a restriction in eBPF, we remove this
limitation.
Per Alexei Starovoitov [1]:
> classic BPF has a restriction that last insn is always BPF_RET.
> eBPF doesn't have BPF_RET instruction and this restriction.
> It has BPF_EXIT insn which can appear anywhere in the program
> one or more times and it doesn't have to be last insn.
[1] https://lkml.org/lkml/2014/11/27/2
Fixes: e54bcde3d6 ("arm64: eBPF JIT compiler")
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 286aad3c40 ("net: bpf: be friendly to kmemcheck") changed the
type of jited from a bitfield into a bool. As this commmit wasn't available
at the time when arm64 eBPF JIT was merged, fix it up now as net is merged
into mainline.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit 02ab695bb3 (net: filter: add "load 64-bit immediate" eBPF
instruction) introduced a new eBPF instruction. Let's add support
for this for arm64 as well.
Our arm64 eBPF JIT compiler now passes the new "load 64-bit
immediate" test case introduced in the same commit 02ab695bb3.
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit 72b603ee8c ("bpf: x86: add missing 'shift by register'
instructions to x64 eBPF JIT") noted support for 'shift by register'
in eBPF and added support for it for x64. Let's enable this for arm64
as well.
The arm64 eBPF JIT compiler now passes the new 'shift by register'
test case introduced in the same commit 72b603ee8c.
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This is the ARM64 variant for 314beb9bca ("x86: bpf_jit_comp: secure bpf
jit against spraying attacks").
Thanks to commit 11d91a770f ("arm64: Add CONFIG_DEBUG_SET_MODULE_RONX
support") which added necessary infrastructure, we can now implement
RO marking of eBPF generated JIT image pages and randomize start offset
for the JIT code, so that it does not reside directly on a page boundary
anymore. Likewise, the holes are filled with illegal instructions: here
we use BRK #0x100 (opcode 0xd4202000) to trigger a fault in the kernel
(unallocated BRKs would trigger a fault through do_debug_exception). This
seems more reliable as we don't have a guaranteed undefined instruction
space on ARM64.
This is basically the ARM64 variant of what we already have in ARM via
commit 55309dd3d4 ("net: bpf: arm: address randomize and write protect
JIT code"). Moreover, this commit also presents a merge resolution due to
conflicts with commit 60a3b2253c ("net: bpf: make eBPF interpreter images
read-only") as we don't use kfree() in bpf_jit_free() anymore to release
the locked bpf_prog structure, but instead bpf_prog_unlock_free() through
a different allocator.
JIT tested on aarch64 with BPF test suite.
Reference: http://mainisusuallyafunction.blogspot.com/2012/11/attacking-hardened-linux-systems-with.html
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Reviewed-by: Zi Shen Lim <zlim.lnx@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
On ARM64, when the BPF JIT compiler fills the JIT image body with
opcodes during translation of eBPF into ARM64 opcodes, we may fail
for several reasons during that phase: one being that we jump to
the notyet label for not yet supported eBPF instructions such as
BPF_ST. In that case we only free offsets, but not the actual
allocated target image where opcodes are being stored. Fix it by
calling module_free() on dismantle time in case of errors.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Zi Shen Lim <zlim.lnx@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The JIT compiler emits A64 instructions. It supports eBPF only.
Legacy BPF is supported thanks to conversion by BPF core.
JIT is enabled in the same way as for other architectures:
echo 1 > /proc/sys/net/core/bpf_jit_enable
Or for additional compiler output:
echo 2 > /proc/sys/net/core/bpf_jit_enable
See Documentation/networking/filter.txt for more information.
The implementation passes all 57 tests in lib/test_bpf.c
on ARMv8 Foundation Model :) Also tested by Will on Juno platform.
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>