Commit Graph

152 Commits

Author SHA1 Message Date
Andrey Ignatov a40b712e4c bpf: Sync bpf.h to tools/
Sync bpf_get_socket_cookie() related bpf UAPI changes to tools/.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-31 09:33:48 +02:00
Andrey Ignatov 060a7fccd3 bpf: Sync bpf.h to tools/
Sync BPF_SOCK_OPS_TCP_LISTEN_CB related UAPI changes to tools/.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-15 00:08:41 +02:00
Quentin Monnet 9b8ca37951 tools: bpf: synchronise BPF UAPI header with tools
Update with latest changes from include/uapi/linux/bpf.h header.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-12 18:55:53 +02:00
Arnaldo Carvalho de Melo f568b47281 tools include uapi: Synchronize bpf.h with the kernel
To pick the rename in:

  bd3a08aaa9 ("bpf: flowlabel in bpf_fib_lookup should be flowinfo")

Silencing this build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zd1sgtbybtjrrt7bqdybu0s0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-25 11:59:36 -03:00
David S. Miller fd129f8941 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-06-05

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add a new BPF hook for sendmsg similar to existing hooks for bind and
   connect: "This allows to override source IP (including the case when it's
   set via cmsg(3)) and destination IP:port for unconnected UDP (slow path).
   TCP and connected UDP (fast path) are not affected. This makes UDP support
   complete, that is, connected UDP is handled by connect hooks, unconnected
   by sendmsg ones.", from Andrey.

2) Rework of the AF_XDP API to allow extending it in future for type writer
   model if necessary. In this mode a memory window is passed to hardware
   and multiple frames might be filled into that window instead of just one
   that is the case in the current fixed frame-size model. With the new
   changes made this can be supported without having to add a new descriptor
   format. Also, core bits for the zero-copy support for AF_XDP have been
   merged as agreed upon, where i40e bits will be routed via Jeff later on.
   Various improvements to documentation and sample programs included as
   well, all from Björn and Magnus.

3) Given BPF's flexibility, a new program type has been added to implement
   infrared decoders. Quote: "The kernel IR decoders support the most
   widely used IR protocols, but there are many protocols which are not
   supported. [...] There is a 'long tail' of unsupported IR protocols,
   for which lircd is need to decode the IR. IR encoding is done in such
   a way that some simple circuit can decode it; therefore, BPF is ideal.
   [...] user-space can define a decoder in BPF, attach it to the rc
   device through the lirc chardev.", from Sean.

4) Several improvements and fixes to BPF core, among others, dumping map
   and prog IDs into fdinfo which is a straight forward way to correlate
   BPF objects used by applications, removing an indirect call and therefore
   retpoline in all map lookup/update/delete calls by invoking the callback
   directly for 64 bit archs, adding a new bpf_skb_cgroup_id() BPF helper
   for tc BPF programs to have an efficient way of looking up cgroup v2 id
   for policy or other use cases. Fixes to make sure we zero tunnel/xfrm
   state that hasn't been filled, to allow context access wrt pt_regs in
   32 bit archs for tracing, and last but not least various test cases
   for fixes that landed in bpf earlier, from Daniel.

5) Get rid of the ndo_xdp_flush API and extend the ndo_xdp_xmit with
   a XDP_XMIT_FLUSH flag instead which allows to avoid one indirect
   call as flushing is now merged directly into ndo_xdp_xmit(), from Jesper.

6) Add a new bpf_get_current_cgroup_id() helper that can be used in
   tracing to retrieve the cgroup id from the current process in order
   to allow for e.g. aggregation of container-level events, from Yonghong.

7) Two follow-up fixes for BTF to reject invalid input values and
   related to that also two test cases for BPF kselftests, from Martin.

8) Various API improvements to the bpf_fib_lookup() helper, that is,
   dropping MPLS bits which are not fully hashed out yet, rejecting
   invalid helper flags, returning error for unsupported address
   families as well as renaming flowlabel to flowinfo, from David.

9) Various fixes and improvements to sockmap BPF kselftests in particular
   in proper error detection and data verification, from Prashant.

10) Two arm32 BPF JIT improvements. One is to fix imm range check with
    regards to whether immediate fits into 24 bits, and a naming cleanup
    to get functions related to rsh handling consistent to those handling
    lsh, from Wang.

11) Two compile warning fixes in BPF, one for BTF and a false positive
    to silent gcc in stack_map_get_build_id_offset(), from Arnd.

12) Add missing seg6.h header into tools include infrastructure in order
    to fix compilation of BPF kselftests, from Mathieu.

13) Several formatting cleanups in the BPF UAPI helper description that
    also fix an error during rst2man compilation, from Quentin.

14) Hide an unused variable in sk_msg_convert_ctx_access() when IPv6 is
    not built into the kernel, from Yue.

15) Remove a useless double assignment in dev_map_enqueue(), from Colin.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-05 12:42:19 -04:00
Yonghong Song c7ddbbaf1e tools/bpf: sync uapi bpf.h for bpf_get_current_cgroup_id() helper
Sync kernel uapi/linux/bpf.h with tools uapi/linux/bpf.h.
Also add the necessary helper define in bpf_helpers.h.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03 18:22:41 -07:00
Daniel Borkmann 6b6a192595 bpf: sync bpf uapi header with tools
Pull in recent changes from include/uapi/linux/bpf.h.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03 07:46:56 -07:00
David S. Miller 9c54aeb03a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Filling in the padding slot in the bpf structure as a bug fix in 'ne'
overlapped with actually using that padding area for something in
'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-03 09:31:58 -04:00
Daniel Borkmann 36f9814a49 bpf: fix uapi hole for 32 bit compat applications
In 64 bit, we have a 4 byte hole between ifindex and netns_dev in the
case of struct bpf_map_info but also struct bpf_prog_info. In net-next
commit b85fab0e67 ("bpf: Add gpl_compatible flag to struct bpf_prog_info")
added a bitfield into it to expose some flags related to programs. Thus,
add an unnamed __u32 bitfield for both so that alignment keeps the same
in both 32 and 64 bit cases, and can be naturally extended from there
as in b85fab0e67.

Before:

  # file test.o
  test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
  # pahole test.o
  struct bpf_map_info {
	__u32                      type;                 /*     0     4 */
	__u32                      id;                   /*     4     4 */
	__u32                      key_size;             /*     8     4 */
	__u32                      value_size;           /*    12     4 */
	__u32                      max_entries;          /*    16     4 */
	__u32                      map_flags;            /*    20     4 */
	char                       name[16];             /*    24    16 */
	__u32                      ifindex;              /*    40     4 */
	__u64                      netns_dev;            /*    44     8 */
	__u64                      netns_ino;            /*    52     8 */

	/* size: 64, cachelines: 1, members: 10 */
	/* padding: 4 */
  };

After (same as on 64 bit):

  # file test.o
  test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
  # pahole test.o
  struct bpf_map_info {
	__u32                      type;                 /*     0     4 */
	__u32                      id;                   /*     4     4 */
	__u32                      key_size;             /*     8     4 */
	__u32                      value_size;           /*    12     4 */
	__u32                      max_entries;          /*    16     4 */
	__u32                      map_flags;            /*    20     4 */
	char                       name[16];             /*    24    16 */
	__u32                      ifindex;              /*    40     4 */

	/* XXX 4 bytes hole, try to pack */

	__u64                      netns_dev;            /*    48     8 */
	__u64                      netns_ino;            /*    56     8 */
	/* --- cacheline 1 boundary (64 bytes) --- */

	/* size: 64, cachelines: 1, members: 10 */
	/* sum members: 60, holes: 1, sum holes: 4 */
  };

Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Reported-by: Eugene Syromiatnikov <esyr@redhat.com>
Fixes: 52775b33bb ("bpf: offload: report device information about offloaded maps")
Fixes: 675fc275a3 ("bpf: offload: report device information for offloaded programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-01 20:41:35 -07:00
Sean Young 6bdd533cee bpf: add selftest for lirc_mode2 type program
This is simple test over rc-loopback.

Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-30 12:40:14 +02:00
Quentin Monnet 7a279e9333 bpf: clean up eBPF helpers documentation
These are minor edits for the eBPF helpers documentation in
include/uapi/linux/bpf.h.

The main fix consists in removing "BPF_FIB_LOOKUP_", because it ends
with a non-escaped underscore that gets interpreted by rst2man and
produces the following message in the resulting manual page:

    DOCUTILS SYSTEM MESSAGES
           System Message: ERROR/3 (/tmp/bpf-helpers.rst:, line 1514)
                  Unknown target name: "bpf_fib_lookup".

Other edits consist in:

- Improving formatting for flag values for "bpf_fib_lookup()" helper.
- Emphasising a parameter name in description of the return value for
  "bpf_get_stack()" helper.
- Removing unnecessary blank lines between "Description" and "Return"
  sections for the few helpers that would use it, for consistency.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-29 21:42:13 +02:00
Andrey Ignatov 3024cf8248 bpf: Sync bpf.h to tools/
Sync new `BPF_CGROUP_UDP4_SENDMSG` and `BPF_CGROUP_UDP6_SENDMSG`
attach types to tools/.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-28 17:41:02 +02:00
Yonghong Song 30687ad94e tools/bpf: sync kernel header bpf.h and add bpf_task_fd_query in libbpf
Sync kernel header bpf.h to tools/include/uapi/linux/bpf.h and
implement bpf_task_fd_query() in libbpf. The test programs
in samples/bpf and tools/testing/selftests/bpf, and later bpftool
will use this libbpf function to query kernel.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-24 18:18:19 -07:00
Mathieu Xhonneux c99a84eac0 selftests/bpf: test for seg6local End.BPF action
Add a new test for the seg6local End.BPF action. The following helpers
are also tested:

- bpf_lwt_push_encap within the LWT BPF IN hook
- bpf_lwt_seg6_action
- bpf_lwt_seg6_adjust_srh
- bpf_lwt_seg6_store_bytes

A chain of End.BPF actions is built. The SRH is injected through a LWT
BPF IN hook before entering this chain. Each End.BPF action validates
the previous one, otherwise the packet is dropped. The test succeeds
if the last node in the chain receives the packet and the UDP datagram
contained can be retrieved from userspace.

Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-24 11:57:36 +02:00
Sandipan Das bd980d43b9 tools: bpf: sync bpf uapi header
Syncing the bpf.h uapi header with tools so that struct
bpf_prog_info has the two new fields for passing on the
JITed image lengths of each function in a multi-function
program.

Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-24 09:20:49 +02:00
Sandipan Das dd0c5f072e tools: bpf: sync bpf uapi header
Syncing the bpf.h uapi header with tools so that struct
bpf_prog_info has the two new fields for passing on the
addresses of the kernel symbols corresponding to each
function in a program.

Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-24 09:20:49 +02:00
Martin KaFai Lau f03b15d34b bpf: btf: Sync bpf.h and btf.h to tools
This patch sync the uapi bpf.h and btf.h to tools.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-23 12:03:32 +02:00
John Fastabend 4da0dcabe4 bpf: add sk_msg prog sk access tests to test_verifier
Add tests for BPF_PROG_TYPE_SK_MSG to test_verifier for read access
to new sk fields.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-18 22:44:11 +02:00
John Fastabend b8b394faa9 bpf: selftest additions for SOCKHASH
This runs existing SOCKMAP tests with SOCKHASH map type. To do this
we push programs into include file and build two BPF programs. One
for SOCKHASH and one for SOCKMAP.

We then run the entire test suite with each type.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-16 22:00:12 +02:00
Prashant Bhole cb9c28ef57 bpf: sync tools bpf.h uapi header
Sync the header from include/uapi/linux/bpf.h which was updated to add
fib lookup helper function. This fixes selftests/bpf build failure.

Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-11 01:35:25 +02:00
Martin KaFai Lau 7a01f6a3bd bpf: btf: Update tools/include/uapi/linux/btf.h with BTF ID
This patch sync the tools/include/uapi/linux/btf.h with
the newly introduced BTF ID support.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-09 17:25:14 +02:00
Daniel Borkmann 32b3652c30 bpf: sync tools bpf.h uapi header
Only sync the header from include/uapi/linux/bpf.h.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-03 16:49:20 -07:00
Quentin Monnet a56497d3d5 bpf: update bpf.h uapi header for tools
Bring fixes for eBPF helper documentation formatting to bpf.h under
tools/ as well.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-30 13:53:12 +02:00
Andrey Ignatov 96871b9f6f bpf: Sync bpf.h to tools/
The patch syncs bpf.h to tools/.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-04-29 08:56:31 -07:00
Yonghong Song de2ff05f48 tools/bpf: add bpf_get_stack helper to tools headers
The tools header file bpf.h is synced with kernel uapi bpf.h.
The new helper is also added to bpf_helpers.h.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-04-29 08:45:53 -07:00
Quentin Monnet 9cde0c8892 bpf: update bpf.h uapi header for tools
Update tools/include/uapi/linux/bpf.h file in order to reflect the
changes for BPF helper functions documentation introduced in previous
commits.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:59 +02:00
Jiri Olsa fb6ef42b5c tools, bpf: Sync bpf.h uapi header
Syncing the bpf.h uapi header with tools.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-26 22:38:38 +02:00
John Fastabend 16962b2404 bpf: sockmap, add selftests
This adds a new test program test_sockmap which is the old sample
sockmap program. By moving the sample program here we can now run it
as part of the self tests suite. To support this a populate_progs()
routine is added to load programs and maps which was previously done
with load_bpf_file(). This is needed because self test libs do not
provide a similar routine. Also we now use the cgroup_helpers
routines to manage cgroup use instead of manually creating one and
supplying it to the CLI.

Notice we keep the CLI around though because it is useful for dbg
and specialized testing.

To run use ./test_sockmap and the result should be,

Summary 660 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 00:06:20 +02:00
Eyal Birger 29a36f9eef samples/bpf: extend test_tunnel_bpf.sh with xfrm state test
Add a test for fetching xfrm state parameters from a tc program running
on ingress.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24 22:26:58 +02:00
Martin KaFai Lau 3bd86a8409 bpf: btf: Sync bpf.h and btf.h to tools/
This patch sync up the bpf.h and btf.h to tools/

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-19 21:46:25 +02:00
Nikita V. Shirokov 0367d0a294 bpf: adding tests for bpf_xdp_adjust_tail
adding selftests for bpf_xdp_adjust_tail helper. in this synthetic test
we are testing that 1) if data_end < data helper will return EINVAL
2) for normal use case packet's length would be reduced.

Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-18 23:34:17 +02:00
Ingo Molnar e2f73a1828 tools/headers: Synchronize kernel ABI headers, v4.17-rc1
Sync the following tooling headers with the latest kernel version:

  tools/arch/arm/include/uapi/asm/kvm.h
    - New ABI: KVM_REG_ARM_*

  tools/arch/x86/include/asm/required-features.h
    - Removal of NEED_LA57 dependency

  tools/arch/x86/include/uapi/asm/kvm.h
    - New KVM ABI: KVM_SYNC_X86_*

  tools/include/uapi/asm-generic/mman-common.h
    - New ABI: MAP_FIXED_NOREPLACE flag

  tools/include/uapi/linux/bpf.h
    - New ABI: BPF_F_SEQ_NUMBER functions

  tools/include/uapi/linux/if_link.h
    - New ABI: IFLA tun and rmnet support

  tools/include/uapi/linux/kvm.h
    - New ABI: hyperv eventfd and CONN_ID_MASK support plus header cleanups

  tools/include/uapi/sound/asound.h
    - New ABI: SNDRV_PCM_FORMAT_FIRST PCM format specifier

  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
    - The x86 system call table description changed due to the ptregs changes and the renames, in:

	d5a00528b58c: syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()
	5ac9efa3c50d: syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention
	ebeb8c82ffaf: syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32

Also fix the x86 syscall table warning:

  -Warning: Kernel ABI header at 'tools/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  +Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'

None of these changes impact existing tooling code, so we only have to copy the kernel version.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brian Robbins <brianrob@microsoft.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com> <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Takuya Yamamoto <tkydevel@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lkml.kernel.org/r/20180416064024.ofjtrz5yuu3ykhvl@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-17 09:47:39 -03:00
Andrey Ignatov 1d436885b2 selftests/bpf: Selftest for sys_bind post-hooks.
Add selftest for attach types `BPF_CGROUP_INET4_POST_BIND` and
`BPF_CGROUP_INET6_POST_BIND`.

The main things tested are:
* prog load behaves as expected (valid/invalid accesses in prog);
* prog attach behaves as expected (load- vs attach-time attach types);
* `BPF_CGROUP_INET_SOCK_CREATE` can be attached in a backward compatible
  way;
* post-hooks return expected result and errno.

Example:
  # ./test_sock
  Test case: bind4 load with invalid access: src_ip6 .. [PASS]
  Test case: bind4 load with invalid access: mark .. [PASS]
  Test case: bind6 load with invalid access: src_ip4 .. [PASS]
  Test case: sock_create load with invalid access: src_port .. [PASS]
  Test case: sock_create load w/o expected_attach_type (compat mode) ..
  [PASS]
  Test case: sock_create load w/ expected_attach_type .. [PASS]
  Test case: attach type mismatch bind4 vs bind6 .. [PASS]
  Test case: attach type mismatch bind6 vs bind4 .. [PASS]
  Test case: attach type mismatch default vs bind4 .. [PASS]
  Test case: attach type mismatch bind6 vs sock_create .. [PASS]
  Test case: bind4 reject all .. [PASS]
  Test case: bind6 reject all .. [PASS]
  Test case: bind6 deny specific IP & port .. [PASS]
  Test case: bind4 allow specific IP & port .. [PASS]
  Test case: bind4 allow all .. [PASS]
  Test case: bind6 allow all .. [PASS]
  Summary: 16 PASSED, 0 FAILED

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-31 02:16:40 +02:00
Andrey Ignatov 622adafb2a selftests/bpf: Selftest for sys_connect hooks
Add selftest for BPF_CGROUP_INET4_CONNECT and BPF_CGROUP_INET6_CONNECT
attach types.

Try to connect(2) to specified IP:port and test that:
* remote IP:port pair is overridden;
* local end of connection is bound to specified IP.

All combinations of IPv4/IPv6 and TCP/UDP are tested.

Example:
  # tcpdump -pn -i lo -w connect.pcap 2>/dev/null &
  [1] 478
  # strace -qqf -e connect -o connect.trace ./test_sock_addr.sh
  Wait for testing IPv4/IPv6 to become available ... OK
  Load bind4 with invalid type (can pollute stderr) ... REJECTED
  Load bind4 with valid type ... OK
  Attach bind4 with invalid type ... REJECTED
  Attach bind4 with valid type ... OK
  Load connect4 with invalid type (can pollute stderr) libbpf: load bpf \
    program failed: Permission denied
  libbpf: -- BEGIN DUMP LOG ---
  libbpf:
  0: (b7) r2 = 23569
  1: (63) *(u32 *)(r1 +24) = r2
  2: (b7) r2 = 16777343
  3: (63) *(u32 *)(r1 +4) = r2
  invalid bpf_context access off=4 size=4
  [ 1518.404609] random: crng init done

  libbpf: -- END LOG --
  libbpf: failed to load program 'cgroup/connect4'
  libbpf: failed to load object './connect4_prog.o'
  ... REJECTED
  Load connect4 with valid type ... OK
  Attach connect4 with invalid type ... REJECTED
  Attach connect4 with valid type ... OK
  Test case #1 (IPv4/TCP):
          Requested: bind(192.168.1.254, 4040) ..
             Actual: bind(127.0.0.1, 4444)
          Requested: connect(192.168.1.254, 4040) from (*, *) ..
             Actual: connect(127.0.0.1, 4444) from (127.0.0.4, 56068)
  Test case #2 (IPv4/UDP):
          Requested: bind(192.168.1.254, 4040) ..
             Actual: bind(127.0.0.1, 4444)
          Requested: connect(192.168.1.254, 4040) from (*, *) ..
             Actual: connect(127.0.0.1, 4444) from (127.0.0.4, 56447)
  Load bind6 with invalid type (can pollute stderr) ... REJECTED
  Load bind6 with valid type ... OK
  Attach bind6 with invalid type ... REJECTED
  Attach bind6 with valid type ... OK
  Load connect6 with invalid type (can pollute stderr) libbpf: load bpf \
    program failed: Permission denied
  libbpf: -- BEGIN DUMP LOG ---
  libbpf:
  0: (b7) r6 = 0
  1: (63) *(u32 *)(r1 +12) = r6
  invalid bpf_context access off=12 size=4

  libbpf: -- END LOG --
  libbpf: failed to load program 'cgroup/connect6'
  libbpf: failed to load object './connect6_prog.o'
  ... REJECTED
  Load connect6 with valid type ... OK
  Attach connect6 with invalid type ... REJECTED
  Attach connect6 with valid type ... OK
  Test case #3 (IPv6/TCP):
          Requested: bind(face:b00c:1234:5678::abcd, 6060) ..
             Actual: bind(::1, 6666)
          Requested: connect(face:b00c:1234:5678::abcd, 6060) from (*, *)
             Actual: connect(::1, 6666) from (::6, 37458)
  Test case #4 (IPv6/UDP):
          Requested: bind(face:b00c:1234:5678::abcd, 6060) ..
             Actual: bind(::1, 6666)
          Requested: connect(face:b00c:1234:5678::abcd, 6060) from (*, *)
             Actual: connect(::1, 6666) from (::6, 39315)
  ### SUCCESS
  # egrep 'connect\(.*AF_INET' connect.trace | \
  > egrep -vw 'htons\(1025\)' | fold -b -s -w 72
  502   connect(7, {sa_family=AF_INET, sin_port=htons(4040),
  sin_addr=inet_addr("192.168.1.254")}, 128) = 0
  502   connect(8, {sa_family=AF_INET, sin_port=htons(4040),
  sin_addr=inet_addr("192.168.1.254")}, 128) = 0
  502   connect(9, {sa_family=AF_INET6, sin6_port=htons(6060),
  inet_pton(AF_INET6, "face:b00c:1234:5678::abcd", &sin6_addr),
  sin6_flowinfo=0, sin6_scope_id=0}, 128) = 0
  502   connect(10, {sa_family=AF_INET6, sin6_port=htons(6060),
  inet_pton(AF_INET6, "face:b00c:1234:5678::abcd", &sin6_addr),
  sin6_flowinfo=0, sin6_scope_id=0}, 128) = 0
  # fg
  tcpdump -pn -i lo -w connect.pcap 2> /dev/null
  # tcpdump -r connect.pcap -n tcp | cut -c 1-72
  reading from file connect.pcap, link-type EN10MB (Ethernet)
  17:57:40.383533 IP 127.0.0.4.56068 > 127.0.0.1.4444: Flags [S], seq 1333
  17:57:40.383566 IP 127.0.0.1.4444 > 127.0.0.4.56068: Flags [S.], seq 112
  17:57:40.383589 IP 127.0.0.4.56068 > 127.0.0.1.4444: Flags [.], ack 1, w
  17:57:40.384578 IP 127.0.0.1.4444 > 127.0.0.4.56068: Flags [R.], seq 1,
  17:57:40.403327 IP6 ::6.37458 > ::1.6666: Flags [S], seq 406513443, win
  17:57:40.403357 IP6 ::1.6666 > ::6.37458: Flags [S.], seq 2448389240, ac
  17:57:40.403376 IP6 ::6.37458 > ::1.6666: Flags [.], ack 1, win 342, opt
  17:57:40.404263 IP6 ::1.6666 > ::6.37458: Flags [R.], seq 1, ack 1, win

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-31 02:16:14 +02:00
Andrey Ignatov e50b0a6f08 selftests/bpf: Selftest for sys_bind hooks
Add selftest to work with bpf_sock_addr context from
`BPF_PROG_TYPE_CGROUP_SOCK_ADDR` programs.

Try to bind(2) on IP:port and apply:
* loads to make sure context can be read correctly, including narrow
  loads (byte, half) for IP and full-size loads (word) for all fields;
* stores to those fields allowed by verifier.

All combination from IPv4/IPv6 and TCP/UDP are tested.

Both scenarios are tested:
* valid programs can be loaded and attached;
* invalid programs can be neither loaded nor attached.

Test passes when expected data can be read from context in the
BPF-program, and after the call to bind(2) socket is bound to IP:port
pair that was written by BPF-program to the context.

Example:
  # ./test_sock_addr
  Attached bind4 program.
  Test case #1 (IPv4/TCP):
          Requested: bind(192.168.1.254, 4040) ..
             Actual: bind(127.0.0.1, 4444)
  Test case #2 (IPv4/UDP):
          Requested: bind(192.168.1.254, 4040) ..
             Actual: bind(127.0.0.1, 4444)
  Attached bind6 program.
  Test case #3 (IPv6/TCP):
          Requested: bind(face:b00c:1234:5678::abcd, 6060) ..
             Actual: bind(::1, 6666)
  Test case #4 (IPv6/UDP):
          Requested: bind(face:b00c:1234:5678::abcd, 6060) ..
             Actual: bind(::1, 6666)
  ### SUCCESS

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-31 02:15:30 +02:00
Andrey Ignatov d7be143b67 libbpf: Support expected_attach_type at prog load
Support setting `expected_attach_type` at prog load time in both
`bpf/bpf.h` and `bpf/libbpf.h`.

Since both headers already have API to load programs, new functions are
added not to break backward compatibility for existing ones:
* `bpf_load_program_xattr()` is added to `bpf/bpf.h`;
* `bpf_prog_load_xattr()` is added to `bpf/libbpf.h`.

Both new functions accept structures, `struct bpf_load_program_attr` and
`struct bpf_prog_load_attr` correspondingly, where new fields can be
added in the future w/o changing the API.

Standard `_xattr` suffix is used to name the new API functions.

Since `bpf_load_program_name()` is not used as heavily as
`bpf_load_program()`, it was removed in favor of more generic
`bpf_load_program_xattr()`.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-31 02:15:05 +02:00
Alexei Starovoitov a0fe3e574b libbpf: add bpf_raw_tracepoint_open helper
add bpf_raw_tracepoint_open(const char *name, int prog_fd) api to libbpf

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-28 22:55:19 +02:00
John Fastabend 0dcbbf6785 bpf: sockmap sample test for bpf_msg_pull_data
This adds an option to test the msg_pull_data helper. This
uses two options txmsg_start and txmsg_end to let the user
specify start and end bytes to pull.

The options can be used with txmsg_apply, txmsg_cork options
as well as with any of the basic tests, txmsg, txmsg_redir and
txmsg_drop (plus noisy variants) to run pull_data inline with
those tests. By giving user direct control over the variables
we can easily do negative testing as well as positive tests.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:41 +01:00
John Fastabend 468b3fdea8 bpf: sockmap sample support for bpf_msg_cork_bytes()
Add sample application support for the bpf_msg_cork_bytes helper. This
lets the user specify how many bytes each verdict should apply to.

Similar to apply_bytes() tests these can be run as a stand-alone test
when used without other options or inline with other tests by using
the txmsg_cork option along with any of the basic tests txmsg,
txmsg_redir, txmsg_drop.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:40 +01:00
John Fastabend 4c4c3c276c bpf: sockmap sample, add option to attach SK_MSG program
Add sockmap option to use SK_MSG program types.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:40 +01:00
John Fastabend 82a8616889 bpf: add map tests for BPF_PROG_TYPE_SK_MSG
Add map tests to attach BPF_PROG_TYPE_SK_MSG types to a sockmap.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:39 +01:00
Song Liu 81f77fd0de bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID
test_stacktrace_build_id() is added. It accesses tracepoint urandom_read
with "dd" and "urandom_read" and gathers stack traces. Then it reads the
stack traces from the stackmap.

urandom_read is a statically link binary that reads from /dev/urandom.
test_stacktrace_build_id() calls readelf to read build ID of urandom_read
and compares it with build ID from the stackmap.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-15 01:10:02 +01:00
Lawrence Brakmo d6d4f60c3a bpf: add selftest for tcpbpf
Added a selftest for tcpbpf (sock_ops) that checks that the appropriate
callbacks occured and that it can access tcp_sock fields and that their
values are correct.

Run with command: ./test_tcpbpf_user
Adding the flag "-d" will show why it did not pass.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:15 -08:00
Jakub Kicinski 52775b33bb bpf: offload: report device information about offloaded maps
Tell user space about device on which the map was created.
Unfortunate reality of user ABI makes sharing this code
with program offload difficult but the information is the
same.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18 22:54:25 +01:00
Jesper Dangaard Brouer e7b2823a58 bpf: Sync kernel ABI header with tooling header
Update tools/include/uapi/linux/bpf.h to bring it in sync with
include/uapi/linux/bpf.h.  The listed commits forgot to update it.

Fixes: 02dd3291b2 ("bpf: finally expose xdp_rxq_info to XDP bpf-programs")
Fixes: f19397a5c6 ("bpf: Add access to snd_cwnd and others in sock_ops")
Fixes: 06ef0ccb5a ("bpf/cgroup: fix a verification error for a CGROUP_DEVICE type prog")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18 22:17:08 +01:00
Jakub Kicinski a38845729e bpf: offload: add map offload infrastructure
BPF map offload follow similar path to program offload.  At creation
time users may specify ifindex of the device on which they want to
create the map.  Map will be validated by the kernel's
.map_alloc_check callback and device driver will be called for the
actual allocation.  Map will have an empty set of operations
associated with it (save for alloc and free callbacks).  The real
device callbacks are kept in map->offload->dev_ops because they
have slightly different signatures.  Map operations are called in
process context so the driver may communicate with HW freely,
msleep(), wait() etc.

Map alloc and free callbacks are muxed via existing .ndo_bpf, and
are always called with rtnl lock held.  Maps and programs are
guaranteed to be destroyed before .ndo_uninit (i.e. before
unregister_netdev() returns).  Map callbacks are invoked with
bpf_devs_lock *read* locked, drivers must take care of exclusive
locking if necessary.

All offload-specific branches are marked with unlikely() (through
bpf_map_is_dev_bound()), given that branch penalty will be
negligible compared to IO anyway, and we don't want to penalize
SW path unnecessarily.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-14 23:36:30 +01:00
Jakub Kicinski 675fc275a3 bpf: offload: report device information for offloaded programs
Report to the user ifindex and namespace information of offloaded
programs.  If device has disappeared return -ENODEV.  Specify the
namespace using dev/inode combination.

CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31 16:12:23 +01:00
Alexei Starovoitov 48cca7e44f libbpf: add support for bpf_call
- recognize relocation emitted by llvm
- since all regular function will be kept in .text section and llvm
  takes care of pc-relative offsets in bpf_call instruction
  simply copy all of .text to relevant program section while adjusting
  bpf_call instructions in program section to point to newly copied
  body of instructions from .text
- do so for all programs in the elf file
- set all programs types to the one passed to bpf_prog_load()

Note for elf files with multiple programs that use different
functions in .text section we need to do 'linker' style logic.
This work is still TBD

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:35 +01:00
Josef Bacik 965de87e54 samples/bpf: add a test for bpf_override_return
This adds a basic test for bpf_override_return to verify it works.  We
override the main function for mounting a btrfs fs so it'll return
-ENOMEM and then make sure that trying to mount a btrfs fs will fail.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 09:02:40 -08:00
Jakub Kicinski 51aa423959 bpftool: revert printing program device bound info
This reverts commit 928631e054 ("bpftool: print program device bound
info").  We will remove this API and redo it right in -next.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21 00:37:35 +01:00
Jakub Kicinski 1f6f4cb7ba bpf: offload: rename the ifindex field
bpf_target_prog seems long and clunky, rename it to prog_ifindex.
We don't want to call this field just ifindex, because maps
may need a similar field in the future and bpf_attr members for
programs and maps are unnamed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21 00:37:35 +01:00
David S. Miller f3edacbd69 bpf: Revert bpf_overrid_function() helper changes.
NACK'd by x86 maintainer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-11 18:24:55 +09:00
Josef Bacik eafb3401fa samples/bpf: add a test for bpf_override_return
This adds a basic test for bpf_override_return to verify it works.  We
override the main function for mounting a btrfs fs so it'll return
-ENOMEM and then make sure that trying to mount a btrfs fs will fail.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-11 12:18:06 +09:00
David S. Miller 4dc6758d78 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Simple cases of overlapping changes in the packet scheduler.

Must easier to resolve this time.

Which probably means that I screwed it up somehow.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-10 10:00:18 +09:00
Roman Gushchin ebc614f687 bpf, cgroup: implement eBPF-based device controller for cgroup v2
Cgroup v2 lacks the device controller, provided by cgroup v1.
This patch adds a new eBPF program type, which in combination
of previously added ability to attach multiple eBPF programs
to a cgroup, will provide a similar functionality, but with some
additional flexibility.

This patch introduces a BPF_PROG_TYPE_CGROUP_DEVICE program type.
A program takes major and minor device numbers, device type
(block/character) and access type (mknod/read/write) as parameters
and returns an integer which defines if the operation should be
allowed or terminated with -EPERM.

Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05 23:26:51 +09:00
Jakub Kicinski 928631e054 bpftool: print program device bound info
If program is bound to a device, print the name of the relevant
interface or unknown if the netdev has since been removed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05 22:26:19 +09:00
Ingo Molnar fb7df12d64 tools/headers: Synchronize kernel ABI headers
After the SPDX license tags were added a number of tooling headers got out of
sync with their kernel variants, generating lots of build warnings.

Sync them:

 - tools/arch/x86/include/asm/disabled-features.h,
   tools/arch/x86/include/asm/required-features.h,
   tools/include/linux/hash.h:

     Remove the SPDX tag where the kernel version does not have it.

 - tools/include/asm-generic/bitops/__fls.h,
   tools/include/asm-generic/bitops/arch_hweight.h,
   tools/include/asm-generic/bitops/const_hweight.h,
   tools/include/asm-generic/bitops/fls.h,
   tools/include/asm-generic/bitops/fls64.h,
   tools/include/uapi/asm-generic/ioctls.h,
   tools/include/uapi/asm-generic/mman-common.h,
   tools/include/uapi/sound/asound.h,
   tools/include/uapi/linux/kvm.h,
   tools/include/uapi/linux/perf_event.h,
   tools/include/uapi/linux/sched.h,
   tools/include/uapi/linux/vhost.h,
   tools/include/uapi/sound/asound.h:

     Add the SPDX tag of the respective kernel header.

 - tools/include/uapi/linux/bpf_common.h,
   tools/include/uapi/linux/fcntl.h,
   tools/include/uapi/linux/hw_breakpoint.h,
   tools/include/uapi/linux/mman.h,
   tools/include/uapi/linux/stat.h,

     Change the tag to the kernel header version:

       -/* SPDX-License-Identifier: GPL-2.0 */
       +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */

Also sync other header details:

 - include/uapi/sound/asound.h:

     Fix pointless end of line whitespace noise the header grew in this cycle.

 - tools/arch/x86/lib/memcpy_64.S:

     Sync the code and add tools/include/asm/export.h with dummy wrappers
     to support building the kernel side code in a tooling header environment.

 - tools/include/uapi/asm-generic/mman.h,
   tools/include/uapi/linux/bpf.h:

     Sync other details that don't impact tooling's use of the ABIs.

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-04 09:27:46 +01:00
David S. Miller ed29668d1a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Smooth Cong Wang's bug fix into 'net-next'.  Basically put
the bulk of the tcf_block_put() logic from 'net' into
tcf_block_put_ext(), but after the offload unbind.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-02 15:23:39 +09:00
John Fastabend 04686ef299 bpf: remove SK_REDIRECT from UAPI
Now that SK_REDIRECT is no longer a valid return code. Remove it
from the UAPI completely. Then do a namespace remapping internal
to sockmap so SK_REDIRECT is no longer externally visible.

Patchs primary change is to do a namechange from SK_REDIRECT to
__SK_REDIRECT

Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01 11:43:50 +09:00
David S. Miller e1ea2f9856 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several conflicts here.

NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to
nfp_fl_output() needed some adjustments because the code block is in
an else block now.

Parallel additions to net/pkt_cls.h and net/sch_generic.h

A bug fix in __tcp_retransmit_skb() conflicted with some of
the rbtree changes in net-next.

The tc action RCU callback fixes in 'net' had some overlap with some
of the recent tcf_block reworking.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-30 21:09:24 +09:00
John Fastabend bfa640757e bpf: rename sk_actions to align with bpf infrastructure
Recent additions to support multiple programs in cgroups impose
a strict requirement, "all yes is yes, any no is no". To enforce
this the infrastructure requires the 'no' return code, SK_DROP in
this case, to be 0.

To apply these rules to SK_SKB program types the sk_actions return
codes need to be adjusted.

This fix adds SK_PASS and makes 'SK_DROP = 0'. Finally, remove
SK_ABORTED to remove any chance that the API may allow aborted
program flows to be passed up the stack. This would be incorrect
behavior and allow programs to break existing policies.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 11:18:48 +09:00
Alexei Starovoitov e27afb84b4 selftests/bpf: fix broken build of test_maps
fix multiple build errors and warnings

1.
test_maps.c: In function ‘test_map_rdonly’:
test_maps.c:1051:30: error: ‘BPF_F_RDONLY’ undeclared (first use in this function)
        MAP_SIZE, map_flags | BPF_F_RDONLY);

2.
test_maps.c:1048:6: warning: unused variable ‘i’ [-Wunused-variable]
  int i, fd, key = 0, value = 0;

3.
test_maps.c:1087:2: error: called object is not a function or function pointer
  assert(bpf_map_lookup_elem(fd, &key, &value) == -1 && errno == EPERM);

4.
./bpf_helpers.h:72:11: error: use of undeclared identifier 'BPF_FUNC_getsockopt'
        (void *) BPF_FUNC_getsockopt;

Fixes: e043325b30 ("bpf: Add tests for eBPF file mode")
Fixes: 6e71b04a82 ("bpf: Add file mode configuration into bpf maps")
Fixes: cd86d1fd21 ("bpf: Adding helper function bpf_getsockops")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-23 01:06:31 +01:00
David S. Miller f8ddadc4db Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
There were quite a few overlapping sets of changes here.

Daniel's bug fix for off-by-ones in the new BPF branch instructions,
along with the added allowances for "data_end > ptr + x" forms
collided with the metadata additions.

Along with those three changes came veritifer test cases, which in
their final form I tried to group together properly.  If I had just
trimmed GIT's conflict tags as-is, this would have split up the
meta tests unnecessarily.

In the socketmap code, a set of preemption disabling changes
overlapped with the rename of bpf_compute_data_end() to
bpf_compute_data_pointers().

Changes were made to the mv88e6060.c driver set addr method
which got removed in net-next.

The hyperv transport socket layer had a locking change in 'net'
which overlapped with a change of socket state macro usage
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 13:39:14 +01:00
Linus Torvalds b5ac3beb5a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "A little more than usual this time around. Been travelling, so that is
  part of it.

  Anyways, here are the highlights:

   1) Deal with memcontrol races wrt. listener dismantle, from Eric
      Dumazet.

   2) Handle page allocation failures properly in nfp driver, from Jaku
      Kicinski.

   3) Fix memory leaks in macsec, from Sabrina Dubroca.

   4) Fix crashes in pppol2tp_session_ioctl(), from Guillaume Nault.

   5) Several fixes in bnxt_en driver, including preventing potential
      NVRAM parameter corruption from Michael Chan.

   6) Fix for KRACK attacks in wireless, from Johannes Berg.

   7) rtnetlink event generation fixes from Xin Long.

   8) Deadlock in mlxsw driver, from Ido Schimmel.

   9) Disallow arithmetic operations on context pointers in bpf, from
      Jakub Kicinski.

  10) Missing sock_owned_by_user() check in sctp_icmp_redirect(), from
      Xin Long.

  11) Only TCP is supported for sockmap, make that explicit with a
      check, from John Fastabend.

  12) Fix IP options state races in DCCP and TCP, from Eric Dumazet.

  13) Fix panic in packet_getsockopt(), also from Eric Dumazet.

  14) Add missing locked in hv_sock layer, from Dexuan Cui.

  15) Various aquantia bug fixes, including several statistics handling
      cures. From Igor Russkikh et al.

  16) Fix arithmetic overflow in devmap code, from John Fastabend.

  17) Fix busted socket memory accounting when we get a fault in the tcp
      zero copy paths. From Willem de Bruijn.

  18) Don't leave opt->tot_len uninitialized in ipv6, from Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
  stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
  ipv6: flowlabel: do not leave opt->tot_len with garbage
  of_mdio: Fix broken PHY IRQ in case of probe deferral
  textsearch: fix typos in library helpers
  rxrpc: Don't release call mutex on error pointer
  net: stmmac: Prevent infinite loop in get_rx_timestamp_status()
  net: stmmac: Fix stmmac_get_rx_hwtstamp()
  net: stmmac: Add missing call to dev_kfree_skb()
  mlxsw: spectrum_router: Configure TIGCR on init
  mlxsw: reg: Add Tunneling IPinIP General Configuration Register
  net: ethtool: remove error check for legacy setting transceiver type
  soreuseport: fix initialization race
  net: bridge: fix returning of vlan range op errors
  sock: correct sk_wmem_queued accounting on efault in tcp zerocopy
  bpf: add test cases to bpf selftests to cover all access tests
  bpf: fix pattern matches for direct packet access
  bpf: fix off by one for range markings with L{T, E} patterns
  bpf: devmap fix arithmetic overflow in bitmap_size calculation
  net: aquantia: Bad udp rate on default interrupt coalescing
  net: aquantia: Enable coalescing management via ethtool interface
  ...
2017-10-21 22:44:48 -04:00
John Fastabend 34f79502bb bpf: avoid preempt enable/disable in sockmap using tcp_skb_cb region
SK_SKB BPF programs are run from the socket/tcp context but early in
the stack before much of the TCP metadata is needed in tcp_skb_cb. So
we can use some unused fields to place BPF metadata needed for SK_SKB
programs when implementing the redirect function.

This allows us to drop the preempt disable logic. It does however
require an API change so sk_redirect_map() has been updated to
additionally provide ctx_ptr to skb. Note, we do however continue to
disable/enable preemption around actual BPF program running to account
for map updates.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:01:29 +01:00
Jesper Dangaard Brouer 6710e11269 bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP
The 'cpumap' is primarily used as a backend map for XDP BPF helper
call bpf_redirect_map() and XDP_REDIRECT action, like 'devmap'.

This patch implement the main part of the map.  It is not connected to
the XDP redirect system yet, and no SKB allocation are done yet.

The main concern in this patch is to ensure the datapath can run
without any locking.  This adds complexity to the setup and tear-down
procedure, which assumptions are extra carefully documented in the
code comments.

V2:
 - make sure array isn't larger than NR_CPUS
 - make sure CPUs added is a valid possible CPU

V3: fix nitpicks from Jakub Kicinski <kubakici@wp.pl>

V5:
 - Restrict map allocation to root / CAP_SYS_ADMIN
 - WARN_ON_ONCE if queue is not empty on tear-down
 - Return -EPERM on memlock limit instead of -ENOMEM
 - Error code in __cpu_map_entry_alloc() also handle ptr_ring_cleanup()
 - Moved cpu_map_enqueue() to next patch

V6: all notice by Daniel Borkmann
 - Fix err return code in cpu_map_alloc() introduced in V5
 - Move cpu_possible() check after max_entries boundary check
 - Forbid usage initially in check_map_func_compatibility()

V7:
 - Fix alloc error path spotted by Daniel Borkmann
 - Did stress test adding+removing CPUs from the map concurrently
 - Fixed refcnt issue on cpu_map_entry, kthread started too soon
 - Make sure packets are flushed during tear-down, involved use of
   rcu_barrier() and kthread_run only exit after queue is empty
 - Fix alloc error path in __cpu_map_entry_alloc() for ptr_ring

V8:
 - Nitpicking comments and gramma by Edward Cree
 - Fix missing semi-colon introduced in V7 due to rebasing
 - Move struct bpf_cpu_map_entry members cpu+map_id to tracepoint patch

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:12:18 +01:00
Arnaldo Carvalho de Melo aa7b4e02b3 tools include uapi bpf.h: Sync kernel ABI header with tooling header
Silences the checker:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'

The 90caccdd8c ("bpf: fix bpf_tail_call() x64 JIT") cset only updated
a comment in uapi/bpf.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-rwx2cqbf0x1lwa1krsr6e6hd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-09 15:55:45 -03:00
Martin KaFai Lau 067cae4777 bpf: Use char in prog and map name
Instead of u8, use char for prog and map name.  It can avoid the
userspace tool getting compiler's signess warning.  The
bpf_prog_aux, bpf_map, bpf_attr, bpf_prog_info and
bpf_map_info are changed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 23:29:39 +01:00
Yonghong Song 81b9cf8028 bpf: add a test case for helper bpf_perf_prog_read_value
The bpf sample program trace_event is enhanced to use the new
helper to print out enabled/running time.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 23:05:57 +01:00
Yonghong Song 020a32d958 bpf: add a test case for helper bpf_perf_event_read_value
The bpf sample program tracex6 is enhanced to use the new
helper to read enabled/running time as well.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 23:05:57 +01:00
Alexei Starovoitov defd9c476f libbpf: sync bpf.h
tools/include/uapi/linux/bpf.h got out of sync with actual kernel header.
Update it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 16:05:05 -07:00
Martin KaFai Lau 88cda1c9da bpf: libbpf: Provide basic API support to specify BPF obj name
This patch extends the libbpf to provide API support to
allow specifying BPF object name.

In tools/lib/bpf/libbpf, the C symbol of the function
and the map is used.  Regarding section name, all maps are
under the same section named "maps".  Hence, section name
is not a good choice for map's name.  To be consistent with
map, bpf_prog also follows and uses its function symbol as
the prog's name.

This patch adds logic to collect function's symbols in libbpf.
There is existing codes to collect the map's symbols and no change
is needed.

The bpf_load_program_name() and bpf_map_create_name() are
added to take the name argument.  For the other bpf_map_create_xxx()
variants, a name argument is directly added to them.

In samples/bpf, bpf_load.c in particular, the symbol is also
used as the map's name and the map symbols has already been
collected in the existing code.  For bpf_prog, bpf_load.c does
not collect the function symbol name.  We can consider to collect
them later if there is a need to continue supporting the bpf_load.c.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-29 06:17:05 +01:00
Daniel Borkmann ac29991ba1 bpf: update bpf.h uapi header for tools
Looks like a couple of updates missed to get carried into tools/include/uapi/,
so copy the bpf.h header as usual to pull in latest updates.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Ingo Molnar 549a397652 tools include: Sync kernel ABI headers with tooling headers
Time for a sync with ABI/uapi headers with the upcoming v4.14 kernel.

None of the ABI changes require any source code level changes to our
existing in-kernel tooling code:

  - tools/arch/s390/include/uapi/asm/kvm.h:

      New KVM_S390_VM_TOD_EXT ABI, not used by in-kernel tooling.

  - tools/arch/x86/include/asm/cpufeatures.h:
    tools/arch/x86/include/asm/disabled-features.h:

      New PCID, SME and VGIF x86 CPU feature bits defined.

  - tools/include/asm-generic/hugetlb_encode.h:
    tools/include/uapi/asm-generic/mman-common.h:
    tools/include/uapi/linux/mman.h:

      Two new madvise() flags, plus a hugetlb system call mmap flags
      restructuring/extension changes.

  - tools/include/uapi/drm/drm.h:
    tools/include/uapi/drm/i915_drm.h:

      New drm_syncobj_create flags definitions, new drm_syncobj_wait
      and drm_syncobj_array ABIs. DRM_I915_PERF_* calls and a new
      I915_PARAM_HAS_EXEC_FENCE_ARRAY ABI for the Intel driver.

  - tools/include/uapi/linux/bpf.h:

      New bpf_sock fields (::mark and ::priority), new XDP_REDIRECT
      action, new kvm_ppc_smmu_info fields (::data_keys, instr_keys)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20170913073823.lxmi4c7ejqlfabjx@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-25 10:39:44 -03:00
Linus Torvalds aae3dbb477 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Support ipv6 checksum offload in sunvnet driver, from Shannon
    Nelson.

 2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
    Dumazet.

 3) Allow generic XDP to work on virtual devices, from John Fastabend.

 4) Add bpf device maps and XDP_REDIRECT, which can be used to build
    arbitrary switching frameworks using XDP. From John Fastabend.

 5) Remove UFO offloads from the tree, gave us little other than bugs.

 6) Remove the IPSEC flow cache, from Florian Westphal.

 7) Support ipv6 route offload in mlxsw driver.

 8) Support VF representors in bnxt_en, from Sathya Perla.

 9) Add support for forward error correction modes to ethtool, from
    Vidya Sagar Ravipati.

10) Add time filter for packet scheduler action dumping, from Jamal Hadi
    Salim.

11) Extend the zerocopy sendmsg() used by virtio and tap to regular
    sockets via MSG_ZEROCOPY. From Willem de Bruijn.

12) Significantly rework value tracking in the BPF verifier, from Edward
    Cree.

13) Add new jump instructions to eBPF, from Daniel Borkmann.

14) Rework rtnetlink plumbing so that operations can be run without
    taking the RTNL semaphore. From Florian Westphal.

15) Support XDP in tap driver, from Jason Wang.

16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.

17) Add Huawei hinic ethernet driver.

18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
    Delalande.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
  i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
  i40e: avoid NVM acquire deadlock during NVM update
  drivers: net: xgene: Remove return statement from void function
  drivers: net: xgene: Configure tx/rx delay for ACPI
  drivers: net: xgene: Read tx/rx delay for ACPI
  rocker: fix kcalloc parameter order
  rds: Fix non-atomic operation on shared flag variable
  net: sched: don't use GFP_KERNEL under spin lock
  vhost_net: correctly check tx avail during rx busy polling
  net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
  rxrpc: Make service connection lookup always check for retry
  net: stmmac: Delete dead code for MDIO registration
  gianfar: Fix Tx flow control deactivation
  cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
  cxgb4: Fix pause frame count in t4_get_port_stats
  cxgb4: fix memory leak
  tun: rename generic_xdp to skb_xdp
  tun: reserve extra headroom only when XDP is set
  net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
  net: dsa: bcm_sf2: Advertise number of egress queues
  ...
2017-09-06 14:45:08 -07:00
John Fastabend 464bc0fd62 bpf: convert sockmap field attach_bpf_fd2 to type
In the initial sockmap API we provided strparser and verdict programs
using a single attach command by extending the attach API with a the
attach_bpf_fd2 field.

However, if we add other programs in the future we will be adding a
field for every new possible type, attach_bpf_fd(3,4,..). This
seems a bit clumsy for an API. So lets push the programs using two
new type fields.

   BPF_SK_SKB_STREAM_PARSER
   BPF_SK_SKB_STREAM_VERDICT

This has the advantage of having a readable name and can easily be
extended in the future.

Updates to samples and sockmap included here also generalize tests
slightly to support upcoming patch for multiple map support.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Fixes: 174a79ff95 ("bpf: sockmap with sk redirect support")
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-28 11:13:21 -07:00
Martin KaFai Lau ad17d0e6c7 bpf: Allow numa selection in INNER_LRU_HASH_PREALLOC test of map_perf_test
This patch makes the needed changes to allow each process of
the INNER_LRU_HASH_PREALLOC test to provide its numa node id
when creating the lru map.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-19 21:35:43 -07:00
John Fastabend 69e8cc134b bpf: sockmap sample program
This program binds a program to a cgroup and then matches hard
coded IP addresses and adds these to a sockmap.

This will receive messages from the backend and send them to
the client.

     client:X <---> frontend:10000 client:X <---> backend:10001

To keep things simple this is only designed for 1:1 connections
using hard coded values. A more complete example would allow many
backends and clients.

To run,

 # sockmap <cgroup2_dir>

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16 11:27:53 -07:00
Daniel Borkmann 92b31a9af7 bpf: add BPF_J{LT,LE,SLT,SLE} instructions
Currently, eBPF only understands BPF_JGT (>), BPF_JGE (>=),
BPF_JSGT (s>), BPF_JSGE (s>=) instructions, this means that
particularly *JLT/*JLE counterparts involving immediates need
to be rewritten from e.g. X < [IMM] by swapping arguments into
[IMM] > X, meaning the immediate first is required to be loaded
into a register Y := [IMM], such that then we can compare with
Y > X. Note that the destination operand is always required to
be a register.

This has the downside of having unnecessarily increased register
pressure, meaning complex program would need to spill other
registers temporarily to stack in order to obtain an unused
register for the [IMM]. Loading to registers will thus also
affect state pruning since we need to account for that register
use and potentially those registers that had to be spilled/filled
again. As a consequence slightly more stack space might have
been used due to spilling, and BPF programs are a bit longer
due to extra code involving the register load and potentially
required spill/fills.

Thus, add BPF_JLT (<), BPF_JLE (<=), BPF_JSLT (s<), BPF_JSLE (s<=)
counterparts to the eBPF instruction set. Modifying LLVM to
remove the NegateCC() workaround in a PoC patch at [1] and
allowing it to also emit the new instructions resulted in
cilium's BPF programs that are injected into the fast-path to
have a reduced program length in the range of 2-3% (e.g.
accumulated main and tail call sections from one of the object
file reduced from 4864 to 4729 insns), reduced complexity in
the range of 10-30% (e.g. accumulated sections reduced in one
of the cases from 116432 to 88428 insns), and reduced stack
usage in the range of 1-5% (e.g. accumulated sections from one
of the object files reduced from 824 to 784b).

The modification for LLVM will be incorporated in a backwards
compatible way. Plan is for LLVM to have i) a target specific
option to offer a possibility to explicitly enable the extension
by the user (as we have with -m target specific extensions today
for various CPU insns), and ii) have the kernel checked for
presence of the extensions and enable them transparently when
the user is selecting more aggressive options such as -march=native
in a bpf target context. (Other frontends generating BPF byte
code, e.g. ply can probe the kernel directly for its code
generation.)

  [1] https://github.com/borkmann/llvm/tree/bpf-insns

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 16:53:56 -07:00
William Tu 996139e801 selftests: bpf: add a test for XDP redirect
Add test for xdp_redirect by creating two namespaces with two
veth peers, then forward packets in-between.

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08 18:12:50 -07:00
Arnaldo Carvalho de Melo f1d6cb2d8c tools headers: Fixup tools/include/uapi/linux/bpf.h copy of kernel ABI header
In 04df41e343 ("bpf: update tools/include/uapi/linux/bpf.h") the files
added in 40304b2a15 ("bpf: BPF support for sock_ops") were added to
tools/include, but not in a verbatim way, missing the comments, which
ends up triggering this warning when build tools/perf/:

  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build
  Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'

Make sure the the lines are equal, to fix the simple header copy
drift detector in tools/perf/.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Lawrence Brakmo <brakmo@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 04df41e343 ("bpf: update tools/include/uapi/linux/bpf.h")
Link: http://lkml.kernel.org/n/tip-z9qyyqht9qq3yyxu76sfy0dh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-31 23:05:00 -03:00
Arnaldo Carvalho de Melo d62c1d7213 tools headers: Fixup tools/include/uapi/linux/bpf.h copy of kernel ABI header
In 2be7e212d5 ("bpf: add bpf_skb_adjust_room helper") BPF_ADJ_ROOM_NET was
added to include/uapi/linux/bpf.h but BPF_ADJ_ROOM_NET_OPS was added to
tools/include/uapi/linux/bpf.h, making these files differ, fix it by using the
same name in both, BPF_ADJ_ROOM_NET, the one in the kernel headers copy.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 2be7e212d5 ("bpf: add bpf_skb_adjust_room helper")
Link: http://lkml.kernel.org/n/tip-2bmwovi9lymplyz6wsszppyf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-31 10:45:07 -03:00
John Fastabend 81f6bf8127 bpf: testing: fix devmap tests
Apparently through one of my revisions of the initial patches
series I lost the devmap test. We can add more testing later but
for now lets fix the simple one we have.

Fixes: 546ac1ffb7 "bpf: add devmap, a map for storing net device references"
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29 14:13:20 -07:00
Daniel Borkmann 2be7e212d5 bpf: add bpf_skb_adjust_room helper
This work adds a helper that can be used to adjust net room of an
skb. The helper is generic and can be further extended in future.
Main use case is for having a programmatic way to add/remove room to
v4/v6 header options along with cls_bpf on egress and ingress hook
of the data path. It reuses most of the infrastructure that we added
for the bpf_skb_change_type() helper which can be used in nat64
translations. Similarly, the helper only takes care of adjusting the
room so that related data is populated and csum adapted out of the
BPF program using it.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-03 02:22:52 -07:00
Lawrence Brakmo 04df41e343 bpf: update tools/include/uapi/linux/bpf.h
Update tools/include/uapi/linux/bpf.h to include changes related to new
bpf sock_ops program type.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 16:15:14 -07:00
Daniel Borkmann ded092cd73 bpf: add bpf_set_hash helper for tc progs
Allow for tc BPF programs to set a skb->hash, apart from clearing
and triggering a recalc that we have right now. It allows for BPF
to implement a custom hashing routine for skb_get_hash().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:47 -04:00
Martin KaFai Lau 95b9afd398 bpf: Test for bpf ID
Add test to exercise the bpf_prog/map id generation,
bpf_(prog|map)_get_next_id(), bpf_(prog|map)_get_fd_by_id() and
bpf_get_obj_info_by_fd().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:41:24 -04:00
Teng Qin b7d3ed5be9 bpf: update perf event helper functions documentation
This commit updates documentation of the bpf_perf_event_output and
bpf_perf_event_read helpers to match their implementation.

Signed-off-by: Teng Qin <qinteng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 21:58:16 -04:00
David S. Miller e07b98d9bf bpf: Add strict alignment flag for BPF_PROG_LOAD.
Add a new field, "prog_flags", and an initial flag value
BPF_F_STRICT_ALIGNMENT.

When set, the verifier will enforce strict pointer alignment
regardless of the setting of CONFIG_EFFICIENT_UNALIGNED_ACCESS.

The verifier, in this mode, will also use a fixed value of "2" in
place of NET_IP_ALIGN.

This facilitates test cases that will exercise and validate this part
of the verifier even when run on architectures where alignment doesn't
matter.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2017-05-11 14:19:00 -04:00
Daniel Borkmann b1d9fc41aa bpf: add napi_id read access to __sk_buff
Add napi_id access to __sk_buff for socket filter program types, tc
program types and other bpf_convert_ctx_access() users. Having access
to skb->napi_id is useful for per RX queue listener siloing, f.e.
in combination with SO_ATTACH_REUSEPORT_EBPF and when busy polling is
used, meaning SO_REUSEPORT enabled listeners can then select the
corresponding socket at SYN time already [1]. The skb is marked via
skb_mark_napi_id() early in the receive path (e.g., napi_gro_receive()).

Currently, sockets can only use SO_INCOMING_NAPI_ID from 6d4339028b
("net: Introduce SO_INCOMING_NAPI_ID") as a socket option to look up
the NAPI ID associated with the queue for steering, which requires a
prior sk_mark_napi_id() after the socket was looked up.

Semantics for the __sk_buff napi_id access are similar, meaning if
skb->napi_id is < MIN_NAPI_ID (e.g. outgoing packets using sender_cpu),
then an invalid napi_id of 0 is returned to the program, otherwise a
valid non-zero napi_id.

  [1] http://netdevconf.org/2.1/slides/apr6/dumazet-BUSY-POLLING-Netdev-2.1.pdf

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:53:27 -04:00
Alexander Alemayhu 3c60a531b9 bpf: fix comment typo
o s/bpf_bpf_get_socket_cookie/bpf_get_socket_cookie

Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09 18:26:08 -07:00
Alexei Starovoitov 3084887378 tools/lib/bpf: add support for BPF_PROG_TEST_RUN command
add support for BPF_PROG_TEST_RUN command to libbpf.a

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:45:57 -07:00
Chenbo Feng 6acc5c2910 Add a eBPF helper function to retrieve socket uid
Returns the owner uid of the socket inside a sk_buff. This is useful to
perform per-UID accounting of network traffic or per-UID packet
filtering. The socket need to be a fullsock otherwise overflowuid is
returned.

Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 17:01:02 -07:00
Chenbo Feng 91b8270f2a Add a helper function to get socket cookie in eBPF
Retrieve the socket cookie generated by sock_gen_cookie() from a sk_buff
with a known socket. Generates a new cookie if one was not yet set.If
the socket pointer inside sk_buff is NULL, 0 is returned. The helper
function coud be useful in monitoring per socket networking traffic
statistics and provide a unique socket identifier per namespace.

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 17:01:02 -07:00
Martin KaFai Lau fb30d4b712 bpf: Add tests for map-in-map
Test cases for array of maps and hash of maps.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 15:45:45 -07:00
David S. Miller 3f64116a83 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-02-16 19:34:01 -05:00
Stephen Rothwell 5463b3d043 bpf: kernel header files need to be copied into the tools directory
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 22:27:31 -05:00
Mickaël Salaün 9a7382662f tools: Sync {,tools/}include/uapi/linux/bpf.h
The tools version of this header is out of date; update it to the latest
version from kernel header.

Synchronize with the following commits:
* b95a5c4db0 ("bpf: add a longest prefix match trie map implementation")
* a5e8c07059 ("bpf: add bpf_probe_read_str helper")
* d1b662adcd ("bpf: allow option for setting bpf_l4_csum_replace from scratch")

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Mack <daniel@zonque.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gianluca Borello <g.borello@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-10 15:56:05 -05:00
Joe Stringer 0cb34dc2a3 tools lib bpf: Sync {tools,}/include/uapi/linux/bpf.h
The tools version of this header is out of date; update it to the latest
version from the kernel headers.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Link: http://lkml.kernel.org/r/20161209024620.31660-2-joe@ovn.org
[ Sync it harder, after merging with what was in net-next via perf/urgent via torvalds/master to get BPG_PROG_(AT|DE)TACH, etc ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Arnaldo Carvalho de Melo dcfc3a6740 tools: Synchronize tools/include/uapi/linux/bpf.h
Commit 747ea55e4f ("bpf: fix bpf_skb_in_cgroup helper naming") renames
BPF_FUNC_skb_in_cgroup to bpf_skb_under_cgroup, triggering this warning
while building perf:

  Warning: tools/include/uapi/linux/bpf.h differs from kernel

Update the copy to ack that, no changes needed, as
BPF_FUNC_skb_in_cgroup isn't used so far.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-x67d2gq8ct6ko12ex14q8bbx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-05 19:31:48 -03:00