linux

Commit Graph

Author	SHA1	Message	Date
Alexei Starovoitov	360301a6c2	selftests/bpf: Add unit tests for global functions test_global_func[12] - check 512 stack limit. test_global_func[34] - check 8 frame call chain limit. test_global_func5 - check that non-ctx pointer cannot be passed into a function that expects context. test_global_func6 - check that ctx pointer is unmodified. test_global_func7 - check that global function returns scalar. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200110064124.1760511-7-ast@kernel.org	2020-01-10 17:20:07 +01:00
Alexei Starovoitov	e528d1c012	selftests/bpf: Modify a test to check global functions Make two static functions in test_xdp_noinline.c global: before: processed 2790 insns after: processed 2598 insns Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200110064124.1760511-6-ast@kernel.org	2020-01-10 17:20:07 +01:00
Alexei Starovoitov	6db2d81a46	selftests/bpf: Add a test for a large global function test results: pyperf50 with always_inlined the same function five times: processed 46378 insns pyperf50 with global function: processed 6102 insns Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200110064124.1760511-5-ast@kernel.org	2020-01-10 17:20:07 +01:00
Alexei Starovoitov	7608e4db6d	selftests/bpf: Add fexit-to-skb test for global funcs Add simple fexit prog type to skb prog type test when subprogram is a global function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200110064124.1760511-4-ast@kernel.org	2020-01-10 17:20:07 +01:00
Alexei Starovoitov	51c39bb1d5	bpf: Introduce function-by-function verification New llvm and old llvm with libbpf help produce BTF that distinguish global and static functions. Unlike arguments of static function the arguments of global functions cannot be removed or optimized away by llvm. The compiler has to use exactly the arguments specified in a function prototype. The argument type information allows the verifier validate each global function independently. For now only supported argument types are pointer to context and scalars. In the future pointers to structures, sizes, pointer to packet data can be supported as well. Consider the following example: static int f1(int ...) { ... } int f3(int b); int f2(int a) { f1(a) + f3(a); } int f3(int b) { ... } int main(...) { f1(...) + f2(...) + f3(...); } The verifier will start its safety checks from the first global function f2(). It will recursively descend into f1() because it's static. Then it will check that arguments match for the f3() invocation inside f2(). It will not descend into f3(). It will finish f2() that has to be successfully verified for all possible values of 'a'. Then it will proceed with f3(). That function also has to be safe for all possible values of 'b'. Then it will start subprog 0 (which is main() function). It will recursively descend into f1() and will skip full check of f2() and f3(), since they are global. The order of processing global functions doesn't affect safety, since all global functions must be proven safe based on their arguments only. Such function by function verification can drastically improve speed of the verification and reduce complexity. Note that the stack limit of 512 still applies to the call chain regardless whether functions were static or global. The nested level of 8 also still applies. The same recursion prevention checks are in place as well. The type information and static/global kind is preserved after the verification hence in the above example global function f2() and f3() can be replaced later by equivalent functions with the same types that are loaded and verified later without affecting safety of this main() program. Such replacement (re-linking) of global functions is a subject of future patches. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200110064124.1760511-3-ast@kernel.org	2020-01-10 17:20:07 +01:00
Alexei Starovoitov	2d3eb67f64	libbpf: Sanitize global functions In case the kernel doesn't support BTF_FUNC_GLOBAL sanitize BTF produced by the compiler for global functions. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200110064124.1760511-2-ast@kernel.org	2020-01-10 17:20:07 +01:00
Alexei Starovoitov	f41aa387a7	Merge branch 'selftest-makefile-cleanup' Andrii Nakryiko says: ==================== Fix issues with bpf_helper_defs.h usage in selftests/bpf. As part of that, fix the way clean up is performed for libbpf and selftests/bpf. Some for Makefile output clean ups as well. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-01-09 21:56:16 -08:00
Andrii Nakryiko	965b9fee28	selftests/bpf: Further clean up Makefile output Further clean up Makefile output: - hide "entering directory" messages; - silvence sub-Make command echoing; - succinct MKDIR messages. Also remove few test binaries that are not produced anymore from .gitignore. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200110051716.1591485-4-andriin@fb.com	2020-01-09 21:55:08 -08:00
Andrii Nakryiko	6910d7d386	selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir Reorder includes search path to ensure $(OUTPUT) and $(CURDIR) go before libbpf's directory. Also fix bpf_helpers.h to include bpf_helper_defs.h in such a way as to leverage includes search path. This allows selftests to not use libbpf's local and potentially stale bpf_helper_defs.h. It's important because selftests/bpf's Makefile only re-generates bpf_helper_defs.h in seltests' output directory, not the one in libbpf's directory. Also force regeneration of bpf_helper_defs.h when libbpf.a is updated to reduce staleness. Fixes: `fa633a0f89` ("libbpf: Fix build on read-only filesystems") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200110051716.1591485-3-andriin@fb.com	2020-01-09 21:55:08 -08:00
Andrii Nakryiko	2031af28a4	libbpf,selftests/bpf: Fix clean targets Libbpf's clean target should clean out generated files in $(OUTPUT) directory and not make assumption that $(OUTPUT) directory is current working directory. Selftest's Makefile should delegate cleaning of libbpf-generated files to libbpf's Makefile. This ensures more robust clean up. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200110051716.1591485-2-andriin@fb.com	2020-01-09 21:55:08 -08:00
Andrii Nakryiko	492ab0205f	libbpf: Make bpf_map order and indices stable Currently, libbpf re-sorts bpf_map structs after all the maps are added and initialized, which might change their relative order and invalidate any bpf_map pointer or index taken before that. This is inconvenient and error-prone. For instance, it can cause .kconfig map index to point to a wrong map. Furthermore, libbpf itself doesn't rely on any specific ordering of bpf_maps, so it's just an unnecessary complication right now. This patch drops sorting of maps and makes their relative positions fixed. If efficient index is ever needed, it's better to have a separate array of pointers as a search index, instead of reordering bpf_map struct in-place. This will be less error-prone and will allow multiple independent orderings, if necessary (e.g., either by section index or by name). Fixes: `166750bc1d` ("libbpf: Support libbpf-provided extern variables") Reported-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200110034247.1220142-1-andriin@fb.com	2020-01-09 21:17:01 -08:00
Andrey Ignatov	f5bfcd953d	bpf: Document BPF_F_QUERY_EFFECTIVE flag Document BPF_F_QUERY_EFFECTIVE flag, mostly to clarify how it affects attach_flags what may not be obvious and what may lead to confision. Specifically attach_flags is returned only for target_fd but if programs are inherited from an ancestor cgroup then returned attach_flags for current cgroup may be confusing. For example, two effective programs of same attach_type can be returned but w/o BPF_F_ALLOW_MULTI in attach_flags. Simple repro: # bpftool c s /sys/fs/cgroup/path/to/task ID AttachType AttachFlags Name # bpftool c s /sys/fs/cgroup/path/to/task effective ID AttachType AttachFlags Name 95043 ingress tw_ipt_ingress 95048 ingress tw_ingress Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200108014006.938363-1-rdna@fb.com	2020-01-09 09:40:06 -08:00
Alexei Starovoitov	417759f7d4	Merge branch 'tcp-bpf-cc' Martin Lau says: ==================== This series introduces BPF STRUCT_OPS. It is an infra to allow implementing some specific kernel's function pointers in BPF. The first use case included in this series is to implement TCP congestion control algorithm in BPF (i.e. implement struct tcp_congestion_ops in BPF). There has been attempt to move the TCP CC to the user space (e.g. CCP in TCP). The common arguments are faster turn around, get away from long-tail kernel versions in production...etc, which are legit points. BPF has been the continuous effort to join both kernel and userspace upsides together (e.g. XDP to gain the performance advantage without bypassing the kernel). The recent BPF advancements (in particular BTF-aware verifier, BPF trampoline, BPF CO-RE...) made implementing kernel struct ops (e.g. tcp cc) possible in BPF. The idea is to allow implementing tcp_congestion_ops in bpf. It allows a faster turnaround for testing algorithm in the production while leveraging the existing (and continue growing) BPF feature/framework instead of building one specifically for userspace TCP CC. Please see individual patch for details. The bpftool support will be posted in follow-up patches. v4: - Expose tcp_ca_find() to tcp.h in patch 7. It is used to check the same bpf-tcp-cc does not exist to guarantee the register() will succeed. - set_memory_ro() and then set_memory_x() only after all trampolines are written to the image in patch 6. (Daniel) spinlock is replaced by mutex because set_memory_* requires sleepable context. v3: - Fix kbuild error by considering CONFIG_BPF_SYSCALL (kbuild) - Support anonymous bitfield in patch 4 (Andrii, Yonghong) - Push boundary safety check to a specific arch's trampoline function (in patch 6) (Yonghong). Reuse the WANR_ON_ONCE check in arch_prepare_bpf_trampoline() in x86. - Check module field is 0 in udata in patch 6 (Yonghong) - Check zero holes in patch 6 (Andrii) - s/_btf_vmlinux/btf/ in patch 5 and 7 (Andrii) - s/check_xxx/is_xxx/ in patch 7 (Andrii) - Use "struct_ops/" convention in patch 11 (Andrii) - Use the skel instead of bpf_object in patch 11 (Andrii) - libbpf: Decide BPF_PROG_TYPE_STRUCT_OPS at open phase by using find_sec_def() - libbpf: Avoid a debug message at open phase (Andrii) - libbpf: Add bpf_program__(is\|set)_struct_ops() for consistency (Andrii) - libbpf: Add "struct_ops" to section_defs (Andrii) - libbpf: Some code shuffling in init_kern_struct_ops() (Andrii) - libbpf: A few safety checks (Andrii) v2: - Dropped cubic for now. They will be reposted once there are more clarity in "jiffies" on both bpf side (about the helper) and tcp_cubic side (some of jiffies usages are being replaced by tp->tcp_mstamp) - Remove unnecssary check on bitfield support from btf_struct_access() (Yonghong) - BTF_TYPE_EMIT macro (Yonghong, Andrii) - value_name's length check to avoid an unlikely type match during truncation case (Yonghong) - BUILD_BUG_ON to ensure no trampoline-image overrun in the future (Yonghong) - Simplify get_next_key() (Yonghong) - Added comment to explain how to check mandatory func ptr in net/ipv4/bpf_tcp_ca.c (Yonghong) - Rename "__bpf_" to "bpf_struct_ops_" for value prefix (Andrii) - Add comment to highlight the bpf_dctcp.c is not necessarily the same as tcp_dctcp.c. (Alexei, Eric) - libbpf: Renmae "struct_ops" to ".struct_ops" for elf sec (Andrii) - libbpf: Expose struct_ops as a bpf_map (Andrii) - libbpf: Support multiple struct_ops in SEC(".struct_ops") (Andrii) - libbpf: Add bpf_map__attach_struct_ops() (Andrii) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-01-09 09:15:25 -08:00
Martin KaFai Lau	09903869f6	bpf: Add bpf_dctcp example This patch adds a bpf_dctcp example. It currently does not do no-ECN fallback but the same could be done through the cgrp2-bpf. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200109003517.3856825-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	590a008882	bpf: libbpf: Add STRUCT_OPS support This patch adds BPF STRUCT_OPS support to libbpf. The only sec_name convention is SEC(".struct_ops") to identify the struct_ops implemented in BPF, e.g. To implement a tcp_congestion_ops: SEC(".struct_ops") struct tcp_congestion_ops dctcp = { .init = (void )dctcp_init, / <-- a bpf_prog / / ... some more func prts ... */ .name = "bpf_dctcp", }; Each struct_ops is defined as a global variable under SEC(".struct_ops") as above. libbpf creates a map for each variable and the variable name is the map's name. Multiple struct_ops is supported under SEC(".struct_ops"). In the bpf_object__open phase, libbpf will look for the SEC(".struct_ops") section and find out what is the btf-type the struct_ops is implementing. Note that the btf-type here is referring to a type in the bpf_prog.o's btf. A "struct bpf_map" is added by bpf_object__add_map() as other maps do. It will then collect (through SHT_REL) where are the bpf progs that the func ptrs are referring to. No btf_vmlinux is needed in the open phase. In the bpf_object__load phase, the map-fields, which depend on the btf_vmlinux, are initialized (in bpf_map__init_kern_struct_ops()). It will also set the prog->type, prog->attach_btf_id, and prog->expected_attach_type. Thus, the prog's properties do not rely on its section name. [ Currently, the bpf_prog's btf-type ==> btf_vmlinux's btf-type matching process is as simple as: member-name match + btf-kind match + size match. If these matching conditions fail, libbpf will reject. The current targeting support is "struct tcp_congestion_ops" which most of its members are function pointers. The member ordering of the bpf_prog's btf-type can be different from the btf_vmlinux's btf-type. ] Then, all obj->maps are created as usual (in bpf_object__create_maps()). Once the maps are created and prog's properties are all set, the libbpf will proceed to load all the progs. bpf_map__attach_struct_ops() is added to register a struct_ops map to a kernel subsystem. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200109003514.3856730-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	17328d618c	bpf: Synch uapi bpf.h to tools/ This patch sync uapi bpf.h to tools/ Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200109003512.3856559-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	206057fe02	bpf: Add BPF_FUNC_tcp_send_ack helper Add a helper to send out a tcp-ack. It will be used in the later bpf_dctcp implementation that requires to send out an ack when the CE state changed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200109004551.3900448-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	0baf26b0fc	bpf: tcp: Support tcp_congestion_ops in bpf This patch makes "struct tcp_congestion_ops" to be the first user of BPF STRUCT_OPS. It allows implementing a tcp_congestion_ops in bpf. The BPF implemented tcp_congestion_ops can be used like regular kernel tcp-cc through sysctl and setsockopt. e.g. [root@arch-fb-vm1 bpf]# sysctl -a \| egrep congestion net.ipv4.tcp_allowed_congestion_control = reno cubic bpf_cubic net.ipv4.tcp_available_congestion_control = reno bic cubic bpf_cubic net.ipv4.tcp_congestion_control = bpf_cubic There has been attempt to move the TCP CC to the user space (e.g. CCP in TCP). The common arguments are faster turn around, get away from long-tail kernel versions in production...etc, which are legit points. BPF has been the continuous effort to join both kernel and userspace upsides together (e.g. XDP to gain the performance advantage without bypassing the kernel). The recent BPF advancements (in particular BTF-aware verifier, BPF trampoline, BPF CO-RE...) made implementing kernel struct ops (e.g. tcp cc) possible in BPF. It allows a faster turnaround for testing algorithm in the production while leveraging the existing (and continue growing) BPF feature/framework instead of building one specifically for userspace TCP CC. This patch allows write access to a few fields in tcp-sock (in bpf_tcp_ca_btf_struct_access()). The optional "get_info" is unsupported now. It can be added later. One possible way is to output the info with a btf-id to describe the content. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200109003508.3856115-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	85d33df357	bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS The patch introduces BPF_MAP_TYPE_STRUCT_OPS. The map value is a kernel struct with its func ptr implemented in bpf prog. This new map is the interface to register/unregister/introspect a bpf implemented kernel struct. The kernel struct is actually embedded inside another new struct (or called the "value" struct in the code). For example, "struct tcp_congestion_ops" is embbeded in: struct bpf_struct_ops_tcp_congestion_ops { refcount_t refcnt; enum bpf_struct_ops_state state; struct tcp_congestion_ops data; /* <-- kernel subsystem struct here / } The map value is "struct bpf_struct_ops_tcp_congestion_ops". The "bpftool map dump" will then be able to show the state ("inuse"/"tobefree") and the number of subsystem's refcnt (e.g. number of tcp_sock in the tcp_congestion_ops case). This "value" struct is created automatically by a macro. Having a separate "value" struct will also make extending "struct bpf_struct_ops_XYZ" easier (e.g. adding "void (init)(void)" to "struct bpf_struct_ops_XYZ" to do some initialization works before registering the struct_ops to the kernel subsystem). The libbpf will take care of finding and populating the "struct bpf_struct_ops_XYZ" from "struct XYZ". Register a struct_ops to a kernel subsystem: 1. Load all needed BPF_PROG_TYPE_STRUCT_OPS prog(s) 2. Create a BPF_MAP_TYPE_STRUCT_OPS with attr->btf_vmlinux_value_type_id set to the btf id "struct bpf_struct_ops_tcp_congestion_ops" of the running kernel. Instead of reusing the attr->btf_value_type_id, btf_vmlinux_value_type_id s added such that attr->btf_fd can still be used as the "user" btf which could store other useful sysadmin/debug info that may be introduced in the furture, e.g. creation-date/compiler-details/map-creator...etc. 3. Create a "struct bpf_struct_ops_tcp_congestion_ops" object as described in the running kernel btf. Populate the value of this object. The function ptr should be populated with the prog fds. 4. Call BPF_MAP_UPDATE with the object created in (3) as the map value. The key is always "0". During BPF_MAP_UPDATE, the code that saves the kernel-func-ptr's args as an array of u64 is generated. BPF_MAP_UPDATE also allows the specific struct_ops to do some final checks in "st_ops->init_member()" (e.g. ensure all mandatory func ptrs are implemented). If everything looks good, it will register this kernel struct to the kernel subsystem. The map will not allow further update from this point. Unregister a struct_ops from the kernel subsystem: BPF_MAP_DELETE with key "0". Introspect a struct_ops: BPF_MAP_LOOKUP_ELEM with key "0". The map value returned will have the prog _id_ populated as the func ptr. The map value state (enum bpf_struct_ops_state) will transit from: INIT (map created) => INUSE (map updated, i.e. reg) => TOBEFREE (map value deleted, i.e. unreg) The kernel subsystem needs to call bpf_struct_ops_get() and bpf_struct_ops_put() to manage the "refcnt" in the "struct bpf_struct_ops_XYZ". This patch uses a separate refcnt for the purose of tracking the subsystem usage. Another approach is to reuse the map->refcnt and then "show" (i.e. during map_lookup) the subsystem's usage by doing map->refcnt - map->usercnt to filter out the map-fd/pinned-map usage. However, that will also tie down the future semantics of map->refcnt and map->usercnt. The very first subsystem's refcnt (during reg()) holds one count to map->refcnt. When the very last subsystem's refcnt is gone, it will also release the map->refcnt. All bpf_prog will be freed when the map->refcnt reaches 0 (i.e. during map_free()). Here is how the bpftool map command will look like: [root@arch-fb-vm1 bpf]# bpftool map show 6: struct_ops name dctcp flags 0x0 key 4B value 256B max_entries 1 memlock 4096B btf_id 6 [root@arch-fb-vm1 bpf]# bpftool map dump id 6 [{ "value": { "refcnt": { "refs": { "counter": 1 } }, "state": 1, "data": { "list": { "next": 0, "prev": 0 }, "key": 0, "flags": 2, "init": 24, "release": 0, "ssthresh": 25, "cong_avoid": 30, "set_state": 27, "cwnd_event": 28, "in_ack_event": 26, "undo_cwnd": 29, "pkts_acked": 0, "min_tso_segs": 0, "sndbuf_expand": 0, "cong_control": 0, "get_info": 0, "name": [98,112,102,95,100,99,116,99,112,0,0,0,0,0,0,0 ], "owner": 0 } } } ] Misc Notes: * bpf_struct_ops_map_sys_lookup_elem() is added for syscall lookup. It does an inplace update on "value" instead returning a pointer to syscall.c. Otherwise, it needs a separate copy of "zero" value for the BPF_STRUCT_OPS_STATE_INIT to avoid races. The bpf_struct_ops_map_delete_elem() is also called without preempt_disable() from map_delete_elem(). It is because the "->unreg()" may requires sleepable context, e.g. the "tcp_unregister_congestion_control()". * "const" is added to some of the existing "struct btf_func_model *" function arg to avoid a compiler warning caused by this patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200109003505.3855919-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	27ae7997a6	bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS This patch allows the kernel's struct ops (i.e. func ptr) to be implemented in BPF. The first use case in this series is the "struct tcp_congestion_ops" which will be introduced in a latter patch. This patch introduces a new prog type BPF_PROG_TYPE_STRUCT_OPS. The BPF_PROG_TYPE_STRUCT_OPS prog is verified against a particular func ptr of a kernel struct. The attr->attach_btf_id is the btf id of a kernel struct. The attr->expected_attach_type is the member "index" of that kernel struct. The first member of a struct starts with member index 0. That will avoid ambiguity when a kernel struct has multiple func ptrs with the same func signature. For example, a BPF_PROG_TYPE_STRUCT_OPS prog is written to implement the "init" func ptr of the "struct tcp_congestion_ops". The attr->attach_btf_id is the btf id of the "struct tcp_congestion_ops" of the _running_ kernel. The attr->expected_attach_type is 3. The ctx of BPF_PROG_TYPE_STRUCT_OPS is an array of u64 args saved by arch_prepare_bpf_trampoline that will be done in the next patch when introducing BPF_MAP_TYPE_STRUCT_OPS. "struct bpf_struct_ops" is introduced as a common interface for the kernel struct that supports BPF_PROG_TYPE_STRUCT_OPS prog. The supporting kernel struct will need to implement an instance of the "struct bpf_struct_ops". The supporting kernel struct also needs to implement a bpf_verifier_ops. During BPF_PROG_LOAD, bpf_struct_ops_find() will find the right bpf_verifier_ops by searching the attr->attach_btf_id. A new "btf_struct_access" is also added to the bpf_verifier_ops such that the supporting kernel struct can optionally provide its own specific check on accessing the func arg (e.g. provide limited write access). After btf_vmlinux is parsed, the new bpf_struct_ops_init() is called to initialize some values (e.g. the btf id of the supporting kernel struct) and it can only be done once the btf_vmlinux is available. The R0 checks at BPF_EXIT is excluded for the BPF_PROG_TYPE_STRUCT_OPS prog if the return type of the prog->aux->attach_func_proto is "void". Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200109003503.3855825-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	976aba002f	bpf: Support bitfield read access in btf_struct_access This patch allows bitfield access as a scalar. It checks "off + size > t->size" to avoid accessing bitfield end up accessing beyond the struct. This check is done outside of the loop since it is applicable to all access. It also takes this chance to break early on the "off < moff" case. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200109003501.3855427-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	218b3f65f9	bpf: Add enum support to btf_ctx_access() It allows bpf prog (e.g. tracing) to attach to a kernel function that takes enum argument. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200109003459.3855366-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	275517ff45	bpf: Avoid storing modifier to info->btf_id info->btf_id expects the btf_id of a struct, so it should store the final result after skipping modifiers (if any). It also takes this chanace to add a missing newline in one of the bpf_log() messages. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200109003456.3855176-1-kafai@fb.com	2020-01-09 08:46:18 -08:00
Martin KaFai Lau	65726b5b7e	bpf: Save PTR_TO_BTF_ID register state when spilling to stack This patch makes the verifier save the PTR_TO_BTF_ID register state when spilling to the stack. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200109003454.3854870-1-kafai@fb.com	2020-01-09 08:45:32 -08:00
Stanislav Fomichev	e43002242a	selftests/bpf: Restore original comm in test_overhead test_overhead changes task comm in order to estimate BPF trampoline overhead but never sets the comm back to the original one. We have the tests (like core_reloc.c) that have 'test_progs' as hard-coded expected comm, so let's try to preserve the original comm. Currently, everything works because the order of execution is: first core_recloc, then test_overhead; but let's make it a bit future-proof. Other related changes: use 'test_overhead' as new comm instead of 'test' to make it easy to debug and drop '\n' at the end. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Petar Penkov <ppenkov@google.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200108192132.189221-1-sdf@google.com	2020-01-09 08:42:07 -08:00
Michal Rostecki	2faef64aa6	bpftool: Add misc section and probe for large INSN limit Introduce a new probe section (misc) for probes not related to concrete map types, program types, functions or kernel configuration. Introduce a probe for large INSN limit as the first one in that section. Example outputs: # bpftool feature probe [...] Scanning miscellaneous eBPF features... Large program size limit is available # bpftool feature probe macros [...] /* eBPF misc features */ #define HAVE_HAVE_LARGE_INSN_LIMIT # bpftool feature probe -j \| jq '.["misc"]' { "have_large_insn_limit": true } Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Link: https://lore.kernel.org/bpf/20200108162428.25014-3-mrostecki@opensuse.org	2020-01-08 19:34:45 +01:00
Michal Rostecki	5ff0512003	libbpf: Add probe for large INSN limit Introduce a new probe which checks whether kernel has large maximum program size which was increased in the following commit: `c04c0d2b96` ("bpf: increase complexity limit and maximum program size") Based on the similar check in Cilium[0], authored by Daniel Borkmann. [0] `657d0f585a` Co-authored-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Link: https://lore.kernel.org/bpf/20200108162428.25014-2-mrostecki@opensuse.org	2020-01-08 19:31:35 +01:00
Vincent Cheng	1ece2fbe9b	ptp: clockmatrix: Rework clockmatrix version information. Simplify and fix the version information displayed by the driver. The new info better relects what is needed to support the hardware. Prev: Version: 4.8.0, Pipeline 22169 0x4001, Rev 0, Bond 5, CSR 311, IRQ 2 New: Version: 4.8.0, Id: 0x4001 Hw Rev: 5 OTP Config Select: 15 - Remove pipeline, CSR and IRQ because version x.y.z already incorporates this information. - Remove bond number because it is not used. - Remove rev number because register was not implemented, always 0 - Add HW Rev ID register to replace rev number - Add OTP config select to show the user configuration chosen by the configurable GPIO pins on start-up Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:51:23 -08:00
YueHaibing	4addbcb387	enetc: Fix inconsistent IS_ERR and PTR_ERR The proper pointer to be passed as argument is hw Detected using Coccinelle. Fixes: `6517798dd3` ("enetc: Make MDIO accessors more generic and export to include/linux/fsl") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:46:42 -08:00
Dan Carpenter	0d6e5bfc9c	enetc: Fix an off by one in enetc_setup_tc_txtime() The priv->tx_ring[] has 16 elements but only priv->num_tx_rings are set up, the rest are NULL. This ">" comparison should be ">=" to avoid a potential crash. Fixes: `0d08c9ec7d` ("enetc: add support time specific departure base on the qos etf") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:46:20 -08:00
David S. Miller	cbefe2c957	Merge branch 'Documentation-stmmac-documentation-improvements' Jose Abreu says: ==================== Documentation: stmmac documentation improvements Converts stmmac documentation to RST format. 1) Adds missing entry of stmmac documentation to MAINTAINERS. 2) Converts stmmac documentation to RST format and adds some new info. 3) Adds the new RST file to the list of files. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:38:56 -08:00
Jose Abreu	b053b28e93	Documentation: networking: Add stmmac to device drivers list Add the stmmac RST file to the index. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:38:56 -08:00
Jose Abreu	2ffebffbe7	Documentation: networking: Convert stmmac documentation to RST format Convert the documentation of the driver to RST format and delete the old txt and old information that no longer applies. Also, add some new information. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:38:56 -08:00
Jose Abreu	1501125460	MAINTAINERS: Add stmmac Ethernet driver documentation entry Add the missing entry for the file that documents the stmicro Ethernet driver stmmac. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:38:56 -08:00
Chen Zhou	c68d724826	drivers: net: cisco_hdlc: use __func__ in debug message Use __func__ to print the function name instead of hard coded string. BTW, replace printk(KERN_DEBUG, ...) with netdev_dbg. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:33:35 -08:00
David S. Miller	10332dc220	Merge branch 'net-ch9200-code-cleanup' Chen Zhou says: ==================== net: ch9200: code cleanup patch 1 introduce __func__ in debug message. patch 2 remove unnecessary return. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:30:36 -08:00
Chen Zhou	195234b885	net: ch9200: remove unnecessary return The return is not needed, remove it. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:30:36 -08:00
Chen Zhou	e64dec834e	net: ch9200: use __func__ in debug message Use __func__ to print the function name instead of hard coded string. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:30:36 -08:00
David S. Miller	58cf542a1a	Merge branch 'ionic-driver-updates' Shannon Nelson says: ==================== ionic: driver updates These are a few little updates for the ionic network driver. v2: dropped IBM msi patch added fix for a compiler warning ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:05:06 -08:00
Shannon Nelson	6be1a5ce1b	ionic: clear compiler warning on hb use before set Build checks have pointed out that 'hb' can theoretically be used before set, so let's initialize it and get rid of the compiler complaint. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:05:06 -08:00
Shannon Nelson	c37d6e3f25	ionic: restrict received packets to mtu size Make sure the NIC drops packets that are larger than the specified MTU. The front end of the NIC will accept packets larger than MTU and will copy all the data it can to fill up the driver's posted buffers - if the buffers are not long enough the packet will then get dropped. With the Rx SG buffers allocagted as full pages, we are currently setting up more space than MTU size available and end up receiving some packets that are larger than MTU, up to the size of buffers posted. To be sure the NIC doesn't waste our time with oversized packets we need to lie a little in the SG descriptor about how long is the last SG element. At dealloc time, we know the allocation was a page, so the deallocation doesn't care about what length we put in the descriptor. Signed-off-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:05:06 -08:00
Shannon Nelson	24cfa8c762	ionic: add Rx dropped packet counter Add a counter for packets dropped by the driver, typically for bad size or a receive error seen by the device. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:05:06 -08:00
Shannon Nelson	3daca28f15	ionic: drop use of subdevice tags The subdevice concept is not being used in the driver, so drop the references to it. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:05:06 -08:00
David S. Miller	5528e0d7f1	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2020-01-06 This series contains updates to igc to add basic support for timestamping. Vinicius adds basic support for timestamping and enables ptp4l/phc2sys to work with i225 devices. Initially, adds the ability to read and adjust the PHC clock. Patches 2 & 3 enable and retrieve hardware timestamps. Patch 4 implements the ethtool ioctl that ptp4l uses to check what timestamping methods are supported. Lastly, added support to do timestamping using the "Start of Packet" signal from the PHY, which is now supported in i225 devices. While i225 does support multiple PTP domains, with multiple timestamping registers, we currently only support one PTP domain and use only one of the timestamping registers for implementation purposes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 18:44:48 -08:00
David S. Miller	1b935183ae	Merge branch 'Unique-mv88e6xxx-IRQ-names' Andrew Lunn says: ==================== Unique mv88e6xxx IRQ names There are a few boards which have multiple mv88e6xxx switches. With such boards, it can be hard to determine which interrupts belong to which switches. Make the interrupt names unique by including the device name in the interrupt name. For the SERDES interrupt, also include the port number. As a result of these patches ZII devel C looks like: 50: 0 gpio-vf610 27 Level mv88e6xxx-0.1:00 54: 0 mv88e6xxx-g1 3 Edge mv88e6xxx-0.1:00-g1-atu-prob 56: 0 mv88e6xxx-g1 5 Edge mv88e6xxx-0.1:00-g1-vtu-prob 58: 0 mv88e6xxx-g1 7 Edge mv88e6xxx-0.1:00-g2 61: 0 mv88e6xxx-g2 1 Edge !mdio-mux!mdio@1!switch@0!mdio:01 62: 0 mv88e6xxx-g2 2 Edge !mdio-mux!mdio@1!switch@0!mdio:02 63: 0 mv88e6xxx-g2 3 Edge !mdio-mux!mdio@1!switch@0!mdio:03 64: 0 mv88e6xxx-g2 4 Edge !mdio-mux!mdio@1!switch@0!mdio:04 70: 0 mv88e6xxx-g2 10 Edge mv88e6xxx-0.1:00-serdes-10 75: 0 mv88e6xxx-g2 15 Edge mv88e6xxx-0.1:00-watchdog 76: 5 gpio-vf610 26 Level mv88e6xxx-0.2:00 80: 0 mv88e6xxx-g1 3 Edge mv88e6xxx-0.2:00-g1-atu-prob 82: 0 mv88e6xxx-g1 5 Edge mv88e6xxx-0.2:00-g1-vtu-prob 84: 4 mv88e6xxx-g1 7 Edge mv88e6xxx-0.2:00-g2 87: 2 mv88e6xxx-g2 1 Edge !mdio-mux!mdio@2!switch@0!mdio:01 88: 0 mv88e6xxx-g2 2 Edge !mdio-mux!mdio@2!switch@0!mdio:02 89: 0 mv88e6xxx-g2 3 Edge !mdio-mux!mdio@2!switch@0!mdio:03 90: 0 mv88e6xxx-g2 4 Edge !mdio-mux!mdio@2!switch@0!mdio:04 95: 3 mv88e6xxx-g2 9 Edge mv88e6xxx-0.2:00-serdes-9 96: 0 mv88e6xxx-g2 10 Edge mv88e6xxx-0.2:00-serdes-10 101: 0 mv88e6xxx-g2 15 Edge mv88e6xxx-0.2:00-watchdog Interrupt names like !mdio-mux!mdio@2!switch@0!mdio:01 are created by phylib for the integrated PHYs. The mv88e6xxx driver does not determine these names. ==================== Tested-by: Chris Healy <cphealy@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 18:30:15 -08:00
Andrew Lunn	8ddf0b5693	net: dsa: mv88e6xxx: Unique ATU and VTU IRQ names Dynamically generate a unique interrupt name for the VTU and ATU, based on the device name. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 18:30:15 -08:00
Andrew Lunn	06acd1148b	net: dsa: mv88e6xxx: Unique g2 IRQ name Dynamically generate a unique g2 interrupt name, based on the device name. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 18:30:14 -08:00
Andrew Lunn	8b4db28914	net: dsa: mv88e6xxx: Unique watchdog IRQ name Dynamically generate a unique watchdog interrupt name, based on the device name. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 18:30:14 -08:00
Andrew Lunn	e6f2f6b824	net: dsa: mv88e6xxx: Unique SERDES interrupt names Dynamically generate a unique SERDES interrupt name, based on the device name and the port the SERDES is for. For example: 95: 3 mv88e6xxx-g2 9 Edge mv88e6xxx-0.2:00-serdes-9 96: 0 mv88e6xxx-g2 10 Edge mv88e6xxx-0.2:00-serdes-10 The 0.2:00 indicates the switch and -9 indicates port 9. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 18:30:14 -08:00
Andrew Lunn	3095383a8a	net: dsa: mv88e6xxx: Unique IRQ name Dynamically generate a unique switch interrupt name, based on the device name. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 18:30:14 -08:00

1 2 3 4 5 ...

888792 Commits All Branches Search

888792 Commits

All Branches