Commit Graph

11 Commits

Author SHA1 Message Date
Jesper Dangaard Brouer 96c5508e30 xdp: implement xdp_redirect_map for generic XDP
Using bpf_redirect_map is allowed for generic XDP programs, but the
appropriate map lookup was never performed in xdp_do_generic_redirect().

Instead the map-index is directly used as the ifindex.  For the
xdp_redirect_map sample in SKB-mode '-S', this resulted in trying
sending on ifindex 0 which isn't valid, resulting in getting SKB
packets dropped.  Thus, the reported performance numbers are wrong in
commit 24251c2647 ("samples/bpf: add option for native and skb mode
for redirect apps") for the 'xdp_redirect_map -S' case.

Before commit 109980b894 ("bpf: don't select potentially stale
ri->map from buggy xdp progs") it could crash the kernel.  Like this
commit also check that the map_owner owner is correct before
dereferencing the map pointer.  But make sure that this API misusage
can be caught by a tracepoint. Thus, allowing userspace via
tracepoints to detect misbehaving bpf_progs.

Fixes: 6103aa96ec ("net: implement XDP_REDIRECT for xdp generic")
Fixes: 24251c2647 ("samples/bpf: add option for native and skb mode for redirect apps")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-11 14:33:00 -07:00
Jesper Dangaard Brouer 59a3089675 xdp: separate xdp_redirect tracepoint in map case
Creating as specific xdp_redirect_map variant of the xdp tracepoints
allow users to write simpler/faster BPF progs that get attached to
these tracepoints.

Goal is to still keep the tracepoints in xdp_redirect and xdp_redirect_map
similar enough, that a tool can read the top part of the TP_STRUCT and
produce similar monitor statistics.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer f5836ca5e9 xdp: separate xdp_redirect tracepoint in error case
There is a need to separate the xdp_redirect tracepoint into two
tracepoints, for separating the error case from the normal forward
case.

Due to the extreme speeds XDP is operating at, loading a tracepoint
have a measurable impact.  Single core XDP REDIRECT (ethtool tuned
rx-usecs 25) can do 13.7 Mpps forwarding, but loading a simple
bpf_prog at the tracepoint (with a return 0) reduce perf to 10.2 Mpps
(CPU E5-1650 v4 @ 3.60GHz, driver: ixgbe)

The overhead of loading a bpf-based tracepoint can be calculated to
cost 25 nanosec ((1/13782002-1/10267937)*10^9 = -24.83 ns).

Using perf record on the tracepoint event, with a non-matching --filter
expression, the overhead is much larger. Performance drops to 8.3 Mpps,
cost 48 nanosec ((1/13782002-1/8312497)*10^9 = -47.74))

Having a separate tracepoint for err cases, which should be less
frequent, allow running a continuous monitor for errors while not
affecting the redirect forward performance (this have also been
verified by measurements).

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer b06337dfdb xdp: make xdp tracepoints report bpf prog id instead of prog_tag
Given previous patch expose the map_id, it seems natural to also
report the bpf prog id.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer 8d3b778ff5 xdp: tracepoint xdp_redirect also need a map argument
To make sense of the map index, the tracepoint user also need to know
that map we are talking about.  Supply the map pointer but only expose
the map->id.

The 'to_index' is renamed 'to_ifindex'.  In the xdp_redirect_map case,
this is the result of the devmap lookup. The map lookup key is exposed
as map_index, which is needed to troubleshoot in case the lookup failed.
The 'to_ifindex' is placed after 'err' to keep TP_STRUCT as common as
possible.

This also keeps the TP_STRUCT similar enough, that userspace can write
a monitor program, that doesn't need to care about whether
bpf_redirect or bpf_redirect_map were used.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29 10:51:28 -07:00
Jesper Dangaard Brouer c31e5a4876 xdp: remove redundant argument to trace_xdp_redirect
Supplying the action argument XDP_REDIRECT to the tracepoint xdp_redirect
is redundant as it is only called in-case this action was specified.

Remove the argument, but keep "act" member of the tracepoint struct and
populate it with XDP_REDIRECT.  This makes it easier to write a common bpf_prog
processing events.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29 10:51:28 -07:00
Jesper Dangaard Brouer 315ec3990e xdp: get tracepoints xdp_exception and xdp_redirect in sync
Remove the net_device string name from the xdp_exception tracepoint,
like the xdp_redirect tracepoint.

Align the TP_STRUCT to have common entries between these two
tracepoint.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-24 11:59:37 -07:00
Jesper Dangaard Brouer a873585587 xdp: remove net_device names from xdp_redirect tracepoint
There is too much overhead in the current trace_xdp_redirect
tracepoint as it does strcpy and strlen on the net_device names.

Besides, exposing the ifindex/index is actually the information that
is needed in the tracepoint to diagnose issues.  When a lookup fails
(either ifindex or devmap index) then there is a need for saying which
to_index that have issues.

V2: Adjust args to be aligned with trace_xdp_exception.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-24 11:59:37 -07:00
Jesper Dangaard Brouer 4c03bdd7b5 xdp: adjust xdp redirect tracepoint to include return error code
The return error code need to be included in the tracepoint
xdp:xdp_redirect, else its not possible to distinguish successful or
failed XDP_REDIRECT transmits.

XDP have no queuing mechanism. Thus, it is fairly easily to overrun a
NIC transmit queue.  The eBPF program invoking helpers (bpf_redirect
or bpf_redirect_map) to redirect a packet doesn't get any feedback
whether the packet was actually transmitted.

Info on failed transmits in the tracepoint xdp:xdp_redirect, is
interesting as this opens for providing a feedback-loop to the
receiving XDP program.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:18:40 -07:00
John Fastabend 5acaee0a89 xdp: add trace event for xdp redirect
This adds a trace event for xdp redirect which may help when debugging
XDP programs that use redirect bpf commands.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17 09:48:06 -07:00
Daniel Borkmann a67edbf4fb bpf: add initial bpf tracepoints
This work adds a number of tracepoints to paths that are either
considered slow-path or exception-like states, where monitoring or
inspecting them would be desirable.

For bpf(2) syscall, tracepoints have been placed for main commands
when they succeed. In XDP case, tracepoint is for exceptions, that
is, f.e. on abnormal BPF program exit such as unknown or XDP_ABORTED
return code, or when error occurs during XDP_TX action and the packet
could not be forwarded.

Both have been split into separate event headers, and can be further
extended. Worst case, if they unexpectedly should get into our way in
future, they can also removed [1]. Of course, these tracepoints (like
any other) can be analyzed by eBPF itself, etc. Example output:

  # ./perf record -a -e bpf:* sleep 10
  # ./perf script
  sock_example  6197 [005]   283.980322:      bpf:bpf_map_create: map type=ARRAY ufd=4 key=4 val=8 max=256 flags=0
  sock_example  6197 [005]   283.980721:       bpf:bpf_prog_load: prog=a5ea8fa30ea6849c type=SOCKET_FILTER ufd=5
  sock_example  6197 [005]   283.988423:   bpf:bpf_prog_get_type: prog=a5ea8fa30ea6849c type=SOCKET_FILTER
  sock_example  6197 [005]   283.988443: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[06 00 00 00] val=[00 00 00 00 00 00 00 00]
  [...]
  sock_example  6197 [005]   288.990868: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[01 00 00 00] val=[14 00 00 00 00 00 00 00]
       swapper     0 [005]   289.338243:    bpf:bpf_prog_put_rcu: prog=a5ea8fa30ea6849c type=SOCKET_FILTER

  [1] https://lwn.net/Articles/705270/

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25 13:17:47 -05:00