perf trace:
- Augment the payload of syscall entry/exit tracepoints with the contents
of pointer arguments, such as the "filename" argument to the "open"
syscall or the 'struct sockaddr *' argument to the 'connect' syscall.
This is done using a BPF program that gets compiled and attached to
various syscalls:sys_enter_NAME tracepoints, copying via a BPF map and
"bpf-output" perf event the raw_syscalls:sys_enter tracepoint payload +
the contents of pointer arguments using the "probe_read", "probe_read_str"
and "perf_event_output" BPF functions.
The 'perf trace' codebase now just processes these augmented tracepoints
using the existing beautifiers that now check if there is more in the
perf_sample->raw_data than what is expected for a normal syscall enter
tracepoint (the common preamble, syscall id, up to six parameters),
using that with hand crafted struct beautifiers.
This is just to show how to augment the existing tracepoints, work will
be done to use DWARF or BTF info to do the pretty-printing and to create
the collectors.
For now this is done using an example restricted C BPF program, but the
end goal is to have this all autogenerated and done transparently.
Its still useful to have this example as one can use it as an skeleton and
write more involved filters, see the etcsnoop.c BPF example, for instance.
E.g.:
# cd tools/perf/examples/bpf/
# perf trace -e augmented_syscalls.c ping -c 1 ::1
0.000 ( 0.008 ms): openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.020 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libcap.so.2, flags: CLOEXEC) = 3
0.051 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libidn.so.11, flags: CLOEXEC) = 3
0.076 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libcrypto.so.1.1, flags: CLOEXEC) = 3
0.106 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libresolv.so.2, flags: CLOEXEC) = 3
0.136 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libm.so.6, flags: CLOEXEC) = 3
0.194 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.224 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libz.so.1, flags: CLOEXEC) = 3
0.252 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libdl.so.2, flags: CLOEXEC) = 3
0.275 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libpthread.so.0, flags: CLOEXEC) = 3
0.730 ( 0.007 ms): open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
PING ::1(::1) 56 data bytes
0.834 ( 0.008 ms): connect(fd: 5, uservaddr: { .family: INET6, port: 1025, addr: ::1 }, addrlen: 28) = 0
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.032 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.032/0.032/0.032/0.000 ms
0.914 ( 0.036 ms): sendto(fd: 4<socket:[843044]>, buff: 0x55b5e52e9720, len: 64, addr: { .family: INET6, port: 58, addr: ::1 }, addr_len: 28) = 64
#
Use 'perf trace -e augmented_syscalls.c,close ping -c 1 ::1' to see the
'close' calls as well, as it is not one of the syscalls augmented in that .c
file.
(Arnaldo Carvalho de Melo)
- Alias 'umount' to 'umount2' (Benjamin Peterson)
perf stat: (Jiri Olsa)
- Make many builtin-stat.c functions generic, moving display functions
to a separate file, prep work for adding the ability to store/display
stat data in perf record/top.
perf annotate: (Kim Phillips)
- Handle arm64 move instructions
perf report: (Thomas Richter):
- Create auxiliary trace data files for s390
libtraceevent: (Tzvetomir Stoyanov (VMware)):
- Split trace-seq related APIs in a separate header file.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCW5BHbgAKCRCyPKLppCJ+
J6WGAQDoPezUDbL8azCcscZZaaT4W63oAyVizOdy4qUIw5D2BAD+P2bYTmW3EKeO
d3FXWqtpMRNR0CFhWPi57XpJiia0oA0=
=ICRn
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-4.20-20180905' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo:
perf trace:
- Augment the payload of syscall entry/exit tracepoints with the contents
of pointer arguments, such as the "filename" argument to the "open"
syscall or the 'struct sockaddr *' argument to the 'connect' syscall.
This is done using a BPF program that gets compiled and attached to
various syscalls:sys_enter_NAME tracepoints, copying via a BPF map and
"bpf-output" perf event the raw_syscalls:sys_enter tracepoint payload +
the contents of pointer arguments using the "probe_read", "probe_read_str"
and "perf_event_output" BPF functions.
The 'perf trace' codebase now just processes these augmented tracepoints
using the existing beautifiers that now check if there is more in the
perf_sample->raw_data than what is expected for a normal syscall enter
tracepoint (the common preamble, syscall id, up to six parameters),
using that with hand crafted struct beautifiers.
This is just to show how to augment the existing tracepoints, work will
be done to use DWARF or BTF info to do the pretty-printing and to create
the collectors.
For now this is done using an example restricted C BPF program, but the
end goal is to have this all autogenerated and done transparently.
Its still useful to have this example as one can use it as an skeleton and
write more involved filters, see the etcsnoop.c BPF example, for instance.
E.g.:
# cd tools/perf/examples/bpf/
# perf trace -e augmented_syscalls.c ping -c 1 ::1
0.000 ( 0.008 ms): openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.020 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libcap.so.2, flags: CLOEXEC) = 3
0.051 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libidn.so.11, flags: CLOEXEC) = 3
0.076 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libcrypto.so.1.1, flags: CLOEXEC) = 3
0.106 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libresolv.so.2, flags: CLOEXEC) = 3
0.136 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libm.so.6, flags: CLOEXEC) = 3
0.194 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.224 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libz.so.1, flags: CLOEXEC) = 3
0.252 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libdl.so.2, flags: CLOEXEC) = 3
0.275 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libpthread.so.0, flags: CLOEXEC) = 3
0.730 ( 0.007 ms): open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
PING ::1(::1) 56 data bytes
0.834 ( 0.008 ms): connect(fd: 5, uservaddr: { .family: INET6, port: 1025, addr: ::1 }, addrlen: 28) = 0
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.032 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.032/0.032/0.032/0.000 ms
0.914 ( 0.036 ms): sendto(fd: 4<socket:[843044]>, buff: 0x55b5e52e9720, len: 64, addr: { .family: INET6, port: 58, addr: ::1 }, addr_len: 28) = 64
#
Use 'perf trace -e augmented_syscalls.c,close ping -c 1 ::1' to see the
'close' calls as well, as it is not one of the syscalls augmented in that .c
file.
(Arnaldo Carvalho de Melo)
- Alias 'umount' to 'umount2' (Benjamin Peterson)
perf stat: (Jiri Olsa)
- Make many builtin-stat.c functions generic, moving display functions
to a separate file, prep work for adding the ability to store/display
stat data in perf record/top.
perf annotate: (Kim Phillips)
- Handle arm64 move instructions
perf report: (Thomas Richter):
- Create auxiliary trace data files for s390
libtraceevent: (Tzvetomir Stoyanov (VMware)):
- Split trace-seq related APIs in a separate header file.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>