Future proof CBR packet decoding by passing through also the undefined
'reserved' byte in the packet payload.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add decoder support for PTWRITE, MWAIT, PWRE, PWRX and EXSTOP packets. This
patch only affects the decoder, so the tools still do not select or consume
the new information. That is added in subsequent patches.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The kernel now supports the disabling of branch tracing, however the
decoder assumes branch tracing is always enabled. Pass through a parameter
to indicate whether branch tracing is enabled and use it to avoid cases
when the decoder is expecting branch packets. There are 2 such cases.
First, FUP packets which can bind to an IP even when there is no branch
tracing. Secondly, the decoder will try to use branch packets to find an IP
to start decoding or to recover from errors.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes a FUP packet is associated with a TSX transaction and a flag is
set to indicate that. Ensure that flag is cleared on any error condition
because at that point the decoder can no longer assume it is correct.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The decoder will try to use branch packets to find an IP to start decoding
or to recover from errors. Currently the FUP packet is used only in the
case of an overflow, however there is no reason for that to be a special
case. So just use FUP always when scanning for an IP.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses IP compression based on the last IP. For decoding purposes,
'last IP' is not updated when a branch target has been suppressed, which is
indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so
ensure never to set 'last_ip' when packet 'count' is zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses IP compression based on the last IP. For decoding
purposes, 'last IP' is considered to be reset to zero whenever there is
a synchronization packet (PSB). The decoder wasn't doing that, and was
treating the zero value to mean that there was no last IP, whereas
compression can be done against the zero value. Fix by setting last_ip
to zero when a PSB is received and keep track of have_last_ip.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The decoder uses its current timestamp in samples. Usually that is a
timestamp that has already passed, but in some cases it is a timestamp
for a branch that the decoder is walking towards, and consequently
hasn't reached. Improve that situation by using the pkt_state to
determine when to use the current or previous timestamp.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implementing a new --smi-cost mode in perf stat to measure SMI cost.
During the measurement, the /sys/device/cpu/freeze_on_smi will be set.
The measurement can be done with one counter (unhalted core cycles), and
two free running MSR counters (IA32_APERF and SMI_COUNT).
In practice, the percentages of SMI core cycles should be more useful
than absolute value. So the output will be the percentage of SMI core
cycles and SMI#. metric_only will be set by default.
SMI cycles% = (aperf - unhalted core cycles) / aperf
Here is an example output.
Performance counter stats for 'sudo echo ':
SMI cycles% SMI#
0.1% 1
0.010858678 seconds time elapsed
Users who wants to get the actual value can apply additional
--no-metric-only.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Curious as to what this was for I looked at /usr/include/ and only some
python headers define this, and it ends up being to enable "extensions"
on some old OSes:
/* Enable extensions on AIX 3, Interix */
I guess we can remove this one safely.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-omnundlxo2brs552bdl6m0j1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While trying to reduce util.[ch] I noticed that fetch_kernel_version()
and fetch_ubuntu_kernel_version() do lots of operations only to check if
they are needed, i.e. it checks if the pointer where to return the
kernel version is NULL only after obtaining the kernel version from
/proc/version_signature or by parsing the results from uname().
Do it earlier not to confuse people reading this code in the future :-)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i94qwyekk4tzbu0b9ce1r1mz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And make it static, nobody else uses it, if we ever need it in more
places we can carve a new source file for process related methods,
for now lets reduce util.{c,h} a tad more.
Link: http://lkml.kernel.org/n/tip-zgb28rllvypjibw52aaz9p15@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In annotate browser, we will add support to check fused instructions.
While this is x86-specific feature so we need the annotate browser to
know what the arch it runs on.
symbol__disassemble() has figured out the arch. This patch just lets the
arch return from symbol__disassemble and save the arch in annotate
browser.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1497840958-4759-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These defines were probably dragged in from sampling support in earlier
patches. They can be put back when needed.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170616112339.3fb6986e4ff33e353008244b@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to ask the compiler to use a specific
alignment, making tools/ look more like kernel source code.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-8jiem6ubg9rlpbs7c2p900no@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to ask the compiler to not insert alignment
paddings in a struct, making tools/ look more like kernel source code.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-byp46nr7hsxvvyc9oupfb40q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of defining __unused or redefining __maybe_unused.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4eleto5pih31jw1q4dypm9pf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to ask the compiler to perform scanf like
argument validation.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-yzqrhfjrn26lqqtwf55egg0h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to ask the compiler to perform printf like
vargargs validation.
v2: Fixed up build on arm, squashing a patch by Kim Phillips, thanks!
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dopkqmmuqs04cxzql0024nnu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to specify that a function doesn't return,
instead of the open coded:
__attribute__((noreturn))
And use it instead of the tools/perf/ specific variation, NORETURN.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-l0y144qzixcy5t4c6i7pdiqj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With commit: 0a943cb10c (tools build: Add HOSTARCH Makefile variable)
when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of
ARCH=x86, so the perf build process searchs header files from
tools/arch/x86_64/include, which doesn't exist.
The following build failure is seen:
In file included from util/event.c:2:0:
tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory
compilation terminated.
Fix this issue by using SRCARCH instead of ARCH in perf, just like the
main kernel Makefile and tools/objtool's.
Signed-off-by: Jiada Wang <jiada_wang@mentor.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Eugeniu Rosca <erosca@de.adit-jv.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Rui Teng <rui.teng@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 0a943cb10c ("tools build: Add HOSTARCH Makefile variable")
Link: http://lkml.kernel.org/r/1491793357-14977-2-git-send-email-jiada_wang@mentor.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since commit 18e7a45af9 ("perf/x86: Reject non sampling events with
precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute
with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done
in the routine used to probe the max precise level when no events were
passed to 'perf record' or 'perf top', i.e.:
perf_evsel__new_cycles()
perf_event_attr__set_max_precise_ip()
The x86 code, in x86_pmu_hw_config(), which is called all the way from
sys_perf_event_open() did, starting with the aforementioned commit:
/* There's no sense in having PEBS for non sampling events: */
if (!is_sampling_event(event))
return -EINVAL;
Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using
just the non precise cycles variant.
To make sure that this is the case, I tested it, before this patch,
with:
# perf probe -L x86_pmu_hw_config
<x86_pmu_hw_config@/home/acme/git/linux/arch/x86/events/core.c:0>
0 int x86_pmu_hw_config(struct perf_event *event)
1 {
2 if (event->attr.precise_ip) {
<SNIP>
17 if (event->attr.precise_ip > precise)
18 return -EOPNOTSUPP;
/* There's no sense in having PEBS for non sampling events: */
21 if (!is_sampling_event(event))
22 return -EINVAL;
}
<SNIP>
# perf probe x86_pmu_hw_config:22
Added new events:
probe:x86_pmu_hw_config (on x86_pmu_hw_config:22)
probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22)
You can now use it in all perf tools, such as:
perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1
# perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1
0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.015 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.000 ( 0.021 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.025 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.023 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.030 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.028 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
#
I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times.
So fix it by just setting attr.sample_period
Now, after this patch:
# perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1
[ perf record: Woken up 1 times to write data ]
0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_event_open_cloexec_flag (/home/acme/bin/perf)
0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evlist__config (/home/acme/bin/perf)
0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evlist__config (/home/acme/bin/perf)
0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1 ) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
#
The probe one called from perf_event_attr__set_max_precise_ip() works
the first time, with attr.precise_ip = 3, wit hthe next ones being the
per cpu ones for the cycles:ppp event.
And here is the text from a report and alternative proposed patch by
Thomas-Mich Richter:
---
On s390 the counter and sampling facility do not support a precise IP
skid level and sometimes returns EOPNOTSUPP when structure member
precise_ip in struct perf_event_attr is not set to zero.
On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP. This
happens only when no events are specified on command line.
The functions called are
...
--> perf_evlist__add_default
--> perf_evsel__new_cycles
--> perf_event_attr__set_max_precise_ip
The last function determines the value of structure member precise_ip by
invoking the perf_event_open() system call and checking the return code.
The first successful open is the value for precise_ip.
However the value is determined without setting member sample_period and
indicates no sampling.
On s390 the counter facility and sampling facility are different. The
above procedure determines a precise_ip value of 3 using the counter
facility. Later it uses the sampling facility with a value of 3 and
fails with EOPNOTSUPP.
---
v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members
of unnamed union members in the container struct initialization, so
move from:
struct perf_event_attr attr = {
...
.sample_period = 1,
};
to right after it as:
struct perf_event_attr attr = {
...
};
attr.sample_period = 1;
v3: We need to reset .sample_period to 0 to let the users of
perf_evsel__new_cycles() to properly setup attr.sample_period or
attr.sample_freq. Reported by Ingo Molnar.
Reported-and-Acked-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 18e7a45af9 ("perf/x86: Reject non sampling events with precise_ip")
Link: http://lkml.kernel.org/n/tip-yv6nnkl7tzqocrm0hl3x7vf1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit e7ee404757 ("perf symbols: Fix symbols searching for module
in buildid-cache") added the function to check kernel modules reside in
the build-id cache. This was because there's no way to identify a DSO
which is actually a kernel module. So it searched linkname of the file
and find ".ko" suffix.
But this does not work for compressed kernel modules and now such DSOs
hCcave correct symtab_type now. So no need to check it anymore. This
patch essentially reverts the commit.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The symsrc__init() overwrites dso->symtab_type as symsrc->type in
dso__load_sym(). But for compressed kernel modules in the build-id
cache, it should have original symtab type to be decompressed as needed.
This fixes perf annotate to show disassembly of the function properly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On failure, it should free the 'name', so clean up the error path using
goto.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently perf decompresses kernel modules when loading the symbol table
but it missed to do it when reading raw data.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Convert open-coded decompress routine to use the function.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move decompress_kmodule() to util/dso.c and split it into two functions
returning fd and (decompressed) file path. The existing user only wants
the fd version but the path version will be used soon.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'name' variable should be freed on the error path.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 6ebd2547dd ("perf annotate: Fix a bug following symbolic
link of a build-id file") changed to use dirname to follow the symlink.
But it only considers new-style build-id cache names so old names fail
on readlink() and force to use system path which might not available.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Fixes: 6ebd2547dd ("perf annotate: Fix a bug following symbolic link of a build-id file")
Link: http://lkml.kernel.org/r/20170608073109.30699-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Script generated by the '--gen-script' option contains an outdated
comment. It mentions a 'perf-trace-python' document while it has been
renamed to 'perf-script-python'. Fix it.
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 133dc4c39c ("perf: Rename 'perf trace' to 'perf script'")
Link: http://lkml.kernel.org/r/20170530111827.21732-2-sj38.park@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In some situations the libdw unwinder stopped working properly. I.e.
with libunwind we see:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
f199 call_init.part.0 (/usr/lib/ld-2.25.so)
f2a5 _dl_init (/usr/lib/ld-2.25.so)
db9 _dl_start_user (/usr/lib/ld-2.25.so)
~~~~~
But with libdw and without this patch this sample is not properly
unwound:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
~~~~~
Debug output showed me that libdw found a module for the last frame
address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
double-checks what libdw sees and what perf knows. If the mappings
mismatch, we now report the elf known to perf. This fixes the situation
above, and the libdw unwinder produces the same stack as libunwind.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The decompress_kmodule() decompresses kernel modules in order to load
symbols from it. In the DSO_BINARY_TYPE__BUILD_ID_CACHE case, it needs
the full file path to extract the file extension to determine the
decompression method. But overwriting 'name' will fail the
decompression since it might point to a non-existing old file.
Instead, use dso->long_name for having the correct extension and use the
real filename to decompress.
In the DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP case, both names should
be the same. This allows resolving symbols in the old modules.
Before:
$ perf report -i perf.data.old | grep scsi_mod
0.00% cc1 [scsi_mod] [k] 0x0000000000004aa6
0.00% as [scsi_mod] [k] 0x00000000000099e1
0.00% cc1 [scsi_mod] [k] 0x0000000000009830
0.00% cc1 [scsi_mod] [k] 0x0000000000001b8f
After:
0.00% cc1 [scsi_mod] [k] scsi_handle_queue_ramp_up
0.00% as [scsi_mod] [k] scsi_sg_alloc
0.00% cc1 [scsi_mod] [k] scsi_setup_cmnd
0.00% cc1 [scsi_mod] [k] scsi_get_command
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like machine__findnew_module_dso(), it should set necessary info for
kernel modules to find symbol info from the file. Factor out
dso__set_module_info() to do it.
This is needed for dso__needs_decompress() to detect such DSOs.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When perf processes build-id event, it creates DSOs with the build-id.
But it didn't set the module short name (like '[module-name]') so when
processing a kernel mmap event of the module, it cannot found the DSO as
it only checks the short names.
That leads for perf to create a same DSO without the build-id info and
it'll lookup the system path even if the DSO is already in the build-id
cache. After kernel was updated, perf cannot find the DSO and cannot
show symbols in it anymore.
You can see this if you have an old data file (w/ old kernel version):
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz : cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
Failed to open /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz, continuing without symbols
...
The second message didn't show the build-id. With this patch:
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz: cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
/lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz with build id cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 not found, continuing without symbols
...
Now it shows the build-id but still cannot load the symbol table. This
is a different problem which will be fixed in the next patch.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-1-namhyung@kernel.org
[ Fix the build on older compilers (debian <= 8, fedora <= 21, etc) wrt kmod_path var init ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When filename contains special chars, perf annotate fails
with an error:
$ perf annotate --vmlinux ./vmlinux\(test\) --stdio native_safe_halt
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `objdump --start-address=0xffffffff8184e840
--stop-address=0xffffffff8184e848 -l -d --no-show-raw -S -C
./vmlinux(test) 2>/dev/null|grep -v ./vmlinux(test):|expand'
Fix it by surrounding filename in double quotes.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adam Stylinski <adam.stylinski@etegent.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20170505101417.2117-1-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:
~~~~~~
$ perf script --inline
...
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
$ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
0x0000000000000a4a
std::__complex_abs(doublecomplex )
/usr/include/c++/6.3.1/complex:589
double std::abs<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:597
double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:654
double std::norm<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:664
main
/tmp/inlining.cpp:14
~~~~~
Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:
~~~~~
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::__complex_abs
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So far, the inlined nodes where only reversed when we built perf
against libbfd. If that was not available, the addr2line fallback
code path was missing the inline_list__reverse call.
Now we always add the nodes in the correct order within
inline_list__append. This removes the need to reverse the list
and also ensures that all callers construct the list in the right
order.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.
This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:
~~~~~~~~~~~~~~~
#include <thread>
#include <chrono>
using namespace std;
int main()
{
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));
return 0;
}
~~~~~~~~~~~~~~~
Now compile and record it:
~~~~~~~~~~~~~~~
g++ -std=c++11 -g -O2 test.cpp
echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~
Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:9
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:10
| __libc_start_main
| _start
|
--0.91%--main test.cpp:13
__libc_start_main
_start
~~~~~~~~~~~~~~~
With this patch here applied, the issue is fixed. The report becomes
much more usable:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:8
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:9
| __libc_start_main
| _start
|
--0.91%--main test.cpp:10
__libc_start_main
_start
~~~~~~~~~~~~~~~
Similarly it works for signal frames:
~~~~~~~~~~~~~~~
__noinline void bar(void)
{
volatile long cnt = 0;
for (cnt = 0; cnt < 100000000; cnt++);
}
__noinline void foo(void)
{
bar();
}
void sig_handler(int sig)
{
foo();
}
int main(void)
{
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);
foo();
return 0;
}
~~~~~~~~~~~~~~~~
Before, the report wrongly points to `signal.c:29` after raise():
~~~~~~~~~~~~~~~~
$ perf report --stdio --no-children -g srcline -s srcline
...
100.00% signal.c:11
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:29
__libc_start_main
_start
~~~~~~~~~~~~~~~~
With this patch in, the issue is fixed and we instead get:
~~~~~~~~~~~~~~~~
100.00% signal signal [.] bar
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:27
__libc_start_main
_start
~~~~~~~~~~~~~~~~
Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via
unw_is_signal_frame() for any but the very first frame.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a filename was found in addr2line it was duplicated via strdup()
but never freed. Now we pass NULL and handle this gracefully in
addr2line.
Detected by Valgrind:
==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220
==16331== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16331== by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
==16331== by 0x52769F: addr2line (srcline.c:256)
==16331== by 0x52769F: addr2inlines (srcline.c:294)
==16331== by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
==16331== by 0x574D7A: inline__fprintf (hist.c:41)
==16331== by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
==16331== by 0x57518A: __callchain__fprintf_graph (hist.c:212)
==16331== by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
==16331== by 0x57738E: hist_entry__fprintf (hist.c:628)
==16331== by 0x57738E: hists__fprintf (hist.c:882)
==16331== by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
==16331== by 0x44A20F: report__browse_hists (builtin-report.c:491)
==16331== by 0x44A20F: __cmd_report (builtin-report.c:624)
==16331== by 0x44A20F: cmd_report (builtin-report.c:1054)
==16331== by 0x4A49CE: run_builtin (perf.c:296)
==16331== by 0x4A4CC0: handle_internal_command (perf.c:348)
==16331== by 0x434371: run_argv (perf.c:392)
==16331== by 0x434371: main (perf.c:530)
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:
==8359== Invalid read of size 8
==8359== at 0x3096D9: map__rip_2objdump (map.c:430)
==8359== by 0x2FC1A3: match_chain_srcline (callchain.c:645)
==8359== by 0x2FC1A3: match_chain (callchain.c:700)
==8359== by 0x2FC1A3: append_chain (callchain.c:895)
==8359== by 0x2FC1A3: append_chain_children (callchain.c:846)
==8359== by 0x2FF719: callchain_append (callchain.c:944)
==8359== by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
==8359== by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
==8359== by 0x33195C: hist_entry_iter__add (hist.c:1050)
==8359== by 0x258F65: process_sample_event (builtin-report.c:204)
==8359== by 0x30D60C: perf_session__deliver_event (session.c:1310)
==8359== by 0x30D60C: ordered_events__deliver_event (session.c:119)
==8359== by 0x310D12: __ordered_events__flush (ordered-events.c:210)
==8359== by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
==8359== by 0x30DD3C: perf_session__process_user_event (session.c:1349)
==8359== by 0x30DD3C: perf_session__process_event (session.c:1475)
==8359== by 0x30FC3C: __perf_session__process_events (session.c:1867)
==8359== by 0x30FC3C: perf_session__process_events (session.c:1921)
==8359== by 0x25A985: __cmd_report (builtin-report.c:575)
==8359== by 0x25A985: cmd_report (builtin-report.c:1054)
==8359== by 0x2B9A80: run_builtin (perf.c:296)
==8359== Address 0x70 is not stack'd, malloc'd or (recently) free'd
This patch fixes the issue.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
[ Remove dependency from another change ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mostly in the documentation.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170503131350.cebeecd8bd0f2968417626ab@arm.com
[ Fix spelling of "parameter" in one of the spell-checked lines ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Symbol versioning, as in glibc, results in symbols being defined as:
<real symbol>@[@]<version>
(Note that "@@" identifies a default symbol, if the symbol name is
repeated.)
perf is currently unable to deal with this, and is unable to create user
probes at such symbols:
--
$ nm /lib/powerpc64le-linux-gnu/libpthread.so.0 | grep pthread_create
0000000000008d30 t __pthread_create_2_1
0000000000008d30 T pthread_create@@GLIBC_2.17
$ /usr/bin/sudo perf probe -v -x /lib/powerpc64le-linux-gnu/libpthread.so.0 pthread_create
probe-definition(0): pthread_create
symbol:pthread_create file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Open Debuginfo file: /usr/lib/debug/lib/powerpc64le-linux-gnu/libpthread-2.19.so
Try to find probe point from debuginfo.
Probe point 'pthread_create' not found.
Error: Failed to add events. Reason: No such file or directory (Code: -2)
--
One is not able to specify the fully versioned symbol, either, due to
syntactic conflicts with other uses of "@" by perf:
--
$ /usr/bin/sudo perf probe -v -x /lib/powerpc64le-linux-gnu/libpthread.so.0 pthread_create@@GLIBC_2.17
probe-definition(0): pthread_create@@GLIBC_2.17
Semantic error :SRC@SRC is not allowed.
0 arguments
Error: Command Parse Error. Reason: Invalid argument (Code: -22)
--
This patch ignores versioning for default symbols, thus allowing probes to be
created for these symbols:
--
$ /usr/bin/sudo ./perf probe -x /lib/powerpc64le-linux-gnu/libpthread.so.0 pthread_create
Added new event:
probe_libpthread:pthread_create (on pthread_create in /lib/powerpc64le-linux-gnu/libpthread-2.19.so)
You can now use it in all perf tools, such as:
perf record -e probe_libpthread:pthread_create -aR sleep 1
$ /usr/bin/sudo ./perf record -e probe_libpthread:pthread_create -aR ./test 2
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.052 MB perf.data (2 samples) ]
$ /usr/bin/sudo ./perf script
test 2915 [000] 19124.260729: probe_libpthread:pthread_create: (3fff99248d38)
test 2916 [000] 19124.260962: probe_libpthread:pthread_create: (3fff99248d38)
$ /usr/bin/sudo ./perf probe --del=probe_libpthread:pthread_create
Removed event: probe_libpthread:pthread_create
--
Committer note:
Change the variable storing the result of strlen() to 'int', to fix the build
on debian:experimental-x-mipsel, fedora:24-x-ARC-uClibc, ubuntu:16.04-x-arm,
etc:
util/symbol.c: In function 'symbol__match_symbol_name':
util/symbol.c:422:11: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
if (len < versioning - name)
^
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/c2b18d9c-17f8-9285-4868-f58b6359ccac@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That is the case of _text on s390, and we have some functions that return an
address, using address zero to report problems, oops.
This would lead the symbol loading routines to not use "_text" as the reference
relocation symbol, or the first symbol for the kernel, but use instead
"_stext", that is at the same address on x86_64 and others, but not on s390:
[acme@localhost perf-4.11.0-rc6]$ head -15 /proc/kallsyms
0000000000000000 T _text
0000000000000418 t iplstart
0000000000000800 T start
000000000000080a t .base
000000000000082e t .sk8x8
0000000000000834 t .gotr
0000000000000842 t .cmd
0000000000000846 t .parm
000000000000084a t .lowcase
0000000000010000 T startup
0000000000010010 T startup_kdump
0000000000010214 t startup_kdump_relocated
0000000000011000 T startup_continue
00000000000112a0 T _ehead
0000000000100000 T _stext
[acme@localhost perf-4.11.0-rc6]$
Which in turn would make 'perf test vmlinux' to fail because it wouldn't find
the symbols before "_stext" in kallsyms.
Fix it by using the return value only for errors and storing the
address, when the symbol is successfully found, in a provided pointer
arg.
Before this patch:
After:
[acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1
1: vmlinux symtab matches kallsyms :
--- start ---
test child forked, pid 40693
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/lib/modules/3.10.0-654.el7.s390x/vmlinux for symbols
ERR : 0: _text not on kallsyms
ERR : 0x418: iplstart not on kallsyms
ERR : 0x800: start not on kallsyms
ERR : 0x80a: .base not on kallsyms
ERR : 0x82e: .sk8x8 not on kallsyms
ERR : 0x834: .gotr not on kallsyms
ERR : 0x842: .cmd not on kallsyms
ERR : 0x846: .parm not on kallsyms
ERR : 0x84a: .lowcase not on kallsyms
ERR : 0x10000: startup not on kallsyms
ERR : 0x10010: startup_kdump not on kallsyms
ERR : 0x10214: startup_kdump_relocated not on kallsyms
ERR : 0x11000: startup_continue not on kallsyms
ERR : 0x112a0: _ehead not on kallsyms
<SNIP warnings>
test child finished with -1
---- end ----
vmlinux symtab matches kallsyms: FAILED!
[acme@localhost perf-4.11.0-rc6]$
After:
[acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1
1: vmlinux symtab matches kallsyms :
--- start ---
test child forked, pid 47160
<SNIP warnings>
test child finished with 0
---- end ----
vmlinux symtab matches kallsyms: Ok
[acme@localhost perf-4.11.0-rc6]$
Reported-by: Michael Petlan <mpetlan@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9x9bwgd3btwdk1u51xie93fz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Both had copies originating from git.git, move those to
tools/lib/string.c, getting both tools/lib/subcmd/ and tools/perf/ to
use it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uidwtticro1qhttzd2rkrkg1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Its basically to do units handling, so move to a more appropriately
named object.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-90ob9vfepui24l8l2makhd9u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a perl specific hack, so move it from util.h to where perl
headers are used.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4igctbinuom2sr6g4b03jqht@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just one more step into splitting util.[ch] to reduce the includes hell.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-navarr9mijkgwgbzu464dwam@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
More needs to be done to have the actual functions and variables in a
smaller .c file that can then be included in the python binding,
avoiding dragging more stuff into it.
Link: http://lkml.kernel.org/n/tip-uecxz7cqkssouj7tlxrkqpl4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Recent commit broke command name strip in perf_event__get_comm_ids
function. It replaced left to right search for '\n' with rtrim, which
actually does right to left search. It occasionally caught earlier '\n'
and kept trash in the command name.
Keeping the ltrim, but moving back the left to right '\n' search
instead of the rtrim.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Taeung Song <treeze.taeung@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Fixes: bdd97ca63f ("perf tools: Refactor the code to strip command name with {l,r}trim()")
Link: http://lkml.kernel.org/r/20170420092430.29657-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The util/event.h header needs PERF_ALIGN(), but wasn't including
linux/kernel.h, where it is defined, instead it was getting it by
luck by including map.h, which it doesn't need at all.
Fix it by including the right header.
Link: http://lkml.kernel.org/n/tip-nf3t9blzm5ncoxsczi8oy9mx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it is going away from util.h, where it is not needed.
This is mostly for things like MAXPATHLEN, MAX() and MIN(), these later
two probably should go away in favor of its kernel sources replacements.
Link: http://lkml.kernel.org/n/tip-z1666f3fl3fqobxvjr5o2r39@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf mem report' doesn't display the data source snoop indication correctly.
In the kernel API the definition is:
#define PERF_MEM_SNOOP_NONE 0x02 /* no snoop */
#define PERF_MEM_SNOOP_HIT 0x04 /* snoop hit */
#define PERF_MEM_SNOOP_MISS 0x08 /* snoop miss */
but the table used by the perf tools exchanged "Hit" and "Miss":
"None",
"Miss",
"Hit",
Fix the table in perf.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170419174940.13641-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Out of util.h, to disentangle it a bit more.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vpksyj3w5fk9t8s6mxmkajyr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remnants from the git codebase.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kwaez3uxo1w9f8v5r7etl0w6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The users of regex and fnmatch functions should include those headers
instead.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ixzm5kuamsq1ixbkuv6kmwzj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The files using the dirent.h routines should instead include it,
reducing the includes hell that lead to longer build times.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-42g2f4z6nfg7mdb2ae97n7tj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of getting it out of luck from util.h, where it isn't needed at
all.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0bqugg5lc5ksla1v4m0dnmc1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we switched to the kernel's roundup_pow_of_two we forgot to remove
this include from util.h, do it now.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 91529834d1 ("perf evlist: Use roundup_pow_of_two")
Link: http://lkml.kernel.org/n/tip-kfye5rxivib6155cltx0bw4h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Disentangling util.h header mess a bit more.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-aj6je8ly377i4upedmjzdsq6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Continuing the disentanglement, mostly the TUI needs CTRL(c), that is
in sys/ttydefaults.h and term.c needs the termios headers.
And term.h needs to be added to a few places too.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-il19zna7qj9ytavdbwlipc7t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are places where we just need a forward declaration, and others
were we need to include strlist.h and/or strfilter.h, reducing the
impact of changes in headers on the build time, do it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-zab42gbiki88y9k0csorxekb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Removing it from util.h, part of an effort to disentangle the includes
hell, that makes changes to util.h or something included by it to cause
a complete rebuild of the tools.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ztrjy52q1rqcchuy3rubfgt2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving them from util.h, where they don't belong. Since libc already
have string.h, name it slightly differently, as string2.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eh3vz5sqxsrdd8lodoro4jrw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Out of util.h into a new file, srcline.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ludnlm4djqcdjziekzr4s3u9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Continuing the split of util.[ch] into more manageable bits.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5eu367rwcwnvvn7fz09l7xpb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
More stuff that came from git, out of the hodge-podge that is util.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-e3lana4gctz3ub4hn4y29hkw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Should make sense for windows, where git is supported.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lzxlhmqrizk72d0zcsreggy8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Both do the same thing, the later is the one we get from
linux/stringify.h, i.e. we now use the same function name/practice as
the kernel sources.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-w2sxa5o4bfx7fjrd5mu4zmke@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We get them from inttypes.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qla4e4mwbf1oewafp1ee2etd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Needed to use the PRI[xu](32,64) formatting macros.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wkbho8kaw24q67dd11q0j39f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
TYPEOF(), for instance, was only used by MSB() that wasn't used at all,
besides typeof() is used in many places, should be the preferred way.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-golox8oa2w1oq28snki14z6s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pave the way for further cleanups where linux/kernel.h may stop being
included in some header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qqxan6tfsl6qx3l0v3nwgjvk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To match the kernel, then look for places redefining it to make it use
this version, which checks that its parameter is an array at build time.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-txlcf1im83bcbj6kh0wxmyy8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We rely on symbol->name[0] since the beginning of tools/perf/, never
having received any complaint about it, also all the containers build
perf just fine, so remove this git codebase remnant.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jsjpgojut8e22o2gtz83augk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In https://lkml.org/lkml/2017/2/2/16 I reported a build error that I
believed was caused by wrong uapi includes. The synthom was fixed by
Arnaldo in:
commit 2f7db55579 ("perf tools: Fix include of linux/mman.h")
but I was wrong attributing the problem to the uapi include.
The root cause was that I was using ARCH=x86_64, hence using the wrong
uapi include path. This explains why no one else ran into this build
problem.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170412064919.92449-8-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Besides memory allocation failure, tips.txt may fail to load because the
file is not found (a more likely cause).
Communicate that to the user in tips failure warning.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170412064919.92449-5-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
(This is a patch has been sitting in the Intel CQM/CMT driver series for
a while, despite not depend on it. Sending it now independently since
the series is being discarded.)
When an event is in error state, read() returns 0 instead of sizeof()
buffer. In certain modes, such as interval printing, ignoring the 0
return value may cause bogus count deltas to be computed and thus
invalid results printed.
This patch fixes this problem by modifying read_counters() to mark the
event as not scaled (scaled = -1) to force the printout routine to show
<NOT COUNTED>.
Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170412182301.44406-1-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When parsing disassemble lines for source line number, use a stripped
line instead of raw line.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1491612748-1605-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When parsing disassemble lines, use ltrim() and rtrim() to strip them,
not using just while loop and isspace().
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1491612748-1605-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pipe-mode has no perf.data header, hence no upfront knowledge of presend
and missing features, hence, do not print missing features in pipe-mode.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170410201432.24807-8-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Session sets a number parameters that rely on evlist. These parameters
are not used in pipe-mode and should not be set, since evlist is
unavailable. Fix that.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170410201432.24807-6-davidcc@google.com
[ Check if file != NULL in perf_session__new(), like when used by builtin-top.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
__perf_session__process_pipe_events reuses the same memory buffer to
process all events in the pipe.
When reordering is needed (e.g. -b option), events are not immediately
flushed, but kept around until reordering is possible, causing
memory corruption.
The problem is usually observed by a "Unknown sample error" output. It
can easily be reproduced by:
perf record -o - noploop | perf inject -b > output
Committer testing:
Before:
$ perf record -o - stress -t 2 -c 2 | perf inject -b > /dev/null
stress: info: [8297] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd
stress: info: [8297] successful run completed in 2s
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
Warning:
Found 1 unknown events!
Is this an older tool processing a perf.data file generated by a more recent tool?
If that is not the case, consider reporting to linux-kernel@vger.kernel.org.
$
After:
$ perf record -o - stress -t 2 -c 2 | perf inject -b > /dev/null
stress: info: [9027] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd
stress: info: [9027] successful run completed in 2s
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
no symbols found in /usr/bin/stress, maybe install a debug package?
no symbols found in /usr/bin/stress, maybe install a debug package?
$
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170410201432.24807-3-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement simple detection for all kind of jumps and branches.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-s390 <linux-s390@vger.kernel.org>
Cc: stable@kernel.org # v4.10+
Link: http://lkml.kernel.org/r/1491465112-45819-3-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
since 4.10 perf annotate exits on s390 with an "unknown error -95".
Turns out that commit 786c1b5184 ("perf annotate: Start supporting
cross arch annotation") added a hard requirement for architecture
support when objdump is used but only provided x86 and arm support.
Meanwhile power was added so lets add s390 as well.
While at it make sure to implement the branch and jump types.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-s390 <linux-s390@vger.kernel.org>
Cc: stable@kernel.org # v4.10+
Fixes: 786c1b5184 "perf annotate: Start supporting cross arch annotation"
Link: http://lkml.kernel.org/r/1491465112-45819-2-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We don't need to use strlen(), a var, or check for the end explicitely,
isspace('\0') is false:
[acme@jouet c]$ cat ltrim.c
#include <ctype.h>
#include <stdio.h>
static char *ltrim(char *s)
{
while (isspace(*s))
++s;
return s;
}
int main(void)
{
printf("ltrim(\"\")='%s'\n", ltrim(""));
return 0;
}
[acme@jouet c]$ ./ltrim
ltrim("")=''
[acme@jouet c]$
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/n/tip-w3nk0x3pai2vojk2ab6kdvaw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After reading command name from /proc/<pid>/status, use ltrim() and
rtrim() to strip command name, not using just while loop, isspace() and
etc.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1491575061-704-6-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The kernel has a special check for a specific irq_vectors trace event.
TRACE_EVENT_PERF_PERM(irq_work_exit,
is_sampling_event(p_event) ? -EPERM : 0);
The perf-record fails for this irq_vectors event when it is present,
like when using a wildcard:
root@skl:/tmp# perf record -a -e irq_vectors:* sleep 2
Error:
You may not have permission to collect system-wide stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).
The current value is 2:
-1: Allow use of (almost) all events by all users
>= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
To make this setting permanent, edit /etc/sysctl.conf too, e.g.:
kernel.perf_event_paranoid = -1
This patch prints out the exact sub event that failed with EPERM for
wildcards to help in understanding what went wrong when this event is
present:
After the patch:
root@skl:/tmp# perf record -a -e irq_vectors:* sleep 2
Error:
No permission to enable irq_vectors:irq_work_exit event.
You may not have permission to collect system-wide stats.
......
Committer notes:
So we have a lot of irq_vectors events:
[root@jouet ~]# perf list irq_vectors:*
List of pre-defined events (to be used in -e):
irq_vectors:call_function_entry [Tracepoint event]
irq_vectors:call_function_exit [Tracepoint event]
irq_vectors:call_function_single_entry [Tracepoint event]
irq_vectors:call_function_single_exit [Tracepoint event]
irq_vectors:deferred_error_apic_entry [Tracepoint event]
irq_vectors:deferred_error_apic_exit [Tracepoint event]
irq_vectors:error_apic_entry [Tracepoint event]
irq_vectors:error_apic_exit [Tracepoint event]
irq_vectors:irq_work_entry [Tracepoint event]
irq_vectors:irq_work_exit [Tracepoint event]
irq_vectors:local_timer_entry [Tracepoint event]
irq_vectors:local_timer_exit [Tracepoint event]
irq_vectors:reschedule_entry [Tracepoint event]
irq_vectors:reschedule_exit [Tracepoint event]
irq_vectors:spurious_apic_entry [Tracepoint event]
irq_vectors:spurious_apic_exit [Tracepoint event]
irq_vectors:thermal_apic_entry [Tracepoint event]
irq_vectors:thermal_apic_exit [Tracepoint event]
irq_vectors:threshold_apic_entry [Tracepoint event]
irq_vectors:threshold_apic_exit [Tracepoint event]
irq_vectors:x86_platform_ipi_entry [Tracepoint event]
irq_vectors:x86_platform_ipi_exit [Tracepoint event]
#
And some may be sampled:
[root@jouet ~]# perf record -e irq_vectors:local* sleep 20s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.020 MB perf.data (2 samples) ]
[root@jouet ~]# perf report -D | egrep 'stats:|events:'
Aggregated stats:
TOTAL events: 155
MMAP events: 144
COMM events: 2
EXIT events: 1
SAMPLE events: 2
MMAP2 events: 4
FINISHED_ROUND events: 1
TIME_CONV events: 1
irq_vectors:local_timer_entry stats:
TOTAL events: 1
SAMPLE events: 1
irq_vectors:local_timer_exit stats:
TOTAL events: 1
SAMPLE events: 1
[root@jouet ~]#
But, as shown in the tracepoint definition at the start of this message,
some, like "irq_vectors:irq_work_exit", may not be sampled, just counted,
i.e. if we try to sample, as when using 'perf record', we get an error:
[root@jouet ~]# perf record -e irq_vectors:irq_work_exit
Error:
You may not have permission to collect system-wide stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid,
<SNIP>
The error message is misleading, this patch will help in pointing out
what is the event causing such an error, but the error message needs
improvement, i.e. we need to figure out a way to check if a tracepoint
is counting only, like this one, when all we can do is to count it with
'perf stat', at most printing the delta using interval printing, as in:
[root@jouet ~]# perf stat -I 5000 -e irq_vectors:irq_work_*
# time counts unit events
5.000168871 0 irq_vectors:irq_work_entry
5.000168871 0 irq_vectors:irq_work_exit
10.000676730 0 irq_vectors:irq_work_entry
10.000676730 0 irq_vectors:irq_work_exit
15.001122415 0 irq_vectors:irq_work_entry
15.001122415 0 irq_vectors:irq_work_exit
20.001298051 0 irq_vectors:irq_work_entry
20.001298051 0 irq_vectors:irq_work_exit
25.001485020 1 irq_vectors:irq_work_entry
25.001485020 1 irq_vectors:irq_work_exit
30.001658706 0 irq_vectors:irq_work_entry
30.001658706 0 irq_vectors:irq_work_exit
^C 32.045711878 0 irq_vectors:irq_work_entry
32.045711878 0 irq_vectors:irq_work_exit
[root@jouet ~]#
But at least, when we use a wildcard, this patch helps a bit.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1491566932-503-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The option 'show-total-period' works fine without a option '-l'. But if
running 'perf annotate --stdio -l --show-total-period', you can see a
problem showing only zero '0' for number of samples.
Before:
$ perf annotate --stdio -l --show-total-period
...
0 : 400816: push %rbp
0 : 400817: mov %rsp,%rbp
0 : 40081a: mov %edi,-0x24(%rbp)
0 : 40081d: mov %rsi,-0x30(%rbp)
0 : 400821: mov -0x24(%rbp),%eax
0 : 400824: mov -0x30(%rbp),%rdx
0 : 400828: mov (%rdx),%esi
0 : 40082a: mov $0x0,%edx
...
The reason is it was missed to set number of samples of
source_line_samples, so set it ordinarily.
After:
$ perf annotate --stdio -l --show-total-period
...
3 : 400816: push %rbp
4 : 400817: mov %rsp,%rbp
0 : 40081a: mov %edi,-0x24(%rbp)
0 : 40081d: mov %rsi,-0x30(%rbp)
1 : 400821: mov -0x24(%rbp),%eax
2 : 400824: mov -0x30(%rbp),%rdx
0 : 400828: mov (%rdx),%esi
1 : 40082a: mov $0x0,%edx
...
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liska <mliska@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 0c4a5bcea4 ("perf annotate: Display total number of samples with --show-total-period")
Link: http://lkml.kernel.org/r/1490703125-13643-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The callers of perf_read_values__enlarge_counters() already propagate
errors, so just print some debug diagnostics and handle allocation
failures gracefully, not trying to do silly things like 'a =
realloc(a)'.
Link: http://lkml.kernel.org/n/tip-nsmmh7uzpg35rzcl9nq7yztp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we fail in the following case:
$ unset HOME
$ ./perf record ls
$ echo $?
255
It's because the config code init fails due to a missing HOME variable
value. Fix this by skipping the user config init if there's no HOME
variable value.
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170330144637.7468-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Trivial fix to spelling mistake in pr_debug message.
Signed-off-by: Colin King <colin.king@canonical.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170330095440.19444-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For some platforms, for example Broadwell, it doesn't support cycles
for LBR. But the perf always prints cycles:0, it's not necessary.
The patch refactors the LBR info print code and drops the cycles:0.
For example: perf report --branch-history --no-children --stdio
On Broadwell:
--0.91%--__random_r random_r.c:394 (iterations:2)
__random_r random_r.c:360 (predicted:0.0%)
__random_r random_r.c:380 (predicted:0.0%)
__random_r random_r.c:357
On Skylake:
--1.07%--main div.c:39 (predicted:52.4% cycles:1 iterations:17)
main div.c:44 (predicted:52.4% cycles:1)
main div.c:42 (cycles:2)
compute_flag div.c:28 (cycles:2)
compute_flag div.c:27 (cycles:1)
rand rand.c:28 (cycles:1)
rand rand.c:28 (cycles:1)
__random random.c:298 (cycles:1)
__random random.c:297 (cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (cycles:1)
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1489046786-10061-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
SDT marker argument is in N@OP format. N is the size of argument and OP
is the actual assembly operand. OP is arch dependent component and hence
it's parsing logic also should be placed under tools/perf/arch/.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170328094754.3156-3-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This came from 'git', but isn't documented anywhere in
tools/perf/Documentation/, looks like baggage we can do without, ditch
it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-e7uwkn60t4hmlnwj99ba4t2s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
New features:
- Handle inline functions in callchains (Jin Yao)
- Enable sorting by srcline as key (Milian Wolff)
Fixes:
- Fix no_size logic in addr_filter__resolve_kernel_syms() in the
auxtrace code (Adrian Hunter)
- Fix some thread refcount leaks in 'perf trace' (Arnaldo Carvalho de Melo)
- Fix divide by zero when calculating percent for an event in a group in
the annotate by source line code (Taeung Song)
- build-id files now aren't anymore symlinks, their parent directories
are, so readlink the later (Taeung Song)
- Assorted fixes for null termination problems, mostly related to
readlink, detected by valgrind (Tommi Rantala)
Infrastructure:
- Make vfs_getname probe point logic in 'perf trace' more robust
wrt length of pathname (Arnaldo Carvalho de Melo)
- Remove unused 'prefix' parameter from builtins main functions (Arnaldo Carvalho de Melo)
- Show 'perf list sdt' option in man page (Ravi Bangoria)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJY2WNUAAoJENZQFvNTUqpAZ3AQAIn/Q+Y665oP57RbikedeifL
He8vdMUkD/haRo0atbvuu5tRrwiRUabkUa6GKPHNCDl8GUD6UbkztUirL4Cq4v9s
7ONbCHXzaPnPZbDbl/W7Yx4vADow3YMR9EyNkL8/i2ApZqMCPQ9mUBhxJlSDp7RY
agYcOugUlYuvHsKVX59fTyvTAq8btfyFQTqhJ+NPddcxsyR5jam9XxxvgMURdFJr
h6OLO9wqCxlMctqlGXU+6tpqiAR+bp8UZgzDKwabGR4mZR+uLBYGf0FUQz52vf2A
83ufaZ5UrQUsSnVeYXBPW+i8+Ixu8pEOFDMDcSpk/wQXunLlN52LmuatSCkPBEV1
jFth8SX3IAX349hpaRBNuLk5UuqS6NKBztYzlaVsKMpuIw4hRPVE3VvqKefZD/hx
Vdlr1v6fPXMcRUcc3lFFiVCIvs0hRV4IDDIimGjJHf8dm+GFMHH+bk+tfiSQAlmZ
q3aSKMImUM3vlD01E4BmTVr4IEZHTd3mv0Ml+nbQGNj6Bu2364eBsFRnNHJWwGmt
c9tcnmeRv6JzrmprVXMuOUyyTcml+b5/vincEEmTxUdbxCbYFkQS3JzPxfpxqFI/
zM5rlJJ9KKWXmwD6OgUoXT5IUzq4BuIVyJ3DxwuL2rrQggsv0zORxQtVduY+IJSj
ZD/Qu7SOiFfnAFM6kLwP
=Lm/M
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-4.12-20170327' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Handle inline functions in callchains (Jin Yao)
- Enable sorting by srcline as key (Milian Wolff)
Fixes:
- Fix no_size logic in addr_filter__resolve_kernel_syms() in the
auxtrace code (Adrian Hunter)
- Fix some thread refcount leaks in 'perf trace' (Arnaldo Carvalho de Melo)
- Fix divide by zero when calculating percent for an event in a group in
the annotate by source line code (Taeung Song)
- build-id files now aren't anymore symlinks, their parent directories
are, so readlink the later (Taeung Song)
- Assorted fixes for null termination problems, mostly related to
readlink, detected by valgrind (Tommi Rantala)
Infrastructure changes:
- Make vfs_getname probe point logic in 'perf trace' more robust
wrt length of pathname (Arnaldo Carvalho de Melo)
- Remove unused 'prefix' parameter from builtins main functions (Arnaldo Carvalho de Melo)
- Show 'perf list sdt' option in man page (Ravi Bangoria)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Simplification: it is easier to open /proc/self/exe than /proc/$pid/exe.
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.21881-7-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ensure that the string that we read from the data file is null terminated.
Valgrind was complaining:
==31357== Invalid read of size 1
==31357== at 0x4EC8C1: __strtok_r_1c (string2.h:200)
==31357== by 0x4EC8C1: parse_ftrace_printk (trace-event-parse.c:161)
==31357== by 0x4F82A8: read_ftrace_printk (trace-event-read.c:204)
==31357== by 0x4F82A8: trace_report (trace-event-read.c:468)
==31357== by 0x4CD552: process_tracing_data (header.c:1576)
==31357== by 0x4D3397: perf_file_section__process (header.c:2705)
==31357== by 0x4D3397: perf_header__process_sections (header.c:2488)
==31357== by 0x4D3397: perf_session__read_header (header.c:2925)
==31357== by 0x4E71E2: perf_session__open (session.c:32)
==31357== by 0x4E71E2: perf_session__new (session.c:139)
==31357== by 0x429F5D: cmd_annotate (builtin-annotate.c:472)
==31357== by 0x497150: run_builtin (perf.c:359)
==31357== by 0x428CE0: handle_internal_command (perf.c:421)
==31357== by 0x428CE0: run_argv (perf.c:467)
==31357== by 0x428CE0: main (perf.c:614)
==31357== Address 0x8ac0efb is 0 bytes after a block of size 1,963 alloc'd
==31357== at 0x4C2DB9D: malloc (vg_replace_malloc.c:299)
==31357== by 0x4F827B: read_ftrace_printk (trace-event-read.c:195)
==31357== by 0x4F827B: trace_report (trace-event-read.c:468)
==31357== by 0x4CD552: process_tracing_data (header.c:1576)
==31357== by 0x4D3397: perf_file_section__process (header.c:2705)
==31357== by 0x4D3397: perf_header__process_sections (header.c:2488)
==31357== by 0x4D3397: perf_session__read_header (header.c:2925)
==31357== by 0x4E71E2: perf_session__open (session.c:32)
==31357== by 0x4E71E2: perf_session__new (session.c:139)
==31357== by 0x429F5D: cmd_annotate (builtin-annotate.c:472)
==31357== by 0x497150: run_builtin (perf.c:359)
==31357== by 0x428CE0: handle_internal_command (perf.c:421)
==31357== by 0x428CE0: run_argv (perf.c:467)
==31357== by 0x428CE0: main (perf.c:614)
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.21881-6-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ensure that we have space for the null byte in buf.
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.21881-5-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Valgrind was complaining:
$ valgrind ./perf list >/dev/null
==11643== Memcheck, a memory error detector
==11643== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==11643== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==11643== Command: ./perf list
==11643==
==11643== Conditional jump or move depends on uninitialised value(s)
==11643== at 0x4C30620: rindex (vg_replace_strmem.c:199)
==11643== by 0x49DAA9: build_id_cache__origname (build-id.c:198)
==11643== by 0x49E1C7: build_id_cache__valid_id (build-id.c:222)
==11643== by 0x49E1C7: build_id_cache__list_all (build-id.c:507)
==11643== by 0x4B9C8F: print_sdt_events (parse-events.c:2067)
==11643== by 0x4BB0B3: print_events (parse-events.c:2313)
==11643== by 0x439501: cmd_list (builtin-list.c:53)
==11643== by 0x497150: run_builtin (perf.c:359)
==11643== by 0x428CE0: handle_internal_command (perf.c:421)
==11643== by 0x428CE0: run_argv (perf.c:467)
==11643== by 0x428CE0: main (perf.c:614)
[...]
Additionally, a zero length result from readlink() is not very interesting.
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.21881-3-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Valgrind was complaining:
==2633== Syscall param open(filename) points to unaddressable byte(s)
==2633== at 0x5281CC0: __open_nocancel (syscall-template.S:84)
==2633== by 0x537D38: open (fcntl2.h:53)
==2633== by 0x537D38: get_sdt_note_list (symbol-elf.c:2017)
==2633== by 0x5396FD: probe_cache__scan_sdt (probe-file.c:700)
==2633== by 0x49EA2C: build_id_cache__add_sdt_cache (build-id.c:625)
==2633== by 0x49EA2C: build_id_cache__add_s (build-id.c:697)
==2633== by 0x49EE72: build_id_cache__add_b (build-id.c:717)
==2633== by 0x49EE72: dso__cache_build_id (build-id.c:782)
==2633== by 0x49F190: __dsos__cache_build_ids (build-id.c:793)
==2633== by 0x49F190: machine__cache_build_ids (build-id.c:801)
==2633== by 0x49F190: perf_session__cache_build_ids (build-id.c:815)
==2633== by 0x4CD4F2: write_build_id (header.c:165)
==2633== by 0x4D26F7: do_write_feat (header.c:2296)
==2633== by 0x4D26F7: perf_header__adds_write (header.c:2335)
==2633== by 0x4D26F7: perf_session__write_header (header.c:2414)
==2633== by 0x43B324: __cmd_record (builtin-record.c:1154)
==2633== by 0x43B324: cmd_record (builtin-record.c:1839)
==2633== by 0x455A07: __cmd_record (builtin-kmem.c:1868)
==2633== by 0x455A07: cmd_kmem (builtin-kmem.c:1944)
==2633== by 0x497150: run_builtin (perf.c:359)
==2633== by 0x428CE0: handle_internal_command (perf.c:421)
==2633== by 0x428CE0: run_argv (perf.c:467)
==2633== by 0x428CE0: main (perf.c:614)
==2633== Address 0x0 is not stack'd, malloc'd or (recently) free'd
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Link: http://lkml.kernel.org/r/20170322130624.21881-2-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is wrong way to read link name from a build-id file. Because a
build-id file is not anymore a symbolic link but build-id directory of
it is symbolic link, so fix it.
For example, if build-id file name gotten from
dso__build_id_filename() is as below,
/root/.debug/.build-id/4f/75c7d197c951659d1c1b8b5fd49bcdf8f3f8b1/elf
To correctly read link name of build-id, use the build-id dir path that
is a symbolic link, instead of the above build-id file name like below.
/root/.debug/.build-id/4f/75c7d197c951659d1c1b8b5fd49bcdf8f3f8b1
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1490598638-13947-2-git-send-email-treeze.taeung@gmail.com
Fixes: 01412261d9 ("perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Often it is interesting to know how costly a given source line is in
total. Previously, one had to build these sums manually based on all
addresses that pointed to the same source line. This patch introduces
srcline as a sort key, which will do the aggregation for us.
Paired with the recent addition of showing inline frames, this makes
perf report much more useful for many C++ work loads.
The following shows the new feature in action. First, let's show the
status quo output when we sort by address. The result contains many hist
entries that generate the same output:
~~~~~~~~~~~~~~~~
$ perf report --stdio --inline -g address
# Children Self Command Shared Object Symbol
# ........ ........ ............ ................... .........................................
#
99.89% 35.34% cpp-inlining cpp-inlining [.] main
|
|--64.55%--main complex:655
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/complex:664 (inline)
| |
| |--60.31%--hypot +20
| | |
| | |--8.52%--__hypot_finite +273
| | |
| | |--7.32%--__hypot_finite +411
...
--35.34%--_start +4194346
__libc_start_main +241
|
|--6.65%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
|
|--2.70%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
|
|--1.69%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
...
~~~~~~~~~~~~~~~~
With this patch and `-g srcline` we instead get the following output:
~~~~~~~~~~~~~~~~
$ perf report --stdio --inline -g srcline
# Children Self Command Shared Object Symbol
# ........ ........ ............ ................... .........................................
#
99.89% 35.34% cpp-inlining cpp-inlining [.] main
|
|--64.55%--main complex:655
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/complex:664 (inline)
| |
| |--64.02%--hypot
| | |
| | --59.81%--__hypot_finite
| |
| --0.53%--cabs
|
--35.34%--_start
__libc_start_main
|
|--12.48%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
...
~~~~~~~~~~~~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If the address belongs to an inlined function, the source information
back to the first non-inlined function will be printed.
For example:
1. Show inlined function name
perf report -g function --inline
- 0.69% 0.00% inline ld-2.23.so [.] dl_main
- dl_main
0.56% _dl_relocate_object
_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)
2. Show the file/line information
perf report -g address --inline
- 0.69% 0.00% inline ld-2.23.so [.] _dl_start
_dl_start rtld.c:307
/build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
+ _dl_sysdep_start dl-sysdep.c:250
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-6-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It takes some time to look for inline stack for callgraph addresses. So
it provides new option "--inline" to let user decide if enable this
feature.
--inline:
If a callgraph address belongs to an inlined function, the inline stack
will be printed. Each entry is the inline function name or file/line.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It would be useful for perf to support a mode to query the inline stack
for a given callgraph address. This would simplify finding the right
code in code that does a lot of inlining.
The srcline.c has contained the code which supports to translate the
address to filename:line_nr. This patch just extends the function to let
it support getting the inline stacks.
It introduces the inline_list which will store the inline function
result (filename:line_nr and funcname).
If BFD lib is not supported, the result is only filename:line_nr.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce dso__name() and filename_split() out of existing code because
these codes will be used in several places in next patch.
For filename_split(), it may also solve a potential memory leak in
existing code. In existing addr2line(),
sep = strchr(filename, ':');
if (sep) {
*sep++ = '\0';
*file = filename;
*line_nr = strtoul(sep, NULL, 0);
ret = 1;
}
out:
pclose(fp);
return ret;
If sep is NULL, filename is not freed or returned via file.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Address filtering with kernel symbols incorrectly resulted in the error
"Cannot determine size of symbol" because the no_size logic was the wrong
way around.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org # v4.9+
Link: http://lkml.kernel.org/r/1490357752-27942-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the printing of perf expressions and internal events to a new
clearer --details flag, instead of lumping it together with other debug
options in --debug. This makes it clearer to use.
Before
perf list --debug
...
unc_m_power_critical_throttle_cycles
[Cycles all ranks are in critical thermal throttle. Unit: uncore_imc]
uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100.
after
perf list --details
...
unc_m_power_critical_throttle_cycles
[Cycles all ranks are in critical thermal throttle. Unit: uncore_imc]
uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20170320201711.14142-14-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for a new JSON event attribute to name MetricExpr for better
output in perf stat.
If the event has no MetricName it uses the normal event name instead to
describe the metric.
Before
% perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
time unc_p_freq_max_os_cycles
1.000149775 15.7
2.000344807 19.3
3.000502544 16.7
4.000640656 6.6
5.000779955 9.9
After
% perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
time freq_max_os_cycles %
1.000149775 15.7
2.000344807 19.3
3.000502544 16.7
4.000640656 6.6
5.000779955 9.9
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-13-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Output the metric expr in perf list when --debug is specified, so that
the user can check the formula.
Before:
% perf list
...
unc_m_power_channel_ppd
[Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd. Unit:
uncore_imc]
uncore_imc_2/event=0x85/
After:
% perf list --debug
...
unc_m_power_channel_ppd
[Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd. Unit:
uncore_imc]
Perf: uncore_imc_2/event=0x85/ MetricExpr: (unc_m_power_channel_ppd / unc_m_clockticks) * 100.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-12-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add generic infrastructure to perf stat to output ratios for
"MetricExpr" entries in the event lists. Many events are more useful as
ratios than in raw form, typically some count in relation to total
ticks.
Transfer the MetricExpr information from the alias to the evsel.
We mark the events that need to be collected for MetricExpr, and also
link the events using them with a pointer. The code is careful to always
prefer the right event in the same group to minimize multiplexing
errors. At the moment only a single relation is supported.
Then add a rblist to the stat shadow code that remembers stats based on
the cpu and context.
Then finally update and retrieve and print these values similarly to the
existing hardcoded perf metrics. We use the simple expression parser
added earlier to evaluate the expression.
Normally we just output the result without further commentary, but for
--metric-only this would lead to empty columns. So for this case use the
original event as description.
There is no attempt to automatically add the MetricExpr event, if it is
missing, however we suggest it to the user, because the user tool
doesn't have enough information to reliably construct a group that is
guaranteed to schedule. So we leave that to the user.
% perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}'
1.000147889 800,085,181 unc_p_clockticks
1.000147889 93,126,241 unc_p_freq_max_os_cycles # 11.6
2.000448381 800,218,217 unc_p_clockticks
2.000448381 142,516,095 unc_p_freq_max_os_cycles # 17.8
3.000639852 800,243,057 unc_p_clockticks
3.000639852 162,292,689 unc_p_freq_max_os_cycles # 20.3
% perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
# time freq_max_os_cycles %
1.000127077 0.9
2.000301436 0.7
3.000456379 0.0
v2: Change from DivideBy to MetricExpr
v3: Use expr__ prefix. Support more than one other event.
v4: Update description
v5: Only print warning message once for multiple PMUs.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-11-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for parsing the MetricExpr header in the JSON event lists
and storing them in the alias structure.
Used in the next patch.
v2: Change DividedBy to MetricExpr
v3: Really catch all uses of DividedBy
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-10-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a simple expression parser good enough to parse JSON relation
expressions. The parser is implemented using bison.
This is just intended as an simple parser for internal usage in the
event lists, not the beginning of a "perf scripting language"
v2: Use expr__ prefix instead of expr_
Support multiple free variables for parser
Committer note:
The v2 patch had:
%define api.pure full
In expr.y, that is a feature introduced in bison 2.7, to have reentrant
parsers, not using global variables, which would make tools/perf stop
building with the bison version shipped in older distros, so Andi
realised that the other parsers (e.g. parse-events.y) were using:
%pure-parser
Which is present in older versions of bison and fits the bill.
I added:
CFLAGS_expr-bison.o += -DYYENABLE_NLS=0 -DYYLTYPE_IS_TRIVIAL=0 -w
To finally make it build, copying what was there for pmu-bison.o,
another parser.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-8-andi@firstfloor.org
[ stdlib.h is needed in tests/expr.c for free() fixing build in systems such as ubuntu:16.04-x-s390 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Special case uncore_ prefix in PMU match, to allow for shorter event
uncore specifications.
Before:
perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
After
perf stat -a -e cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
Committer tests:
# perf list uncore
List of pre-defined events (to be used in -e):
uncore_cbox_0/clockticks/ [Kernel PMU event]
uncore_cbox_1/clockticks/ [Kernel PMU event]
uncore_imc/data_reads/ [Kernel PMU event]
uncore_imc/data_writes/ [Kernel PMU event]
# perf stat -a -e cbox_0/clockticks/ sleep 1
Performance counter stats for 'system wide':
281,474,976,653,084 cbox_0/clockticks/
1.000870129 seconds time elapsed
#
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20170320201711.14142-7-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the user specifies a pmu directly, expand it automatically with a
prefix match for all available PMUs, similar as we do for the normal
aliases now.
This allows to specify attributes for duplicated boxes quickly. For
example uncore_cbox_{0,6}/.../ can be now specified as uncore_cbox/.../
and it gets automatically expanded for all boxes.
This generally makes it more concise to write uncore specifications, and
also avoids the need to know the exact topology of the system.
Before:
% perf stat -a -e uncore_cbox_0/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_1/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_2/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_3/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_4/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_5/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
After:
% perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
v2: Handle all bison rules. Move multi add code to separate function.
Handle uncore_ prefix correctly.
v3: Move parse_events_multi_pmu_add to separate patch. Move uncore
prefix check to separate patch.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out the PMU name matching in the event parser into a separate
function, to use the same code for other grammar rules later.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The uncore PMU has a lot of duplicated PMUs for different subsystems.
When expanding an uncore alias we usually end up with a large
number of identically named aliases, which makes perf stat
output difficult to read.
Automatically sum them up in perf stat, unless --no-merge is specified.
This can be default because only the uncores generally have duplicated
aliases. Other PMUs have unique names.
Before:
% perf stat --no-merge -a -e unc_c_llc_lookup.any sleep 1
Performance counter stats for 'system wide':
694,976 Bytes unc_c_llc_lookup.any
706,304 Bytes unc_c_llc_lookup.any
956,608 Bytes unc_c_llc_lookup.any
782,720 Bytes unc_c_llc_lookup.any
605,696 Bytes unc_c_llc_lookup.any
442,816 Bytes unc_c_llc_lookup.any
659,328 Bytes unc_c_llc_lookup.any
509,312 Bytes unc_c_llc_lookup.any
263,936 Bytes unc_c_llc_lookup.any
592,448 Bytes unc_c_llc_lookup.any
672,448 Bytes unc_c_llc_lookup.any
608,640 Bytes unc_c_llc_lookup.any
641,024 Bytes unc_c_llc_lookup.any
856,896 Bytes unc_c_llc_lookup.any
808,832 Bytes unc_c_llc_lookup.any
684,864 Bytes unc_c_llc_lookup.any
710,464 Bytes unc_c_llc_lookup.any
538,304 Bytes unc_c_llc_lookup.any
1.002577660 seconds time elapsed
After:
% perf stat -a -e unc_c_llc_lookup.any sleep 1
Performance counter stats for 'system wide':
2,685,120 Bytes unc_c_llc_lookup.any
1.002648032 seconds time elapsed
v2: Split collect_aliases. Rename alias flag.
v3: Make sure unsupported/not counted is always printed.
v4: Factor out callback change into separate patch.
v5: Move check for bad results here
Move merged check into collect_data
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The source code line number (lineno) needs to be kept in accross calls
to symbol__parse_objdump_line() when parsing the output of 'objdump -l
-dS', so that it can associate it with the instructions till the next
line.
See disasm_line__new() and struct disasm_line::line_nr.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7hpx8f8ybdpiujceysaj229w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'grep -v "filename"' applied to the objdump command output cause a
side effect eliminating filename:linenr of output of 'objdump -l' if the
object file name and source file name are the same, fix it.
E.g. the output of the following objdump command in symbol__disassemble():
$ objdump -l -d -S -C /home/taeung/hello --start-address=...
/home/taeung/hello: file format elf64-x86-64
Disassembly of section .text:
0000000000400526 <main>:
main():
/home/taeung/hello.c:4
void main()
{
400526: 55 push %rbp
400527: 48 89 e5 mov %rsp,%rbp
/home/taeung/hello.c:5
...
But it uses grep -v "filename" e.g. "/home/taeung/hello" in the objdump
command to remove the first line containing file name and file format
("/home/taeung/hello: file format elf64-x86-64"):
Before:
$ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello
But this causes a side effect, removing filename:linenr too, because the
object file and source file have the same name e.g. "/home/taueng/hello",
"/home/taeung/hello.c"
So more do a better match by using grep -v as below to correctly remove
that first line:
"/home/taeung/hello: file format elf64-x86-64"
After:
$ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello:
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1489978617-31396-5-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
An sdt probe can be associated with arguments but they were not passed
to the user probe tracing interface (uprobe_events); this patch adapts
the sdt argument descriptors according to the uprobe input format.
As the uprobe parser does not support scaled address mode, perf will
skip arguments which cannot be adapted to the uprobe format.
Here are the results:
$ perf buildid-cache -v --add test_sdt
$ perf probe -x test_sdt sdt_libfoo:table_frob
$ perf probe -x test_sdt sdt_libfoo:table_diddle
$ perf record -e sdt_libfoo:table_frob -e sdt_libfoo:table_diddle test_sdt
$ perf script
test_sdt ... 666.255678: sdt_libfoo:table_frob: (4004d7) arg0=0 arg1=0
test_sdt ... 666.255683: sdt_libfoo:table_diddle: (40051a) arg0=0 arg1=0
test_sdt ... 666.255686: sdt_libfoo:table_frob: (4004d7) arg0=1 arg1=2
test_sdt ... 666.255689: sdt_libfoo:table_diddle: (40051a) arg0=3 arg1=4
test_sdt ... 666.255692: sdt_libfoo:table_frob: (4004d7) arg0=2 arg1=4
test_sdt ... 666.255694: sdt_libfoo:table_diddle: (40051a) arg0=6 arg1=8
Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20161214000732.1710-3-alexis.berlemont@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
During a "perf buildid-cache --add" command, the section ".note.stapsdt"
of the "added" binary is scanned in order to list the available SDT
markers available in a binary. The parts containing the probes arguments
were left unscanned.
The whole section is now parsed; the probe arguments are extracted for
later use.
Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20161214000732.1710-2-alexis.berlemont@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are many SDT markers in powerpc whose uprobe definition goes
beyond current MAX_CMDLEN, especially when target filename is long and
sdt marker has long list of arguments. For example, definition of sdt
marker
method__compile__end: 8@17 8@9 8@10 -4@8 8@7 -4@6 8@5 -4@4 1@37(28)
from file
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so
is
p:sdt_hotspot/method__compile__end /usr/lib/jvm/java-1.8.0-openjdk-\
1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so:0x4c4e00\
arg1=%gpr17:u64 arg2=%gpr9:u64 arg3=%gpr10:u64 arg4=%gpr8:s32\
arg5=%gpr7:u64 arg6=%gpr6:s32 arg7=%gpr5:u64 arg8=%gpr4:s32\
arg9=+37(%gpr28):u8
'perf probe' fails with segfault for such markers. As the uprobe_events
file accepts definitions up to 4094 characters(4096 - 2 (\n\0)),
increase value of MAX_CMDLEN match that.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170207054547.3690-1-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'*ntevs' contains number of elements present in 'tevs' array. If there
are no elements in array, 'tevs2' can be directly assigned to 'tevs'
without allocating more space. So the condition should be '*ntevs == 0'
not 'ntevs == 0'.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 42bba263eb ("perf probe: Allow wildcard for cached events")
Link: http://lkml.kernel.org/r/20170308065908.4128-1-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch decodes the 'partial' flag in AUX records and prints
a warning to the user, so that they don't have to guess why their
PT traces contain gaps (or missing altogether):
Warning:
AUX data had gaps in it 8 times out of 8!
Are you running a KVM guest in the background?
Trying to be even more helpful, we will detect if the user's kvm driver sets up
exclusive VMX root mode for the entire lifespan of the kvm process:
Reloading kvm_intel module with vmm_exclusive=0
will reduce the gaps to only guest's timeslices.
Note however, that you'll still have gaps in cpu-wide traces even with
vmm_exclusive=0, but the number of gaps will be below 100% (as opposed to the
above example).
Currently this is the only reason for partial records.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/r/8760j941ig.fsf@ashishki-desk.ger.corp.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current symbols__fixup_end() heuristic for the last entry in the rb
tree is suboptimal as it leads to not being able to recognize the symbol
in the call graph in a couple of corner cases, for example:
i) If the symbol has a start address (f.e. exposed via kallsyms)
that is at a page boundary, then the roundup(curr->start, 4096)
for the last entry will result in curr->start == curr->end with
a symbol length of zero.
ii) If the symbol has a start address that is shortly before a page
boundary, then also here, curr->end - curr->start will just be
very few bytes, where it's unrealistic that we could perform a
match against.
Instead, change the heuristic to roundup(curr->start, 4096) + 4096, so
that we can catch such corner cases and have a better chance to find
that specific symbol. It's still just best effort as the real end of the
symbol is unknown to us (and could even be at a larger offset than the
current range), but better than the current situation.
Alexei reported that he recently run into case i) with a JITed eBPF
program (these are all page aligned) as the last symbol which wasn't
properly shown in the call graph (while other eBPF program symbols in
the rb tree were displayed correctly). Since this is a generic issue,
lets try to improve the heuristic a bit.
Reported-and-Tested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 2e538c4a18 ("perf tools: Improve kernel/modules symbol lookup")
Link: http://lkml.kernel.org/r/bb5c80d27743be6f12afc68405f1956a330e1bc9.1489614365.git.daniel@iogearbox.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
New features:
- Add 'brstackinsn' field in 'perf script' to reuse the x86 instruction
decoder used in the Intel PT code to study hot paths to samples (Andi Kleen)
Kernel:
- Default UPROBES_EVENTS to Y (Alexei Starovoitov)
- Fix check for kretprobe offset within function entry (Naveen N. Rao)
Infrastructure:
- Introduce util func is_sdt_event() (Ravi Bangoria)
- Make perf_event__synthesize_mmap_events() scale on older kernels where
reading /proc/pid/maps is way slower than reading /proc/pid/task/pid/maps (Stephane Eranian)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYyrdSAAoJENZQFvNTUqpAe+4P/3c4ilBSOxLCCxGO7jDYo9oq
/KqlvsCIg7+vo5eqrOUJAb4qXFnvpYxwjMMkL5rx7gdsBCRfRXIINGWUMrq5mNyk
MgxuqYnp+yRuxLYml2wn+tdwLzcHWSN2EO9mqQ14N4I+HvgdLmVPQ44ACQXs6KfL
dk/Ix8YtnFWl2sDZjvyr7ZBqwCPzzklZgHM6erxNUr/WJspzUiixAWqUmewodOUl
P3PitlHXkITOK3AxSqOjJ4g1k933215nGih7hr0XdjEm4pIYaYksShQ6k9DASCrv
dn2o1pF1LTu7KCtAo70aaSB7GXydwoA//o2gRbDkSwJJ25DIImZxJXQz9PAYDOo1
vXSIhmlQ72c4/Yv/XzVOrIoMMMpmWKS3lGZxMVGR/Ie9Gw4kbotkaoEqEpNQsaDZ
iIaU5v/EcvvToT7T7VHrGg0+vmHgYxm5gSlyASi2IrO2/wJAs0v2pYfuL6gYhXGp
mhv/pHUv4l9OW+Ubm+zJEEcg337c2RQU5wT/bk4PihxY6nQyEH2Pn5VzdNbZLuMR
eWnqTH/md+8/bkhmuZJp71wm60oPHoPvbDjvtfVmXAa52AzO+NWSc9Veke3C/QRm
XgNkrXlzeKopEso3j4gw2iAolqw9t8FHFLGgbTkS+6UCKjAM7vNLiIV02LQqhM50
qCnKEusMDCRgzeOXxYt+
=Bg5M
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-4.12-20170316' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Add 'brstackinsn' field in 'perf script' to reuse the x86 instruction
decoder used in the Intel PT code to study hot paths to samples (Andi Kleen)
Kernel changes:
- Default UPROBES_EVENTS to Y (Alexei Starovoitov)
- Fix check for kretprobe offset within function entry (Naveen N. Rao)
Infrastructure changes:
- Introduce util func is_sdt_event() (Ravi Bangoria)
- Make perf_event__synthesize_mmap_events() scale on older kernels where
reading /proc/pid/maps is way slower than reading /proc/pid/task/pid/maps (Stephane Eranian)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Implement printing instruction sequences as hex dump for branch stacks.
This relies on the x86 instruction decoder used by the PT decoder to
find the lengths of instructions to dump them individually.
This is good enough for pattern matching.
This allows to study hot paths for individual samples, together with
branch misprediction and cycle count / IPC information if available (on
Skylake systems).
% perf record -b ...
% perf script -F brstackinsn
...
read_hpet+67:
ffffffff9905b843 insn: 74 ea # PRED
ffffffff9905b82f insn: 85 c9
ffffffff9905b831 insn: 74 12
ffffffff9905b833 insn: f3 90
ffffffff9905b835 insn: 48 8b 0f
ffffffff9905b838 insn: 48 89 ca
ffffffff9905b83b insn: 48 c1 ea 20
ffffffff9905b83f insn: 39 f2
ffffffff9905b841 insn: 89 d0
ffffffff9905b843 insn: 74 ea # PRED
Only works when no special branch filters are specified.
Occasionally the path does not reach up to the sample IP, as the LBRs
may be frozen before executing a final jump. In this case we print a
special message.
The instruction dumper piggy backs on the existing infrastructure from
the IP PT decoder.
An earlier iteration of this patch relied on a disassembler, but this
version only uses the existing instruction decoder.
Committer note:
Added hint about how to get suitable perf.data files for use with
'-F brstackinsm':
$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
$
$ perf script -F brstackinsn
Display of branch stack assembler requested, but non all-branch filter set
Hint: run 'perf record -b ...'
$
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch significantly improves the execution time of
perf_event__synthesize_mmap_events() when running perf record on systems
where processes have lots of threads.
It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to
generate each map line in the maps file. If you have 1000 threads, then you
have necessarily 1000 stacks. For each vma, you need to check if it
corresponds to a thread's stack. With a large number of threads, this can take
a very long time. I have seen latencies >> 10mn.
As of today, perf does not use the fact that a mapping is a stack, therefore we
can work around the issue by using /proc/pid/tasks/pid/maps. This entry does
not try to map a vma to stack and is thus much faster with no loss of
functonality.
The proc-map-timeout logic is kept in case users still want some upper limit.
In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual
/proc/pid/task/pid/maps, tasks -> task. Thanks Arnaldo for catching this.
Committer note:
This problem seems to have been elliminated in the kernel since commit :
b18cb64ead ("fs/proc: Stop trying to report thread stacks").
Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170315135059.GC2177@redhat.com
Link: http://lkml.kernel.org/r/1489598233-25586-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We indicate support for accepting sym+offset with kretprobes through a
line in ftrace README. Parse the same to identify support and choose the
appropriate format for kprobe_events.
As an example, without this perf patch, but with the ftrace changes:
naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/tracing/README | grep kretprobe
place (kretprobe): [<module>:]<symbol>[+<offset>]|<memaddr>
naveen@ubuntu:~/linux/tools/perf$
naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return
probe-definition(0): do_open%return
symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /boot/vmlinux for symbols
Open Debuginfo file: /boot/vmlinux
Try to find probe point from debuginfo.
Matched function: do_open [2d0c7d8]
Probe point found: do_open+0
Matched function: do_open [35d76b5]
found inline addr: 0xc0000000004ba984
Failed to find "do_open%return",
because do_open is an inlined function and has no return point.
An error occurred in debuginfo analysis (-22).
Trying to use symbols.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: r:probe/do_open do_open+0
Writing event: r:probe/do_open_1 do_open+0
Added new events:
probe:do_open (on do_open%return)
probe:do_open_1 (on do_open%return)
You can now use it in all perf tools, such as:
perf record -e probe:do_open_1 -aR sleep 1
naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list
c000000000041370 k kretprobe_trampoline+0x0 [OPTIMIZED]
c0000000004433d0 r do_open+0x0 [DISABLED]
c0000000004433d0 r do_open+0x0 [DISABLED]
And after this patch (and the subsequent powerpc patch):
naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return
probe-definition(0): do_open%return
symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /boot/vmlinux for symbols
Open Debuginfo file: /boot/vmlinux
Try to find probe point from debuginfo.
Matched function: do_open [2d0c7d8]
Probe point found: do_open+0
Matched function: do_open [35d76b5]
found inline addr: 0xc0000000004ba984
Failed to find "do_open%return",
because do_open is an inlined function and has no return point.
An error occurred in debuginfo analysis (-22).
Trying to use symbols.
Opening /sys/kernel/debug/tracing//README write=0
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: r:probe/do_open _text+4469712
Writing event: r:probe/do_open_1 _text+4956248
Added new events:
probe:do_open (on do_open%return)
probe:do_open_1 (on do_open%return)
You can now use it in all perf tools, such as:
perf record -e probe:do_open_1 -aR sleep 1
naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list
c000000000041370 k kretprobe_trampoline+0x0 [OPTIMIZED]
c0000000004433d0 r do_open+0x0 [DISABLED]
c0000000004ba058 r do_open+0x8 [DISABLED]
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/496ef9f33c1ab16286ece9dd62aa672807aef91c.1488961018.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Simplify and separate out the ftrace README scanning logic into a
separate helper. This is used subsequently to scan for all patterns of
interest and to cache the result.
Since we are only interested in availability of probe argument type x,
we will only scan for that.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/6dc30edc747ba82a236593be6cf3a046fa9453b5.1488961018.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch introduces a cgroup identifier entry field in perf report to
identify or distinguish data of different cgroups. It uses the device
number and inode number of cgroup namespace, included in perf data with
the new PERF_RECORD_NAMESPACES event, as cgroup identifier.
With the assumption that each container is created with it's own cgroup
namespace, this allows assessment/analysis of multiple containers at
once.
A simple test for this would be to clone a few processes passing
SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
different workloads on each of those contexts, while running perf
record command with --namespaces option.
Shown below is the output of perf report, sorted with cgroup identifier,
on perf.data generated with the above test scenario, clearly indicating
one context's considerable use of kernel memory in comparison with
others:
$ perf report -s cgroup_id,sample --stdio
#
# Total Lost Samples: 0
#
# Samples: 5K of event 'kmem:kmalloc'
# Event count (approx.): 5965
#
# Overhead cgroup id (dev/inode) Samples
# ........ ..................... ............
#
81.27% 3/0xeffffffb 4848
16.24% 3/0xf00000d0 969
1.16% 3/0xf00000ce 69
0.82% 3/0xf00000cf 49
0.50% 0/0x0 30
While this is a start, there is further scope of improving this. For
example, instead of cgroup namespace's device and inode numbers, dev
and inode numbers of some or all namespaces may be used to distinguish
which processes are running in a given container context.
Also, scripts to map device and inode info to containers sounds
plausible for better tracing of containers.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
by the kernel when fork, clone, setns or unshare are invoked. And update
perf-record documentation with the new option to record namespace
events.
Committer notes:
Combined it with a later patch to allow printing it via 'perf report -D'
and be able to test the feature introduced in this patch. Had to move
here also perf_ns__name(), that was introduced in another later patch.
Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
^
Testing it:
# perf record --namespaces -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
#
# perf report -D
<SNIP>
3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
[0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
0x1151e0 [0x30]: event: 9
.
. ... raw event: size 48 bytes
. 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h....
. 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c....
. 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................
<SNIP>
NAMESPACES events: 1
<SNIP>
#
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Skip the sample which doesn't have branch_info to avoid segmentation
fault:
The fault can be reproduced by:
perf record -a
perf report -F cycles
Signed-off-by: Changbin Du <changbin.du@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 0e332f033a ("perf tools: Add support for cycles, weight branch_info field")
Link: http://lkml.kernel.org/r/20170313083148.23568-1-changbin.du@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of trying to go on adding more ifdef conditions, do a feature
test and define HAVE_SCHED_GETCPU_SUPPORT instead, then use it to
provide the prototype. No need to change the stub, as it is already a
__weak symbol.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-yge89er9g90sc0v6k0a0r5tr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make system wide (-a) the default option if no target was specified and
one of following conditions is met:
- there's no workload specified (current behaviour)
- there is workload specified but all requested
events are system wide ones
Mixed events core/uncore with workload:
$ perf stat -e 'uncore_cbox_0/clockticks/,cycles' sleep 1
Performance counter stats for 'sleep 1':
<not supported> uncore_cbox_0/clockticks/
980,489 cycles
1.000897406 seconds time elapsed
Uncore event with workload:
$ perf stat -e 'uncore_cbox_0/clockticks/' sleep 1
Performance counter stats for 'system wide':
281,473,897,192,670 uncore_cbox_0/clockticks/
1.000833784 seconds time elapsed
Committer note:
When testing I realized the default case for !root, i.e. no events
passed via -e, was broke by v2 of this patch, reported and after a
patch provided by Jiri it is back working:
[acme@jouet linux]$ perf stat usleep 1
Performance counter stats for 'usleep 1':
0.401335 task-clock:u (msec) # 0.297 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
48 page-faults:u # 0.120 M/sec
458,146 cycles:u # 1.142 GHz
245,113 instructions:u # 0.54 insn per cycle
47,991 branches:u # 119.578 M/sec
4,022 branch-misses:u # 8.38% of all branches
0.001350029 seconds time elapsed
[acme@jouet linux]$
Suggested-and-Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170227094818.GA12764@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
$ perf test decoder
57: x86 instruction decoder - new instructions : FAILED!
$
Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 80 78 56 34 12 bndstx %bnd0,0x12345678(%rax)
Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 85 78 56 34 12 bndstx %bnd0,0x12345678(%rbp)
Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 84 01 78 56 34 12 bndstx %bnd0,0x12345678(%rcx,%rax,1)
Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 84 05 78 56 34 12 bndstx %bnd0,0x12345678(%rbp,%rax,1)
Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 84 08 78 56 34 12 bndstx %bnd0,0x12345678(%rax,%rcx,1)
There is missing initialization. It only affects the test because it is
checking 'rel' even in cases where there is no value.
Fix it.
Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/08c6ad07-7994-3e56-b20e-d75727ca7765@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The cpu_map__snprint_mask() generates a string representation of a
cpumask bitmap. For cpu 0 to 11, it'll return "fff".
Committer notes:
Fix compiler warning on some toolchains:
19 fedora:24-x-ARC-uClibc: FAIL
CC /tmp/build/perf/util/cpumap.o
util/cpumap.c: In function 'hex_char':
util/cpumap.c:679:2: error: comparison is always true due to limited range of data type [-Werror=type-limits]
if (0 <= val && val <= 9)
^
cc1: all warnings being treated as errors
Applying patch from Namhyung that makes function receive an 'unsigned
char', that is what the callers are passing to this function.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170224011251.14946-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add new sort key 'symbol_size' to allow user to sort by symbol size, or
(more usefully) display the symbol size using --fields=...,symbol_size.
Committer note:
Testing it together with the recently added -q, to remove the headers,
and using the '+' sign with -s, to add the symbol_size sort order to
the default, which is '-s/--sort comm,dso,symbol':
# perf report -q -s +symbol_size | head -10
10.39% swapper [kernel.vmlinux] [k] intel_idle 270
3.45% swapper [kernel.vmlinux] [k] update_blocked_averages 1546
2.61% swapper [kernel.vmlinux] [k] update_load_avg 1292
2.36% swapper [kernel.vmlinux] [k] update_cfs_shares 240
1.83% swapper [kernel.vmlinux] [k] __hrtimer_run_queues 606
1.74% swapper [kernel.vmlinux] [k] update_cfs_rq_load_avg. 1187
1.66% swapper [kernel.vmlinux] [k] apic_timer_interrupt 152
1.60% CPU 0/KVM [kvm] [k] kvm_set_msr_common 3046
1.60% gnome-shell libglib-2.0.so.0 [.] g_slist_find 37
1.46% gnome-termina libglib-2.0.so.0 [.] g_hash_table_lookup 370
#
Signed-off-by: Charles Baylis <charles.baylis@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1487943176-13840-1-git-send-email-charles.baylis@linaro.org
[ Use symbol__size(), remove needless %lld + (long long) casting ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is an odd refcount use case, so add some more comments to help
understand that when it hits zero it really means that the mmap()ed area
(on a perf_event_open() returned fd) has been munmap()ed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170223162344.GD3595@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.
This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Kook <keescook@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: alsa-devel@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-10-git-send-email-elena.reshetova@intel.com
[ Did missing tests/thread-map.c conversion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter.
This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Kook <keescook@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: alsa-devel@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-9-git-send-email-elena.reshetova@intel.com
[ Did missing conversion in __machine__remove_thread() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.
This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Kook <keescook@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: alsa-devel@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-8-git-send-email-elena.reshetova@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.
This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Kook <keescook@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: alsa-devel@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-7-git-send-email-elena.reshetova@intel.com
[ Did the missing conversion of tests/thread-mg-share.c too ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.
This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Kook <keescook@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: alsa-devel@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-6-git-send-email-elena.reshetova@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter.
This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Kook <keescook@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: alsa-devel@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-5-git-send-email-elena.reshetova@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.
This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Kook <keescook@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: alsa-devel@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-4-git-send-email-elena.reshetova@intel.com
[ Reinstated comm_str__get() function, needed when reusing entries in the rbtree ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter.
This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Kook <keescook@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: alsa-devel@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-3-git-send-email-elena.reshetova@intel.com
[ fixed mixed conversion to refcount in tests/cpumap.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.
This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Kook <keescook@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: alsa-devel@alsa-project.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1487691303-31858-2-git-send-email-elena.reshetova@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to the new file paths, remove them from introductory comments.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170218113140.8051-1-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull cgroup updates from Tejun Heo:
"Several noteworthy changes.
- Parav's rdma controller is finally merged. It is very straight
forward and can limit the abosolute numbers of common rdma
constructs used by different cgroups.
- kernel/cgroup.c got too chubby and disorganized. Created
kernel/cgroup/ subdirectory and moved all cgroup related files
under kernel/ there and reorganized the core code. This hurts for
backporting patches but was long overdue.
- cgroup v2 process listing reimplemented so that it no longer
depends on allocating a buffer large enough to cache the entire
result to sort and uniq the output. v2 has always mangled the sort
order to ensure that users don't depend on the sorted output, so
this shouldn't surprise anybody. This makes the pid listing
functions use the same iterators that are used internally, which
have to have the same iterating capabilities anyway.
- perf cgroup filtering now works automatically on cgroup v2. This
patch was posted a long time ago but somehow fell through the
cracks.
- misc fixes asnd documentation updates"
* 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (27 commits)
kernfs: fix locking around kernfs_ops->release() callback
cgroup: drop the matching uid requirement on migration for cgroup v2
cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
cgroup: misc cleanups
cgroup: call subsys->*attach() only for subsystems which are actually affected by migration
cgroup: track migration context in cgroup_mgctx
cgroup: cosmetic update to cgroup_taskset_add()
rdmacg: Fixed uninitialized current resource usage
cgroup: Add missing cgroup-v2 PID controller documentation.
rdmacg: Added documentation for rdmacg
IB/core: added support to use rdma cgroup controller
rdmacg: Added rdma cgroup controller
cgroup: fix a comment typo
cgroup: fix RCU related sparse warnings
cgroup: move namespace code to kernel/cgroup/namespace.c
cgroup: rename functions for consistency
cgroup: move v1 mount functions to kernel/cgroup/cgroup-v1.c
cgroup: separate out cgroup1_kf_syscall_ops
cgroup: refactor mount path and clearly distinguish v1 and v2 paths
cgroup: move cgroup v1 specific code to kernel/cgroup/cgroup-v1.c
...
Fix typos and add the following to the scripts/spelling.txt:
an one||a one
I dropped the "an" before "one or more" in
drivers/net/ethernet/sfc/mcdi_pcol.h.
Link: http://lkml.kernel.org/r/1481573103-11329-6-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix typos and add the following to the scripts/spelling.txt:
an union||a union
Link: http://lkml.kernel.org/r/1481573103-11329-5-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
"Highlights:
1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
Varadhan.
2) Simplify classifier state on sk_buff in order to shrink it a bit.
From Willem de Bruijn.
3) Introduce SIPHASH and it's usage for secure sequence numbers and
syncookies. From Jason A. Donenfeld.
4) Reduce CPU usage for ICMP replies we are going to limit or
suppress, from Jesper Dangaard Brouer.
5) Introduce Shared Memory Communications socket layer, from Ursula
Braun.
6) Add RACK loss detection and allow it to actually trigger fast
recovery instead of just assisting after other algorithms have
triggered it. From Yuchung Cheng.
7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.
8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.
9) Export MPLS packet stats via netlink, from Robert Shearman.
10) Significantly improve inet port bind conflict handling, especially
when an application is restarted and changes it's setting of
reuseport. From Josef Bacik.
11) Implement TX batching in vhost_net, from Jason Wang.
12) Extend the dummy device so that VF (virtual function) features,
such as configuration, can be more easily tested. From Phil
Sutter.
13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
Dumazet.
14) Add new bpf MAP, implementing a longest prefix match trie. From
Daniel Mack.
15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.
16) Add new aquantia driver, from David VomLehn.
17) Add bpf tracepoints, from Daniel Borkmann.
18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
Florian Fainelli.
19) Remove custom busy polling in many drivers, it is done in the core
networking since 4.5 times. From Eric Dumazet.
20) Support XDP adjust_head in virtio_net, from John Fastabend.
21) Fix several major holes in neighbour entry confirmation, from
Julian Anastasov.
22) Add XDP support to bnxt_en driver, from Michael Chan.
23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.
24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.
25) Support GRO in IPSEC protocols, from Steffen Klassert"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
Revert "ath10k: Search SMBIOS for OEM board file extension"
net: socket: fix recvmmsg not returning error from sock_error
bnxt_en: use eth_hw_addr_random()
bpf: fix unlocking of jited image when module ronx not set
arch: add ARCH_HAS_SET_MEMORY config
net: napi_watchdog() can use napi_schedule_irqoff()
tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
net/hsr: use eth_hw_addr_random()
net: mvpp2: enable building on 64-bit platforms
net: mvpp2: switch to build_skb() in the RX path
net: mvpp2: simplify MVPP2_PRS_RI_* definitions
net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
net: mvpp2: remove unused register definitions
net: mvpp2: simplify mvpp2_bm_bufs_add()
net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
net: mvpp2: release reference to txq_cpu[] entry after unmapping
net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
...
It now can have negative value to suppress the message entirely. So it
needs to check it being positive.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170217081742.17417-3-namhyung@kernel.org
[ Adjust fuzz on tools/perf/util/pmu.c, add > 0 checks in many other places ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_quiet_option() is to suppress all messages. It's intended to
be called just after parsing options.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170217081742.17417-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we allow not to specify value for numeric terms and we set
them to value 1. This was originaly meant just for single bit terms to
allow user to type:
$ perf record -e 'cpu/cpu-cycles,any'
instead of:
$ perf record -e 'cpu/cpu-cycles,any=1'
However it works also for multi bits terms like:
$ perf record -e 'cpu/event/' ls
...
$ perf evlist -v
..., config: 0x1, ...
After discussion with Peter we decided making such term usage to fail,
like:
$ perf record -e 'cpu/event/' ls
event syntax error: 'cpu/event/'
\___ no value assigned for term
...
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1487340058-10496-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to add yet another parameter to new_term function in following
patch, so it's better to move first all the current params into template
struct parse_events_term and use it as a single argument.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1487340058-10496-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are 2 problems wrt. cpu_topology_map on systems with sparse CPUs:
1. offline/absent CPUs will have their socket_id and core_id set to -1
which triggers:
"socket_id number is too big.You may need to upgrade the perf tool."
2. size of cpu_topology_map (perf_env.cpu[]) is allocated based on
_SC_NPROCESSORS_CONF, but can be indexed with CPU ids going above.
Users of perf_env.cpu[] are using CPU id as index. This can lead
to read beyond what was allocated:
==19991== Invalid read of size 4
==19991== at 0x490CEB: check_cpu_topology (topology.c:69)
==19991== by 0x490CEB: test_session_topology (topology.c:106)
...
For example:
_SC_NPROCESSORS_CONF == 16
available: 2 nodes (0-1)
node 0 cpus: 0 6 8 10 16 22 24 26
node 0 size: 12004 MB
node 0 free: 9470 MB
node 1 cpus: 1 7 9 11 23 25 27
node 1 size: 12093 MB
node 1 free: 9406 MB
node distances:
node 0 1
0: 10 20
1: 20 10
This patch changes HEADER_NRCPUS.nr_cpus_available from _SC_NPROCESSORS_CONF
to max_present_cpu and updates any user of cpu_topology_map to iterate
with nr_cpus_avail.
As a consequence HEADER_CPU_TOPOLOGY core_id and socket_id lists get longer,
but maintain compatibility with pre-patch state - index to cpu_topology_map is
CPU id.
perf test 36 -v
36: Session topology :
--- start ---
test child forked, pid 22211
templ file: /tmp/perf-test-gmdX5i
CPU 0, core 0, socket 0
CPU 1, core 0, socket 1
CPU 6, core 10, socket 0
CPU 7, core 10, socket 1
CPU 8, core 1, socket 0
CPU 9, core 1, socket 1
CPU 10, core 9, socket 0
CPU 11, core 9, socket 1
CPU 16, core 0, socket 0
CPU 22, core 10, socket 0
CPU 23, core 10, socket 1
CPU 24, core 1, socket 0
CPU 25, core 1, socket 1
CPU 26, core 9, socket 0
CPU 27, core 9, socket 1
test child finished with 0
---- end ----
Session topology: Ok
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/d7c05c6445fca74a8442c2c73cfffd349c52c44f.1487146877.git.jstancek@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When build_cpu_topo() encounters offline/absent CPUs, it fails to find any
sysfs entries and returns failure.
This leads to build_cpu_topology() and write_cpu_topology() failing as
well.
Because HEADER_CPU_TOPOLOGY has not been written, read leaves cpu_topology_map
NULL and we get NULL ptr deref at:
...
cmd_test
__cmd_test
test_and_print
run_test
test_session_topology
check_cpu_topology
36: Session topology :
--- start ---
test child forked, pid 14902
templ file: /tmp/perf-test-4CKocW
failed to write feature HEADER_CPU_TOPOLOGY
perf: Segmentation fault
Obtained 9 stack frames.
./perf(sighandler_dump_stack+0x41) [0x5095f1]
/lib64/libc.so.6(+0x35250) [0x7f4b7c3c9250]
./perf(test_session_topology+0x1db) [0x490ceb]
./perf() [0x475b68]
./perf(cmd_test+0x5b9) [0x4763c9]
./perf() [0x4945a3]
./perf(main+0x69f) [0x427e8f]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f4b7c3b5b35]
./perf() [0x427fb9]
test child interrupted
---- end ----
Session topology: FAILED!
This patch makes build_cpu_topology() skip offline/absent CPUs, by checking
their presence against cpu_map built from online CPUs.
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/a271b770175524f4961d4903af33798358a4a518.1487146877.git.jstancek@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to cpu__max_cpu() (which returns the max possible CPU), returns
the max present CPU.
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/8ea4601b5cacc49927235b4ebac424bd6eeccb06.1487146877.git.jstancek@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The struct branch_stack->branch_stack.cycles field is a u64 :16
bitfield, and this somehow confuses clang 4.0 when checking the
arguments of a printf format, so cast the :16 to unsigned short to help
it.
Silences this:
util/session.c:935:4: error: format specifies type 'unsigned short' but the argument has type 'u64' (aka 'unsigned long') [-Werror,-Wformat]
e->flags.cycles,
^~~~~~~~~~~~~~~
1 error generated.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eo2t4uhlbne105z72tvyzkp1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -spec=/path/to/file can be used to change what gcc puts in the cc,
ld, etc command lines, but this is not present in clang, filter it out
at the setup.py file by changing python2's internal variable where it
keeps its initial CFLAGS value.
With this all of perf can be built in at least Fedora 25, fixing this
problem:
GEN /tmp/build/perf/python/perf.so
CC /tmp/build/perf/builtin-buildid-list.o
clang-4.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
clang-4.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
error: command 'clang' failed with exit status 1
Now I need to change all the containers where I have clang to build
perf with it, so that we can check that in other distros (opensuse, debian,
ubuntu, etc) this also works.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-g9lhgr162ao8ao29vvf0hgm1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Gcc has a -spec option to override what options to pass to cc, etc, and
in some distros this is used, like in fedora, where we end up getting
this passed to gcc that makes clang, that doesn't have this option to
stop the build:
CC /tmp/build/perf/util/scripting-engines/trace-event-python.o
clang-4.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
So filter this out when the compiler used is clang, this way we
can build the python scripting support in tools/perf/.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2gosxoiouf24pnlknp7w7q4z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As pointed out by clang, we were not providing a prototype for a
function before using it:
util/parse-events.y:699:6: error: conflicting types for 'parse_events_error'
void parse_events_error(YYLTYPE *loc, void *data,
^
/tmp/build/perf/util/parse-events-bison.c:2224:7: note: previous implicit declaration is here
yyerror (&yylloc, _data, scanner, YY_("syntax error"));
^
/tmp/build/perf/util/parse-events-bison.c:65:25: note: expanded from macro 'yyerror'
#define yyerror parse_events_error
1 error generated.
One line fix it.
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170215130605.GC4020@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The alias->unit field is an array, so to check that it is not set we
should see if it is an empty string, i.e. alias->unit[0], instead of
checking alias->unit != NULL, as this will _always_ evaluate to 'true'.
Pointed out by clang.
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170214182435.GD4458@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In a few cases we were using 'enum map_type' and that triggered this
warning when using clang:
util/session.c:1923:16: error: comparison of constant 2 with expression of type 'enum map_type' is always true
[-Werror,-Wtautological-constant-out-of-range-compare]
for (i = 0; i < MAP__NR_TYPES; ++i) {
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i6uyo6bsopa2dghnx8qo7rri@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So set it only for other compilers, allowing us to overcome yet another
build failure due to an inexistent clang -W option:
error: unknown warning option '-Wno-override-init'; did you mean '-Wno-override-module'? [-Werror,-Wunknown-warning-option]
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-oaa1ici3j8nygp4pzl2oobh3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As this is a GNU extension and while harmless in this case, we can do
the same thing in a more clearer way by using a existing thread_map and
cpu_map constructors:
With this we avoid this while compiling with clang:
util/evsel.c:1659:17: error: field 'map' with variable sized type 'struct cpu_map' not at the end of a struct or class is a GNU extension
[-Werror,-Wgnu-variable-sized-type-not-at-end]
struct cpu_map map;
^
util/evsel.c:1667:20: error: field 'map' with variable sized type 'struct thread_map' not at the end of a struct or class is a GNU extension
[-Werror,-Wgnu-variable-sized-type-not-at-end]
struct thread_map map;
^
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-207juvrqjiar7uvas2s83v5i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Genuine problem detected with clang, the warnings are spot on:
util/probe-event.c:2079:7: error: variable 'map' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
if (addr) {
^~~~
util/probe-event.c:2094:6: note: uninitialized use occurs here
if (map && !is_kprobe) {
^~~
util/probe-event.c:2079:3: note: remove the 'if' if its condition is always true
if (addr) {
^~~~~~~~~~
util/probe-event.c:2075:8: error: variable 'map' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
if (kernel_get_symbol_address_by_name(tp->symbol,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/probe-event.c:2094:6: note: uninitialized use occurs here
if (map && !is_kprobe) {
^~~
util/probe-event.c:2075:4: note: remove the 'if' if its condition is always false
if (kernel_get_symbol_address_by_name(tp->symbol,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/probe-event.c:2064:17: note: initialize the variable 'map' to silence this warning
struct map *map;
^
= NULL
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-m3501el55i10hctfbmi2qxzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As this is a GNU extension and while harmless in this case, we can do
the same thing in a more clearer way by using an existing thread_map
constructor.
With this we avoid this while compiling with clang:
util/parse-events.c:2024:21: error: field 'map' with variable sized type 'struct thread_map' not at the end of a struct or class is a GNU extension
[-Werror,-Wgnu-variable-sized-type-not-at-end]
struct thread_map map;
^
1 error generated.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-tqocbplnyyhpst6drgm2u4m3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it will always evaluate to 'true', as reported by clang:
util/map.c:390:36: error: address of array 'map->dso->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
if (map && map->dso && (map->dso->name || map->dso->long_name)) {
~~~~~~~~~~^~~~ ~~
util/map.c:393:22: error: address of array 'map->dso->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
else if (map->dso->name)
~~ ~~~~~~~~~~^~~~
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-x8cu007cly40kfp8xnpi9kya@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it is an array, so will always evaluate to 'true', as reported by
clang:
builtin-sched.c:2070:19: error: address of array 'sym->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
if (sym && sym->name) {
~~ ~~~~~^~~~
1 warning generated.
So just ditch all those useless checks.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ydpm927col06paixb775jjx5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a tool can't open counters due to the kernel.perf_event_paranoit
sysctl setting, we inform how to tweak it to allow the operation to
succeed, in addition to that, suggest setting /etc/sysctl.conf to
make the setting permanent.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4gwe99k4a6p12d4u8bbyttj2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix below compile error:
CC util/scripting-engines/trace-event-perl.o
In file included from /usr/lib/perl5/5.22.2/i686-linux/CORE/perl.h:5673:0,
from util/scripting-engines/trace-event-perl.c:31:
/usr/lib/perl5/5.22.2/i686-linux/CORE/inline.h: In function 'S__is_utf8_char_slow':
/usr/lib/perl5/5.22.2/i686-linux/CORE/inline.h:270:5: error: nested extern declaration of 'Perl___notused' [-Werror=nested-externs]
dTHX; /* The function called below requires thread context */
^
cc1: all warnings being treated as errors
After digging perl5 repository, I find out that we will meet this
compile error with perl from v5.21.1 to v5.25.4
Signed-off-by: Wang YanQing <udknight@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170212024655.GA15997@udknight
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To address new warnings emmited by gcc 7, e.g.::
CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o
CC /tmp/build/perf/tests/parse-events.o
util/intel-pt-decoder/intel-pt-pkt-decoder.c: In function 'intel_pt_pkt_desc':
util/intel-pt-decoder/intel-pt-pkt-decoder.c:499:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (!(packet->count))
^
util/intel-pt-decoder/intel-pt-pkt-decoder.c:501:2: note: here
case INTEL_PT_CYC:
^~~~
CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o
cc1: all warnings being treated as errors
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-mf0hw789pu9x855us5l32c83@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In commit daeecbc0c4 ("perf tools: Add event_update event scale type"), the
handling of PERF_EVENT_UPDATE__SCALE cast struct event_update_event->data to a
pointer to event_update_event_scale, uses some field from this casted struct
and then ends up falling through to the handling of another event type,
PERF_EVENT_UPDATE__CPUS were it casts that ev->data to yet another type, oops,
fix it by inserting the missing break.
Noticed when building perf using gcc 7 on Fedora Rawhide:
util/header.c: In function 'perf_event__process_event_update':
util/header.c:3207:16: error: this statement may fall through [-Werror=implicit-fallthrough=]
evsel->scale = ev_scale->scale;
~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
util/header.c:3208:2: note: here
case PERF_EVENT_UPDATE__CPUS:
^~~~
This wasn't noticed because probably PERF_EVENT_UPDATE__CPUS comes after
PERF_EVENT_UPDATE__SCALE, so we would just create a bogus evsel->own_cpus when
processing a PERF_EVENT_UPDATE__SCALE to then leak it and create a new cpu map
with the correct data.
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: daeecbc0c4 ("perf tools: Add event_update event scale type")
Link: http://lkml.kernel.org/n/tip-lukcf9hdj092ax2914ss95at@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The size of dirent->dt_name is NAME_MAX + 1, but the size for the 'path'
buffer is hard coded at 256, which may truncate it because we also
prepend "/proc/", so that all that into account and thank gcc 7 for this
warning:
/git/linux/tools/perf/util/thread_map.c: In function 'thread_map__new_by_uid':
/git/linux/tools/perf/util/thread_map.c:119:39: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size 250 [-Werror=format-truncation=]
snprintf(path, sizeof(path), "/proc/%s", dirent->d_name);
^~
In file included from /usr/include/stdio.h:939:0,
from /git/linux/tools/perf/util/thread_map.c:5:
/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 7 and 262 bytes into a destination of size 256
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-csy0r8zrvz5efccgd4k12c82@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:
util/strfilter.c: In function 'strfilter_node__sprint':
util/strfilter.c:270:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (len < 0)
^
util/strfilter.c:272:2: note: here
case '!':
^~~~
cc1: all warnings being treated as errors
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-z2dpywg7u8fim000hjfbpyfm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:
CC /tmp/build/perf/util/string.o
util/string.c: In function 'perf_atoll':
util/string.c:22:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (*p)
^
util/string.c:24:3: note: here
case '\0':
^~~~
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0ophb30v9apkk6o95el0rqlq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It was using uapi/linux/mmap.h which caused for at least one reporter,
that hasn't specified in what environment the problem manifests itself:
----
The original error is:
In file included from util/event.c:2:0:
...tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h:
No such file or directory
#include <uapi/asm/mman.h>
^
compilation terminated.
----
Test built it on these containers:
# dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 debian:experimental-x-arm64: Ok
11 debian:experimental-x-mips: Ok
12 debian:experimental-x-mips64: Ok
13 debian:experimental-x-mipsel: Ok
14 fedora:20: Ok
15 fedora:21: Ok
16 fedora:22: Ok
17 fedora:23: Ok
18 fedora:24: Ok
19 fedora:24-x-ARC-uClibc: Ok
20 fedora:25: Ok
21 fedora:rawhide: Ok
22 mageia:5: Ok
23 opensuse:13.2: Ok
24 opensuse:42.1: Ok
25 opensuse:tumbleweed: Ok
26 ubuntu:12.04.5: Ok
27 ubuntu:14.04.4-x-linaro-arm64: Ok
28 ubuntu:15.10: Ok
29 ubuntu:16.04: Ok
30 ubuntu:16.04-x-arm: Ok
31 ubuntu:16.04-x-arm64: Ok
32 ubuntu:16.04-x-powerpc: Ok
33 ubuntu:16.04-x-powerpc64: Ok
34 ubuntu:16.04-x-powerpc64el: Ok
35 ubuntu:16.04-x-s390: Ok
36 ubuntu:16.10: Ok
Reported-by: David Carrillo-Cisneros <davidcc@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: fbef103fad ("perf tools: Do hugetlb handling in more systems")
Link: http://lkml.kernel.org/n/tip-4wm5xmjz5wgbq7ucyz4dyd72@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The cases changed in this patch are for when we free but keep the
pointer to the freed area, which is not always a good idea.
Be more defensive and zero the pointer to avoid possible use after
free bugs to take more time to be detected.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-5-git-send-email-treeze.taeung@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have zfree(&ptr) for this very common pattern:
free(ptr);
ptr = NULL;
So use it in a few more places.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-4-git-send-email-treeze.taeung@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After commit 5baecbcd9c ("perf symbols: we can now read separate
debug-info files based on a build ID") and when --symfs option is used
perf failed to pick up symbols for file with the same name between host
and sysroot specified by --symfs option. One can see message like this:
bin/bash with build id 26f0062cb6950d4d1ab0fd9c43eae8b10ca42062 not found, continuing without symbols
It happens because code added by 5baecbcd9c opens files directly by
dso->long_name without symbol_conf.symfs consideration, which as result
picks one from the host. It reads its build ID and later even code finds
another proper file in directory pointed by --symfs perf ignores it
because build id mismatches.
Fix is to use __symbol__join_symfs to adjust file name according to
--symfs setting. If no --symfs passed the operation would noop and picks
the same host file as before.
Also note in latter tree after 5baecbcd9c commit additional check for
'!dso->has_build_id' was added, so to observe error condition 'perf
record' should run with --no-buildid, so perf.data itself would not have
build id for target binary in buildid perf section and 'perf report'
will pass '!dso->has_build_id' condition. Or target binary should not
have build id, but the same binary on host has build id, again
'!dso->has_build_id' will pass in this case and incorrect build id could
be read if --symfs is used.
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Phlipot <cphlipot0@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: xe-linux-external@cisco.com
Fixes: 5baecbcd9c ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1486424908-17094-1-git-send-email-kamensky@cisco.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For debugging and testing it is useful to see the converted alias
string. Add support to perf stat/record and perf list to print the alias
conversion. The text string is saved in the alias structure. For perf
stat/record it is folded into the normal -v. For perf list -v was taken,
so we use --debug.
Before:
% perf list
...
cache:
l1d.replacement
[L1D data line replacements]
l1d_pend_miss.fb_full
[Cycles a demand request was blocked due to Fill Buffers inavailability]
After
% perf list --debug
...
cache:
l1d.replacement
[L1D data line replacements]
cpu/umask=0x1,period=2000003,event=0x51/
l1d_pend_miss.fb_full
[Cycles a demand request was blocked due to Fill Buffers inavailability]
cpu/umask=0x2,period=2000003,cmask=1,event=0x48/
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The code for handling pmu aliases without specifying the PMU hardcoded
only supported the cpu PMU.
This patch extends it to work for all PMUs. We always duplicate the
event for all PMUs that have an matching alias. This allows to
automatically expand an alias for all instances of a PMU (so for example
you can monitor all cache boxes with a single event)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for registering json aliases per PMU. Any alias with an unit
matching the prefix is registered to the PMU. Uncore has multiple
instances of most units, so all these aliases get registered for each
individual PMU (this is important later to run the event on every
instance of the PMU).
To avoid printing the events multiple times in perf list filter out
duplicated events during printing.
v2: Rely on uncore_ prefix already in unit
v3: Document why calls were reordered
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Handle the "Unit" field, which is needed to find the right PMU for an
event. We call it "pmu" and convert it to the perf pmu name with an
uncore prefix.
Handle the "ExtSel" field, which just extends the event mask with an
additional bit.
Handle the "Filter" field which adds parameters to the main event
to configure filtering.
Handle the "Unit" field which declares the unit the values should be
scaled too (similar to what the kernel exports)
Set up the "perpkg" field for uncore events so that perf knows they are
per package (similar to what the kernel exports)
Then output the fields into the pmu-events data structures which are
compiled into perf.
Filter out zero fields, except for the event itself.
v2: Fix compilation. Add uncore_ prefix at pre-processing time.
Move eventcode change to separate patch.
v3: Remove extra __maybe_unused
v4: dont duplicate aliases for cpu pmu events
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These two debug messages are missing the trailing newline.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bintian Wang <bintian.wang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170207073412.26983-2-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_event is a utility controller whose primary role is identifying
cgroup membership to filter perf events; however, because it also
tracks some per-css state, it can't be replaced by pure cgroup
membership test. Mark the controller as implicitly enabled on the
default hierarchy so that perf events can always be filtered based on
cgroup v2 path as long as the controller is not mounted on a legacy
hierarchy.
"perf record" is updated accordingly so that it searches for both v1
and v2 hierarchies. A v1 hierarchy is used if perf_event is mounted
on it; otherwise, it uses the v2 hierarchy.
v2: Doc updated to reflect more flexible rebinding behavior.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor. Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org
Fixes: 84c2cafa28 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 21e6d84286 ("perf diff: Use perf_hpp__register_sort_field
interface") changed list_add() to perf_hpp__register_sort_field().
This resulted in a behavior change since the field was added to the tail
instead of the head. So the -o option is mostly ignored due to its
order in the list.
This patch fixes it by adding perf_hpp__prepend_sort_field().
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 21e6d84286 ("perf diff: Use perf_hpp__register_sort_field interface")
Link: http://lkml.kernel.org/r/20170118051457.30946-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to for_each_subsystem and for_each_event in util/parse-events.c,
add new macro 'for_each_event' for easy iteration over the tracepoints
in order to be more compact and readable.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485862711-20216-2-git-send-email-treeze.taeung@gmail.com
[ Slight change to keep existing style for checking strcmp() return ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
rm_rf() doesn't modify its path argument, and a future caller will pass
a string constant into it to delete.
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-5-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor. Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org
Fixes: 84c2cafa28 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Previously these were being ignored, sometimes silently.
Stop doing that, emitting debug messages and handling the errors.
Testing it:
$ cat ~/.perfconfig
cat: /home/acme/.perfconfig: No such file or directory
$ perf stat -e cycles usleep 1
Performance counter stats for 'usleep 1':
938,996 cycles:u
0.003813731 seconds time elapsed
$ perf top --stdio
Error:
You may not have permission to collect system-wide stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid,
<SNIP>
[ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
[acme@jouet linux]$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
# Overhead Command Shared Object Symbol
# ........ ....... ................. .........................
71.77% usleep libc-2.24.so [.] _dl_addr
27.07% usleep ld-2.24.so [.] _dl_next_ld_env_entry
1.13% usleep [kernel.kallsyms] [k] page_fault
$
$ touch ~/.perfconfig
$ ls -la ~/.perfconfig
-rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig
$
$ perf stat -e instructions usleep 1
Performance counter stats for 'usleep 1':
244,610 instructions:u
0.000805383 seconds time elapsed
$
[root@jouet ~]# chown acme.acme ~/.perfconfig
[root@jouet ~]# perf stat -e cycles usleep 1
Warning: File /root/.perfconfig not owned by current user or root, ignoring it.
Performance counter stats for 'usleep 1':
937,615 cycles
0.000836931 seconds time elapsed
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While propagating the errors from perf_config(), which were being
completely ignored, everything stopped working for people without a
~/.perfconfig file, because the perf_config_set__init() was considering
an error not to have a .perfconfig file, duh, fix it by checking the
errno after the failed stat() call.
It should also not return an error when it says it is ignoring the file,
and also a empty file should not return an error either.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8beeb00f2c ("perf config: Use new perf_config_set__init() to initialize config set")
Link: http://lkml.kernel.org/n/tip-ygpbab3apbs6l8wr97xedwks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Introduce 'perf ftrace' a perf front end to the kernel's ftrace
function and function_graph tracer, defaulting to the "function_graph"
tracer, more work will be done in reviving this effort, forward porting
it from its initial patch submission (Namhyung Kim)
- Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)
Fixes:
- Fix wrong register name for arm64, used in 'perf probe' (He Kuang)
- Fix map offsets in relocation in libbpf (Joe Stringer)
- Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)
Infrastructure:
- libbpf prog functions sync with what is exported via uapi (Joe Stringer)
Trivial:
- Remove unnecessary checks and assignments in 'perf probe's
try_to_find_absolute_address() (Markus Elfring)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYig7UAAoJENZQFvNTUqpAhJQP/iI0T7A8TNekPGLv7j20c302
89N9+9TAFtVqjgr1hIzqQgGOqbOdAW1tU3VTPW92nNDBn9JV5qwuF9YWEiDaAVv2
0bmV5hLnrNlymddm3pdg/PbD1TVlwk2NFxtrkPxuf/vx0ZhEGqsSrRUCR/xGXbtQ
TcMg3rQquspV9JNv4HzFdQC9nsG1CGNotZKsE1avRw70pWAqCtF81B0m8teb6OWo
5qnN+AMJlYcC+OGffROemUksuehkMvi5L8v1e/6RO/lU1qt9Jrc/2sT9cqvjVFNR
k4c76cUgWOCYzDEotENMpU4bc6e/24DE2ydFeovihdXw8Qs4ajEA9LXKM4yW+ZoE
MZE3GS153a8n+CvTfkB9Ow1QJ8rgmR/L0BuhmGb6bYW/MtuTRTShhSduZwOrIyap
9KckHYti4p3oN3CKFYGO9PN3DRUdx+Xqg/miwrgjkPo09QFp+lzfFFOk0P2/Zqw2
yfvdWeHxkkrwoWQIyMHVKp/E9jQPuyYqwnKdp68LCN+DgNiFpPpSA8id5e47RQDE
otqrK8U/82ktakfrBijSPBI6EEqFg7ltip2KT/xlDMfnP9HtxgFhzrk52dyi6pM/
jkBhJaTQhVZTyaFvUXuaLmBSdPpcaaGM4KJ+2iAayA2r0KLiDj6IdzD5ROCRFOvJ
SFA472mIxNxUjpQEUTtc
=tYKN
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-4.11-20170126' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull the latest perf/core updates from Arnaldo Carvalho de Melo:
New features:
- Introduce 'perf ftrace' a perf front end to the kernel's ftrace
function and function_graph tracer, defaulting to the "function_graph"
tracer, more work will be done in reviving this effort, forward porting
it from its initial patch submission (Namhyung Kim)
- Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)
Fixes:
- Fix wrong register name for arm64, used in 'perf probe' (He Kuang)
- Fix map offsets in relocation in libbpf (Joe Stringer)
- Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)
Infrastructure changes:
- libbpf prog functions sync with what is exported via uapi (Joe Stringer)
Trivial changes:
- Remove unnecessary checks and assignments in 'perf probe's
try_to_find_absolute_address() (Markus Elfring)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Current trace info data lacks the saved cmdline mapping which is needed
for pevent to find out the comm of a task. Add this and bump up the
version number so that perf can determine its presence when reading.
This is mostly corresponding to trace.dat file version 6, but still
lacks 4 byte of number of cpus, and 10 bytes of type string - and I
think we don't need those anyway.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
[ Change version test from == to >= ]
Link: http://lkml.kernel.org/n/tip-vaooqpxsikxbb3359p0corcb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Do just like handling other cases i.e. print some debug message and
ignore the sample.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-t7kzlm3cxyvbd7d9n9554ai9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove an error code assignment which is redundant in an if branch for
the handling of a memory allocation failure because the same value was
set for the local variable "err" before.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/0ede09ec-79b6-c8bd-5b20-02c63ed98aab@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove a condition check which is unnecessary at the end
because this source code place should usually only be reached
with a non-zero pointer.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/a3f2473b-6383-a326-bce0-b826423608b8@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for the __print_hex_str() macro that was added for
tracing, so that user space tools such as perf can understand
it as well.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using perf with call graph method dwarf fails to provide backtrace
support for stripped binary even though .gnu_debuglink points to *.dbg
flavor with properly populated debug symbols.
Problem is reproduced on ARM (v7, v8), kernels 3.14.y, 4.4.y and
4.10.rc3. Perf is configured with libunwind, and unwind dwarf support
[1]. Test code (stress_bt.c) can be found on [2].
Running (explicitly disable other unwinding methods):
$ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
-fno-asynchronous-unwind-tables stress_bt.c
$ perf record -N --call-graph dwarf ./stress_bt
$ perf report
results in properly generated call graph. Stripping the binary and running
it results with missing call graph. Expected result is to have call graph:
$ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
-fno-asynchronous-unwind-tables stress_bt.c
$ objcopy --only-keep-debug stress_bt stress_bt.dbg
$ objcopy --strip-debug stress_bt
$ objcopy --add-gnu-debuglink=stress_bt.dbg stress_bt
$ perf record -N --call-graph dwarf ./stress_bt
$ perf report
Problem is that perf doesn't try to read symbols pointed by gnu
debuglink. Patch adds checking, and reading of the symbols from
debuglink and symsrc. Order of the check is to first check within dso,
then check whether symsrc is defined and try to read from it. Finally,
debuglink is checked. Default locations of debug files are discussed in
[3] and [4]. Comments on RFC are on [5].
[1] https://wiki.linaro.org/LEG/Engineering/TOOLS/perf-callstack-unwinding
[2] [1]#Backtrace_stress_application
[3] https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
[4] https://sourceware.org/binutils/docs/binutils/objcopy.html
[5] https://lkml.org/lkml/2016/8/22/473
Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/d309d40a-463f-482b-68e1-1465326efdc1@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The use_browser and perf_version_string variables are both declared in
perf.c but they are also referenced by other functions of libperf.a.
Therefore a user linking an own main() with libperf.a must declare those
two variables in their files even if the files never use the browser or
the version information.
This patch fixes this issue by moving use_browser and
perf_version_string out of perf.c to some other files.
Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170117002237.c1aec0ce3b4d675dca018deb@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix to probe on gcc generated functions on modules. Since
probing on a module is based on its symbol name, it should
be adjusted on actual symbols.
E.g. without this fix, perf probe shows probe definition
on non-exist symbol as below.
$ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -F in_range*
in_range.isra.12
$ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
p:probe/in_range nf_nat:in_range+0
With this fix, perf probe correctly shows a probe on
gcc-generated symbol.
$ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
p:probe/in_range nf_nat:in_range.isra.12+0
This also fixes same problem on online module as below.
$ perf probe -m i915 -D assert_plane
p:probe/assert_plane i915:assert_plane.constprop.134+0
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411450673.9978.14905987549651656075.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>