Just like we do for pr_debug, so that we can have a single point
where to redirect to the currently used output system, be it
stdio or newt.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268349164-5822-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At present, the perf subcommands that do system-wide monitoring
(perf stat, perf record and perf top) don't work properly unless
the online cpus are numbered 0, 1, ..., N-1. These tools ask
for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN)
and then try to create events for cpus 0, 1, ..., N-1.
This creates problems for systems where the online cpus are
numbered sparsely. For example, a POWER6 system in
single-threaded mode (i.e. only running 1 hardware thread per
core) will have only even-numbered cpus online.
This fixes the problem by reading the /sys/devices/system/cpu/online
file to find out which cpus are online. The code that does that is in
tools/perf/util/cpumap.[ch], and consists of a read_cpu_map()
function that sets up a cpumap[] array and returns the number of
online cpus. If /sys/devices/system/cpu/online can't be read or
can't be parsed successfully, it falls back to using sysconf to
ask how many cpus are online and sets up an identity map in cpumap[].
The perf record, perf stat and perf top code then calls
read_cpu_map() in the system-wide monitoring case (instead of
sysconf) and uses cpumap[] to get the cpu numbers to pass to
perf_event_open.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100310093609.GA3959@brick.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If -vv is used just the map table will be printed, -vvv will
print the symbol table too, with it we can see that we have a
bug where some samples are not being resolved to a map when we
get them in the perf.data stream, but after we have it all
processed, we can find the right map, some reordering probably
is happening.
Upcoming patches will provide ways to ask for most PERF_SAMPLE_
conditional samples to be taken for !PERF_RECORD_SAMPLE events
too, then we'll be able to ask for PERF_SAMPLE_TIME and
PERF_SAMPLE_CPU to help diagnose this.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268161097-17761-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now that report can store historgrams for multiple events we
need to be able to do the post processing work for each
histogram. This patch changes the post processing functions so
that they can be called individually for each event's histogram.
Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
[ Guarantee bisectabilty by fixing up builtin-report.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the structures necessary to count each event
type independently in perf report.
Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In order to minimize the impact of storing multiple events in a
report this function will now take the root of the histogram
tree so that the logic for selecting the proper tree can be
inserted before the call.
Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
cc1: warnings being treated as errors
util/probe-finder.c: In function 'find_line_range':
util/probe-finder.c:172: warning: 'src' may be used
uninitialized in this function make: *** [util/probe-finder.o]
Error 1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267800842-22324-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Minimal userspace interface to the new 'precise' events flag.
Can be used like "perf top -e r00c0p" which will use PEBS to sample
retired instructions.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.468665803@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'perf-probes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Issue at least one memory barrier in stop_machine_text_poke()
perf probe: Correct probe syntax on command line help
perf probe: Add lazy line matching support
perf probe: Show more lines after last line
perf probe: Check function address range strictly in line finder
perf probe: Use libdw callback routines
perf probe: Use elfutils-libdw for analyzing debuginfo
perf probe: Rename probe finder functions
perf probe: Fix bugs in line range finder
perf probe: Update perf probe document
perf probe: Do not show --line option without dwarf support
kprobes: Add documents of jump optimization
kprobes/x86: Support kprobes jump optimization on x86
x86: Add text_poke_smp for SMP cross modifying code
kprobes/x86: Cleanup save/restore registers
kprobes/x86: Boost probes when reentering
kprobes: Jump optimization sysctl interface
kprobes: Introduce kprobes jump optimization
kprobes: Introduce generic insn_slot framework
kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE
Even though we don't register the counters until the child is right about
to exec(), we're still going to get at least a few events while the
fork()'d child is still executing 'perf' and in particular we're going to
get the MMAP events.
We can't distinguish the ones in the newly executed process because the
PID will be the same.
One way to solve this would be to have a PERF_RECORD_EXEC event, and when
this is seen 'perf' can flush it's map cache. We can't use
PERF_RECORD_COMM since that's generated by other things, not just exec().
Actually, thinking about it some more, using PERF_RECORD_COMM might be a
good enough approximation.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267196914-16238-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add lazy line matching support for specifying new probes.
This also changes the syntax of perf probe a bit. Now
perf probe accepts one of below probe event definitions.
1) Define event based on function name
[EVENT=]FUNC[@SRC][:RLN|+OFF|%return|;PTN] [ARG ...]
2) Define event based on source file with line number
[EVENT=]SRC:ALN [ARG ...]
3) Define event based on source file with lazy pattern
[EVENT=]SRC;PTN [ARG ...]
- New lazy matching pattern(PTN) follows ';' (semicolon). And it
must be put the end of the definition.
- So, @SRC is no longer the part which must be put at the end
of the definition.
Note that ';' (semicolon) can be interpreted as the end of
a command by the shell. This means that you need to quote it.
(anyway you will need to quote the lazy pattern itself too,
because it may contains other sensitive characters, like
'[',']' etc.).
Lazy matching
-------------
The lazy line matching is similar to glob matching except
ignoring spaces in both of pattern and target.
e.g.
'a=*' can matches 'a=b', 'a = b', 'a == b' and so on.
This provides some sort of flexibility and robustness to
probe point definitions against minor code changes.
(for example, actual 10th line of schedule() can be changed
easily by modifying schedule(), but the same line matching
'rq=cpu_rq*' may still exist.)
Changes in v3:
- Cast Dwarf_Addr to uintmax_t for printf-formats.
Changes in v2:
- Cast Dwarf_Addr to unsigned long long for printf-formats.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
LKML-Reference: <20100225133611.6725.45078.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Show 2 more lines after the last probe-able line.
This will clearly show the last closed-brace of
inline functions.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
LKML-Reference: <20100225133604.6725.76820.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Check (inlined) function address range strictly for
improving output of probe-able lines of inline functions.
Without this change, perf probe --line <function> sometimes
showed other inline function bodies too, because it didn't
filter out inlined functions.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
LKML-Reference: <20100225133557.6725.20697.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use libdw callback functions aggressively, and remove
local tree-search API. This change simplifies the code.
Changes in v3:
- Cast Dwarf_Addr to uintmax_t for printf-formats.
Changes in v2:
- Cast Dwarf_Addr to unsigned long long for printf-formats.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
LKML-Reference: <20100225133549.6725.81499.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Newer gcc introduces newer & richer debuginfo, and only libdw
in elfutils project can support it. So perf probe moves onto
elfutils-libdw from libdwarf.
Changes in v3:
- Cast Dwarf_Addr/Dwarf_Word to uintmax_t for printf-formats.
- Recover a sign-prefix which was removed in v2 by mistake.
Changes in v2:
- Fix a type-casting bug in Makefile.
- Cast Dwarf_Addr/Dwarf_Word to unsigned long long for printf-formats.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
LKML-Reference: <20100225133542.6725.34724.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix find_line_range_by_line() to init line_list and remove
misconseptional found marking which should be done when
real lines are found (if there is no lines probe-able,
find_line_range() should return 0).
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
LKML-Reference: <20100225133527.6725.52418.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Because symbol->end is not fixed up at symbol_filter time, only
after all symbols for a DSO are loaded, and that, for asm
symbols, may be bogus, causing segfaults when hits happen in
these symbols.
Reported-by: David Miller <davem@davemloft.net>
Reported-by: Anton Blanchard <anton@samba.org>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org> # for .33.x. Does not apply cleanly, needs backport.
LKML-Reference: <20100225155740.GB8553@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Be more clear about DSO long names and tell from which file
kernel symbols were obtained, all in --verbose mode:
[root@mica ~]# perf report -v > /dev/null
Looking at the vmlinux_path (5 entries long)
Using /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux for symbols
[root@mica ~]# mv /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux /tmp/dd
[root@mica ~]# perf report -v > /dev/null
Looking at the vmlinux_path (5 entries long)
Using /proc/kallsyms for symbols
[root@mica ~]#
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1266866139-6361-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In function dso__split_kallsyms(), curr_map saves the return value
of map__new2. So check it instead of var map after the call returns.
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: <stable@kernel.org> # for .33.x
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1267066851.1726.9.camel@localhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If we know the size of a tuple in advance, there's no need to resize
it - start out with the known size in the first place.
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Keiichi KII <k-keiichi@bx.jp.nec.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1266822779.6426.4.camel@tropicana>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Add base support for Python scripting to perf trace.
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Keiichi KII <k-keiichi@bx.jp.nec.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1264580883-15324-6-git-send-email-tzanussi@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Create a scripting-engines directory to contain scripting engine
implementation code, in anticipation of the addition of new scripting
support. Also removes trace-event-perl.h.
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Keiichi KII <k-keiichi@bx.jp.nec.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1264580883-15324-5-git-send-email-tzanussi@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
This stuff is needed by all scripting engines; move it from the Perl
engine source to a more common place.
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Keiichi KII <k-keiichi@bx.jp.nec.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1264580883-15324-4-git-send-email-tzanussi@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Clear struct probe_point before using it in
show_perf_probe_events(), and set pp->found counter correctly in
synthesize_perf_probe_point(). Without this initialization,
clear_probe_point() will free random addresses.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
LKML-Reference: <20100218181652.26547.57790.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As the parent comm then is worthless, confusing users about the
thread where the sample really happened, leading to think that
the sample happened in the parent, not where it really happened,
in the children of a thread for which a PERF_RECORD_COMM event
was not received.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1266627727-19715-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In 2161db9 we stopped failing when not finding modules when
asked too, but then the kernel maps (just one, for vmlinux)
wasn't having its ->end field correctly set up, so symbols were
not being found for the vmlinux map because its range was 0-0.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1266702793-29434-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
cpumode bits are defined as such:
#define PERF_RECORD_MISC_KERNEL (1 << 0)
#define PERF_RECORD_MISC_USER (2 << 0)
#define PERF_RECORD_MISC_HYPERVISOR (3 << 0)
We need to compare against the complete value of cpumode,
otherwise hypervisor samples get incorrectly attributed as
userspace.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
LKML-Reference: <20100209034304.GA3702@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
First, for programs and prelinked libraries, annotate code was
fooled by objdump output IPs (src->eip in the code) being
wrongly converted to absolute IPs. In such case there were no
conversion needed, but in
src->eip = strtoull(src->line, NULL, 16);
src->eip = map->unmap_ip(map, src->eip); // = eip + map->start - map->pgoff
we were reading absolute address from objdump (e.g. 8048604) and
then almost doubling it, because eip & map->start are
approximately close for small programs.
Needless to say, that later, in record_precise_ip() there was no
matching with real runtime IPs.
And second, like with `perf annotate` the problem with
non-prelinked *.so was that we were doing rip -> objdump address
conversion wrong.
Also, because unlike `perf annotate`, `perf top` code does
annotation based on absolute IPs for performance reasons(*), new
helper for mapping objdump addresse to IP is introduced.
(*) we get samples info in absolute IPs, and since we do lots of
hit-testing on absolute IPs at runtime in record_precise_ip(), it's
better to convert objdump addresses to IPs once and do no conversion
at runtime.
I also had to fix how objdump output is parsed (with hardcoded
8/16 characters format, which was inappropriate for ET_DYN dsos
with small addresses like '4ac')
Also note, that not all objdump output lines has associtated
IPs, e.g. look at source lines here:
000004ac <my_strlen>:
extern "C"
int my_strlen(const char *s)
4ac: 55 push %ebp
4ad: 89 e5 mov %esp,%ebp
4af: 83 ec 10 sub $0x10,%esp
{
int len = 0;
4b2: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%ebp)
4b9: eb 08 jmp 4c3 <my_strlen+0x17>
while (*s) {
++len;
4bb: 83 45 fc 01 addl $0x1,-0x4(%ebp)
++s;
4bf: 83 45 08 01 addl $0x1,0x8(%ebp)
So we mark them with eip=0, and ignore such lines in annotate
lookup code.
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
[ Note: one hunk of this patch was applied by Mike in 57d8188 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1265550376-12665-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
perf top and perf record refuses to initialize on non-modular kernels:
refuse to initialize:
$ perf top -v
map_groups__set_modules_path_dir: cannot open /lib/modules/2.6.33-rc6-tip-00586-g398dde3-dirty/
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Setting _FILE_OFFSET_BITS and using O_LARGEFILE, lseek64, etc,
is redundant. Thanks H. Peter Anvin for pointing it out.
So, this patch removes O_LARGEFILE, lseek64, etc.
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4B6A8972.3070605@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The problem was we were incorrectly calculating objdump
addresses for sym->start and sym->end, look:
For simple ET_DYN type DSO (*.so) with one function, objdump -dS
output is something like this:
000004ac <my_strlen>:
int my_strlen(const char *s)
4ac: 55 push %ebp
4ad: 89 e5 mov %esp,%ebp
4af: 83 ec 10 sub $0x10,%esp
{
i.e. we have relative-to-dso-mapping IPs (=RIP) there.
For ET_EXEC type and probably for prelinked libs as well (sorry
can't test - I don't use prelink) objdump outputs absolute IPs,
e.g.
08048604 <zz_strlen>:
extern "C"
int zz_strlen(const char *s)
8048604: 55 push %ebp
8048605: 89 e5 mov %esp,%ebp
8048607: 83 ec 10 sub $0x10,%esp
{
So, if sym->start is always relative to dso mapping(*), we'll
have to unmap it for ET_EXEC like cases, and leave as is for
ET_DYN cases.
(*) and it is - we've explicitely made it relative. Look for
adjust_symbols handling in dso__load_sym()
Previously we were always unmapping sym->start and for ET_DYN
dsos resulting addresses were wrong, and so objdump output was
empty.
The end result was that perf annotate output for symbols from
non-prelinked *.so had always 0.00% percents only, which is
wrong.
To fix it, let's introduce a helper for converting rip to
objdump address, and also let's document what map_ip() and
unmap_ip() do -- I had to study sources for several hours to
understand it.
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1265223128-11786-8-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Not to pollute too much 'perf annotate' debugging sessions.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-7-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We want to stream events as fast as possible to perf.data, and
also in the future we want to have splice working, when no
interception will be possible.
Using build_id__mark_dso_hit_ops to create the list of DSOs that
back MMAPs we also optimize disk usage in the build-id cache by
only caching DSOs that had hits.
Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Because 'perf record' will have to find the build-ids in after
we stop recording, so as to reduce even more the impact in the
workload while we do the measurement.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We can check using strcmp, most DSOs don't start with '[' so the
test is cheap enough and we had to test it there anyway since
when reading perf.data files we weren't calling the routine that
created this global variable and thus weren't setting it as
"loaded", which was causing a bogus:
Failed to open [vdso], continuing without symbols
Message as the first line of 'perf report'.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While debugging a problem reported by Pekka Enberg by printing
the IP and all the maps for a thread when we don't find a map
for an IP I noticed that dso__load_sym needs to fixup these
extra maps it creates to hold symbols in different ELF sections
than the main kernel one.
Now we're back showing things like:
[root@doppio linux-2.6-tip]# perf report | grep vsyscall
0.02% mutt [kernel.kallsyms].vsyscall_fn [.] vread_hpet
0.01% named [kernel.kallsyms].vsyscall_fn [.] vread_hpet
0.01% NetworkManager [kernel.kallsyms].vsyscall_fn [.] vread_hpet
0.01% gconfd-2 [kernel.kallsyms].vsyscall_0 [.] vgettimeofday
0.01% hald-addon-rfki [kernel.kallsyms].vsyscall_fn [.] vread_hpet
0.00% dbus-daemon [kernel.kallsyms].vsyscall_fn [.] vread_hpet
[root@doppio linux-2.6-tip]#
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I noticed while writing the first test in 'perf regtest' that to
just test the symbol handling routines one needs to create a
perf session, that is a layer centered on a perf.data file,
events, etc, so I untied these layers.
This reduces the complexity for the users as the number of
parameters to most of the symbols and session APIs now was
reduced while not adding more state to all the map instances by
only having data that is needed to split the kernel (kallsyms
and ELF symtab sections) maps and do vmlinux relocation on the
main kernel map.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Open perf data file with O_LARGEFILE flag since its size is
easily larger that 2G.
For example:
# rm -rf perf.data
# ./perf kmem record sleep 300
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 3142.147 MB perf.data
(~137282513 samples) ]
# ll -h perf.data
-rw------- 1 root root 3.1G .....
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4B68F32A.9040203@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
linux/hash.h, hash header of kernel, is also useful for perf.
util/include/linuxhash.h includes linux/hash.h, so we can use
hash facilities (e.g. hash_long()) in perf now.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1264851813-8413-3-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch is required to test the next patch for perf lock.
At 064739bc4b ,
support for the modifier "__data_loc" of format is added.
But, when I wanted to parse format of lock_acquired (or some
event else), raw_field_ptr() did not returned correct pointer.
So I modified raw_field_ptr() like this patch. Then
raw_field_ptr() works well.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <1264851813-8413-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
[ v3: fixed minor stylistic detail ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Removing one extra step needed in the tools that need this,
fixing a bug in 'perf probe' where this was not being done.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1264633557-17597-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
To make it clear and allow for direct usage by, for instance,
regression test suites.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1264633557-17597-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
So that we can call it directly from regression tests, and also
to reduce the size of dso__load_kernel_sym(), making it more
clear.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1264633557-17597-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>