Commit Graph

1798 Commits

Author SHA1 Message Date
Masami Hiramatsu 221d061182 perf probe: Fix to search local variables in appropriate scope
Fix perf probe to search local variables in appropriate local inlined
function scope. For example, pre_schedule() has only 2 local variables,
as below;

$ perf probe -L pre_schedule
<pre_schedule@/home/mhiramat/ksrc/linux-2.6/kernel/sched.c:0>
      0  static inline void pre_schedule(struct rq *rq, struct task_struct *prev)
         {
      2         if (prev->sched_class->pre_schedule)
      3                 prev->sched_class->pre_schedule(rq, prev);
         }

However, current perf probe shows 4 local variables on pre_schedule(),
because it searches variables in the caller(schedule()) scope.

$ perf probe -V pre_schedule
Available variables at pre_schedule
        @<schedule+445>
                int     cpu
                long unsigned int*      switch_count
                struct rq*      rq
                struct task_struct*     prev

This patch fixes this issue by searching variables in the local scope of
the instance of inlined function. Here is the result.

$ perf probe -V pre_schedule
Available variables at pre_schedule
        @<schedule+445>
                struct rq*      rq
                struct task_struct*     prev

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110259.19900.85664.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:28:45 -03:00
Masami Hiramatsu 13e27d7686 perf probe: Warn when more than one line are given
Check multiple --lines option and print warning informing that only the
first specified --line option is valid.

Changes from the 1st post:

- Accept only the first option instead of the last.
- Fix warning message according to David's comment.
- Mark as a bugfix.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110253.19900.96192.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:27:11 -03:00
Masami Hiramatsu 36c0c588b9 perf probe: Fix to walk all inline instances
Fix line-range collector to walk all instances of inlined function,
because some execution paths can be optimized out depending on the
function argument of instances.

E.g.)
inline_func (arg) {
	if (arg)
		do_something;
	else
		do_another;
}

func_A() {
	inline_func(1)
}

func_B() {
	inline_func(0)
}

In this case, func_A may have only do_something code and func_B may have
only do_another.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110247.19900.93702.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:25:38 -03:00
Masami Hiramatsu b0e9cb2802 perf probe: Fix to search nested inlined functions in CU
Fix perf probe to walk through the lines of all nested inlined function
call sites and declared lines when a whole CU is passed to the line
walker.

The die_walk_lines() can have two different type of DIEs, subprogram (or
inlined-subroutine) DIE and CU DIE.

If a caller passes a subprogram DIE, this means that the walker walk on
lines of given subprogram. In this case, it just needs to search on
direct children of DIE tree for finding call-site information of inlined
function which directly called from given subprogram.

On the other hand, if a caller passes a CU DIE to the walker, this means
that the walker have to walk on all lines in the source files included
in given CU DIE. In this case, it has to search whole DIE trees of all
subprograms to find the call-site information of all nested inlined
functions.

Without this patch:

$ perf probe --line kernel/cpu.c:151-157
</home/mhiramat/ksrc/linux-2.6/kernel/cpu.c:151>

         static int cpu_notify(unsigned long val, void *v)
         {
    154         return __cpu_notify(val, v, -1, NULL);
         }

With this:
$ perf probe --line kernel/cpu.c:151-157
</home/mhiramat/ksrc/linux-2.6/kernel/cpu.c:151>

    152  static int cpu_notify(unsigned long val, void *v)
         {
    154         return __cpu_notify(val, v, -1, NULL);
         }

As you can see, --line option with source line range shows the declared
lines as probe-able.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110241.19900.34994.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:23:39 -03:00
Masami Hiramatsu a128405c6b perf probe: Fix line walker to check CU correctly
Fix line walker to check whether a given DIE is CU or not.

Actually this function accepts CU, subprogram and inlined_subroutine
DIEs.

Without this fix, perf probe always fails to analyze lines on inlined
functions;

$ perf probe -L pre_schedule
Debuginfo analysis failed. (-2)
  Error: Failed to show lines. (-2)

This fixes that bug, as below.

$ perf probe -L pre_schedule
<pre_schedule@/home/mhiramat/ksrc/linux-2.6/kernel/sched.c:0>
      0  static inline void pre_schedule(struct rq *rq, struct task_struct *prev
         {
      2         if (prev->sched_class->pre_schedule)
      3                 prev->sched_class->pre_schedule(rq, prev);
         }

         /* rq->lock is NOT held, but preemption is disabled */

Changes from v1:
 - Update against current tip tree.(Fix dwarf-aux.c)

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110235.19900.20614.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:22:46 -03:00
Masami Hiramatsu 8afa2a707d perf probe: Fix a memory leak for scopes array
Fix a memory leak for scopes array when it finds a variable in the
global scope.

Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110229.19900.63019.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:21:15 -03:00
Vasiliy Kulikov e9b52ef222 perf: fix temporary file ownership check
A file in /tmp/ might be a symlink, so lstat() should be used instead of
stat().

Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20110811205537.GA22864@albatros
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 08:28:17 -03:00
Jiri Olsa f57b05ed53 perf report: Use properly build_id kernel binaries
If we bring the recorded perf data together with kernel binary from another
machine using:

	on server A:
	perf archive

	on server B:
	tar xjvf perf.data.tar.bz2 -C ~/.debug

the build_id kernel dso is not properly recognized during the "perf report"
command on server B.

The reason is, that build_id dsos are added during the session initialization,
while the kernel maps are created during the sample event processing.

The machine__create_kernel_maps functions ends up creating new dso object for
kernel, but it does not check if we already have one added by build_id
processing.

Also the build_id reading ABI quirk added in commit:

 - commit b25114817a
   perf build-id: Add quirk to deal with perf.data file format breakage

populates the "struct build_id_event::pid" with 0, which
is later interpreted as DEFAULT_GUEST_KERNEL_ID.

This is not always correct, so it's better to guess the pid
value based on the "struct build_id_event::header::misc" value.

- Tested with data generated on x86 kernel version v2.6.34
  and reported back on x86_64 current kernel.
- Not tested for guest kernel case.

Note the problem stays for PERF_RECORD_MMAP events recorded by perf that
does not use proper pid (HOST_KERNEL_ID/DEFAULT_GUEST_KERNEL_ID). They are
misinterpreted within the current perf code. Probably there's not much we
can do about that.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20110601194346.GB1934@jolsa.brq.redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-11 08:58:03 -03:00
Arnaldo Carvalho de Melo fc8ed7be73 perf top browser: Remove spurious helpline update
It will be immediately replaced in perf_top_browser__run.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-q7e2jzb44elqpkvdllk94x0i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-10 12:42:26 -03:00
Ingo Molnar 7676ebbaf2 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent 2011-08-10 10:20:52 +02:00
Pekka Enberg 981c125269 perf symbols: Check '/tmp/perf-' symbol file ownership
The external symbol files are generated by JIT compilers, for example, but we
need to make sure they're ours before injecting them to 'perf report'.

Requested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1312919658-17158-1-git-send-email-penberg@kernel.org
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-09 15:23:08 -03:00
Jiri Olsa 580cabed88 perf sched: Usage leftover from trace -> script rename
The 'perf sched' command usage still showing 'trace' command instead of
the 'script' command.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110809124651.GD2056@jolsa.brq.redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-09 13:32:12 -03:00
Jiri Olsa 4c09bafae3 perf sched: Do not delete session object prematurely
The session object is released prematurely when processing events for
latency command. The session's thread objects are used within the
output_lat_thread function.

Runnning following commands:

 # perf sched record
 # perf sched latency

the latter displays incorrect data and might cause access violation.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1312837414-3819-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-09 13:31:38 -03:00
Arnaldo Carvalho de Melo 069e3725dd perf tools: Check $HOME/.perfconfig ownership
Just like we do already for perf.data files.

Requested-by: Ingo Molnar <mingo@elte.hu>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Christian Ohm <chr.ohm@gmx.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qgokmxsmvppwpc5404qhyk7e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-09 12:42:13 -03:00
Ingo Molnar e710574de1 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent 2011-08-09 16:44:27 +02:00
Jiri Olsa 9941c96ad8 perf tools: Add support to install perf python extension
Adding install-python_ext target to install python extension related
files.  Installation directory is governed by python distutils package
and follows the DESTDIR variable settings.

Also moving python extension build output into '$(O)python_ext_build'
directory and making it configurable via PYTHON_EXTBUILD variable.

Keeping the '$(O)python/perf.so' file, so it could be used for testing
as of until now.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110722113307.GA1931@jolsa.brq.redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-08 12:54:26 -03:00
Jonathan Nieder aba8d05607 perf tools: do not look at ./config for configuration
In addition to /etc/perfconfig and $HOME/.perfconfig, perf looks for
configuration in the file ./config, imitating git which looks at
$GIT_DIR/config.  If ./config is not a perf configuration file, it
fails, or worse, treats it as a configuration file and changes behavior
in some unexpected way.

"config" is not an unusual name for a file to be lying around and perf
does not have a private directory dedicated for its own use, so let's
just stop looking for configuration in the cwd.  Callers needing
context-sensitive configuration can use the PERF_CONFIG environment
variable.

Requested-by: Christian Ohm <chr.ohm@gmx.net>
Cc: 632923@bugs.debian.org
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Christian Ohm <chr.ohm@gmx.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110805165838.GA7237@elie.gateway.2wire.net
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-08 09:46:32 -03:00
Kusanagi Kouichi 8b7e0b34b8 perf tools: Make clean leaves some files
Use LIB_OBJS and BUILTIN_OBJS for .o files.

LIB_FILE is already prefixed with OUTPUT.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110807083932.9C0E514C03B@msa103.auone-net.jp
Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-08 09:43:22 -03:00
Zhu Yanhai cf8dc9ff29 perf lock: Dropping unsupported ':r' modifier
Looks to me like the :r modifier is not supported anymore, so remove it
from the list of events. Without this fix 'perf lock record' doesn't
work.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Zhu Yanhai <gaoyang.zyh@taobao.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1312035232-9534-1-git-send-email-gaoyang.zyh@taobao.com
Signed-off-by: Zhu Yanhai <gaoyang.zyh@taobao.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-08 09:41:35 -03:00
Jovi Zhang ce27a443d1 perf probe: Fix coredump introduced by probe module option
perf will coredump if the user doesn't give the "-m" option in probe
command, this patch fixes it.

[root@localhost perf]# ./perf probe --add='PROBE'
Segmentation fault (core dumped)

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1311602888-2389-1-git-send-email-bookjovi@gmail.com
Signed-off-by: Jovi Zhang <bookjovi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-08 09:35:41 -03:00
Ingo Molnar 6d158f3ec5 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent 2011-08-05 10:35:55 +02:00
Arnaldo Carvalho de Melo 00894ce9b8 perf report: Use ui__warning in some more places
So that we get a proper warning in the TUI in cases like:

 $ perf report --stdio -g fractal,0.5,caller --sort pid
 Selected -g but no callchain data. Did you call 'perf record' without -g?
 $

The --stdio case is ok because it uses fprintf, ui__warning is needed to
figure out if --stdio or --tui is being used.

Cc: Arun Sharma <asharma@fb.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sam Liao <phyomh@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ag9fz2wd17mbbfjsbznq1wms@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-03 12:33:24 -03:00
Arnaldo Carvalho de Melo 3e9f45a7a4 perf python: Add PERF_RECORD_{LOST,READ,SAMPLE} routine tables
So those friggin "spurious" PERF_RECORD_MMAP events were actually a
brain fart copy'n'paste error in the python binding, doh. I.e. they
weren't MMAPs, just SAMPLEs.

Fix it by providing routines for these events instead of using the MMAP
ones.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-b0rc8y5jd03f9f11kftodvkm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-07-25 17:13:27 -03:00
Arnaldo Carvalho de Melo 4152ab377b perf evlist: Introduce 'disable' method
To remove the last case of access to the FD() macro outside the library.

Inspired by a patch by Borislav that moved the FD() macro to util.h, for
namespace concerns I rather preferred to constrain it to ev{sel,list}.c.

Cc: Borislav Petkov <bp@amd64.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qn893qsstcg366tkucu649qj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-07-25 11:06:19 -03:00
Linus Torvalds c0c463d34a Merge branches 'x86-urgent-for-linus', 'core-debug-for-linus', 'irq-core-for-linus' and 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  um: Make rwsem.S depend on CONFIG_RWSEM_XCHGADD_ALGORITHM

* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debug: Make CONFIG_EXPERT select CONFIG_DEBUG_KERNEL to unhide debug options

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Remove unused CHECK_IRQ_PER_CPU()

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools, x86: Fix 32-bit compile on 64-bit system
2011-07-23 10:33:08 -07:00
Han Pingtian 4f9bae351d perf buildid-cache: Zero out buffer of filenames when adding/removing buildid
The readlink() function doesn't append a null byte to buf. So we should
zero out buf with zalloc(). Or we'll see sometimes error like this:

[root@intel-s3e36-01]~# /usr/bin/perf buildid-cache -a /lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko -v
Adding f64ba8efd5f53c7ad332fc17db1d21de309038e1 /lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko: Ok
[root@intel-s3e36-01]~# /usr/bin/perf buildid-cache -r /lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko -v
Removing f64ba8efd5f53c7ad332fc17db1d21de309038e1 /lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko: FAIL
/lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko wasn't in the cache

The change in build_id_cache__add_s() is a defense.

Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20110718031314.GA5802@hpt.nay.redhat.com
Signed-off-by: Han Pingtian <phan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-07-22 08:59:26 -03:00
David Ahern 08a4a43fc4 perf tools, x86: Fix 32-bit compile on 64-bit system
Builds for 32-bit perf binaries on a 64-bit host currently fail
with this error:

 [...]
 bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages:
 bench/../../../arch/x86/lib/memcpy_64.S:29: Error: bad register name `%rdi'
 bench/../../../arch/x86/lib/memcpy_64.S:34: Error: invalid instruction suffix for `movs'
 bench/../../../arch/x86/lib/memcpy_64.S:50: Error: bad register name `%rdi'
 bench/../../../arch/x86/lib/memcpy_64.S:61: Error: bad register name `%rdi'
 ...

The problem is the detection of the host arch without considering passed in
flags. This change fixes 32-bit builds via:

make EXTRA_CFLAGS=-m32

and 64-bit builds still reference the memcpy_64.S.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310420304-21452-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 13:42:30 +02:00
Jiri Olsa baf040a0d1 perf tools: Make test use the preset debugfs path
Use preset debugfs path instead of hardcoded one.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-4-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:41:14 +02:00
Jiri Olsa 13b62567e9 perf tools: Add automated tests for events parsing
Adding builtin test for parse_events function, which is
responsible for parsing/processing "-e" option for
stat/top/record commands.

This new test will run within the builtin test command suite
(perf test).

One or several tests were added for each type of event.
More tests could be added easily if needed.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-3-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:41:13 +02:00
Jiri Olsa f120f9d51b perf tools: De-opt the parse_events function
Moving out the option parameter from parse_events function,
and adding new parse_events_option function instead.

The option parameter is used only to carry "struct perf_evlist"
pointer for chaining new events. Putting it away, enable us
to call parse_events from other places without using the
option parameter.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-2-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:41:11 +02:00
David Ahern adc4bf9955 perf script: Fix display of IP address for non-callchain path
Non-callchain path is using al.addr which prints as:
  openssl 14564 17672.003587:       7862d _x86_64_AES_encrypt_compact

This should be sample->ip to print as:
  openssl 14564 17672.003587:  3f7867862d _x86_64_AES_encrypt_compact

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1306768587-15376-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:09:28 +02:00
David Ahern eda3913bb7 perf tools: Fix endian conversion reading event attr from file header
The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.

With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.

-v2:
 - changed the existing perf_event__attr_swap() to swap only elements
   of perf_event_attr and exported it for use in swapping the
   attributes in the file header
 - updated swap_ops used for processing events

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:57:36 +02:00
Jiri Olsa 0111919da2 perf tools: Add missing 'node' alias to the hw_cache[] array
Add "node" as a simple alias for NODE cache events.

The addition of NODE cache events broke the parse_alias
function, so any mismatched event caused the segfault, like:

  # ./perf stat -e krava ls

The hw_cache/hw_cache_op/hw_cache_result arrays needs to follow
PERF_COUNT_HW_CACHE_*MAX enums. Adding those MAXs to be size
of those arrays, so possible ommision in future wil not lead to
segfault.

Adding read/write/prefetch as allowed operations for node cache
event.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/20110713205818.GB7827@jolsa.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:54:51 +02:00
Masami Hiramatsu 14a8fd7cee perf probe: Support adding probes on offline kernel modules
Support adding probes on offline kernel modules. This enables
perf-probe to trace kernel-module init functions via perf-probe.
If user gives the path of module with -m option, perf-probe
expects the module is offline.
This feature works with --add, --funcs, and --vars.

E.g)
 # perf probe -m /lib/modules/`uname -r`/kernel/fs/btrfs/btrfs.ko \
   -a "extent_io_init:5 extent_state_cache"
 Add new events:
   probe:extent_io_init (on extent_io_init:5 with extent_state_cache)
   probe:extent_io_init_1 (on extent_io_init:5 with extent_state_cache)

 You can now use it on all perf tools, such as:

         perf record -e probe:extent_io_init_1 -aR sleep 1

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072751.6528.10230.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:25:12 -04:00
Masami Hiramatsu 190b57fcb9 perf probe: Add probed module in front of function
Add probed module name and ":" in front of function name
if -m module option is given. In the result, the symbol
name passed to kprobe-tracer becomes MODULE:FUNCTION,
so that kallsyms can solve it as a symbol in the module
correctly.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072745.6528.26416.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:19:08 -04:00
Masami Hiramatsu ff74178350 perf probe: Introduce debuginfo to encapsulate dwarf information
Introduce debuginfo to encapsulate dwarf information.
This new object allows us to reuse and expand debuginfo easily.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072739.6528.12438.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:14:19 -04:00
Masami Hiramatsu e0d153c690 perf-probe: Move dwarf library routines to dwarf-aux.{c, h}
Move dwarf library related routines to dwarf-aux.{c,h}.
This includes several minor changes.
- Add simple documents for each API.
- Rename die_find_real_subprogram() to die_find_realfunc()
- Rename line_walk_handler_t to line_walk_callback_t.
- Minor cleanups.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072727.6528.57647.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:10:17 -04:00
Masami Hiramatsu bcfc082150 perf probe: Remove redundant dwarf functions
Since there are dwarf_bitsize, dwarf_bitoffset and dwarf_bytesize
defined in libdw, we don't need die_get_bit_size, die_get_bit_offset
and die_get_byte_size anymore.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072721.6528.2747.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:04:47 -04:00
Masami Hiramatsu bad03ae476 perf probe: Move strtailcmp to string.c
Since strtailcmp() is enough generic, it should be defined in string.c.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072715.6528.10677.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:00:47 -04:00
Masami Hiramatsu baad2d3e69 perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END
Since die_find/walk* callbacks use DIE_FIND_CB_FOUND for
both of failed and found cases, it should be "END"
instead "FOUND" for avoiding confusion.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/20110627072709.6528.45706.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 15:55:57 -04:00
Sonny Rao 259032bfe3 perf: Robustify proc and debugfs file recording
While attempting to create a timechart of boot up I found perf didn't
tolerate modules being loaded/unloaded.  This patch fixes this by
reading the file once and then writing the size read at the correct
point in the file.  It also simplifies the code somewhat.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Link: http://lkml.kernel.org/r/10011.1310614483@neuling.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-14 15:53:01 -04:00
Anton Blanchard 5d67be97f8 perf report/annotate/script: Add option to specify a CPU range
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.

This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.

The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.

 v2: Incorporate suggestions from David Ahern:
	- Added -c to perf script
	- Check that SAMPLE_CPU is set when -c is used
	- Update documentation

 v3: Create perf_session__cpu_bitmap()

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-05 10:44:44 +02:00
Zhengyu He 3ae9a34d74 perf stat: Add noise output for csv mode
Previously, when you want perf-stat to output the statistics in
csv mode, no information of the noise will be printed out.

For example right now we output this --repeat information:

 ./perf stat -r3 -x, sleep 1
 1.164789,task-clock
 8,context-switches
 0,CPU-migrations
 219,page-faults
 3337800,cycles

With this patch, the output will be appended with an additional
entry for the noise value:

 ./perf stat -r3 -x, sleep 1
 1.164789,task-clock,3.75%
 8,context-switches,75.00%
 0,CPU-migrations,100.00%
 219,page-faults,0.00%
 3337800,cycles,3.36%

Signed-off-by: Zhengyu He <zhengyuh@google.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1308861942-4945-1-git-send-email-zhengyuh@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 12:52:40 +02:00
Ingo Molnar 343a031f3c Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2011-07-01 11:51:58 +02:00
Ingo Molnar 10e6962765 Merge commit 'v3.0-rc5' into perf/core
Merge reason: Pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 10:28:46 +02:00
Frederic Weisbecker cb1955b86c perf tools: Only display parent field if explictly sorted
We don't need to display the parent field if the parent
sorting machinery is only used for parent filtering
(as in "-p foo").

However if parent filtering is used in combination with
explicit parent sorting ( -s parent), we want to
display it.

Result with:

  perf report -p kernel_thread -s parent

Before:

 # Overhead  Parent symbol
 # ........  .............
 #
     0.07%
            |
            --- ioread8
                ata_sff_check_status
                ata_sff_tf_load
                ata_sff_qc_issue
                ata_bmdma_qc_issue
                ata_qc_issue
                ata_scsi_translate
                ata_scsi_queuecmd
                scsi_dispatch_cmd
                scsi_request_fn
                __blk_run_queue
                __make_request
                generic_make_request
                submit_bio
                submit_bh
                journal_submit_commit_record
                jbd2_journal_commit_transaction
                kjournald2
                kthread
                kernel_thread_helpe

After:

 # Overhead  Parent symbol
 # ........  .............
 #
     0.07%  kernel_thread_helper
            |
            --- ioread8
                ata_sff_check_status
                ata_sff_tf_load
                ata_sff_qc_issue
                ata_bmdma_qc_issue
                ata_qc_issue
                ata_scsi_translate
                ata_scsi_queuecmd
                scsi_dispatch_cmd
                scsi_request_fn
                __blk_run_queue
                __make_request
                generic_make_request
                submit_bio
                submit_bh
                journal_submit_commit_record
                jbd2_journal_commit_transaction
                kjournald2
                kthread
                kernel_thread_helper

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:49 +02:00
Frederic Weisbecker fd8ea21276 perf tools: Allow sort dimensions to be registered more than once
So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).

We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:41 +02:00
Frederic Weisbecker e84d21227c perf tools: Don't display ignored entries on stdio ui
As for newt ui, don't display entries that have been marked
as ignored.

The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:

 # Overhead      Command      Shared Object        Symbol
 # ........  ...........  .................  ............
 #
^A
                   |
                   --- __lock_acquire
                      |
                      |--95.97%-- lock_acquire
                      |          |
                      |          |--30.75%-- _raw_spin_lock

Discard these from the stdio display.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:33 +02:00
Frederic Weisbecker 2fd701bc78 perf tools: Remove sort print helpers declarations
These are probably some old leftovers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:19 +02:00
Frederic Weisbecker 872a878fb1 perf tools: Make sort operations static
These don't need to be globally visible.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:25:12 +02:00
Sam Liao d797fdc5c5 perf tools: Add inverted call graph report support.
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.

Also add "-G" alias for inverted call graph.

Signed-off-by: Sam Liao <phyomh@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2011-06-30 00:24:30 +02:00
Linus Torvalds 8816ead9d8 Merge branches 'perf-urgent-for-linus', 'sched-urgent-for-linus', 'timers-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tools/perf: Fix static build of perf tool
  tracing: Fix regression in printk_formats file

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  clocksource: Make watchdog robust vs. interruption
  timerfd: Fix wakeup of processes when timer is cancelled on clock change

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, MAINTAINERS: Add x86 MCE people
  x86, efi: Do not reserve boot services regions within reserved areas
2011-06-19 09:00:18 -07:00
Linus Torvalds 357ed6b1a1 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: Move RCU_BOOST #ifdefs to header file
  rcu: use softirq instead of kthreads except when RCU_BOOST=y
  rcu: Use softirq to address performance regression
  rcu: Simplify curing of load woes
2011-06-19 08:56:56 -07:00
Linus Torvalds 7cc2ed0589 Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  kbuild: Call depmod.sh via shell
  perf: clear out make flags when calling kernel make kernelver
2011-06-16 10:26:58 -07:00
Ingo Molnar b4f9f2b64a Merge commit 'v3.0-rc3' into perf/core
Merge reason: add the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-16 13:23:22 +02:00
Mathias Krause 203db2952b tools/perf: Fix static build of perf tool
To build a statically linked version of the perf tool all needed
libraries must be added in the correct order to get the symbols
resolved. Currently this is broken when, e.g. python or newt
support is enabled -- libpython needs libpthread which is an
unconditional link dependency of the perf tool; libslang needs
libm, another unconditional dependency. To solve the problem in
the long run without the need to keep track of transitive
library dependencies, simply make the linker look at the EXTLIBS
multiple times until it has all symbols resolved.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/1308171818-20370-1-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-16 10:17:39 +02:00
Andy Whitcroft 37aa9a2eb4 perf: clear out make flags when calling kernel make kernelver
When generating the perf version from the kernel version using 'make
kernelver' it is necessary to clear out any MAKEFLAGS otherwise they may
trigger additional output which pollute the contents.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-06-15 22:12:55 +02:00
Shaohua Li 09223371de rcu: Use softirq to address performance regression
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.

The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread.  A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.

Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler.  Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling.  (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime.  And "the meantime" might well be forever.)

This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work.  RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case.  This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: "Alex,Shi" <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-06-14 15:25:39 -07:00
Linus Torvalds 6aecceccf5 Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  perf: Use make kernelversion instead of parsing the Makefile
  kbuild: Hack for depmod not handling X.Y versions
  kbuild: Move depmod call to a separate script
  kbuild: Fix <linux/version.h> for empty SUBLEVEL or PATCHLEVEL
  kbuild: Fix KERNELVERSION for empty SUBLEVEL or PATCHLEVEL
  kbuild: silence Nothing to be done for 'all' message
2011-06-09 16:27:42 -07:00
Michal Marek 5d61b9fd19 perf: Use make kernelversion instead of parsing the Makefile
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-06-09 23:05:54 +02:00
Frederic Weisbecker b273fa9716 perf python: Fix argument name list of read_on_cpu()
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:

	# ./python/twatch.py
	Traceback (most recent call last):
	  File "./python/twatch.py", line 41, in <module>
	    main()
	  File "./python/twatch.py", line 32, in main
	    event = evlist.read_on_cpu(cpu)
	RuntimeError: more argument specifiers than keyword list entries

Hence, add cpu to the name list.

Cc: David Ahern <daahern@cisco.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:09:22 -03:00
Arnaldo Carvalho de Melo 56722381b8 perf evlist: Don't die if sample_{id_all|type} is invalid
Fixes two more cases where the python binding would not load:

. Not finding die(), which it shouldn't anyway, not good to just stop the
  world because some particular perf.data file is invalid, just propagate
  the error to the caller.

. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
  where it belongs, as most cases are moving to operate on an evsel object.o

One of the fixed problems:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:07:52 -03:00
Arnaldo Carvalho de Melo 9c850d6c4b perf python: Use exception to propagate errors
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.

Fixes this problem:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]

As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:07:01 -03:00
Arnaldo Carvalho de Melo d21cc9f67d perf evlist: Remove dependency on debug routines
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:05:23 -03:00
David Ahern 7cec092238 perf script: Add printing of sample address
Resolve to a function or variable if possible and if the sym option is
enabled.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306782503-22002-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:31:01 -03:00
David Ahern 610723f24e perf script: Make printing of dso a separate field option
The 'sym' option displays both the function name and the DSO it comes
from. Split the display of the dso into a separate option.  This allows
display of the ip address and symbol without the dso, thus shortening
line lengths - and decluttering the output a bit.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-3-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:29:14 -03:00
David Ahern 787bef174f perf script: "sym" field really means show IP data
Currently the "sym" output field is used to dump instruction pointers
and callchain stack. Sample addresses can also be converted to symbols,
so the meaning of "sym" needs to be fixed. This patch adds an "ip"
option and if it is selected the user can also opt to dump symbols for
them. If the user opts to dump IP without syms only the address is
shown.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-2-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:28:34 -03:00
David Ahern 2cee77c450 perf stat: clarify unsupported events from uncounted events
perf stat continues running even if the event list contains counters
that are not supported. The resulting output then contains <not counted>
for those events which gets confusing as to which events are supported,
but not counted and which are not supported.

Before:

perf stat -ddd -- sleep 1

      Performance counter stats for 'sleep 1':

          0.571283 task-clock                #    0.001 CPUs utilized
                 1 context-switches          #    0.002 M/sec
                 0 CPU-migrations            #    0.000 M/sec
               157 page-faults               #    0.275 M/sec
         1,037,707 cycles                    #    1.816 GHz
     <not counted> stalled-cycles-frontend
     <not counted> stalled-cycles-backend
           654,499 instructions              #    0.63  insns per cycle
           136,129 branches                  #  238.286 M/sec
     <not counted> branch-misses
     <not counted> L1-dcache-loads
     <not counted> L1-dcache-load-misses
     <not counted> LLC-loads
     <not counted> LLC-load-misses
     <not counted> L1-icache-loads
     <not counted> L1-icache-load-misses
     <not counted> dTLB-loads
     <not counted> dTLB-load-misses
     <not counted> iTLB-loads
     <not counted> iTLB-load-misses
     <not counted> L1-dcache-prefetches
     <not counted> L1-dcache-prefetch-misses

       1.001004836 seconds time elapsed

After:

perf stat -ddd -- sleep 1

 Performance counter stats for 'sleep 1':

          1.350326 task-clock                #    0.001 CPUs utilized
                 2 context-switches          #    0.001 M/sec
                 0 CPU-migrations            #    0.000 M/sec
               157 page-faults               #    0.116 M/sec
            11,986 cycles                    #    0.009 GHz
   <not supported> stalled-cycles-frontend
   <not supported> stalled-cycles-backend
           496,986 instructions              #   41.46  insns per cycle
           138,065 branches                  #  102.246 M/sec
             7,245 branch-misses             #    5.25% of all branches
     <not counted> L1-dcache-loads
     <not counted> L1-dcache-load-misses
     <not counted> LLC-loads
     <not counted> LLC-load-misses
     <not counted> L1-icache-loads
     <not counted> L1-icache-load-misses
     <not counted> dTLB-loads
     <not counted> dTLB-load-misses
     <not counted> iTLB-loads
     <not counted> iTLB-load-misses
     <not counted> L1-dcache-prefetches
   <not supported> L1-dcache-prefetch-misses

       1.002397333 seconds time elapsed

v1->v2:
changed supported type from int to bool

v2->v3
fixed vertical alignment of new struct element

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306767359-13221-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:26:15 -03:00
Frederic Weisbecker 64348153c6 perf python: Cleanup useless double NULL termination in method arg names
The list of methods argument names only needs to be NULL terminated
once. Remove the second ones.

Cc: David Ahern <daahern@cisco.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1301588863-20210-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:21:26 -03:00
Frederic Weisbecker e95cc02880 perf python: Fix argument name list of read_on_cpu()
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:

	# ./python/twatch.py
	Traceback (most recent call last):
	  File "./python/twatch.py", line 41, in <module>
	    main()
	  File "./python/twatch.py", line 32, in main
	    event = evlist.read_on_cpu(cpu)
	RuntimeError: more argument specifiers than keyword list entries

Hence, add cpu to the name list.

Cc: David Ahern <daahern@cisco.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:21:07 -03:00
Arnaldo Carvalho de Melo c2a70653af perf evlist: Don't die if sample_{id_all|type} is invalid
Fixes two more cases where the python binding would not load:

. Not finding die(), which it shouldn't anyway, not good to just stop the
  world because some particular perf.data file is invalid, just propagate
  the error to the caller.

. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
  where it belongs, as most cases are moving to operate on an evsel object.o

One of the fixed problems:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 11:04:54 -03:00
Arnaldo Carvalho de Melo 5c6970af2f perf python: Use exception to propagate errors
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.

Fixes this problem:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]

As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 10:55:10 -03:00
Arnaldo Carvalho de Melo bccdaba044 perf evlist: Remove dependency on debug routines
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 10:41:41 -03:00
Arnaldo Carvalho de Melo e4a338d05d perf top: Don't stop if no kernel symtab is found
We now just warn the user about the fact and go on providing just
userspace samples.

This fixes a problem when no vmlinux is explicetely passed by the user,
thus symbol_conf.vmlinux_name is NULL, no suitable vmlinux is found, and
then we get:

 aldebaran:~> perf top -p 7557
 [kernel.kallsyms] with build id 44d9a989eabbd79e486bc079d6b743d397c204e0
 not found, continuing without symbols
 The (null) file can't be used

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-cj2g81hn64wv2bipmqk4fy2m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:29 -03:00
Arnaldo Carvalho de Melo 5f6f558097 perf top: Handle kptr_restrict
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-cyl5zmi1nu35vyu7l5im2pyv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:25 -03:00
Arnaldo Carvalho de Melo 59fb1ee95e perf top: Remove unused macro
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-weqbs0tkk2u0qp1xxdxxosfg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:20 -03:00
David Ahern 4af4c9550c perf events: initialize fd array to -1 instead of 0
perf_evsel__alloc_fd allocates an array of file descriptors with the
memory initialized to 0. The array has dimensions for cpus and threads.

Later, __perf_evsel__open calls sys_perf_event_open for each cpu and thread
dimensions. If the open fails for any of the cpus or threads then the fd's
for this event are closed and the fd entry in the array is set to -1. Now,
if the first attempt fails for the event (e.g., the event is not supported)
the remaining dimensions (cpu > 0 and thread > 0) are not touched and left
at the initialized value of 0.

builtin-stat catches ENOENT and ENOSYS failures and allows the command to
continue. The end result is that stat attempts to read from an fd of 0 which
of course is stdin and so the command hangs until you type ctrl-D.

Resolve by initializing the array to -1 since an fd < 0 is already
handled.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306511914-8016-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:12 -03:00
Arnaldo Carvalho de Melo 646aaea615 perf tools: Make sure kptr_restrict warnings fit 80 col terms
Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-i1p8vrhq7xveyui6t1sc914e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:09 -03:00
Arnaldo Carvalho de Melo 75911c9bd1 perf tools: Fix build on older systems
Where /usr/include/linux/const.h is not present, e.g. RHEL5.

Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-ypcw2mu0w7dl1rrc6ncz3pee@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26 11:16:29 -03:00
Arnaldo Carvalho de Melo ec80fde746 perf symbols: Handle /proc/sys/kernel/kptr_restrict
Perf uses /proc/modules to figure out where kernel modules are loaded.

With the advent of kptr_restrict, non root users get zeroes for all module
start addresses.

So check if kptr_restrict is non zero and don't generate the syntethic
PERF_RECORD_MMAP events for them.

Warn the user about it in perf record and in perf report.

In perf report the reference relocation symbol being zero means that
kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't
use it to fixup symbol addresses when using a valid kallsyms (in the buildid
cache) or vmlinux (in the vmlinux path) build-id located automatically or
specified by the user.

Provide an explanation about it in 'perf report' if kernel samples were taken,
checking if a suitable vmlinux or kallsyms was found/specified.

Restricted /proc/kallsyms don't go to the buildid cache anymore.

Example:

 [acme@emilia ~]$ perf record -F 100000 sleep 1

 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check
 /proc/sys/kernel/kptr_restrict.

 Samples in kernel functions may not be resolved if a suitable vmlinux file is
 not found in the buildid cache or in the vmlinux path.

 Samples in kernel modules won't be resolved at all.

 If some relocation was applied (e.g. kexec) symbols may be misresolved even
 with a suitable vmlinux or kallsyms file.

 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ]
 [acme@emilia ~]$

 [acme@emilia ~]$ perf report --stdio
 Kernel address maps (/proc/{kallsyms,modules}) were restricted,
 check /proc/sys/kernel/kptr_restrict before running 'perf record'.

 If some relocation was applied (e.g. kexec) symbols may be misresolved.

 Samples in kernel modules can't be resolved as well.

 # Events: 13  cycles
 #
 # Overhead  Command      Shared Object                 Symbol
 # ........  .......  .................  .....................
 #
    20.24%    sleep  [kernel.kallsyms]  [k] page_fault
    20.04%    sleep  [kernel.kallsyms]  [k] filemap_fault
    19.78%    sleep  [kernel.kallsyms]  [k] __lru_cache_add
    19.69%    sleep  ld-2.12.so         [.] memcpy
    14.71%    sleep  [kernel.kallsyms]  [k] dput
     4.70%    sleep  [kernel.kallsyms]  [k] flush_signal_handlers
     0.73%    sleep  [kernel.kallsyms]  [k] perf_event_comm
     0.11%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe

 #
 # (For a higher level overview, try: perf report --sort comm,dso)
 #
 [acme@emilia ~]$

This is because it found a suitable vmlinux (build-id checked) in
/lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long
file name).

If we remove that file from the vmlinux path:

 [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \
		     /lib/modules/2.6.39-rc7+/build/vmlinux.OFF
 [acme@emilia ~]$ perf report --stdio
 [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562
 not found, continuing without symbols

 Kernel address maps (/proc/{kallsyms,modules}) were restricted, check
 /proc/sys/kernel/kptr_restrict before running 'perf record'.

 As no suitable kallsyms nor vmlinux was found, kernel samples can't be
 resolved.

 Samples in kernel modules can't be resolved as well.

 # Events: 13  cycles
 #
 # Overhead  Command      Shared Object  Symbol
 # ........  .......  .................  ......
 #
    80.31%    sleep  [kernel.kallsyms]  [k] 0xffffffff8103425a
    19.69%    sleep  ld-2.12.so         [.] memcpy

 #
 # (For a higher level overview, try: perf report --sort comm,dso)
 #
 [acme@emilia ~]$

Reported-by: Stephane Eranian <eranian@google.com>
Suggested-by: David Miller <davem@davemloft.net>
Cc: Dave Jones <davej@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26 11:15:25 -03:00
Jesper Juhl ea7659fb2b perf: Remove duplicate headers
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: trivial@kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1105261011290.17400@swampdragon.chaosbits.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-26 13:49:57 +02:00
Linus Torvalds 5214638384 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Fix sample type size calculation in 32 bits archs
  profile: Use vzalloc() rather than vmalloc() & memset()
2011-05-23 21:20:48 -07:00
Frederic Weisbecker 0f61f3e4db perf tools: Fix sample type size calculation in 32 bits archs
The shift used here to count the number of bits set in
the mask doesn't work above the low part for archs that
are not 64 bits.

Fix the constant used for the shift.

This fixes a 32-bit perf top failure reported by Eric Dumazet:

	Can't parse sample, err = -14
	Can't parse sample, err = -14
	...

Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com
Link: http://lkml.kernel.org/r/1306200686-17317-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-24 04:33:24 +02:00
Linus Torvalds 19504828b4 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Fix sample size bit operations
  perf tools: Fix ommitted mmap data update on remap
  watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh
  watchdog: Disable watchdog when thresh is zero
  watchdog: Only disable/enable watchdog if neccessary
  watchdog: Fix rounding bug in get_sample_period()
  perf tools: Propagate event parse error handling
  perf tools: Robustify dynamic sample content fetch
  perf tools: Pre-check sample size before parsing
  perf tools: Move evlist sample helpers to evlist area
  perf tools: Remove junk code in mmap size handling
  perf tools: Check we are able to read the event size on mmap
2011-05-23 09:25:52 -07:00
Linus Torvalds 57d19e80f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  b43: fix comment typo reqest -> request
  Haavard Skinnemoen has left Atmel
  cris: typo in mach-fs Makefile
  Kconfig: fix copy/paste-ism for dell-wmi-aio driver
  doc: timers-howto: fix a typo ("unsgined")
  perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
  md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
  treewide: fix a few typos in comments
  regulator: change debug statement be consistent with the style of the rest
  Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
  audit: acquire creds selectively to reduce atomic op overhead
  rtlwifi: don't touch with treewide double semicolon removal
  treewide: cleanup continuations and remove logging message whitespace
  ath9k_hw: don't touch with treewide double semicolon removal
  include/linux/leds-regulator.h: fix syntax in example code
  tty: fix typo in descripton of tty_termios_encode_baud_rate
  xtensa: remove obsolete BKL kernel option from defconfig
  m68k: fix comment typo 'occcured'
  arch:Kconfig.locks Remove unused config option.
  treewide: remove extra semicolons
  ...
2011-05-23 09:12:26 -07:00
Frederic Weisbecker 3cb6d15408 perf tools: Fix sample size bit operations
What we want is to count the number of bits in the mask,
not some other random operation written in the middle
of the night.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1306148788-6179-2-git-send-email-fweisbec@gmail.com
[ Fixed perf_event__names[] alignment which was nearby and hurting my eyes ... ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-23 13:26:36 +02:00
Frederic Weisbecker 998bedc8c5 perf tools: Fix ommitted mmap data update on remap
Commit eac9eacee1 "perf tools: Check we are able to read the event
size on mmap" brought a check to ensure we can read the size of the
event before dereferencing it, and do a remap otherwise to move the
buffer forward.

However that remap was ommitting all the necessary work to
update the new page offset, head, and to unmap previous pages,
etc...

To fix this, gather all the code that fetches the event in a
seperate helper which does all the necessary checks about the
header/event size and tells us anytime a remap is needed.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1306148788-6179-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-23 13:22:57 +02:00
Ingo Molnar 3ac1bbcf13 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
Conflicts:
	tools/perf/builtin-top.c

Semantic conflict:
	util/include/linux/list.h        # fix prefetch.h removal fallout

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-22 10:10:01 +02:00
Frederic Weisbecker 5538becaec perf tools: Propagate event parse error handling
Better handle event parsing error by propagating the details
in upper layers or by dumping some failure message. So that
the user knows he has some crazy events in the batch.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:38:49 +02:00
Frederic Weisbecker 98e1da905c perf tools: Robustify dynamic sample content fetch
Ensure the size of the dynamic fields such as callchains
or raw events don't overlap the whole event boundaries.

This prevents from dereferencing junk if the given size of
the callchain goes too eager.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:38:48 +02:00
Frederic Weisbecker a285412479 perf tools: Pre-check sample size before parsing
Check that the total size of the sample fields having a fixed
size do not exceed the one of the whole event. This robustifies
the sample parsing.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:38:36 +02:00
Frederic Weisbecker 74429964d8 perf tools: Move evlist sample helpers to evlist area
These APIs should belong to evlist.c as they may not be
exclusively tied to the headers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com
2011-05-22 03:12:29 +02:00
Frederic Weisbecker dd5f5fd108 perf tools: Remove junk code in mmap size handling
size is overriden later and used only then. Those
lines are only junk, probably a leftover.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:12:28 +02:00
Frederic Weisbecker eac9eacee1 perf tools: Check we are able to read the event size on mmap
Check we have enough mmaped space to read the current event
size from its headers, otherwise we may dereference some
hell there.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:12:13 +02:00
Linus Torvalds 268bb0ce3e sanitize <linux/prefetch.h> usage
Commit e66eed651f ("list: remove prefetching from regular list
iterators") removed the include of prefetch.h from list.h, which
uncovered several cases that had apparently relied on that rather
obscure header file dependency.

So this fixes things up a bit, using

   grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
   grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')

to guide us in finding files that either need <linux/prefetch.h>
inclusion, or have it despite not needing it.

There are more of them around (mostly network drivers), but this gets
many core ones.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-20 12:50:29 -07:00
Linus Torvalds eb04f2f04e Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits)
  Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"
  net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
  batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu
  batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()
  batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu
  net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()
  net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
  net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()
  perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()
  perf,rcu: convert call_rcu(free_ctx) to kfree_rcu()
  net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()
  security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
  net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()
  net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()
  net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()
  ...
2011-05-19 18:14:34 -07:00
Linus Torvalds 80fe02b5da Merge branches 'sched-core-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
  sched: Fix and optimise calculation of the weight-inverse
  sched: Avoid going ahead if ->cpus_allowed is not changed
  sched, rt: Update rq clock when unthrottling of an otherwise idle CPU
  sched: Remove unused parameters from sched_fork() and wake_up_new_task()
  sched: Shorten the construction of the span cpu mask of sched domain
  sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG
  sched: Remove unused 'this_best_prio arg' from balance_tasks()
  sched: Remove noop in alloc_rt_sched_group()
  sched: Get rid of lock_depth
  sched: Remove obsolete comment from scheduler_tick()
  sched: Fix sched_domain iterations vs. RCU
  sched: Next buddy hint on sleep and preempt path
  sched: Make set_*_buddy() work on non-task entities
  sched: Remove need_migrate_task()
  sched: Move the second half of ttwu() to the remote cpu
  sched: Restructure ttwu() some more
  sched: Rename ttwu_post_activation() to ttwu_do_wakeup()
  sched: Remove rq argument from ttwu_stat()
  sched: Remove rq->lock from the first half of ttwu()
  sched: Drop rq->lock from sched_exec()
  ...

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix rt_rq runtime leakage bug
2011-05-19 17:41:22 -07:00
Linus Torvalds df48d8716e Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (107 commits)
  perf stat: Add more cache-miss percentage printouts
  perf stat: Add -d -d and -d -d -d options to show more CPU events
  ftrace/kbuild: Add recordmcount files to force full build
  ftrace: Add self-tests for multiple function trace users
  ftrace: Modify ftrace_set_filter/notrace to take ops
  ftrace: Allow dynamically allocated function tracers
  ftrace: Implement separate user function filtering
  ftrace: Free hash with call_rcu_sched()
  ftrace: Have global_ops store the functions that are to be traced
  ftrace: Add ops parameter to ftrace_startup/shutdown functions
  ftrace: Add enabled_functions file
  ftrace: Use counters to enable functions to trace
  ftrace: Separate hash allocation and assignment
  ftrace: Create a global_ops to hold the filter and notrace hashes
  ftrace: Use hash instead for FTRACE_FL_FILTER
  ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions
  perf bench, x86: Add alternatives-asm.h wrapper
  x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit
  x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB
  x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB
  ...
2011-05-19 17:36:08 -07:00
Ingo Molnar c3305257cd perf stat: Add more cache-miss percentage printouts
Print out the cache-miss percentage as well if the cache refs were
collected, for all the generic cache event types.

Before:

   11,103,723,230 dTLB-loads                #  622.471 M/sec                    ( +-  0.30% )
       87,065,337 dTLB-load-misses          #    4.881 M/sec                    ( +-  0.90% )

After:

   11,353,713,242 dTLB-loads                #  626.020 M/sec                    ( +-  0.35% )
      113,393,472 dTLB-load-misses          #    1.00% of all dTLB cache hits   ( +-  0.49% )

Also ASCII color highlight too high percentages, them when it's executed on the console.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-19 14:30:50 +02:00
Ingo Molnar 2cba3ffb9a perf stat: Add -d -d and -d -d -d options to show more CPU events
Print even more detailed statistics if requested via perf stat -d:

       -d:          detailed events, L1 and LLC data cache
    -d -d:     more detailed events, dTLB and iTLB events
 -d -d -d:     very detailed events, adding prefetch events

Full output looks like this now:

 Performance counter stats for '/home/mingo/hackbench 10' (5 runs):

       1703.674707 task-clock                #    8.709 CPUs utilized            ( +-  4.19% )
            49,068 context-switches          #    0.029 M/sec                    ( +- 16.66% )
             8,303 CPU-migrations            #    0.005 M/sec                    ( +- 24.90% )
            17,397 page-faults               #    0.010 M/sec                    ( +-  0.46% )
     2,345,389,239 cycles                    #    1.377 GHz                      ( +-  4.61% ) [55.90%]
     1,884,503,527 stalled-cycles-frontend   #   80.35% frontend cycles idle     ( +-  5.67% ) [50.39%]
       743,919,737 stalled-cycles-backend    #   31.72% backend  cycles idle     ( +-  8.75% ) [49.91%]
     1,314,416,379 instructions              #    0.56  insns per cycle
                                             #    1.43  stalled cycles per insn  ( +-  2.53% ) [60.87%]
       272,592,567 branches                  #  160.003 M/sec                    ( +-  1.74% ) [56.56%]
         3,794,846 branch-misses             #    1.39% of all branches          ( +-  6.59% ) [58.50%]
       449,982,778 L1-dcache-loads           #  264.125 M/sec                    ( +-  2.47% ) [49.88%]
        22,404,961 L1-dcache-load-misses     #    4.98% of all L1-dcache hits    ( +-  6.08% ) [55.05%]
         6,204,750 LLC-loads                 #    3.642 M/sec                    ( +-  8.91% ) [43.75%]
         1,837,411 LLC-load-misses           #    1.078 M/sec                    ( +-  7.27% ) [12.07%]
       411,440,421 L1-icache-loads           #  241.502 M/sec                    ( +-  5.60% ) [36.52%]
        27,556,832 L1-icache-load-misses     #   16.175 M/sec                    ( +-  7.46% ) [46.72%]
       464,067,627 dTLB-loads                #  272.392 M/sec                    ( +-  4.46% ) [54.17%]
        10,765,648 dTLB-load-misses          #    6.319 M/sec                    ( +-  3.18% ) [48.68%]
     1,273,080,386 iTLB-loads                #  747.256 M/sec                    ( +-  3.38% ) [47.53%]
           117,481 iTLB-load-misses          #    0.069 M/sec                    ( +- 14.99% ) [47.01%]
         4,590,653 L1-dcache-prefetches      #    2.695 M/sec                    ( +-  4.49% ) [46.19%]
         1,712,660 L1-dcache-prefetch-misses #    1.005 M/sec                    ( +-  3.75% ) [44.82%]

        0.195622057  seconds time elapsed  ( +-  6.84% )

Also clean up the attribute construction code to be appending, and factor
it out into add_default_attributes().

Tweak the coverage percentage printout a bit, so that it's easier to view it
alongside the +- sttddev colum.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-to3kgu04449s64062val8b62@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-19 14:29:51 +02:00