Commit Graph

52 Commits

Author SHA1 Message Date
Adrian Hunter 47e780848e perf script: Add 'synth' field for synthesized event payloads
Add a field to display the content the raw_data of a synthesized event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-22-git-send-email-adrian.hunter@intel.com
[ Resolved conflict with 106dacd86f ("perf script: Support -F brstackoff,dso") ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 12:19:10 -03:00
Mark Santaniello 106dacd86f perf script: Support -F brstackoff,dso
The idea here is to make AutoFDO easier in cloud environment with ASLR.
It's easiest to show how this is useful by example. I built a small test
akin to "while(1) { do_nothing(); }" where the do_nothing function is
loaded from a dso:

  $ cat burncpu.cpp
  #include <dlfcn.h>

  int main() {
    void* handle = dlopen("./dso.so", RTLD_LAZY);
    if (!handle) return -1;

    typedef void (*fp)();
    fp do_nothing = (fp) dlsym(handle, "do_nothing");

    while(1) {
      do_nothing();
    }
  }

  $ cat dso.cpp
  extern "C" void do_nothing() {}

  $ cat build.sh
  #!/bin/bash
  g++ -shared dso.cpp -o dso.so
  g++ burncpu.cpp -o burncpu -ldl

I sampled the execution of this program with perf record -b.

Using the existing "brstack,dso", we get absolute addresses that are
affected by ASLR, and could be different on different hosts. The address
does not uniquely identify a branch/target in the binary:

  $ perf script -F brstack,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
  0x7f967139b6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0

Using the existing "brstacksym,dso" is a little better, because the
symbol plus offset and dso name *does* uniquely identify a branch/target
in the binary.  Ultimately, however, AutoFDO wants a simple offset into
the binary, so we'd have to undo all the work perf did to symbolize in
the first place:

  $ perf script -F brstacksym,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
  do_nothing+0x5(/tmp/burncpu/dso.so)/main+0x44(/tmp/burncpu/exe)/P/-/-/0

With the new "brstackoff,dso" we get what we need: a simple offset into a
specific dso/binary that uniquely identifies a branch/target:
  $ perf script -F brstackoff,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
  0x6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0

Signed-off-by: Mark Santaniello <marksan@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170619163825.2012979-2-marksan@fb.com
[ Updated documentation about 'brstackoff' using text from above ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 22:05:46 -03:00
Andi Kleen 36ce565114 perf script: Allow adding and removing fields
With 'perf script' it is common that we just want to add or remove a field.

Currently this requires figuring out the long list of default fields and
specifying them first, and then adding/removing the new field.

This patch adds a new + - syntax to merely add or remove fields,
that allows more succint and clearer command lines

For example to remove the comm field from PMU samples:

Previously

  $ perf script -F tid,cpu,time,event,sym,ip,dso,period | head -1
  swapper  0 [000] 504345.383126:          1 cycles:  ffffffff90060c66 native_write_msr ([kernel.kallsyms])

with the new syntax

  perf script -F -comm | head -1
  0 [000] 504345.383126:          1 cycles:  ffffffff90060c66 native_write_msr ([kernel.kallsyms])

The new syntax cannot be mixed with normal overriding.

v2: Fix example in description. Use tid vs pid. No functional changes.
v3: Don't skip initialization when user specified explicit type.
v4: Rebase. Remove empty line.

Committer testing:

  # perf record -a usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.748 MB perf.data (14 samples) ]

Without a explicit field list specified via -F, defaults to:

  # perf script | head -2
      perf 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
   swapper    0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
  #

Which is equivalent to:

  # perf script -F comm,tid,cpu,time,period,event,ip,sym,dso | head -2
      perf 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
   swapper    0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
  #

So if we want to remove the comm, as in your original example, we would have to
figure out the default field list and remove ' comm' from it:

  # perf script -F tid,cpu,time,period,event,ip,sym,dso | head -2
   6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
      0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
  #

With your patch this becomes simpler, one can remove fields by prefixing them
with '-':

  # perf script -F -comm | head -2
  6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
     0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
  #

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Link: http://lkml.kernel.org/r/20170602154810.15875-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 15:14:58 -03:00
Namhyung Kim 325fbff51f perf script: Add --inline option for debugging
The --inline option is to show inlined functions in callchains.

For example:

  $ perf script
  a.out  5644 11611.467597:     309961 cycles:u:
                     790 main (/home/namhyung/tmp/perf/a.out)
                   20511 __libc_start_main (/usr/lib/libc-2.25.so)
                     8ba _start (/home/namhyung/tmp/perf/a.out)
  ...

  $ perf script --inline
  a.out  5644 11611.467597:     309961 cycles:u:
                     790 main (/home/namhyung/tmp/perf/a.out)
                         std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()
                         std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                         std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                         main
                   20511 __libc_start_main (/usr/lib/libc-2.25.so)
                     8ba _start (/home/namhyung/tmp/perf/a.out)
  ...

Reviewed-and-tested-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-5-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-24 08:41:48 +02:00
Andi Kleen 48d02a1d5c perf script: Add 'brstackinsn' for branch stacks
Implement printing instruction sequences as hex dump for branch stacks.

This relies on the x86 instruction decoder used by the PT decoder to
find the lengths of instructions to dump them individually.

This is good enough for pattern matching.

This allows to study hot paths for individual samples, together with
branch misprediction and cycle count / IPC information if available (on
Skylake systems).

  % perf record -b ...
  % perf script -F brstackinsn
  ...
    read_hpet+67:
          ffffffff9905b843        insn: 74 ea                     # PRED
          ffffffff9905b82f        insn: 85 c9
          ffffffff9905b831        insn: 74 12
          ffffffff9905b833        insn: f3 90
          ffffffff9905b835        insn: 48 8b 0f
          ffffffff9905b838        insn: 48 89 ca
          ffffffff9905b83b        insn: 48 c1 ea 20
          ffffffff9905b83f        insn: 39 f2
          ffffffff9905b841        insn: 89 d0
          ffffffff9905b843        insn: 74 ea                     # PRED

Only works when no special branch filters are specified.

Occasionally the path does not reach up to the sample IP, as the LBRs
may be frozen before executing a final jump. In this case we print a
special message.

The instruction dumper piggy backs on the existing infrastructure from
the IP PT decoder.

An earlier iteration of this patch relied on a disassembler, but this
version only uses the existing instruction decoder.

Committer note:

Added hint about how to get suitable perf.data files for use with
'-F brstackinsm':

  $ perf record usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
  $
  $ perf script -F brstackinsn
  Display of branch stack assembler requested, but non all-branch filter set
  Hint: run 'perf record -b ...'
  $

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-16 09:24:35 -03:00
Hari Bathini 96a44bbccd perf script: Add script print support for namespace events
Introduce a new option to display events of type PERF_RECORD_NAMESPACES
and update perf-script documentation accordingly.

Shown below is output (trimmed) of perf script command with the newly
introduced option, on perf.data generated with perf record command using
--namespaces option.

  $ perf script --show-namespace-events
      swapper   0 [000]     0.000000: PERF_RECORD_NAMESPACES 1/1 - nr_namespaces: 7
                [0/net: 3/0xf000001c, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
      swapper   0 [000]     0.000000: PERF_RECORD_NAMESPACES 2/2 - nr_namespaces: 7
                [0/net: 3/0xf000001c, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]

Commiter notes:

Testing it:

Investigating that double PERF_RECORD_NAMESPACES for the 19155
pid/tid... Its more than that, there are two PERF_RECORD_COMM as well,
and with zeroed timestamps, so probably a synthesizing artifact...

  # perf script --show-task --show-namespace
  <SNIP>
      perf     0 [000]     0.000000: PERF_RECORD_COMM: perf:19154/19154
      perf     0 [000]     0.000000: PERF_RECORD_FORK(19155:19155):(19154:19154)
      perf     0 [000]     0.000000: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
          [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
           4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
      perf     0 [000]     0.000000: PERF_RECORD_COMM: perf:19155/19155
      perf     0 [000]     0.000000: PERF_RECORD_COMM: perf:19155/19155
      perf     0 [000]     0.000000: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
          [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
           4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
   swapper     0 [000]  3110.881834:          1 cycles:  ffffffffa7060bf6 native_write_msr (/lib/modules/4.11.0-rc1+/build/vmlinux)

  <SNIP>

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891932627.25309.1941587059154176221.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-14 15:17:36 -03:00
Michael Petlan 5c64f99b1d perf script: Fix man page about --dump-raw-trace option
The "--dump-raw-script" is not a valid option, replace it with the valid
one, "--dump-raw-trace"

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 133dc4c39c ("perf: Rename 'perf trace' to 'perf script'")
LPU-Reference: 728644547.14560155.1484320012612.JavaMail.zimbra@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 14:59:15 -03:00
David Ahern a91f4c473f perf script: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for some amount of time and analyze a segment of interest within that
window.

Committer notes:

Testing it:

  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  #
  # perf script --hide-call-graph | head -15
    swapper    0 [0] 9693.370039:      1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370044:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370046:      7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370048:    126 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370049:   2701 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370051:  58823 cycles:ppp: ffffffffb90cd2e0 idle_cpu (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370059:      1 cycles:ppp: ffffffffb91a713a ctx_resched (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370062:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370064:     13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370065:    250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370067:   5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370076:      1 cycles:ppp: ffffffffb91a76c1 __perf_event_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370091:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370095:      3 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
  #
  # perf script --hide-call-graph --time ,9693.370048
    swapper    0 [0] 9693.370039:      1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370044:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370046:      7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
  # perf script --hide-call-graph --time 9693.370064,9693.370076
    swapper    0 [1] 9693.370064:     13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370065:    250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370067:   5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:45 -03:00
David Ahern 64eff7d9c4 perf script: Add option to stop printing callchain
Allow user to specify list of symbols which cause the dump of callchains
to stop at that symbol.

Committer notes:

Testing it:

  # perf record -ag usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.177 MB perf.data (33 samples) ]
  #
  # # Without it:
  #
  # perf script
  swapper   0 [000]  9693.370039:          1 cycles:ppp:
                  2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                 137ffeb start_kernel ([kernel.vmlinux].init.text)
                 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
                 137f419 x86_64_start_kernel ([kernel.vmlinux].init.text)

  swapper   0 [000]  9693.370044:          1 cycles:ppp:
                  20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                 137ffeb start_kernel ([kernel.vmlinux].init.text)
                 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
  #
  # # Using it to see just what are the calls from the 'remote_function' function:
  #
  # perf script --stop-bt remote_function
  swapper   0 [000]  9693.370039:          1 cycles:ppp:
                  2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

  swapper   0 [000]  9693.370044:          1 cycles:ppp:
                  20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480104021-36275-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 13:06:19 -03:00
Andi Kleen 224e2c977b perf script: Support insn and insnlen
When looking at Intel PT traces with perf script it is useful to have
some indication of the instruction. Dump the instruction bytes and
instruction length, which can be used for simple pattern analysis in
scripts.

% perf record -e intel_pt// foo
% perf script --itrace=i0ns -F ip,insn,insnlen
 ffffffff8101232f ilen: 5 insn: 0f 1f 44 00 00
 ffffffff81012334 ilen: 1 insn: 5b
 ffffffff81012335 ilen: 1 insn: 5d
 ffffffff81012336 ilen: 1 insn: c3
 ffffffff810123e3 ilen: 1 insn: 5b
 ffffffff810123e4 ilen: 2 insn: 41 5c
 ffffffff810123e6 ilen: 1 insn: 5d
 ffffffff810123e7 ilen: 1 insn: c3
 ffffffff810124a6 ilen: 2 insn: 31 c0
 ffffffff810124a8 ilen: 9 insn: 41 83 bc 24 a8 01 00 00 01
 ffffffff810124b1 ilen: 2 insn: 75 87
...

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/1475847747-30994-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-24 11:07:30 -03:00
Brendan Gregg bcdc09af3e perf script: Add 'bpf-output' field to usage message
This adds the 'bpf-output' field to the perf script usage message, and docs.

Signed-off-by: Brendan Gregg <bgregg@netflix.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1470192469-11910-4-git-send-email-bgregg@netflix.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-08-09 10:46:43 -03:00
Adrian Hunter e216708d98 perf script: Add callindent option
Based on patches from Andi Kleen.

When printing PT instruction traces with perf script it is rather useful
to see some indentation for the call tree. This patch adds a new
callindent field to perf script that prints spaces for the function call
stack depth.

We already have code to track the function call stack for PT, that we
can reuse with minor modifications.

The resulting output is not quite as nice as ftrace yet, but a lot
better than what was there before.

Note there are some corner cases when the thread stack gets code
confused and prints incorrect indentation. Even with that it is fairly
useful.

When displaying kernel code traces it is recommended to run as root, as
otherwise perf doesn't understand the kernel addresses properly, and may
not reset the call stack correctly on kernel boundaries.

Example output:

	sudo perf-with-kcore record eg2 -a -e intel_pt// -- sleep 1
	sudo perf-with-kcore script eg2 --ns -F callindent,time,comm,pid,sym,ip,addr,flags,cpu --itrace=cre | less
	...
         swapper     0 [000]  5830.389116586:   call        irq_exit                                                     ffffffff8104d620 smp_call_function_single_interrupt+0x30 => ffffffff8107e720 irq_exit
         swapper     0 [000]  5830.389116586:   call            idle_cpu                                                 ffffffff8107e769 irq_exit+0x49 => ffffffff810a3970 idle_cpu
         swapper     0 [000]  5830.389116586:   return          idle_cpu                                                 ffffffff810a39b7 idle_cpu+0x47 => ffffffff8107e76e irq_exit
         swapper     0 [000]  5830.389116586:   call            tick_nohz_irq_exit                                       ffffffff8107e7bd irq_exit+0x9d => ffffffff810f2fc0 tick_nohz_irq_exit
         swapper     0 [000]  5830.389116919:   call                __tick_nohz_idle_enter                               ffffffff810f2fe0 tick_nohz_irq_exit+0x20 => ffffffff810f28d0 __tick_nohz_idle_enter
         swapper     0 [000]  5830.389116919:   call                    ktime_get                                        ffffffff810f28f1 __tick_nohz_idle_enter+0x21 => ffffffff810e9ec0 ktime_get
         swapper     0 [000]  5830.389116919:   call                        read_tsc                                     ffffffff810e9ef6 ktime_get+0x36 => ffffffff81035070 read_tsc
         swapper     0 [000]  5830.389116919:   return                      read_tsc                                     ffffffff81035084 read_tsc+0x14 => ffffffff810e9efc ktime_get
         swapper     0 [000]  5830.389116919:   return                  ktime_get                                        ffffffff810e9f46 ktime_get+0x86 => ffffffff810f28f6 __tick_nohz_idle_enter
         swapper     0 [000]  5830.389116919:   call                    sched_clock_idle_sleep_event                     ffffffff810f290b __tick_nohz_idle_enter+0x3b => ffffffff810a7380 sched_clock_idle_sleep_event
         swapper     0 [000]  5830.389116919:   call                        sched_clock_cpu                              ffffffff810a738b sched_clock_idle_sleep_event+0xb => ffffffff810a72e0 sched_clock_cpu
         swapper     0 [000]  5830.389116919:   call                            sched_clock                              ffffffff810a734d sched_clock_cpu+0x6d => ffffffff81035750 sched_clock
         swapper     0 [000]  5830.389116919:   call                                native_sched_clock                   ffffffff81035754 sched_clock+0x4 => ffffffff81035640 native_sched_clock
         swapper     0 [000]  5830.389116919:   return                              native_sched_clock                   ffffffff8103568c native_sched_clock+0x4c => ffffffff81035759 sched_clock
         swapper     0 [000]  5830.389116919:   return                          sched_clock                              ffffffff8103575c sched_clock+0xc => ffffffff810a7352 sched_clock_cpu
         swapper     0 [000]  5830.389116919:   return                      sched_clock_cpu                              ffffffff810a7356 sched_clock_cpu+0x76 => ffffffff810a7390 sched_clock_idle_sleep_event
         swapper     0 [000]  5830.389116919:   return                  sched_clock_idle_sleep_event                     ffffffff810a7391 sched_clock_idle_sleep_event+0x11 => ffffffff810f2910 __tick_nohz_idle_enter
	...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1466689258-28493-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-23 17:04:26 -03:00
Adrian Hunter 055cd33d93 perf script: Print sample flags more nicely
The flags field is synthesized and may have a value when Instruction
Trace decoding. The flags are "bcrosyiABEx" which stand for branch,
call, return, conditional, system, asynchronous, interrupt, transaction
abort, trace begin, trace end, and in transaction, respectively.

Change the display so that known combinations of flags are printed more
nicely e.g.: "call" for "bc", "return" for "br", "jcc" for "bo", "jmp"
for "b", "int" for "bci", "iret" for "bri", "syscall" for "bcs",
"sysret" for "brs", "async" for "by", "hw int" for "bcyi", "tx abrt" for
"bA", "tr strt" for "bB", "tr end" for "bE".

However the "x" flag will be displayed separately in those cases e.g.
"jcc (x)" for a condition branch within a transaction.

Example:

    perf record -e intel_pt//u ls
    perf script --ns -F comm,cpu,pid,tid,time,ip,addr,sym,dso,symoff,flags
    ...
    ls  3689/3689  [001]  2062.020965237:   jcc          7f06a958847a _dl_sysdep_start+0xfa (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a9588450 _dl_sysdep_start+0xd0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965237:   jmp          7f06a9588461 _dl_sysdep_start+0xe1 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a95885a0 _dl_sysdep_start+0x220 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965237:   jmp          7f06a95885a4 _dl_sysdep_start+0x224 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a9588470 _dl_sysdep_start+0xf0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965904:   call         7f06a95884c3 _dl_sysdep_start+0x143 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a9589140 brk+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020965904:   syscall      7f06a958914a brk+0xa (/lib/x86_64-linux-gnu/ld-2.19.so) =>                0 [unknown] ([unknown])
    ls  3689/3689  [001]  2062.020966237:   tr strt                 0 [unknown] ([unknown]) =>     7f06a958914c brk+0xc (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   return       7f06a9589165 brk+0x25 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a95884c8 _dl_sysdep_start+0x148 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   jcc          7f06a95884d7 _dl_sysdep_start+0x157 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   call         7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a958ac50 strlen+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ls  3689/3689  [001]  2062.020966237:   jcc          7f06a958ac6e strlen+0x1e (/lib/x86_64-linux-gnu/ld-2.19.so) =>     7f06a958ac60 strlen+0x10 (/lib/x86_64-linux-gnu/ld-2.19.so)
    ...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1466689258-28493-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-23 16:36:59 -03:00
Adrian Hunter cbb0bba9f3 perf script: Fix documentation of '-f' when it should be '-F'
The documentation for perf script mixes up '-f' and '-F'. Fix it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/None
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-21 13:18:33 -03:00
Arnaldo Carvalho de Melo fe176085a4 perf tools: Fix usage of max_stack sysctl
We cannot limit processing stacks from the current value of the sysctl,
as we may be processing perf.data files, possibly from other machines.

Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can
be overriden using --max-stack or equivalent.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Fixes: 4cb93446c5 ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack")
Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:56 -03:00
Arnaldo Carvalho de Melo 4cb93446c5 perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack
There is an upper limit to what tooling considers a valid callchain,
and it was tied to the hardcoded value in the kernel,
PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl,
make it read it and use that as the upper limit, falling back to
PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-27 10:29:07 -03:00
Arnaldo Carvalho de Melo 6125cc8dac perf script: Add --max-stack knob
Works just like with 'perf report'. In some cases we may want to have
more than 127 entries, the default maximum.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-mqkz2p5ok2978gztb0vsnocc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 19:46:57 -03:00
Jiri Olsa e0be62cc03 perf tools: Make -f/--force option documentation consistent across tools
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1458823940-24583-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:08 -03:00
Stephane Eranian dc323ce8e7 perf script: Enable printing of branch stack
This patch improves perf script by enabling printing of the
branch stack via the 'brstack' and 'brstacksym' arguments to
the field selection option -F. The option is off by default
and operates only if the perf.data file has branch stack content.

The branches are printed in to/from pairs. The most recent branch
is printed first. The number of branch entries vary based on the
underlying hardware and filtering used.

The brstack prints FROM/TO addresses in raw hexadecimal format.
The brstacksym prints FROM/TO addresses in symbolic form wherever
possible.

 $ perf script -F ip,brstack
  5d3000 0x401aa0/0x5d2000/M/-/-/-/0 ...

 $ perf script -F ip,brstacksym
  4011e0 noploop+0x0/noploop+0x0/P/-/-/0

The notation F/T/M/X/A/C describes the attributes of the branch.
F=from, T=to, M/P=misprediction/prediction, X=TSX, A=TSX abort, C=cycles (SKL)

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yuanfang Chen <cyfmxc@gmail.com>
Link: http://lkml.kernel.org/r/1441039273-16260-5-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-29 17:16:20 -03:00
Adrian Hunter 83e1986032 perf script: Allow time to be displayed in nanoseconds
Add option --ns to display time to 9 decimal places.  That is useful in
some cases, for example when using Intel PT cycle accurate mode.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 16:46:05 -03:00
Stephane Eranian fc36f9485a perf script: Enable printing of interrupted machine state
This patch adds the output of the interrupted machine state (iregs) to
perf script. It presents them  as NAME:VALUE so this is easy to parse
during post processing.

To capture the interrupted machine state:
   $ perf record -I ....

to display iregs, use the -F option:

   $ perf script -F ip,iregs
   40afc2   AX:0x6c5770    BX:0x1e    CX:0x5f4d80a    DX:0x101010101010101    SI:0x1

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1441039273-16260-2-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-31 17:51:07 -03:00
Mark Drayton 77e0070da4 perf script: Add --[no-]-demangle/--[no-]-demangle-kernel
Sometimes when post-processing output from `perf script` one does not
want to demangle C++ symbol names. Add an option to allow this.

Also add --[no-]demangle-kernel to be consistent with top/report/probe.

Signed-off-by: Mark Drayton <mbd@fb.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1440616695-32340-1-git-send-email-scientist@fb.com
Signed-off-by: Yannick Brosseau <scientist@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-28 11:47:40 -03:00
Adrian Hunter 60b88d8743 perf tools: Put itrace options into an asciidoc include
perf script, report and inject all have the same itrace options. Put
them into an asciidoc include file.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-21 11:40:44 -03:00
Andi Kleen a9710ba091 perf tools: Support full source file paths for srcline
For perf report/script srcline currently only the base file name of the
source file is printed. This is a good default because it usually fits
on the screen.

But in some cases we want to know the full file name, for example to
aggregate hits per file.

In the later case we need more than the base file name to resolve file
naming collisions: for example the kernel source has ~70 files named
"core.c"

It's also useful as input to post processing tools which want to point
to the right file.

Add a flag to allow full file name output.

Add an option to perf report/script to enable this option.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1438986245-15191-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-10 11:58:05 -03:00
Adrian Hunter 7c14898ba9 perf script: Add option --show-switch-events
Add option --show-switch-events to show switch events in a similar
fashion to --show-task-events and --show-mmap-events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1437471846-26995-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-07-23 22:51:14 -03:00
Adrian Hunter 53c76b0e9e perf auxtrace: Add option to synthesize events for transactions
Add AUX area tracing option 'x' to synthesize events for transactions.
This will be used by Intel PT to synthesize an event record for each TSX
start, commit or abort.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1430404667-10593-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-05-05 18:12:55 -03:00
Adrian Hunter 400ea6d327 perf script: Add field option 'flags' to print sample flags
Instruction tracing will typically have access to information about the
instruction being executed for a particular ip sample.  Some of that
information will be available in the 'flags' member of struct
perf_sample.

With the addition of transactions events synthesis to Instruction
Tracing options, there is a need to be able easily to see the flags
because they show whether the ip is at the start, commit or abort of a
tranasaction.

Consequently add an option to display the flags.

The flags are "bcrosyiABEx" which stand for branch, call, return,
conditional, system, asynchronous, interrupt, transaction abort, trace
begin, trace end, and in transaction, respectively.

Example using Intel PT:

perf script -fip,time,event,sym,addr,flags

...
 1288.721584105: branches:u:   bo              401146 main =>           401152 main
 1288.721584105: transactions:   x                   0           401164 main
 1288.721584105: branches:u:   bx              40117c main =>           40119b main
 1288.721584105: branches:u:   box             4011a4 main =>           40117e main
 1288.721584105: branches:u:   bcx             401187 main =>           401094 g
...
 1288.721591645: branches:u:   bx              4010c4 g =>           4010cb g
 1288.721591645: branches:u:   brx             4010cc g =>           401189 main
 1288.721591645: transactions:                       0           4011a6 main
 1288.721593199: branches:u:   b               4011a9 main =>           4011af main
 1288.721593199: branches:u:   bo              4011bc main =>           40113e main
 1288.721593199: branches:u:   b               401150 main =>           40115a main
 1288.721593199: transactions:   x                   0           401164 main
 1288.721593199: branches:u:   bx              40117c main =>           40119b main
 1288.721593199: branches:u:   box             4011a4 main =>           40117e main
 1288.721593199: branches:u:   bcx             401187 main =>           40105e f
...
 1288.722284747: branches:u:   brx             401093 f =>           401189 main
 1288.722284747: branches:u:   box             4011a4 main =>           40117e main
 1288.722284747: branches:u:   bcx             401187 main =>           40105e f
 1288.722285883: transactions:   bA                  0           401071 f
 1288.722285883: branches:u:   bA              401071 f =>           40116a main
 1288.722285883: branches:u:   bE              40116a main =>                0 [unknown]
 1288.722297174: branches:u:   bB                   0 [unknown] =>           40116a main
...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-26-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-29 10:37:57 -03:00
Adrian Hunter 7a680eb990 perf script: Add Instruction Tracing support
Add support for decoding an AUX area assuming it contains instruction
tracing data.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-17-git-send-email-adrian.hunter@intel.com
[ Do not use -Z as an alternative to --itrace ]
[ Fixed initialization of itrace_synth_opts struct fields on older gcc versions ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-29 10:37:56 -03:00
David Ahern e03eaa400c perf tools: Add pid/tid filtering to report and script commands
The 'record' and 'top' tools already allow a user to specify a CSV of
pids and/or tids of tasks to collect data.

Add those options to the 'report' and 'script' analysis commands to only
consider samples related to the given pids/tids.

This is also inline with the existing comm option.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1427212361-7066-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-24 13:02:46 -03:00
Arnaldo Carvalho de Melo 48000a1aed perf tools: Remove EOL whitespaces
Janitorial stuff: boredom moment.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-u70i7shys3kths4hzru72bha@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-21 13:24:31 -03:00
Jiri Olsa 535aeaae7d perf script: Add period data column
Adding period data column to be displayed in perf script.  It's possible
to get period values using -f option, like:

  $ perf script -f comm,tid,time,period,ip,sym,dso
          :26019 26019 52414.329088:       3707  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329088:         44  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329093:       1987  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329093:          6  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
              ls 26019 52414.329442:     537558        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
              ls 26019 52414.329442:       2099        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
              ls 26019 52414.330181:    1242100        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
              ls 26019 52414.330181:       3774        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
              ls 26019 52414.331427:    1083662  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
              ls 26019 52414.331427:        360  ffffffff810c7dc2 update_curr ([kernel.kallsyms])

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: "Jen-Cheng(Tommy) Huang" <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1408977943-16594-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-17 15:21:30 -03:00
Masanari Iida 96355f2cfb perf Documentation: Fix typos in perf/Documentation
This patch fix spelling typos found in tool/perf/Documentation.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: http://lkml.kernel.org/r/1410275930-17207-1-git-send-email-standby24x7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-15 17:39:02 -03:00
Jiri Olsa e90debddf8 perf script: Add --header/--header-only options
Currently the perf.data header is always displayed for stdio output,
which is no always useful.

Disabling header information by default and adding following options to
control header output:

  --header      - display header information
  --header-only - display header information only w/o further
                  processing

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-0ehaawv5xc83w6ag03c5hi10@git.kernel.org
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1386583370-1699-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-10 16:51:07 -03:00
Adrian Hunter cc8fae1d81 perf script: Add an option to print the source line number
Add field 'srcline' that displays the source file name and line number
associated with the sample ip.  The information displayed is the same as
from addr2line.

 $ perf script -f comm,tid,pid,time,ip,sym,dso,symoff,srcline
            grep 10701/10701 2497321.421013:  ffffffff81043ffa native_write_msr_safe+0xa ([kernel.kallsyms])
  /usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/arch/x86/include/asm/msr.h:95
            grep 10701/10701 2497321.421984:  ffffffff8165b6b3 _raw_spin_lock+0x13 ([kernel.kallsyms])
  /usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/arch/x86/include/asm/spinlock.h:54
            grep 10701/10701 2497321.421990:  ffffffff810b64b3 tick_sched_timer+0x53 ([kernel.kallsyms])
  /usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/kernel/time/tick-sched.c:840
            grep 10701/10701 2497321.421992:  ffffffff8106f63f run_timer_softirq+0x2f ([kernel.kallsyms])
  /usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/kernel/timer.c:1372

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1386315778-11633-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-09 14:47:15 -03:00
Namhyung Kim ba1ddf42f3 perf script: Print mmap[2] events also
If --show-mmap-events option is given, also print internal MMAP and
MMAP2 events.  It would be helpful for debugging.

  $ perf script --show-mmap-events
  ...
           sleep  9486 [009] 3350640.335531: PERF_RECORD_MMAP 9486/9486: [0x400000(0x6000) @ 0]: x /usr/bin/sleep
           sleep  9486 [009] 3350640.335542: PERF_RECORD_MMAP 9486/9486: [0x3153a00000(0x223000) @ 0]: x /usr/lib64/ld-2.17.so
           sleep  9486 [009] 3350640.335553: PERF_RECORD_MMAP 9486/9486: [0x7fff8b5fe000(0x2000) @ 0x7fff8b5fe000]: x [vdso]
           sleep  9486 [009] 3350640.335643: PERF_RECORD_MMAP 9486/9486: [0x3153e00000(0x3c0000) @ 0]: x /usr/lib64/libc-2.17.so

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1385456066-26592-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:38 -03:00
Namhyung Kim ad7ebb9a48 perf script: Print comm, fork and exit events also
If --show-task-events option is given, also print internal COMM, FORK
and EXIT events.  It would be helpful for debugging.

  $ perf script --show-task-events
  ...
         swapper     0 [009] 3350640.335261: sched:sched_switch: prev_comm=swapper/9
           sleep  9486 [009] 3350640.335509: PERF_RECORD_COMM: sleep:9486
           sleep  9486 [009] 3350640.335806: sched:sched_stat_runtime: comm=sleep pid=9486
         firefox  2635 [003] 3350641.275896: PERF_RECORD_FORK(2635:9487):(2635:2635)
         firefox  2635 [003] 3350641.275896: sched:sched_process_fork: comm=firefox pid=2635
           sleep  9486 [009] 3350641.336009: PERF_RECORD_EXIT(9486:9486):(9486:9486)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1385455873-25865-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:38 -03:00
Akihiro Nagai 0bc8d20580 perf script: Add option resolving vmlinux path
Add the option get the path of [kernel.kallsyms].
Specify '--show-kernel-path' option to use this function.
This patch enables other applications to use this output easily.

Without --show-kernel-path  option

ffffffff81467612 irq_return ([kernel.kallsyms])
ffffffff81467612 irq_return ([kernel.kallsyms])
    7f24fc02a6b3 _start (/lib64/ld-2.14.so)
[snip]

With --show-kernel-path option

ffffffff81467612 irq_return (/lib/modules/3.2.0+/build/vmlinux)
ffffffff81467612 irq_return (/lib/modules/3.2.0+/build/vmlinux)
    7f24fc02a6b3 _start (/lib64/ld-2.14.so)
[snip]

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20120130044320.2384.73322.stgit@linux3
Signed-off-by: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-01-30 18:13:07 -02:00
Akihiro Nagai a978f2ab41 perf script: Add the offset field specifier
Add the offset field specifier 'symoff' to show the offset from
the symbols in the output of perf-script. We can get the more
detailed address information.

Output sample:
ffffffff81467612 irq_return+0x0 => 301ec016b0 _start+0x0
ffffffff81467612 irq_return+0x0 => 301ec016b0 _start+0x0
      301ec016b3 _start+0x3     => 301ec04b70 _dl_start+0x0
ffffffff81467612 irq_return+0x0 => 301ec04b70 _dl_start+0x0
ffffffff81467612 irq_return+0x0 => 301ec04b96 _dl_start+0x26
ffffffff81467612 irq_return+0x0 => 301ec04b9d _dl_start+0x2d
      301ec04beb _dl_start+0x7b => 301ec04c0d _dl_start+0x9d
      301ec04c11 _dl_start+0xa1 => 301ec04bf0 _dl_start+0x80
[snip]

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20120130044314.2384.67094.stgit@linux3
Signed-off-by: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-01-30 18:09:21 -02:00
Robert Richter efad14150a perf report: Accept fifos as input file
The default input file for perf report is not handled the same way as
perf record does it for its output file. This leads to unexpected
behavior of perf report, etc. E.g.:

 # perf record -a -e cpu-cycles sleep 2 | perf report | cat
 failed to open perf.data: No such file or directory  (try 'perf record' first)

While perf record writes to a fifo, perf report expects perf.data to be
read. This patch changes this to accept fifos as input file.

Applies to the following commands:

 perf annotate
 perf buildid-list
 perf evlist
 perf kmem
 perf lock
 perf report
 perf sched
 perf script
 perf timechart

Also fixes char const* -> const char* type declaration for filename
strings.

v2:
* Prevent potential null pointer access to input_name in
  builtin-report.c. Needed due to removal of patch "perf report: Setup
  browser if stdout is a pipe"

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-23 17:01:03 -02:00
David Ahern e7984b7bee perf script: Add comm filtering option
Allows collecting events system wide and then pulling out events for a
specific task name(s). e.g,

    perf script -c gnome-shell,gnome-terminal

Applies on top of:
    https://lkml.org/lkml/2011/11/13/74

v2->v3
- update Documentation

v1->v2
- use comm_list from symbol_conf

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1321894972-24246-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28 11:48:23 -02:00
David Ahern c8e6672035 perf tools: make -C consistent across commands (for cpu list arg)
Currently the meaning of -C varies by perf command: for perf-top,
perf-stat, perf-record it means cpu list. For perf-report it means comm
list. Then perf-annotate, perf-report and perf-script use -c for cpu
list.

Fix annotate, report and script to use -C for cpu list to be consistent
with top, stat and record. This means report needs to use -c for comm
list which does introduce a backward compatibility change.

v1 -> v2
- update perf-script.txt too

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1321209008-7004-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28 11:45:53 -02:00
Stephane Eranian fbe96f29ce perf tools: Make perf.data more self-descriptive (v8)
The goal of this patch is to include more information about the host
environment into the perf.data so it is more self-descriptive. Overtime,
profiles are captured on various machines and it becomes hard to track
what was recorded, on what machine and when.

This patch provides a way to solve this by extending the perf.data file
with basic information about the host machine. To add those extensions,
we leverage the feature bits capabilities of the perf.data format.  The
change is backward compatible with existing perf.data files.

We define the following useful new extensions:
 - HEADER_HOSTNAME: the hostname
 - HEADER_OSRELEASE: the kernel release number
 - HEADER_ARCH: the hw architecture
 - HEADER_CPUDESC: generic CPU description
 - HEADER_NRCPUS: number of online/avail cpus
 - HEADER_CMDLINE: perf command line
 - HEADER_VERSION: perf version
 - HEADER_TOPOLOGY: cpu topology
 - HEADER_EVENT_DESC: full event description (attrs)
 - HEADER_CPUID: easy-to-parse low level CPU identication

The small granularity for the entries is to make it easier to extend
without breaking backward compatiblity. Many entries are provided as
ASCII strings.

Perf report/script have been modified to print the basic information as
easy-to-parse ASCII strings. Extended information about CPU and NUMA
topology may be requested with the -I option.

Thanks to David Ahern for reviewing and testing the many versions of
this patch.

 $ perf report --stdio
 # ========
 # captured on : Mon Sep 26 15:22:14 2011
 # hostname : quad
 # os release : 3.1.0-rc4-tip
 # perf version : 3.1.0-rc4
 # arch : x86_64
 # nrcpus online : 4
 # nrcpus avail : 4
 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
 # cpuid : GenuineIntel,6,15,11
 # total memory : 8105360 kB
 # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
 # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
 # HEADER_CPU_TOPOLOGY info available, use -I to display
 # HEADER_NUMA_TOPOLOGY info available, use -I to display
 # ========
 #
 ...

 $ perf report --stdio -I
 # ========
 # captured on : Mon Sep 26 15:22:14 2011
 # hostname : quad
 # os release : 3.1.0-rc4-tip
 # perf version : 3.1.0-rc4
 # arch : x86_64
 # nrcpus online : 4
 # nrcpus avail : 4
 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
 # cpuid : GenuineIntel,6,15,11
 # total memory : 8105360 kB
 # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
 # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
 # sibling cores   : 0-3
 # sibling threads : 0
 # sibling threads : 1
 # sibling threads : 2
 # sibling threads : 3
 # node0 meminfo  : total = 8320608 kB, free = 7571024 kB
 # node0 cpu list : 0-3
 # ========
 #
 ...

Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20110930134040.GA5575@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
[ committer notes: Use --show-info in the tools as was in the docs, rename
  perf_header_fprintf_info to perf_file_section__fprintf_info, fixup
  conflict with f69b64f7 "perf: Support setting the disassembler style" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07 17:01:24 -03:00
Anton Blanchard 5d67be97f8 perf report/annotate/script: Add option to specify a CPU range
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.

This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.

The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.

 v2: Incorporate suggestions from David Ahern:
	- Added -c to perf script
	- Check that SAMPLE_CPU is set when -c is used
	- Update documentation

 v3: Create perf_session__cpu_bitmap()

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-05 10:44:44 +02:00
David Ahern 7cec092238 perf script: Add printing of sample address
Resolve to a function or variable if possible and if the sym option is
enabled.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306782503-22002-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:31:01 -03:00
David Ahern 610723f24e perf script: Make printing of dso a separate field option
The 'sym' option displays both the function name and the DSO it comes
from. Split the display of the dso into a separate option.  This allows
display of the ip address and symbol without the dso, thus shortening
line lengths - and decluttering the output a bit.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-3-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:29:14 -03:00
David Ahern 787bef174f perf script: "sym" field really means show IP data
Currently the "sym" output field is used to dump instruction pointers
and callchain stack. Sample addresses can also be converted to symbols,
so the meaning of "sym" needs to be fixed. This patch adds an "ip"
option and if it is selected the user can also opt to dump symbols for
them. If the user opts to dump IP without syms only the address is
shown.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-2-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:28:34 -03:00
Arnaldo Carvalho de Melo 176fcc5c5f perf script: Add more documentation about the -f/--fields parameters
Using the commit log for 2c9e45f.

Cc: David Ahern <daahern@cisco.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-30 15:30:43 -03:00
David Ahern 1424dc9680 perf script: Add support for H/W and S/W events
Custom fields set for each type by prepending field argument with type.
For file with multiple event types (e.g., trace and S/W) display of an
event type suppressed by setting output fields to "".

e.g.,
perf record -ga -e sched:sched_switch -e cpu-clock -c 10000000 -R -- sleep 1
perf script

openssl 11496 [000]  9711.807107: cpu-clock-msecs:
        ffffffff810c22dc arch_local_irq_restore ([kernel.kallsyms])
        ffffffff810c518c __alloc_pages_nodemask ([kernel.kallsyms])
        ffffffff810297b2 pte_alloc_one ([kernel.kallsyms])
        ffffffff810d8b98 __pte_alloc ([kernel.kallsyms])
        ffffffff810daf07 handle_mm_fault ([kernel.kallsyms])
        ffffffff8138763a do_page_fault ([kernel.kallsyms])
        ffffffff81384a65 page_fault ([kernel.kallsyms])
            7f6130507d70 asn1_check_tlen (/lib64/libcrypto.so.1.0.0c)
                       0  ()

         openssl 11496 [000]  9711.808042: sched_switch: prev_comm=openssl ...
     kworker/0:0     4 [000]  9711.808067: sched_switch: prev_comm=kworker/...
         swapper     0 [001]  9711.808090: sched_switch: prev_comm=kworker/...
            sshd 11451 [001]  9711.808185: sched_switch: prev_comm=sshd pre...
swapper     0 [001]  9711.816155: cpu-clock-msecs:
        ffffffff81023609 native_safe_halt ([kernel.kallsyms])
        ffffffff8100132a cpu_idle ([kernel.kallsyms])
        ffffffff8137cf9b start_secondary ([kernel.kallsyms])

openssl 11496 [000]  9711.817104: cpu-clock-msecs:
            7f61304ad723 AES_cbc_encrypt (/lib64/libcrypto.so.1.0.0c)
            7fff3402f950  ()
        12f0debc9a785634  ()

swapper     0 [001]  9711.826155: cpu-clock-msecs:
        ffffffff81023609 native_safe_halt ([kernel.kallsyms])
        ffffffff8100132a cpu_idle ([kernel.kallsyms])
        ffffffff8137cf9b start_secondary ([kernel.kallsyms])

To suppress trace events within the file and use default output for S/W events:
perf script -f trace:

or to suppress S/W events and do default display for trace events:
perf script -f sw:

Custom field selections:
perf script -f sw:comm,tid,time -f trace:time,trace

         openssl 11496  9711.797162:
         swapper     0  9711.807071:
         openssl 11496  9711.807107:
 9711.808042: prev_comm=openssl prev_pid=11496 prev_prio=120 prev_state=R ...
 9711.808067: prev_comm=kworker/0:0 prev_pid=4 prev_prio=120 prev_state=S ...
 9711.808090: prev_comm=kworker/0:0 prev_pid=0 prev_prio=120 prev_state=R ...
 9711.808185: prev_comm=sshd prev_pid=11451 prev_prio=120 prev_state=S ==>...
         swapper     0  9711.816155:
         openssl 11496  9711.817104:
         swapper     0  9711.826155:

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-7-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-14 17:07:20 -03:00
David Ahern c0230b2bfb perf script: Add support for dumping symbols
Add option to dump symbols found in events.

e.g., perf script -f comm,pid,tid,time,trace,sym

swapper     0/0       537.037184: prev_comm=swapper prev_pid=0 prev_prio=120...
        ffffffff81030350 perf_trace_sched_switch ([kernel.kallsyms])
        ffffffff81382ac5 schedule ([kernel.kallsyms])
        ffffffff8100134a cpu_idle ([kernel.kallsyms])
        ffffffff81370b39 rest_init ([kernel.kallsyms])
        ffffffff81696c23 start_kernel ([kernel.kallsyms].init.text)
        ffffffff816962af x86_64_start_reservations ([kernel.kallsyms].init.text)
        ffffffff816963b9 x86_64_start_kernel ([kernel.kallsyms].init.text)

sshd  1675/1675    537.037309: prev_comm=sshd prev_pid=1675 prev_prio=120...
        ffffffff81030350 perf_trace_sched_switch ([kernel.kallsyms])
        ffffffff81382ac5 schedule ([kernel.kallsyms])
        ffffffff813837aa schedule_hrtimeout_range_clock ([kernel.kallsyms])
        ffffffff81383886 schedule_hrtimeout_range ([kernel.kallsyms])
        ffffffff8110c4f9 poll_schedule_timeout ([kernel.kallsyms])
        ffffffff8110cd20 do_select ([kernel.kallsyms])
        ffffffff8110ced8 core_sys_select ([kernel.kallsyms])
        ffffffff8110d00d sys_select ([kernel.kallsyms])
        ffffffff81002bc2 system_call ([kernel.kallsyms])
            7f1647e56e93 __GI_select (/lib64/libc-2.12.90.so)

netstat  1692/1692    537.038664: prev_comm=netstat prev_pid=1692 prev_prio=...
        ffffffff81030350 perf_trace_sched_switch ([kernel.kallsyms])
        ffffffff81382ac5 schedule ([kernel.kallsyms])
        ffffffff81002c3a sysret_careful ([kernel.kallsyms])
            7f7a6cd1b210 __GI___libc_read (/lib64/libc-2.12.90.so)

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-6-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-14 17:06:50 -03:00
David Ahern 745f43e343 perf script: Support custom field selection for output
Allow a user to select which fields to print to stdout for event data.
Options include comm (command name), tid (thread id), pid (process id),
time (perf timestamp), cpu, event (for event name), and trace (for
trace data).

Default is set to maintain compatibility with current output; this
feature does alter output format slightly -- no '-' between command
and pid/tid.

Thanks to Frederic Weisbecker for detailed suggestions on this approach.

Examples (output compressed)

1. trace, default format

perf record -ga -e sched:sched_switch
perf script

swapper    0 [000] 537.037184: sched_switch: prev_comm=swapper prev_pid=0...
   sshd 1675 [000] 537.037309: sched_switch: prev_comm=sshd prev_pid=1675...
netstat 1692 [001] 537.038664: sched_switch: prev_comm=netstat prev_pid=1692...

2. trace, custom format

perf record -ga -e sched:sched_switch
perf script -f comm,pid,time,trace     <--- omitting cpu and event name

swapper    0 537.037184: prev_comm=swapper prev_pid=0 prev_prio=120 ...
   sshd 1675 537.037309: prev_comm=sshd prev_pid=1675 prev_prio=120 ...
netstat 1692 537.038664: prev_comm=netstat prev_pid=1692 prev_prio=120 ...

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-5-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-14 17:06:16 -03:00