Commit Graph

18623 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo f7004f5990 perf evsel: Switch to libperf's cpumap.h
We don't need what is in perf's util/cpumap.h, just the struct cpu_map
that is in libperf's internal/cpumap.h file to cover this one case:

  tools/perf/util/evsel.h:215:27: error: dereferencing pointer to incomplete type ‘struct perf_cpu_map’
  215 |  return evsel__cpus(evsel)->nr;

So switch to libperf's cpumap.h and add some missing struct foward
declarations and include sys/types.h to get pid_t.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ufjkpohijti05ggk69s91ktf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo 1028f96226 perf x86 kvm-stat: Add missing string.h header
It uses strcmp(), strstr() and was getting the required string.h header
by luck, from evsel.h -> cpumap.h -> debug.h -> string.h, add the
missing header.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-qrz8hhvrhwnmt5ocfwk4br5d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo 43cc5d5ecb perf evsel: util/evsel.h needs stdio.h as it uses FILE
And it was getting it by luck from util/cpumap.h that shouldn't be
included in util/evsel.h as it only needs what is in libperf, i.e.
struct cpu_map, that is in internal/cpumap.h, so add stdio.h before
we fix that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-2ywx5sl031tj3zske7c7edgv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo a06b7f422d perf evsel: Remove needless stddef.h from util/evsel.h
We added it in 07ac002f2f ("perf evsel: Introduce is_group_member
method") but we already ditched that function, and there was nothing
else left that needed NULL nor anything else from stddef.h, ditch it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1zy0xfsy61x81f3fpyx5znco@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo ddee688a83 perf evsel: Remove needless counts.h header from util/evsel.h
We need only a struct forward declaration, so prune the header
dependency tree a bit more.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-oqvgf04w4ku8xasrz79zquim@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo 69714a4e39 perf evsel: Add missing perf/evsel.h header in util/evsel.h
Since util/evsel.h uses perf_evsel__cpus() that has its prototype in
libperf's perf/evsel.h file, we need it explicitely included.

This was working by luck as util/evsel.h includes counts.h, but that is
not necessary, just some forward declarations, so, before we remove
counts.h from util/evsel.h, add what is realli needed.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-nfb9e0t4jm9zhvr0q86hc29d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo 430482c2e3 perf scripting python: Add missing counts.h header
It is getting this via evsel.h, that don't strictly need counts.h, just
forward declarations for some structs, so add it here before we remove
it from there.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-6bxk3ltwkw91qcld2ot86bgg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo bfc49182c6 perf stat: Add missing counts.h
It is getting this via evsel.h, that don't strictly need counts.h, just
forward declarations for some structs, so add it here before we remove
it from there.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-jwcbm9gv9llloe3he5qkdefs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo e4aec1b1bd perf tests: Add missing counts.h
Those are getting counts.h via evsel.h, that don't strictly need
counts.h, just forward declarations for some structs, so add it here
before we remove it from there.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-phldqlfxxu563txja7evd4zt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo 0f31c0195c perf script: Add missing counts.h
It is getting this via evsel.h, that don't strictly need counts.h, just
forward declarations for some structs, so add it here before we remove
it from there.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-q4shpvlxyjqz7val1hyrdak9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:56 -03:00
Arnaldo Carvalho de Melo e14e5497d5 perf evlist: Add missing xyarray.h header
It gets it very indirectly, via evsel.h -> counts.h, and since counts.h
doesn't need xyarray.h at all, add it here before we remove it there.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hkizv6gojwfklj9ezaiiztll@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:56 -03:00
Arnaldo Carvalho de Melo 964f384989 perf bpf: Add missing xyarray.h header
This was being obtained indirectly via evsel.h -> counts.h, since we
don't need xyarray in counts.h, we need to add it here explicitely
before removing it from counts.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-jirmxg527i82yz31bwad9we7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:56 -03:00
Arnaldo Carvalho de Melo 2d64ae9b85 perf counts: Add missing headers needed for types used
We get these by sheer luck, since we're cleaning unneeded headers use,
this needs to be done first to avoid breakage down the line.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-p7bncbi53t4p2kobkbmu86a4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:56 -03:00
Arnaldo Carvalho de Melo 7646602401 perf evsel: Move xyarray.h from evsel.c to evsel.h to reduce include dep tree
All we need in util/evsel.h is the foward declaration of 'struct
xyarray', not the internal/xyarray.h, that can be moved to util/evsel.c
and then we reduce the header dependency tree.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wwqce6ixwcyq6yzx3ljrdm80@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:56 -03:00
Arnaldo Carvalho de Melo 0b8026e8fb perf metricgroup: Remove needless includes from metricgroup.h
There we need just some struct forward declarations, do that instead and
add the includes needed by metricgroup.c.

That should help with needless rebuilds when changing the removed
headers from metricgroup.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1fkskjws6imir2hhztqhdyb0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:56 -03:00
Arnaldo Carvalho de Melo e740ca86f3 perf kvm s390: Add missing string.h header
It uses strstr(), needs to include string.h or its not going to build
when we remove string.h from the place it is getting from indirectly, by
luck.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-72y0i0uiaqght5b83e3ae7p4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:56 -03:00
Arnaldo Carvalho de Melo 45a2c0ccf6 perf arm64: Add missing debug.h header
This file uses pr_debug() but isn't including debug.h, getting it by
luck, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-t7pisnsdfh88kclpw52jcwl7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 15:09:18 -03:00
Jiri Olsa b4df75de3b libperf: Move perf's cpu_map__idx() to perf_cpu_map__idx()
As an internal function that will be used by both perf and libperf, but
is not exported at this point.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190822111141.25823-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 11:18:45 -03:00
Jiri Olsa 315c0a1f0c libperf: Move perf's cpu_map__empty() to perf_cpu_map__empty()
So it's part of the libperf library as one of basic functions operating
on the perf_cpu_map class.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190822111141.25823-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 11:17:03 -03:00
Jiri Olsa 6549cd8f2c perf tools: Use perf_cpu_map__nr instead of cpu_map__nr
Switch the rest of the perf code to use libperf's perf_cpu_map__nr(),
which is the same as current cpu_map__nr() and remove the cpu_map__nr()
function.

Link: http://lkml.kernel.org/n/tip-6e0guy75clis7nm0xpuz9fga@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190822111141.25823-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 11:14:54 -03:00
Jiri Olsa db9a5fd02a tools headers: Add missing perf_event.h include
We need perf_event.h include for 'struct perf_event_mmap_page'.

Link: http://lkml.kernel.org/n/tip-bolqkmqajexhccjb0ib0an8w@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190822111141.25823-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 11:12:36 -03:00
Ingo Molnar 4e92b18e5b perf/core improvements and fixes:
callchains:
 
    Alexey Budankov:
 
   - Allow collecting LBR together with DWARF callchains, for workloads
     where the userspace stack size collected is not big enough for
     pure DWARF based unwinding.
 
   - Dump the LBR call stack in 'perf report -D'.
 
 perf top:
 
   Arnaldo Carvalho de Melo:
 
   - Show visual cue at start to state that the minimal set of samples
     are being collected prior to sorting/bucketizing/displaying.
 
 CoreSight (ARM hardware tracing):
 
   Leo Yan:
 
   - Support sample flags 'insn' and 'insnlen'.
 
 core:
 
   Adrian Hunter:
 
   - Add comment for 'idx' member in 'struct perf_sample_id.
 
 tools headers:
 
   Arnaldo Carvalho de Melo:
 
   - Synchronize linux/bits.h, which required grabbing a copy of the kernel
     const.h headers and some changes in the ordering of header directories.
 
   - Sync x86's asm/cpufeatures.h with the with the kernel, no change in
     any of the tools.
 
 libperf:
 
   Jiri Olsa:
 
   - Fix arch include paths.
 
 libtraceevent:
 
   Steven Rostedt (VMware):
 
   - Fix "robust" test of do_generate_dynamic_list_file.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXVxFFAAKCRCyPKLppCJ+
 JwebAQCp3nc/q4rpE6v9W8b1uMAsmELl5q8D+74/1ZWT4dqPFAEA9ZmtsIi/7ZjV
 Dq3403hFUfk1kDacEvUXKgzi3PMq+Qc=
 =DK9x
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-5.4-20190820' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

callchains:

   Alexey Budankov:

  - Allow collecting LBR together with DWARF callchains, for workloads
    where the userspace stack size collected is not big enough for
    pure DWARF based unwinding.

  - Dump the LBR call stack in 'perf report -D'.

perf top:

  Arnaldo Carvalho de Melo:

  - Show visual cue at start to state that the minimal set of samples
    are being collected prior to sorting/bucketizing/displaying.

CoreSight (ARM hardware tracing):

  Leo Yan:

  - Support sample flags 'insn' and 'insnlen'.

core:

  Adrian Hunter:

  - Add comment for 'idx' member in 'struct perf_sample_id.

tools headers:

  Arnaldo Carvalho de Melo:

  - Synchronize linux/bits.h, which required grabbing a copy of the kernel
    const.h headers and some changes in the ordering of header directories.

  - Sync x86's asm/cpufeatures.h with the with the kernel, no change in
    any of the tools.

libperf:

  Jiri Olsa:

  - Fix arch include paths.

libtraceevent:

  Steven Rostedt (VMware):

  - Fix "robust" test of do_generate_dynamic_list_file.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-08-20 21:38:22 +02:00
Ingo Molnar 51c359c2fd Linux 5.3-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl1Zw6ceHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGbiUH/0kqDBzkpne1odxW
 LeAPtTgmxDbcOE/bgIk374e95mn3EP3arna01BLc5ztwkQ521f4Iw5mKW5InZcyu
 3/IvpYeQUcdazphWSu72VnUZ8QfYNh4NJDjAx6iyliQ1NpJF9LLYLWWjlqwGbWHQ
 USbwp7A+56m1AWWmce2r50DK7jEZShKxRBQrXNXtvn8+YaVMvmdZpT6ejyG52J+4
 zr9yYrT9sa5jcPGPnWN/sx03/BPij+yOFKKe8L9vprb3uEmNKPvqtAbUpI0QYw6j
 T+eZELLxAOsUk84kxQyTLCU/GMesP6hIaE93HlpmgcQkBBzK7H5SBN37r8OJjOeS
 IXlJX4c=
 =9Iey
 -----END PGP SIGNATURE-----

Merge tag 'v5.3-rc5' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-08-20 21:38:05 +02:00
Jiri Olsa b81d39c7a1 libperf: Fix arch include paths
Guenter Roeck reported problem with compilation when the ARCH is
specified:

  $ make ARCH=x86_64
  In file included from tools/include/asm/atomic.h:6:0,
                   from include/linux/atomic.h:5,
                   from tools/include/linux/refcount.h:41,
                   from cpumap.c:4: tools/include/asm/../../arch/x86/include/asm/atomic.h:11:10:
  fatal error: asm/cmpxchg.h: No such file or directory

The problem is that we don't use SRCARCH (the sanitized ARCH version)
and we don't get the proper include path.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 3143504918 ("libperf: Make libperf.a part of the perf build")
Link: http://lkml.kernel.org/r/20190820124624.GG24105@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:29:36 -03:00
Arnaldo Carvalho de Melo 42fc2e9ef9 tools headers: Fixup bitsperlong per arch includes
We were getting the file by luck, from one of the paths in -I, fix it to
get it from the proper place:

  $ cd tools/include/uapi/asm/
  [acme@quaco asm]$ grep include bitsperlong.h
  #include "../../arch/x86/include/uapi/asm/bitsperlong.h"
  #include "../../arch/arm64/include/uapi/asm/bitsperlong.h"
  #include "../../arch/powerpc/include/uapi/asm/bitsperlong.h"
  #include "../../arch/s390/include/uapi/asm/bitsperlong.h"
  #include "../../arch/sparc/include/uapi/asm/bitsperlong.h"
  #include "../../arch/mips/include/uapi/asm/bitsperlong.h"
  #include "../../arch/ia64/include/uapi/asm/bitsperlong.h"
  #include "../../arch/riscv/include/uapi/asm/bitsperlong.h"
  #include "../../arch/alpha/include/uapi/asm/bitsperlong.h"
  #include <asm-generic/bitsperlong.h>
  $ ls -la ../../arch/x86/include/uapi/asm/bitsperlong.h
  ls: cannot access '../../arch/x86/include/uapi/asm/bitsperlong.h': No such file or directory
  $ ls -la ../../../arch/*/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 237 ../../../arch/alpha/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 841 ../../../arch/arm64/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 966 ../../../arch/hexagon/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 234 ../../../arch/ia64/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 100 ../../../arch/microblaze/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 244 ../../../arch/mips/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 352 ../../../arch/parisc/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 312 ../../../arch/powerpc/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 353 ../../../arch/riscv/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 292 ../../../arch/s390/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 323 ../../../arch/sparc/include/uapi/asm/bitsperlong.h
  -rw-rw-r--. 1 320 ../../../arch/x86/include/uapi/asm/bitsperlong.h
  $

Found while fixing some other problem, before it was escaping the
tools/ chroot and using stuff in the kernel sources:

    CC       /tmp/build/perf/util/find_bit.o
In file included from /git/linux/tools/include/../../arch/x86/include/uapi/asm/bitsperlong.h:11,
                 from /git/linux/tools/include/uapi/asm/bitsperlong.h:3,
                 from /git/linux/tools/include/linux/bits.h:6,
                 from /git/linux/tools/include/linux/bitops.h:13,
                 from ../lib/find_bit.c:17:

  # cd /git/linux/tools/include/../../arch/x86/include/uapi/asm/
  # pwd
  /git/linux/arch/x86/include/uapi/asm
  #

Now it is getting the one we want it to, i.e. the one inside tools/:

    CC       /tmp/build/perf/util/find_bit.o
  In file included from /git/linux/tools/arch/x86/include/uapi/asm/bitsperlong.h:11,
                   from /git/linux/tools/include/linux/bits.h:6,
                   from /git/linux/tools/include/linux/bitops.h:13,

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8f8cfqywmf6jk8a3ucr0ixhu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:23:00 -03:00
Arnaldo Carvalho de Melo 5c959b6d8f perf top: Show info message while collecting samples
Give visual cue about what is happening while initially collecting the
minimal set of samples to collect/sort/display.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-xcui60p1v6ozijfam2o89ya8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:22:39 -03:00
Arnaldo Carvalho de Melo 2284cf8074 perf ui browser: Allow specifying message to show when no samples are available to display
The 'perf top' tool will use that to avoid having a initial blank screen
while collecting the minimum number of samples to sort and display.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-89ciceg8cy4442he3t0jzo3f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:22:18 -03:00
Arnaldo Carvalho de Melo 9b01611934 perf ui: Introduce non-interactive ui__info_window() function
Sometimes we want just to print a message on the center of the screen,
like in 'perf top' while we wait for the minimum amount of samples to be
collected before sorting and showing them.

Also expose __ui__info_window() as an optimization for cases where such
message is to be printed while holding the ui lock.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-uat0f89vfwl2w52kv9wzwd8a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:21:49 -03:00
Arnaldo Carvalho de Melo 9e79ff77e4 perf ui: Make 'exit_msg' optional in ui__question_window()
We will not need it when refactoring this function to be
non-interactive, so make it optional.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-pnx1dn17bsz7lqt9ty95nnjx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:21:27 -03:00
Leo Yan a4973d8f7b perf cs-etm: Support sample flags 'insn' and 'insnlen'
The synthetic branch and instruction samples are missed to set
instruction related info, thus the perf tool fails to display samples
with flags '-F,+insn,+insnlen'.

The CoreSight trace decoder provides sufficient information to decide
the instruction size based on the ISA type: A64/A32 instructions are
32-bit size, but one exception is the T32 instruction size, which might
be 32-bit or 16-bit.

This patch handles these cases and it reads the instruction values from
DSO file; thus can support the flags '-F,+insn,+insnlen'.

Before:

  # perf script -F,insn,insnlen,ip,sym
                0 [unknown] ilen: 0
     ffff97174044 _start ilen: 0
     ffff97174938 _dl_start ilen: 0
     ffff97174938 _dl_start ilen: 0
     ffff97174938 _dl_start ilen: 0
     ffff97174938 _dl_start ilen: 0
     ffff97174938 _dl_start ilen: 0
     ffff97174938 _dl_start ilen: 0
     ffff97174938 _dl_start ilen: 0
     ffff97174938 _dl_start ilen: 0

  [...]

After:

  # perf script -F,insn,insnlen,ip,sym
                0 [unknown] ilen: 0
     ffff97174044 _start ilen: 4 insn: 2f 02 00 94
     ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
     ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
     ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
     ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
     ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
     ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
     ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
     ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54

  [...]

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190815082854.18191-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:20:52 -03:00
Alexey Budankov 10ccbc1cc0 perf report: Prefer DWARF callstacks to LBR ones when captured both
Display DWARF based callchains when the perf.data file contains raw thread
stack data as LBR callstack data.

Commiter testing:

This changes the output from the branch stack based one, i.e. without
this patch, for the same file as in the previous csets:

  # perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 13  of event 'cycles'
  # Event count (approx.): 13
  #
  # Overhead  Command  Source Shared Object  Source Symbol                Target Symbol                              Basic Block Cycles
  # ........  .......  ....................  ...........................  .........................................  ..................
  #
       7.69%  ls       libpthread-2.29.so    [.] _init                    [.] __pthread_initialize_minimal_internal  6827
       7.69%  ls       ld-2.29.so            [k] _start                   [k] _dl_start                              -
       7.69%  ls       ld-2.29.so            [.] _dl_start_user           [.] _dl_init                               -24790
       7.69%  ls       ld-2.29.so            [k] _dl_start                [k] _dl_sysdep_start                       278
       7.69%  ls       ld-2.29.so            [k] dl_main                  [k] _dl_map_object_deps                    15581
       7.69%  ls       ld-2.29.so            [k] open_verify.constprop.0  [k] lseek64                                4228
       7.69%  ls       ld-2.29.so            [k] _dl_map_object           [k] open_verify.constprop.0                55
       7.69%  ls       ld-2.29.so            [k] openaux                  [k] _dl_map_object                         67
       7.69%  ls       ld-2.29.so            [k] _dl_map_object_deps      [k] 0x00007f441b57c090                     112
       7.69%  ls       ld-2.29.so            [.] call_init.part.0         [.] _init                                  334
       7.69%  ls       ld-2.29.so            [.] _dl_init                 [.] call_init.part.0                       383
       7.69%  ls       ld-2.29.so            [k] _dl_sysdep_start         [k] dl_main                                45
       7.69%  ls       ld-2.29.so            [k] _dl_catch_exception      [k] openaux                                116

  #
  # (Tip: For memory address profiling, try: perf mem record / perf mem report)
  #

To the one that shows call chains:

  # perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 10  of event 'cycles'
  # Event count (approx.): 3204047
  #
  # Children      Self  Command  Shared Object       Symbol
  # ........  ........  .......  ..................  .........................................
  #
      55.01%     0.00%  ls       [kernel.vmlinux]    [k] entry_SYSCALL_64_after_hwframe
              |
              ---entry_SYSCALL_64_after_hwframe
                 do_syscall_64
                 |
                  --16.01%--__x64_sys_execve
                            __do_execve_file.isra.0
                            search_binary_handler
                            load_elf_binary
                            elf_map
                            vm_mmap_pgoff
                            do_mmap
                            mmap_region
                            perf_event_mmap
                            perf_iterate_sb
                            perf_iterate_ctx
                            perf_event_mmap_output
                            perf_output_copy
                            memcpy_erms

      55.01%    39.00%  ls       [kernel.vmlinux]    [k] do_syscall_64
              |
              |--39.00%--0xffffffffffffffff
              |          _dl_map_object
              |          open_verify.constprop.0
              |          __lseek64 (inlined)
              |          entry_SYSCALL_64_after_hwframe
              |          do_syscall_64
              |
               --16.01%--do_syscall_64
                         __x64_sys_execve
                         __do_execve_file.isra.0
                         search_binary_handler
                         load_elf_binary
                         elf_map
                         vm_mmap_pgoff
                         do_mmap
                         mmap_region
                         perf_event_mmap
                         perf_iterate_sb
                         perf_iterate_ctx
                         perf_event_mmap_output
                         perf_output_copy
                         memcpy_erms

      42.95%    42.95%  ls       libpthread-2.29.so  [.] __pthread_initialize_minimal_internal
              |
              ---_init
                 __pthread_initialize_minimal_internal

      42.95%     0.00%  ls       libpthread-2.29.so  [.] _init
              |
              ---_init
                 __pthread_initialize_minimal_internal

  <SNIP>

  #
  # (Tip: Profiling branch (mis)predictions with: perf record -b / perf report)
  #
  #

The branch stack view be explicitely selected using:

  # perf report -h branch-stack

   Usage: perf report [<options>]

      -b, --branch-stack    use branch records for per branch histogram filling

  #

I.e. after this patch:

  # perf report -b --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 13  of event 'cycles'
  # Event count (approx.): 13
  #
  # Overhead  Command  Source Shared Object  Source Symbol                Target Symbol                              Basic Block Cycles
  # ........  .......  ....................  ...........................  .........................................  ..................
  #
       7.69%  ls       libpthread-2.29.so    [.] _init                    [.] __pthread_initialize_minimal_internal  6827
       7.69%  ls       ld-2.29.so            [k] _start                   [k] _dl_start                              -
       7.69%  ls       ld-2.29.so            [.] _dl_start_user           [.] _dl_init                               -24790
       7.69%  ls       ld-2.29.so            [k] _dl_start                [k] _dl_sysdep_start                       278
       7.69%  ls       ld-2.29.so            [k] dl_main                  [k] _dl_map_object_deps                    15581
       7.69%  ls       ld-2.29.so            [k] open_verify.constprop.0  [k] lseek64                                4228
       7.69%  ls       ld-2.29.so            [k] _dl_map_object           [k] open_verify.constprop.0                55
       7.69%  ls       ld-2.29.so            [k] openaux                  [k] _dl_map_object                         67
       7.69%  ls       ld-2.29.so            [k] _dl_map_object_deps      [k] 0x00007f441b57c090                     112
       7.69%  ls       ld-2.29.so            [.] call_init.part.0         [.] _init                                  334
       7.69%  ls       ld-2.29.so            [.] _dl_init                 [.] call_init.part.0                       383
       7.69%  ls       ld-2.29.so            [k] _dl_sysdep_start         [k] dl_main                                45
       7.69%  ls       ld-2.29.so            [k] _dl_catch_exception      [k] openaux                                116

  #
  # (Tip: Show current config key-value pairs: perf config --list)
  #
  #

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ccbd9583-82f4-dec5-7e84-64bf56e351fb@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:20:16 -03:00
Alexey Budankov d2720c3dad perf report: Dump LBR callstack data by -D jointly with thread stack
Make perf report -D command print captured LBR callstack chain when it is
collected together with raw thread stack data:

  2752673087247083 0x5d10 [0x548]: PERF_RECORD_SAMPLE(IP, 0x4002): 5841/5841: 0x40121f period: 1543862 addr: 0
  ... FP chain: nr:0
  ... branch callstack: nr:3
  .....  0: 00000000004011d0
  .....  1: 00007f393c388411
  .....  2: 0000000000401098
  ... user regs: mask 0xff0fff ABI 64-bit
  .... AX    0x34e7
  .... BX    0x7fff5f6dd3c0
  .... CX    0xffffffff
  .... DX    0x34e6
  .... SI    0x7f393c5268d0
  .... DI    0x0
  .... BP    0x401260
  .... SP    0x7fff5f6dd3c0
  .... IP    0x40121f
  .... FLAGS 0x29f
  .... CS    0x33
  .... SS    0x2b
  .... R8    0x7f393c526800
  .... R9    0x7f393c525da0
  .... R10   0xfffffffffffff70a
  .... R11   0x246
  .... R12   0x401070
  .... R13   0x7fff5f6ddcb0
  .... R14   0x0
  .... R15   0x0
  ... ustack: size 1024, offset 0x130
   . data_src: 0x5080021
   ... thread: stack_test:5841
   ...... dso: /root/abudanko/stacks/stack_test

Committer testing:

  # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.042 MB perf.data (10 samples) ]
  #

Before:

  # perf report -D |& grep PERF_RECORD_SAMPLE -A28 | tail -29
  67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0
  ... FP chain: nr:0
  ... user regs: mask 0xff0fff ABI 64-bit
  .... AX    0x7f441b2b1000
  .... BX    0x7f441b55b970
  .... CX    0x7fff6e2db218
  .... DX    0x7fff6e2db218
  .... SI    0x7fff6e2db208
  .... DI    0x1
  .... BP    0x1
  .... SP    0x7fff6e2db178
  .... IP    0x7f441b2b1e20
  .... FLAGS 0x20a
  .... CS    0x33
  .... SS    0x2b
  .... R8    0x1
  .... R9    0x7f441b371c18
  .... R10   0x7f441b5a5f10
  .... R11   0x202
  .... R12   0x7fff6e2db208
  .... R13   0x7fff6e2db218
  .... R14   0x7f441b5a7150
  .... R15   0x0
  ... ustack: size 1024, offset 0x148
   . data_src: 0x5080021
   ... thread: ls:9721
   ...... dso: /usr/lib64/libpthread-2.29.so

  0xad00 [0x60]: event: 10
  #

After:

  # perf report -D |& grep PERF_RECORD_SAMPLE -A31 | tail -32
  67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0
  ... FP chain: nr:0
  ... branch callstack: nr:4
  .....  0: 00007f441b2b1e20
  .....  1: 00007f441b58af1a
  .....  2: 00007f441b58b0e1
  .....  3: 00007f441b57c145
  ... user regs: mask 0xff0fff ABI 64-bit
  .... AX    0x7f441b2b1000
  .... BX    0x7f441b55b970
  .... CX    0x7fff6e2db218
  .... DX    0x7fff6e2db218
  .... SI    0x7fff6e2db208
  .... DI    0x1
  .... BP    0x1
  .... SP    0x7fff6e2db178
  .... IP    0x7f441b2b1e20
  .... FLAGS 0x20a
  .... CS    0x33
  .... SS    0x2b
  .... R8    0x1
  .... R9    0x7f441b371c18
  .... R10   0x7f441b5a5f10
  .... R11   0x202
  .... R12   0x7fff6e2db208
  .... R13   0x7fff6e2db218
  .... R14   0x7f441b5a7150
  .... R15   0x0
  ... ustack: size 1024, offset 0x148
   . data_src: 0x5080021
   ... thread: ls:9721
   ...... dso: /usr/lib64/libpthread-2.29.so
  #

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/aa82e5dd-def2-0ca8-a064-db9e2e8ad076@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:19:44 -03:00
Alexey Budankov 2566349648 perf record: Enable LBR callstack capture jointly with thread stack
Enable '-j stack' applicability together with '--call-graph dwarf'
option so thread stack data and LBR call stack could be captured
jointly:

  $ perf record -g --call-graph dwarf,1024 -j stack,u -- stack_test

Collected LBR call stack can be used to augment DWARF call stack
calculated from the raw thread stack data and to provide more
comprehensive call stack information for cases when collected SIZE is
not enough to cover complete thread stack.

Such cases are typical for workloads that allocate large arrays of data
on its threads stacks or the possible SIZE to collect can't be large
enough due to workload nature or system configuration and this is where
hardware captured LBR call stacks can provide missing stack frames.
Possible DWARF plus LBR call stacks consolidation algorithm description
follows.

With this patch set perf report command UI currently ignores collected
LBR call stack data and still provides DWARF based call stacks
information.

  ===========================================================================

  Overview:

   Legend:

   THS - thread stack
   CTX - thread register context
   SWS - software stack
   SSF - skipped stack frames
   PSS - Perf sample stack

   ip,sp,bp - HW registers values
   d        - allocated stack regions
   kip      - ip address in the kernel space
   K        - captured thread stack size

        THS

       -----
       |   |<-stack bottom
        ...
       |---|
       |ip4|
       |---|         PSS = SWS(THS(K))
       |   |
   --> |   |
   |   |d3 |                  user/
   |   |---|         user PSS kernel PSS
   |   |ip3|         ------   ------
   |   |---|         |SSF |   |SSF |
   |   |   |          ....     ....
   |   |   |         ------   ------
   |   |d2 |         | -1 |   | -1 |
       |---|   user  ------   ------
   K   |ip2|   CTX   |ip3 |   |ip3 |
       |---|         |----|   |----|
   |   |d1 |   ...   |ip2 | , |ip2 |
   |   |---|  |---|  |----|   |----|
   |   |ip1|  |bp0|  |ip1 |   |ip1 |
   |   |---|  |---|  |----|   |----|
   |   |   |  |ip0|->|ip0 |   |ip0 |<-user stack top
   |   |   |  |---|  ------   ------
   |   |   |<-|sp0|<-stack    |kip0|<-kernel stack bottom
   --> -----  -----   top     |----|
                              |kip1|
                              |----|
		              |kip2|
		              |----|
                               ....
			      |    |<-kernel stack top
                              ------

  Algorithm details:

   Legend:

   HWS - hardware stack
   K-SWS - kernel software stack

			 BRANCH
			 TABLE

		 HWS      ip   ip
			  from to
		 ------  -----------
		 |ip7`|  |ip7`|    |
		 |----|  |----|----|
		 |ip6`|  |ip6`|    |
	user PSS |----|  |----|----|
		 |ip5`|  |ip5`|    |
	------   |----|  |----|----|
	| -1 |   |ip4`|  |ip4`|    |
	------   |----|  |----|----|
	|ip3 |~~~|ip3`|  |ip3`|    |
	|----|   |----|  |----|----|
	|ip2 |~~~|ip2`|  |ip2`|    |
	|----| 	 |----|  |----|----|
	|ip1 |~~~|ip1`|  |ip1`|ip0`|
	|----| 	 |----|  -----------
	|ip0 |~~~|ip0`|<---------'
	------   ------

	1. if (sym(ipj) == sym(ipj`)), j=0-3 ===> user PSS
	2. ipj`                      , j=4-7 ===> user PSS

  Augmented PSS = A_SWS(SWS(THS(K)), HWS):

	         user/
       user PSS  kernel PSS

	------   ------
	|ip7`|   |ip7`|<-user PSS bottom
	|----|   |----|
	|ip6`|   |ip6`|
	|----|   |----|
    HWS	|ip5`|   |ip5`|
	|----|   |----|
	|ip4`|   |ip4`|
	------   ------
	|ip3 |   |ip3 |
	|----|   |----|
    SWS |ip2 |   |ip2 |
	|----|   |----|
	|ip1 |   |ip1 |
	|----|   |----|
	|ip0 |   |ip0 |<-user PSS top
	------   ------
		 |kip0|<-kernel PSS bottom
		 |----|
		 |kip1|
	   K-SWS |----|
		 |kip2|
		 |----|
		 |kip3|<-kernel PSS top
		 ------

                  APSS

Committer testing:

Before:

  # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
  unknown branch filter stack, check man page

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -j, --branch-filter <branch filter mask>
                            branch stack filter modes
  # perf record -g --call-graph dwarf,1024 -j u ls > /dev/null
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.054 MB perf.data (12 samples) ]
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CALLCHAIN|PERIOD|BRANCH_STACK|REGS_USER|STACK_USER|DATA_SRC, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY, sample_regs_user: 0xff0fff, sample_stack_user: 1024
   #

After:

  # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.044 MB perf.data (11 samples) ]
  [root@quaco ~]# perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CALLCHAIN|PERIOD|BRANCH_STACK|REGS_USER|STACK_USER|DATA_SRC, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: USER|CALL_STACK, sample_regs_user: 0xff0fff, sample_stack_user: 1024
  #

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/e9e00090-66fb-d2a4-c90f-1d12344f7788@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:18:58 -03:00
Steven Rostedt (VMware) 82a2f88458 tools lib traceevent: Fix "robust" test of do_generate_dynamic_list_file
The tools/lib/traceevent/Makefile had a test added to it to detect a failure
of the "nm" when making the dynamic list file (whatever that is). The
problem is that the test sorts the values "U W w" and some versions of sort
will place "w" ahead of "W" (even though it has a higher ASCII value, and
break the test.

Add 'tr "w" "W"' to merge the two and not worry about the ordering.

Reported-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal rarek <mmarek@suse.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org
Fixes: 6467753d61 ("tools lib traceevent: Robustify do_generate_dynamic_list_file")
Link: http://lkml.kernel.org/r/20190805130150.25acfeb1@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:18:54 -03:00
Adrian Hunter 3c84e65a53 perf evsel: Add comment for 'idx' member in 'struct perf_sample_id
The 'idx' member was added as preparation for AUX area sampling. Add a
comment to describe why.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/83ff264f-84c3-5372-8976-dd9293d20c6f@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:17:45 -03:00
Arnaldo Carvalho de Melo 0ac10d87a5 tools arch x86: Sync asm/cpufeatures.h with the with the kernel
To pick up the changes in:

  f36cf386e3 ("x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS")
  18ec54fdd6 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")

That don't affect anything in tools/.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/n/tip-860dq1qie2cpnfghlpcnxrzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:17:09 -03:00
Arnaldo Carvalho de Melo b658911731 tools headers: Synchronize linux/bits.h with the kernel sources
To pick up the changes in this cset:

  95b980d62d ("linux/bits.h: make BIT(), GENMASK(), and friends available in assembly")

To address this tools/perf build warning:

  Warning: Kernel ABI header at 'tools/include/linux/bits.h' differs from latest version at 'include/linux/bits.h'
  diff -u tools/include/linux/bits.h include/linux/bits.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1if3iga5r3di6oyddgxsr225@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:09:21 -03:00
Arnaldo Carvalho de Melo aaa6ef8aa8 tools headers: Grab copy of linux/const.h, needed by linux/bits.h
So that can update the copy of linux/bits.h that now uses macros defined
in const.h and that are not available in older systems.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c2qfcbl58hxyfb5u5xivp7is@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:08:23 -03:00
Arnaldo Carvalho de Melo 146dc30363 perf tools: tools/include should come before tools/uapi/include
The next cset will grap const.h copies from the kernel to keep bits.h
in sync as it started to use linux/const.h, that in turn includes
uapi/linux/const.h.

So now we have a file with the same name in tools/include and
tools/uapi/include, and one includes the other, we need to have
tools/include/uapi/ after tools/include/ for this to work, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-qzjqxa1wdrt51kwadyqawnuj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:07:22 -03:00
Arnaldo Carvalho de Melo 6e98bc349e tools headers: Add limits.h to access __WORDSIZE
We need to make sure limits.h is included before checking if we can use
__WORDSIZE, do it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-5yfoed4rnsck2n3cwhm9mvth@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:03:05 -03:00
Linus Torvalds 85d8d3b172 - A few fixes for the userspace hyper-v tools from Adrian Vladu.
- A fix for the hyper-v MAINTAINERs entry from Lan Tianyu.
 - Fix for SPDX license identifier in the userspace tools from Nishad
 Kamdar.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAl1YXCEACgkQ3qZv95d3
 LNy/cxAAjyyxIPmu7bUi9jN9YHbEQjCRQJBMvm+/hfqoiiWZa/GglfWw5EY+zAJi
 pHzWyMiOTeubgiMpA17yE0QK6iigPcxN7mOmcrmuWuKIgJthLZFXPhaCrx8apEty
 Ovm9qNjFulpBks/U2r+Pfi8rj7gLqJ5a5qGydYvlrJOTfIRXDYU93HAAdJutraqF
 w34sWxpQjKaTxva+JK7Em2/UCXrCNQ8Tzz7E3cLf20Z+PDGgkHAIWF8p+SAlosOd
 GKzYrW8z7yr1pwcwyLW6yEMlKMENq66pRkk1s0oUbwgQuCmb8HllWazIRkXAAHiD
 1I6r4A2aH7VemBxGKoAH0ENXB5MX5hTmyjOXONM4wqa/24nQsm9mjpuDY/aIRngZ
 3qCsQ/4xpZpKJxccY7DzAcpT7VPSpMM9patAk2C+QXbYIg/HpWfY+42OsnwjnjcE
 D89azjnhIeyOWOFvKxD187hPXd/lfoReC+czCisJvHxR4OPOP1chIUx6VZ25e91d
 gzJqggippL5+QgSt2pYtf6INlsMThvW/2/LLXGxR/r+CEpVAfXFlLBoz7OfwElZ2
 m2vdbwo+0aIOVHL51594Wc78yZM2EQxlMNwt8l7mKTDxxwSlIcC4VHX8r46Df8AA
 GgFSG2pnMNID/a9Rhr7KWzXnaBlALKfGKypz9BPtLKasHal32X8=
 =iKcM
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull Hyper-V fixes from Sasha Levin:

 - A few fixes for the userspace hyper-v tools from Adrian Vladu.

 - A fix for the hyper-v MAINTAINERs entry from Lan Tianyu.

 - Fix for SPDX license identifier in the userspace tools from Nishad
   Kamdar.

* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  MAINTAINERS: Fix Hyperv vIOMMU driver file name
  tools: hv: Use the correct style for SPDX License Identifier
  tools: hv: fix typos in toolchain
  tools: hv: fix KVP and VSS daemons exit code
  tools: hv: fixed Python pep8/flake8 warnings for lsvmbus
2019-08-17 19:31:30 -07:00
Adrian Vladu 2d35c66036 tools: hv: fix typos in toolchain
Fix typos in the HyperV toolchain.

Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com>

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Alessandro Pilotti <apilotti@cloudbasesolutions.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-17 15:29:28 -04:00
Adrian Vladu b099515607 tools: hv: fix KVP and VSS daemons exit code
HyperV KVP and VSS daemons should exit with 0 when the '--help'
or '-h' flags are used.

Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com>

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Alessandro Pilotti <apilotti@cloudbasesolutions.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-17 15:29:23 -04:00
Adrian Vladu 5912e791f3 tools: hv: fixed Python pep8/flake8 warnings for lsvmbus
Fixed pep8/flake8 python style code for lsvmbus tool.

The TAB indentation was on purpose ignored (pep8 rule W191) to make
sure the code is complying with the Linux code guideline.
The following command doe not show any warnings now:
pep8 --ignore=W191 lsvmbus
flake8 --ignore=W191 lsvmbus

Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com>

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Alessandro Pilotti <apilotti@cloudbasesolutions.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-17 15:29:18 -04:00
John Keeping e2736219e6 perf unwind: Remove unnecessary test
If dwarf_callchain_users is false, then unwind__prepare_access() will
not set unwind_libunwind_ops so the remaining test here is sufficient.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: john keeping <john@metanate.com>
Link: http://lkml.kernel.org/r/20190815100146.28842-3-john@metanate.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-16 12:30:14 -03:00
John Keeping e8ba2906f6 perf unwind: Fix libunwind when tid != pid
Commit e5adfc3e7e ("perf map: Synthesize maps only for thread group
leader") changed the recording side so that we no longer get mmap events
for threads other than the thread group leader (when synthesising these
events for threads which exist before perf is started).

When a file recorded after this change is loaded, the lack of mmap
records mean that unwinding is not set up for any other threads.

This can be seen in a simple record/report scenario:

	perf record --call-graph=dwarf -t $TID
	perf report

If $TID is a process ID then the report will show call graphs, but if
$TID is a secondary thread the output is as if --call-graph=none was
specified.

Following the rationale in that commit, move the libunwind fields into
struct map_groups and update the libunwind functions to take this
instead of the struct thread.  This is only required for
unwind__finish_access which must now be called from map_groups__delete
and the others are changed for symmetry.

Note that unwind__get_entries keeps the thread argument since it is
required for symbol lookup and the libdw unwind provider uses the thread
ID.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: e5adfc3e7e ("perf map: Synthesize maps only for thread group leader")
Link: http://lkml.kernel.org/r/20190815100146.28842-2-john@metanate.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-16 12:25:57 -03:00
John Keeping ab6cd0e527 perf map: Use zalloc for map_groups
In the next commit we will add new fields to map_groups and we need
these to be null if no value is assigned.  The simplest way to achieve
this is to request zeroed memory from the allocator.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: john keeping <john@metanate.com>
Link: http://lkml.kernel.org/r/20190815100146.28842-1-john@metanate.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-16 12:25:23 -03:00
Arnaldo Carvalho de Melo ef4b1a539f perf report: Add --switch-on/--switch-off events
Since 'perf top' shares the histogram browser with 'perf report', then
the same explanation in the previous cset applies.

An additional example uses a pair of SDT events available for systemtap:

  # perf probe --exec=/usr/bin/stap '%*:*'
  Added new events:
    sdt_stap:benchmark__thread__start (on %* in /usr/bin/stap)
    sdt_stap:benchmark   (on %* in /usr/bin/stap)
    sdt_stap:benchmark__thread__end (on %* in /usr/bin/stap)
    sdt_stap:pass6__start (on %* in /usr/bin/stap)
    sdt_stap:pass6__end  (on %* in /usr/bin/stap)
    sdt_stap:pass5__start (on %* in /usr/bin/stap)
    sdt_stap:pass5__end  (on %* in /usr/bin/stap)
    sdt_stap:pass0__start (on %* in /usr/bin/stap)
    sdt_stap:pass0__end  (on %* in /usr/bin/stap)
    sdt_stap:pass1a__start (on %* in /usr/bin/stap)
    sdt_stap:pass1b__start (on %* in /usr/bin/stap)
    sdt_stap:pass1__end  (on %* in /usr/bin/stap)
    sdt_stap:pass2__start (on %* in /usr/bin/stap)
    sdt_stap:pass2__end  (on %* in /usr/bin/stap)
    sdt_stap:pass3__start (on %* in /usr/bin/stap)
    sdt_stap:pass3__end  (on %* in /usr/bin/stap)
    sdt_stap:pass4__start (on %* in /usr/bin/stap)
    sdt_stap:pass4__end  (on %* in /usr/bin/stap)
    sdt_stap:benchmark__start (on %* in /usr/bin/stap)
    sdt_stap:benchmark__end (on %* in /usr/bin/stap)
    sdt_stap:cache__get  (on %* in /usr/bin/stap)
    sdt_stap:cache__clean (on %* in /usr/bin/stap)
    sdt_stap:cache__add__module (on %* in /usr/bin/stap)
    sdt_stap:cache__add__source (on %* in /usr/bin/stap)
    sdt_stap:stap_system__complete (on %* in /usr/bin/stap)
    sdt_stap:stap_system__start (on %* in /usr/bin/stap)
    sdt_stap:stap_system__spawn (on %* in /usr/bin/stap)
    sdt_stap:stap_system__fork (on %* in /usr/bin/stap)
    sdt_stap:intern_string (on %* in /usr/bin/stap)
    sdt_stap:client__start (on %* in /usr/bin/stap)
    sdt_stap:client__end (on %* in /usr/bin/stap)

  You can now use it in all perf tools, such as:

  	perf record -e sdt_stap:client__end -aR sleep 1

  #

From these we're use the two below to run systemtap's test suite:

  # perf record -e sdt_stap:pass2__*,cycles:P make installcheck > /dev/null
  ^C[ perf record: Woken up 8 times to write data ]
  [ perf record: Captured and wrote 2.691 MB perf.data (39638 samples) ]
  Terminated
  # perf script | grep sdt_stap
              stap 28979 [000] 19424.302660: sdt_stap:pass2__start: (561b9a537de3) arg1=140730364262544
              stap 28979 [000] 19424.333083:   sdt_stap:pass2__end: (561b9a53a9e1) arg1=140730364262544
              stap 29045 [006] 19424.933460: sdt_stap:pass2__start: (563edddcede3) arg1=140722674883152
              stap 29045 [006] 19424.963794:   sdt_stap:pass2__end: (563edddd19e1) arg1=140722674883152
  # perf script | grep cycles |  wc -l
  39634
  #

Looking at the whole perf.data file:

  [root@quaco testsuite]# perf report | grep cycles:P -A25
  # Samples: 39K of event 'cycles:P'
  # Event count (approx.): 34044267368
  #
  # Overhead  Command  Shared Object         Symbol
  # ........  .......  ....................  ................................
  #
       3.50%  cc1      cc1                   [.] ht_lookup_with_hash
       3.04%  cc1      cc1                   [.] _cpp_lex_token
       2.11%  cc1      cc1                   [.] ggc_internal_alloc
       1.83%  cc1      cc1                   [.] cpp_get_token_with_location
       1.68%  cc1      libc-2.29.so          [.] _int_malloc
       1.41%  cc1      cc1                   [.] linemap_position_for_column
       1.25%  cc1      cc1                   [.] ggc_internal_cleared_alloc
       1.20%  cc1      cc1                   [.] c_lex_with_flags
       1.18%  cc1      cc1                   [.] get_combined_adhoc_loc
       1.05%  cc1      libc-2.29.so          [.] malloc
       1.01%  cc1      libc-2.29.so          [.] _int_free
       0.96%  stap     stap                  [.] std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, stringtable_hash, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > >
       0.78%  stap     stap                  [.] lexer::scan
       0.74%  cc1      cc1                   [.] _cpp_lex_direct
       0.70%  cc1      cc1                   [.] pop_scope
       0.70%  cc1      cc1                   [.] c_parser_declspecs
       0.69%  stap     libc-2.29.so          [.] _int_malloc
       0.68%  cc1      cc1                   [.] htab_find_slot
       0.68%  cc1      [kernel.vmlinux]      [k] prepare_exit_to_usermode
       0.64%  cc1      [kernel.vmlinux]      [k] clear_page_erms
  [root@quaco testsuite]#

And now only what happens in slices demarcated by those start/end SDT
events:

  [root@quaco testsuite]# perf report --switch-on=sdt_stap:pass2__start --switch-off=sdt_stap:pass2__end | grep cycles:P -A100
  # Samples: 240  of event 'cycles:P'
  # Event count (approx.): 206491934
  #
  # Overhead  Command  Shared Object        Symbol
  # ........  .......  ...................  ................................................
  #
      38.99%  stap     stap                 [.] systemtap_session::register_library_aliases
      19.47%  stap     stap                 [.] match_key::operator<
      15.01%  stap     libc-2.29.so         [.] __memcmp_avx2_movbe
       5.19%  stap     libc-2.29.so         [.] _int_malloc
       2.50%  stap     libstdc++.so.6.0.26  [.] std::_Rb_tree_insert_and_rebalance
       2.30%  stap     stap                 [.] match_node::build_no_more
       2.07%  stap     libc-2.29.so         [.] malloc
       1.66%  stap     stap                 [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::find
       1.66%  stap     stap                 [.] match_node::bind
       1.58%  stap     [kernel.vmlinux]     [k] prepare_exit_to_usermode
       1.17%  stap     [kernel.vmlinux]     [k] native_irq_return_iret
       0.87%  stap     stap                 [.] 0x0000000000032ec4
       0.77%  stap     libstdc++.so.6.0.26  [.] std::_Rb_tree_increment
       0.47%  stap     stap                 [.] std::vector<derived_probe_builder*, std::allocator<derived_probe_builder*> >::_M_realloc_insert<derived_probe_builder* const&>
       0.47%  stap     [kernel.vmlinux]     [k] get_page_from_freelist
       0.47%  stap     [kernel.vmlinux]     [k] swapgs_restore_regs_and_return_to_usermode
       0.47%  stap     [kernel.vmlinux]     [k] do_user_addr_fault
       0.46%  stap     [kernel.vmlinux]     [k] __pagevec_lru_add_fn
       0.46%  stap     stap                 [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::_M_emplace_unique<std::pair<match_key, match_node*> >
       0.42%  stap     libstdc++.so.6.0.26  [.] 0x00000000000c18fa
       0.40%  stap     [kernel.vmlinux]     [k] interrupt_entry
       0.40%  stap     [kernel.vmlinux]     [k] update_load_avg
       0.40%  stap     [kernel.vmlinux]     [k] __intel_pmu_disable_all
       0.40%  stap     [kernel.vmlinux]     [k] clear_page_erms
       0.39%  stap     [kernel.vmlinux]     [k] __mod_node_page_state
       0.39%  stap     [kernel.vmlinux]     [k] error_entry
       0.39%  stap     [kernel.vmlinux]     [k] sync_regs
       0.38%  stap     [kernel.vmlinux]     [k] __handle_mm_fault
       0.38%  stap     stap                 [.] derive_probes

  #
  # (Tip: System-wide collection from all CPUs: perf record -a)
  #
  [root@quaco testsuite]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-408hvumcnyn93a0auihnawew@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-16 12:14:33 -03:00
Arnaldo Carvalho de Melo 2f53ae347f perf top: Add --switch-on/--switch-off events
Just like 'perf trace' and 'perf script', should be useful for instance
to only consider samples after the initialization phase of some
workload.

The man page has some examples and considerations about its current
interface, that still doesn't handle the on/off events in a special way,
behaving just like when multiple events are specified, i.e.:

- In non-group mode (when the event list is not enclosed in {}) show a
  a menu to allow choosing which event the user wants to see in the
  histograms browser

- In group mode, be it using {} or asking for --group, show one column
  per event.

Try for instance:

  # perf top -e '{cycles,instructions,probe:icmp_rcv}' --switch-on=probe:icmp_rcv

Replace probe:icmp_rcv, that I put in place using:

  # perf probe icmp_rcv:59

To hit when broadcast packets arrive, with a probe installed after an
initialization phase is over or after some other point of interest, some
garbage collection, etc, and also use --switch-off, for instance, on a
probe installed after said garbage collection is over.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-c7q7qjeqtyvc9mkeipxza6ne@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-15 16:03:26 -03:00
Arnaldo Carvalho de Melo 22ac4318ad perf trace: Add --switch-on/--switch-off events
Just like with 'perf script':

  # perf trace -e sched:*,syscalls:*sleep* sleep 1
       0.000 :28345/28345 sched:sched_waking:comm=perf pid=28346 prio=120 target_cpu=005
       0.005 :28345/28345 sched:sched_wakeup:perf:28346 [120] success=1 CPU:005
       0.383 sleep/28346 sched:sched_process_exec:filename=/usr/bin/sleep pid=28346 old_pid=28346
       0.613 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=607375 [ns] vruntime=23289041218 [ns]
       0.689 sleep/28346 syscalls:sys_enter_nanosleep:rqtp: 0x7ffc491789b0
       0.693 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=72021 [ns] vruntime=23289113239 [ns]
       0.694 sleep/28346 sched:sched_switch:sleep:28346 [120] S ==> swapper/5:0 [120]
    1000.787 :0/0 sched:sched_waking:comm=sleep pid=28346 prio=120 target_cpu=005
    1000.824 :0/0 sched:sched_wakeup:sleep:28346 [120] success=1 CPU:005
    1000.908 sleep/28346 syscalls:sys_exit_nanosleep:0x0
    1001.218 sleep/28346 sched:sched_process_exit:comm=sleep pid=28346 prio=120
  # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep sleep 1
       0.000 sleep/28349 sched:sched_stat_runtime:comm=sleep pid=28349 runtime=603036 [ns] vruntime=23873537697 [ns]
       0.001 sleep/28349 sched:sched_switch:sleep:28349 [120] S ==> swapper/4:0 [120]
    1000.392 :0/0 sched:sched_waking:comm=sleep pid=28349 prio=120 target_cpu=004
    1000.443 :0/0 sched:sched_wakeup:sleep:28349 [120] success=1 CPU:004
    1000.540 sleep/28349 syscalls:sys_exit_nanosleep:0x0
    1000.852 sleep/28349 sched:sched_process_exit:comm=sleep pid=28349 prio=120
  # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep sleep 1
       0.000 sleep/28352 sched:sched_stat_runtime:comm=sleep pid=28352 runtime=610543 [ns] vruntime=24811686681 [ns]
       0.001 sleep/28352 sched:sched_switch:sleep:28352 [120] S ==> swapper/0:0 [120]
    1000.397 :0/0 sched:sched_waking:comm=sleep pid=28352 prio=120 target_cpu=000
    1000.440 :0/0 sched:sched_wakeup:sleep:28352 [120] success=1 CPU:000
  #
  # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep --show-on-off sleep 1
       0.000 sleep/28367 syscalls:sys_enter_nanosleep:rqtp: 0x7fffd1a25fc0
       0.004 sleep/28367 sched:sched_stat_runtime:comm=sleep pid=28367 runtime=628760 [ns] vruntime=22170052672 [ns]
       0.005 sleep/28367 sched:sched_switch:sleep:28367 [120] S ==> swapper/2:0 [120]
    1000.367 :0/0 sched:sched_waking:comm=sleep pid=28367 prio=120 target_cpu=002
    1000.412 :0/0 sched:sched_wakeup:sleep:28367 [120] success=1 CPU:002
    1000.512 sleep/28367 syscalls:sys_exit_nanosleep:0x0
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-t3ngpt1brcc1fm9gep9gxm4q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-15 12:26:21 -03:00