mirror of https://gitee.com/openkylin/linux.git
18605 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Jiri Olsa | 315c0a1f0c |
libperf: Move perf's cpu_map__empty() to perf_cpu_map__empty()
So it's part of the libperf library as one of basic functions operating on the perf_cpu_map class. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190822111141.25823-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Jiri Olsa | 6549cd8f2c |
perf tools: Use perf_cpu_map__nr instead of cpu_map__nr
Switch the rest of the perf code to use libperf's perf_cpu_map__nr(), which is the same as current cpu_map__nr() and remove the cpu_map__nr() function. Link: http://lkml.kernel.org/n/tip-6e0guy75clis7nm0xpuz9fga@git.kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190822111141.25823-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Jiri Olsa | db9a5fd02a |
tools headers: Add missing perf_event.h include
We need perf_event.h include for 'struct perf_event_mmap_page'. Link: http://lkml.kernel.org/n/tip-bolqkmqajexhccjb0ib0an8w@git.kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190822111141.25823-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Ingo Molnar | 4e92b18e5b |
perf/core improvements and fixes:
callchains: Alexey Budankov: - Allow collecting LBR together with DWARF callchains, for workloads where the userspace stack size collected is not big enough for pure DWARF based unwinding. - Dump the LBR call stack in 'perf report -D'. perf top: Arnaldo Carvalho de Melo: - Show visual cue at start to state that the minimal set of samples are being collected prior to sorting/bucketizing/displaying. CoreSight (ARM hardware tracing): Leo Yan: - Support sample flags 'insn' and 'insnlen'. core: Adrian Hunter: - Add comment for 'idx' member in 'struct perf_sample_id. tools headers: Arnaldo Carvalho de Melo: - Synchronize linux/bits.h, which required grabbing a copy of the kernel const.h headers and some changes in the ordering of header directories. - Sync x86's asm/cpufeatures.h with the with the kernel, no change in any of the tools. libperf: Jiri Olsa: - Fix arch include paths. libtraceevent: Steven Rostedt (VMware): - Fix "robust" test of do_generate_dynamic_list_file. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXVxFFAAKCRCyPKLppCJ+ JwebAQCp3nc/q4rpE6v9W8b1uMAsmELl5q8D+74/1ZWT4dqPFAEA9ZmtsIi/7ZjV Dq3403hFUfk1kDacEvUXKgzi3PMq+Qc= =DK9x -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo-5.4-20190820' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: callchains: Alexey Budankov: - Allow collecting LBR together with DWARF callchains, for workloads where the userspace stack size collected is not big enough for pure DWARF based unwinding. - Dump the LBR call stack in 'perf report -D'. perf top: Arnaldo Carvalho de Melo: - Show visual cue at start to state that the minimal set of samples are being collected prior to sorting/bucketizing/displaying. CoreSight (ARM hardware tracing): Leo Yan: - Support sample flags 'insn' and 'insnlen'. core: Adrian Hunter: - Add comment for 'idx' member in 'struct perf_sample_id. tools headers: Arnaldo Carvalho de Melo: - Synchronize linux/bits.h, which required grabbing a copy of the kernel const.h headers and some changes in the ordering of header directories. - Sync x86's asm/cpufeatures.h with the with the kernel, no change in any of the tools. libperf: Jiri Olsa: - Fix arch include paths. libtraceevent: Steven Rostedt (VMware): - Fix "robust" test of do_generate_dynamic_list_file. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | 51c359c2fd |
Linux 5.3-rc5
-----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl1Zw6ceHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGbiUH/0kqDBzkpne1odxW LeAPtTgmxDbcOE/bgIk374e95mn3EP3arna01BLc5ztwkQ521f4Iw5mKW5InZcyu 3/IvpYeQUcdazphWSu72VnUZ8QfYNh4NJDjAx6iyliQ1NpJF9LLYLWWjlqwGbWHQ USbwp7A+56m1AWWmce2r50DK7jEZShKxRBQrXNXtvn8+YaVMvmdZpT6ejyG52J+4 zr9yYrT9sa5jcPGPnWN/sx03/BPij+yOFKKe8L9vprb3uEmNKPvqtAbUpI0QYw6j T+eZELLxAOsUk84kxQyTLCU/GMesP6hIaE93HlpmgcQkBBzK7H5SBN37r8OJjOeS IXlJX4c= =9Iey -----END PGP SIGNATURE----- Merge tag 'v5.3-rc5' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Jiri Olsa | b81d39c7a1 |
libperf: Fix arch include paths
Guenter Roeck reported problem with compilation when the ARCH is
specified:
$ make ARCH=x86_64
In file included from tools/include/asm/atomic.h:6:0,
from include/linux/atomic.h:5,
from tools/include/linux/refcount.h:41,
from cpumap.c:4: tools/include/asm/../../arch/x86/include/asm/atomic.h:11:10:
fatal error: asm/cmpxchg.h: No such file or directory
The problem is that we don't use SRCARCH (the sanitized ARCH version)
and we don't get the proper include path.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes:
|
|
Arnaldo Carvalho de Melo | 42fc2e9ef9 |
tools headers: Fixup bitsperlong per arch includes
We were getting the file by luck, from one of the paths in -I, fix it to get it from the proper place: $ cd tools/include/uapi/asm/ [acme@quaco asm]$ grep include bitsperlong.h #include "../../arch/x86/include/uapi/asm/bitsperlong.h" #include "../../arch/arm64/include/uapi/asm/bitsperlong.h" #include "../../arch/powerpc/include/uapi/asm/bitsperlong.h" #include "../../arch/s390/include/uapi/asm/bitsperlong.h" #include "../../arch/sparc/include/uapi/asm/bitsperlong.h" #include "../../arch/mips/include/uapi/asm/bitsperlong.h" #include "../../arch/ia64/include/uapi/asm/bitsperlong.h" #include "../../arch/riscv/include/uapi/asm/bitsperlong.h" #include "../../arch/alpha/include/uapi/asm/bitsperlong.h" #include <asm-generic/bitsperlong.h> $ ls -la ../../arch/x86/include/uapi/asm/bitsperlong.h ls: cannot access '../../arch/x86/include/uapi/asm/bitsperlong.h': No such file or directory $ ls -la ../../../arch/*/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 237 ../../../arch/alpha/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 841 ../../../arch/arm64/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 966 ../../../arch/hexagon/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 234 ../../../arch/ia64/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 100 ../../../arch/microblaze/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 244 ../../../arch/mips/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 352 ../../../arch/parisc/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 312 ../../../arch/powerpc/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 353 ../../../arch/riscv/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 292 ../../../arch/s390/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 323 ../../../arch/sparc/include/uapi/asm/bitsperlong.h -rw-rw-r--. 1 320 ../../../arch/x86/include/uapi/asm/bitsperlong.h $ Found while fixing some other problem, before it was escaping the tools/ chroot and using stuff in the kernel sources: CC /tmp/build/perf/util/find_bit.o In file included from /git/linux/tools/include/../../arch/x86/include/uapi/asm/bitsperlong.h:11, from /git/linux/tools/include/uapi/asm/bitsperlong.h:3, from /git/linux/tools/include/linux/bits.h:6, from /git/linux/tools/include/linux/bitops.h:13, from ../lib/find_bit.c:17: # cd /git/linux/tools/include/../../arch/x86/include/uapi/asm/ # pwd /git/linux/arch/x86/include/uapi/asm # Now it is getting the one we want it to, i.e. the one inside tools/: CC /tmp/build/perf/util/find_bit.o In file included from /git/linux/tools/arch/x86/include/uapi/asm/bitsperlong.h:11, from /git/linux/tools/include/linux/bits.h:6, from /git/linux/tools/include/linux/bitops.h:13, Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-8f8cfqywmf6jk8a3ucr0ixhu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 5c959b6d8f |
perf top: Show info message while collecting samples
Give visual cue about what is happening while initially collecting the minimal set of samples to collect/sort/display. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-xcui60p1v6ozijfam2o89ya8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 2284cf8074 |
perf ui browser: Allow specifying message to show when no samples are available to display
The 'perf top' tool will use that to avoid having a initial blank screen while collecting the minimum number of samples to sort and display. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-89ciceg8cy4442he3t0jzo3f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 9b01611934 |
perf ui: Introduce non-interactive ui__info_window() function
Sometimes we want just to print a message on the center of the screen, like in 'perf top' while we wait for the minimum amount of samples to be collected before sorting and showing them. Also expose __ui__info_window() as an optimization for cases where such message is to be printed while holding the ui lock. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-uat0f89vfwl2w52kv9wzwd8a@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 9e79ff77e4 |
perf ui: Make 'exit_msg' optional in ui__question_window()
We will not need it when refactoring this function to be non-interactive, so make it optional. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-pnx1dn17bsz7lqt9ty95nnjx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Leo Yan | a4973d8f7b |
perf cs-etm: Support sample flags 'insn' and 'insnlen'
The synthetic branch and instruction samples are missed to set instruction related info, thus the perf tool fails to display samples with flags '-F,+insn,+insnlen'. The CoreSight trace decoder provides sufficient information to decide the instruction size based on the ISA type: A64/A32 instructions are 32-bit size, but one exception is the T32 instruction size, which might be 32-bit or 16-bit. This patch handles these cases and it reads the instruction values from DSO file; thus can support the flags '-F,+insn,+insnlen'. Before: # perf script -F,insn,insnlen,ip,sym 0 [unknown] ilen: 0 ffff97174044 _start ilen: 0 ffff97174938 _dl_start ilen: 0 ffff97174938 _dl_start ilen: 0 ffff97174938 _dl_start ilen: 0 ffff97174938 _dl_start ilen: 0 ffff97174938 _dl_start ilen: 0 ffff97174938 _dl_start ilen: 0 ffff97174938 _dl_start ilen: 0 ffff97174938 _dl_start ilen: 0 [...] After: # perf script -F,insn,insnlen,ip,sym 0 [unknown] ilen: 0 ffff97174044 _start ilen: 4 insn: 2f 02 00 94 ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54 ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54 ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54 ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54 ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54 ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54 ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54 ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54 [...] Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Robert Walker <robert.walker@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190815082854.18191-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Alexey Budankov | 10ccbc1cc0 |
perf report: Prefer DWARF callstacks to LBR ones when captured both
Display DWARF based callchains when the perf.data file contains raw thread stack data as LBR callstack data. Commiter testing: This changes the output from the branch stack based one, i.e. without this patch, for the same file as in the previous csets: # perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 13 # # Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles # ........ ....... .................... ........................... ......................................... .................. # 7.69% ls libpthread-2.29.so [.] _init [.] __pthread_initialize_minimal_internal 6827 7.69% ls ld-2.29.so [k] _start [k] _dl_start - 7.69% ls ld-2.29.so [.] _dl_start_user [.] _dl_init -24790 7.69% ls ld-2.29.so [k] _dl_start [k] _dl_sysdep_start 278 7.69% ls ld-2.29.so [k] dl_main [k] _dl_map_object_deps 15581 7.69% ls ld-2.29.so [k] open_verify.constprop.0 [k] lseek64 4228 7.69% ls ld-2.29.so [k] _dl_map_object [k] open_verify.constprop.0 55 7.69% ls ld-2.29.so [k] openaux [k] _dl_map_object 67 7.69% ls ld-2.29.so [k] _dl_map_object_deps [k] 0x00007f441b57c090 112 7.69% ls ld-2.29.so [.] call_init.part.0 [.] _init 334 7.69% ls ld-2.29.so [.] _dl_init [.] call_init.part.0 383 7.69% ls ld-2.29.so [k] _dl_sysdep_start [k] dl_main 45 7.69% ls ld-2.29.so [k] _dl_catch_exception [k] openaux 116 # # (Tip: For memory address profiling, try: perf mem record / perf mem report) # To the one that shows call chains: # perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 10 of event 'cycles' # Event count (approx.): 3204047 # # Children Self Command Shared Object Symbol # ........ ........ ....... .................. ......................................... # 55.01% 0.00% ls [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe | ---entry_SYSCALL_64_after_hwframe do_syscall_64 | --16.01%--__x64_sys_execve __do_execve_file.isra.0 search_binary_handler load_elf_binary elf_map vm_mmap_pgoff do_mmap mmap_region perf_event_mmap perf_iterate_sb perf_iterate_ctx perf_event_mmap_output perf_output_copy memcpy_erms 55.01% 39.00% ls [kernel.vmlinux] [k] do_syscall_64 | |--39.00%--0xffffffffffffffff | _dl_map_object | open_verify.constprop.0 | __lseek64 (inlined) | entry_SYSCALL_64_after_hwframe | do_syscall_64 | --16.01%--do_syscall_64 __x64_sys_execve __do_execve_file.isra.0 search_binary_handler load_elf_binary elf_map vm_mmap_pgoff do_mmap mmap_region perf_event_mmap perf_iterate_sb perf_iterate_ctx perf_event_mmap_output perf_output_copy memcpy_erms 42.95% 42.95% ls libpthread-2.29.so [.] __pthread_initialize_minimal_internal | ---_init __pthread_initialize_minimal_internal 42.95% 0.00% ls libpthread-2.29.so [.] _init | ---_init __pthread_initialize_minimal_internal <SNIP> # # (Tip: Profiling branch (mis)predictions with: perf record -b / perf report) # # The branch stack view be explicitely selected using: # perf report -h branch-stack Usage: perf report [<options>] -b, --branch-stack use branch records for per branch histogram filling # I.e. after this patch: # perf report -b --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 13 # # Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles # ........ ....... .................... ........................... ......................................... .................. # 7.69% ls libpthread-2.29.so [.] _init [.] __pthread_initialize_minimal_internal 6827 7.69% ls ld-2.29.so [k] _start [k] _dl_start - 7.69% ls ld-2.29.so [.] _dl_start_user [.] _dl_init -24790 7.69% ls ld-2.29.so [k] _dl_start [k] _dl_sysdep_start 278 7.69% ls ld-2.29.so [k] dl_main [k] _dl_map_object_deps 15581 7.69% ls ld-2.29.so [k] open_verify.constprop.0 [k] lseek64 4228 7.69% ls ld-2.29.so [k] _dl_map_object [k] open_verify.constprop.0 55 7.69% ls ld-2.29.so [k] openaux [k] _dl_map_object 67 7.69% ls ld-2.29.so [k] _dl_map_object_deps [k] 0x00007f441b57c090 112 7.69% ls ld-2.29.so [.] call_init.part.0 [.] _init 334 7.69% ls ld-2.29.so [.] _dl_init [.] call_init.part.0 383 7.69% ls ld-2.29.so [k] _dl_sysdep_start [k] dl_main 45 7.69% ls ld-2.29.so [k] _dl_catch_exception [k] openaux 116 # # (Tip: Show current config key-value pairs: perf config --list) # # Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ccbd9583-82f4-dec5-7e84-64bf56e351fb@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Alexey Budankov | d2720c3dad |
perf report: Dump LBR callstack data by -D jointly with thread stack
Make perf report -D command print captured LBR callstack chain when it is collected together with raw thread stack data: 2752673087247083 0x5d10 [0x548]: PERF_RECORD_SAMPLE(IP, 0x4002): 5841/5841: 0x40121f period: 1543862 addr: 0 ... FP chain: nr:0 ... branch callstack: nr:3 ..... 0: 00000000004011d0 ..... 1: 00007f393c388411 ..... 2: 0000000000401098 ... user regs: mask 0xff0fff ABI 64-bit .... AX 0x34e7 .... BX 0x7fff5f6dd3c0 .... CX 0xffffffff .... DX 0x34e6 .... SI 0x7f393c5268d0 .... DI 0x0 .... BP 0x401260 .... SP 0x7fff5f6dd3c0 .... IP 0x40121f .... FLAGS 0x29f .... CS 0x33 .... SS 0x2b .... R8 0x7f393c526800 .... R9 0x7f393c525da0 .... R10 0xfffffffffffff70a .... R11 0x246 .... R12 0x401070 .... R13 0x7fff5f6ddcb0 .... R14 0x0 .... R15 0x0 ... ustack: size 1024, offset 0x130 . data_src: 0x5080021 ... thread: stack_test:5841 ...... dso: /root/abudanko/stacks/stack_test Committer testing: # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.042 MB perf.data (10 samples) ] # Before: # perf report -D |& grep PERF_RECORD_SAMPLE -A28 | tail -29 67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0 ... FP chain: nr:0 ... user regs: mask 0xff0fff ABI 64-bit .... AX 0x7f441b2b1000 .... BX 0x7f441b55b970 .... CX 0x7fff6e2db218 .... DX 0x7fff6e2db218 .... SI 0x7fff6e2db208 .... DI 0x1 .... BP 0x1 .... SP 0x7fff6e2db178 .... IP 0x7f441b2b1e20 .... FLAGS 0x20a .... CS 0x33 .... SS 0x2b .... R8 0x1 .... R9 0x7f441b371c18 .... R10 0x7f441b5a5f10 .... R11 0x202 .... R12 0x7fff6e2db208 .... R13 0x7fff6e2db218 .... R14 0x7f441b5a7150 .... R15 0x0 ... ustack: size 1024, offset 0x148 . data_src: 0x5080021 ... thread: ls:9721 ...... dso: /usr/lib64/libpthread-2.29.so 0xad00 [0x60]: event: 10 # After: # perf report -D |& grep PERF_RECORD_SAMPLE -A31 | tail -32 67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0 ... FP chain: nr:0 ... branch callstack: nr:4 ..... 0: 00007f441b2b1e20 ..... 1: 00007f441b58af1a ..... 2: 00007f441b58b0e1 ..... 3: 00007f441b57c145 ... user regs: mask 0xff0fff ABI 64-bit .... AX 0x7f441b2b1000 .... BX 0x7f441b55b970 .... CX 0x7fff6e2db218 .... DX 0x7fff6e2db218 .... SI 0x7fff6e2db208 .... DI 0x1 .... BP 0x1 .... SP 0x7fff6e2db178 .... IP 0x7f441b2b1e20 .... FLAGS 0x20a .... CS 0x33 .... SS 0x2b .... R8 0x1 .... R9 0x7f441b371c18 .... R10 0x7f441b5a5f10 .... R11 0x202 .... R12 0x7fff6e2db208 .... R13 0x7fff6e2db218 .... R14 0x7f441b5a7150 .... R15 0x0 ... ustack: size 1024, offset 0x148 . data_src: 0x5080021 ... thread: ls:9721 ...... dso: /usr/lib64/libpthread-2.29.so # Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/aa82e5dd-def2-0ca8-a064-db9e2e8ad076@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Alexey Budankov | 2566349648 |
perf record: Enable LBR callstack capture jointly with thread stack
Enable '-j stack' applicability together with '--call-graph dwarf' option so thread stack data and LBR call stack could be captured jointly: $ perf record -g --call-graph dwarf,1024 -j stack,u -- stack_test Collected LBR call stack can be used to augment DWARF call stack calculated from the raw thread stack data and to provide more comprehensive call stack information for cases when collected SIZE is not enough to cover complete thread stack. Such cases are typical for workloads that allocate large arrays of data on its threads stacks or the possible SIZE to collect can't be large enough due to workload nature or system configuration and this is where hardware captured LBR call stacks can provide missing stack frames. Possible DWARF plus LBR call stacks consolidation algorithm description follows. With this patch set perf report command UI currently ignores collected LBR call stack data and still provides DWARF based call stacks information. =========================================================================== Overview: Legend: THS - thread stack CTX - thread register context SWS - software stack SSF - skipped stack frames PSS - Perf sample stack ip,sp,bp - HW registers values d - allocated stack regions kip - ip address in the kernel space K - captured thread stack size THS ----- | |<-stack bottom ... |---| |ip4| |---| PSS = SWS(THS(K)) | | --> | | | |d3 | user/ | |---| user PSS kernel PSS | |ip3| ------ ------ | |---| |SSF | |SSF | | | | .... .... | | | ------ ------ | |d2 | | -1 | | -1 | |---| user ------ ------ K |ip2| CTX |ip3 | |ip3 | |---| |----| |----| | |d1 | ... |ip2 | , |ip2 | | |---| |---| |----| |----| | |ip1| |bp0| |ip1 | |ip1 | | |---| |---| |----| |----| | | | |ip0|->|ip0 | |ip0 |<-user stack top | | | |---| ------ ------ | | |<-|sp0|<-stack |kip0|<-kernel stack bottom --> ----- ----- top |----| |kip1| |----| |kip2| |----| .... | |<-kernel stack top ------ Algorithm details: Legend: HWS - hardware stack K-SWS - kernel software stack BRANCH TABLE HWS ip ip from to ------ ----------- |ip7`| |ip7`| | |----| |----|----| |ip6`| |ip6`| | user PSS |----| |----|----| |ip5`| |ip5`| | ------ |----| |----|----| | -1 | |ip4`| |ip4`| | ------ |----| |----|----| |ip3 |~~~|ip3`| |ip3`| | |----| |----| |----|----| |ip2 |~~~|ip2`| |ip2`| | |----| |----| |----|----| |ip1 |~~~|ip1`| |ip1`|ip0`| |----| |----| ----------- |ip0 |~~~|ip0`|<---------' ------ ------ 1. if (sym(ipj) == sym(ipj`)), j=0-3 ===> user PSS 2. ipj` , j=4-7 ===> user PSS Augmented PSS = A_SWS(SWS(THS(K)), HWS): user/ user PSS kernel PSS ------ ------ |ip7`| |ip7`|<-user PSS bottom |----| |----| |ip6`| |ip6`| |----| |----| HWS |ip5`| |ip5`| |----| |----| |ip4`| |ip4`| ------ ------ |ip3 | |ip3 | |----| |----| SWS |ip2 | |ip2 | |----| |----| |ip1 | |ip1 | |----| |----| |ip0 | |ip0 |<-user PSS top ------ ------ |kip0|<-kernel PSS bottom |----| |kip1| K-SWS |----| |kip2| |----| |kip3|<-kernel PSS top ------ APSS Committer testing: Before: # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null unknown branch filter stack, check man page Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -j, --branch-filter <branch filter mask> branch stack filter modes # perf record -g --call-graph dwarf,1024 -j u ls > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.054 MB perf.data (12 samples) ] # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CALLCHAIN|PERIOD|BRANCH_STACK|REGS_USER|STACK_USER|DATA_SRC, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY, sample_regs_user: 0xff0fff, sample_stack_user: 1024 # After: # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.044 MB perf.data (11 samples) ] [root@quaco ~]# perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CALLCHAIN|PERIOD|BRANCH_STACK|REGS_USER|STACK_USER|DATA_SRC, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: USER|CALL_STACK, sample_regs_user: 0xff0fff, sample_stack_user: 1024 # Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/e9e00090-66fb-d2a4-c90f-1d12344f7788@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Steven Rostedt (VMware) | 82a2f88458 |
tools lib traceevent: Fix "robust" test of do_generate_dynamic_list_file
The tools/lib/traceevent/Makefile had a test added to it to detect a failure
of the "nm" when making the dynamic list file (whatever that is). The
problem is that the test sorts the values "U W w" and some versions of sort
will place "w" ahead of "W" (even though it has a higher ASCII value, and
break the test.
Add 'tr "w" "W"' to merge the two and not worry about the ordering.
Reported-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal rarek <mmarek@suse.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org
Fixes:
|
|
Adrian Hunter | 3c84e65a53 |
perf evsel: Add comment for 'idx' member in 'struct perf_sample_id
The 'idx' member was added as preparation for AUX area sampling. Add a comment to describe why. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/83ff264f-84c3-5372-8976-dd9293d20c6f@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 0ac10d87a5 |
tools arch x86: Sync asm/cpufeatures.h with the with the kernel
To pick up the changes in: |
|
Arnaldo Carvalho de Melo | b658911731 |
tools headers: Synchronize linux/bits.h with the kernel sources
To pick up the changes in this cset:
|
|
Arnaldo Carvalho de Melo | aaa6ef8aa8 |
tools headers: Grab copy of linux/const.h, needed by linux/bits.h
So that can update the copy of linux/bits.h that now uses macros defined in const.h and that are not available in older systems. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-c2qfcbl58hxyfb5u5xivp7is@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 146dc30363 |
perf tools: tools/include should come before tools/uapi/include
The next cset will grap const.h copies from the kernel to keep bits.h in sync as it started to use linux/const.h, that in turn includes uapi/linux/const.h. So now we have a file with the same name in tools/include and tools/uapi/include, and one includes the other, we need to have tools/include/uapi/ after tools/include/ for this to work, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-qzjqxa1wdrt51kwadyqawnuj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 6e98bc349e |
tools headers: Add limits.h to access __WORDSIZE
We need to make sure limits.h is included before checking if we can use __WORDSIZE, do it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-5yfoed4rnsck2n3cwhm9mvth@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Linus Torvalds | 85d8d3b172 |
- A few fixes for the userspace hyper-v tools from Adrian Vladu.
- A fix for the hyper-v MAINTAINERs entry from Lan Tianyu. - Fix for SPDX license identifier in the userspace tools from Nishad Kamdar. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAl1YXCEACgkQ3qZv95d3 LNy/cxAAjyyxIPmu7bUi9jN9YHbEQjCRQJBMvm+/hfqoiiWZa/GglfWw5EY+zAJi pHzWyMiOTeubgiMpA17yE0QK6iigPcxN7mOmcrmuWuKIgJthLZFXPhaCrx8apEty Ovm9qNjFulpBks/U2r+Pfi8rj7gLqJ5a5qGydYvlrJOTfIRXDYU93HAAdJutraqF w34sWxpQjKaTxva+JK7Em2/UCXrCNQ8Tzz7E3cLf20Z+PDGgkHAIWF8p+SAlosOd GKzYrW8z7yr1pwcwyLW6yEMlKMENq66pRkk1s0oUbwgQuCmb8HllWazIRkXAAHiD 1I6r4A2aH7VemBxGKoAH0ENXB5MX5hTmyjOXONM4wqa/24nQsm9mjpuDY/aIRngZ 3qCsQ/4xpZpKJxccY7DzAcpT7VPSpMM9patAk2C+QXbYIg/HpWfY+42OsnwjnjcE D89azjnhIeyOWOFvKxD187hPXd/lfoReC+czCisJvHxR4OPOP1chIUx6VZ25e91d gzJqggippL5+QgSt2pYtf6INlsMThvW/2/LLXGxR/r+CEpVAfXFlLBoz7OfwElZ2 m2vdbwo+0aIOVHL51594Wc78yZM2EQxlMNwt8l7mKTDxxwSlIcC4VHX8r46Df8AA GgFSG2pnMNID/a9Rhr7KWzXnaBlALKfGKypz9BPtLKasHal32X8= =iKcM -----END PGP SIGNATURE----- Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V fixes from Sasha Levin: - A few fixes for the userspace hyper-v tools from Adrian Vladu. - A fix for the hyper-v MAINTAINERs entry from Lan Tianyu. - Fix for SPDX license identifier in the userspace tools from Nishad Kamdar. * tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: MAINTAINERS: Fix Hyperv vIOMMU driver file name tools: hv: Use the correct style for SPDX License Identifier tools: hv: fix typos in toolchain tools: hv: fix KVP and VSS daemons exit code tools: hv: fixed Python pep8/flake8 warnings for lsvmbus |
|
Adrian Vladu | 2d35c66036 |
tools: hv: fix typos in toolchain
Fix typos in the HyperV toolchain. Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sasha Levin <sashal@kernel.org> Cc: Alessandro Pilotti <apilotti@cloudbasesolutions.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
|
Adrian Vladu | b099515607 |
tools: hv: fix KVP and VSS daemons exit code
HyperV KVP and VSS daemons should exit with 0 when the '--help' or '-h' flags are used. Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sasha Levin <sashal@kernel.org> Cc: Alessandro Pilotti <apilotti@cloudbasesolutions.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
|
Adrian Vladu | 5912e791f3 |
tools: hv: fixed Python pep8/flake8 warnings for lsvmbus
Fixed pep8/flake8 python style code for lsvmbus tool. The TAB indentation was on purpose ignored (pep8 rule W191) to make sure the code is complying with the Linux code guideline. The following command doe not show any warnings now: pep8 --ignore=W191 lsvmbus flake8 --ignore=W191 lsvmbus Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sasha Levin <sashal@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Alessandro Pilotti <apilotti@cloudbasesolutions.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
|
John Keeping | e2736219e6 |
perf unwind: Remove unnecessary test
If dwarf_callchain_users is false, then unwind__prepare_access() will not set unwind_libunwind_ops so the remaining test here is sufficient. Signed-off-by: John Keeping <john@metanate.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: john keeping <john@metanate.com> Link: http://lkml.kernel.org/r/20190815100146.28842-3-john@metanate.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
John Keeping | e8ba2906f6 |
perf unwind: Fix libunwind when tid != pid
Commit |
|
John Keeping | ab6cd0e527 |
perf map: Use zalloc for map_groups
In the next commit we will add new fields to map_groups and we need these to be null if no value is assigned. The simplest way to achieve this is to request zeroed memory from the allocator. Signed-off-by: John Keeping <john@metanate.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: john keeping <john@metanate.com> Link: http://lkml.kernel.org/r/20190815100146.28842-1-john@metanate.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | ef4b1a539f |
perf report: Add --switch-on/--switch-off events
Since 'perf top' shares the histogram browser with 'perf report', then the same explanation in the previous cset applies. An additional example uses a pair of SDT events available for systemtap: # perf probe --exec=/usr/bin/stap '%*:*' Added new events: sdt_stap:benchmark__thread__start (on %* in /usr/bin/stap) sdt_stap:benchmark (on %* in /usr/bin/stap) sdt_stap:benchmark__thread__end (on %* in /usr/bin/stap) sdt_stap:pass6__start (on %* in /usr/bin/stap) sdt_stap:pass6__end (on %* in /usr/bin/stap) sdt_stap:pass5__start (on %* in /usr/bin/stap) sdt_stap:pass5__end (on %* in /usr/bin/stap) sdt_stap:pass0__start (on %* in /usr/bin/stap) sdt_stap:pass0__end (on %* in /usr/bin/stap) sdt_stap:pass1a__start (on %* in /usr/bin/stap) sdt_stap:pass1b__start (on %* in /usr/bin/stap) sdt_stap:pass1__end (on %* in /usr/bin/stap) sdt_stap:pass2__start (on %* in /usr/bin/stap) sdt_stap:pass2__end (on %* in /usr/bin/stap) sdt_stap:pass3__start (on %* in /usr/bin/stap) sdt_stap:pass3__end (on %* in /usr/bin/stap) sdt_stap:pass4__start (on %* in /usr/bin/stap) sdt_stap:pass4__end (on %* in /usr/bin/stap) sdt_stap:benchmark__start (on %* in /usr/bin/stap) sdt_stap:benchmark__end (on %* in /usr/bin/stap) sdt_stap:cache__get (on %* in /usr/bin/stap) sdt_stap:cache__clean (on %* in /usr/bin/stap) sdt_stap:cache__add__module (on %* in /usr/bin/stap) sdt_stap:cache__add__source (on %* in /usr/bin/stap) sdt_stap:stap_system__complete (on %* in /usr/bin/stap) sdt_stap:stap_system__start (on %* in /usr/bin/stap) sdt_stap:stap_system__spawn (on %* in /usr/bin/stap) sdt_stap:stap_system__fork (on %* in /usr/bin/stap) sdt_stap:intern_string (on %* in /usr/bin/stap) sdt_stap:client__start (on %* in /usr/bin/stap) sdt_stap:client__end (on %* in /usr/bin/stap) You can now use it in all perf tools, such as: perf record -e sdt_stap:client__end -aR sleep 1 # From these we're use the two below to run systemtap's test suite: # perf record -e sdt_stap:pass2__*,cycles:P make installcheck > /dev/null ^C[ perf record: Woken up 8 times to write data ] [ perf record: Captured and wrote 2.691 MB perf.data (39638 samples) ] Terminated # perf script | grep sdt_stap stap 28979 [000] 19424.302660: sdt_stap:pass2__start: (561b9a537de3) arg1=140730364262544 stap 28979 [000] 19424.333083: sdt_stap:pass2__end: (561b9a53a9e1) arg1=140730364262544 stap 29045 [006] 19424.933460: sdt_stap:pass2__start: (563edddcede3) arg1=140722674883152 stap 29045 [006] 19424.963794: sdt_stap:pass2__end: (563edddd19e1) arg1=140722674883152 # perf script | grep cycles | wc -l 39634 # Looking at the whole perf.data file: [root@quaco testsuite]# perf report | grep cycles:P -A25 # Samples: 39K of event 'cycles:P' # Event count (approx.): 34044267368 # # Overhead Command Shared Object Symbol # ........ ....... .................... ................................ # 3.50% cc1 cc1 [.] ht_lookup_with_hash 3.04% cc1 cc1 [.] _cpp_lex_token 2.11% cc1 cc1 [.] ggc_internal_alloc 1.83% cc1 cc1 [.] cpp_get_token_with_location 1.68% cc1 libc-2.29.so [.] _int_malloc 1.41% cc1 cc1 [.] linemap_position_for_column 1.25% cc1 cc1 [.] ggc_internal_cleared_alloc 1.20% cc1 cc1 [.] c_lex_with_flags 1.18% cc1 cc1 [.] get_combined_adhoc_loc 1.05% cc1 libc-2.29.so [.] malloc 1.01% cc1 libc-2.29.so [.] _int_free 0.96% stap stap [.] std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, stringtable_hash, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > > 0.78% stap stap [.] lexer::scan 0.74% cc1 cc1 [.] _cpp_lex_direct 0.70% cc1 cc1 [.] pop_scope 0.70% cc1 cc1 [.] c_parser_declspecs 0.69% stap libc-2.29.so [.] _int_malloc 0.68% cc1 cc1 [.] htab_find_slot 0.68% cc1 [kernel.vmlinux] [k] prepare_exit_to_usermode 0.64% cc1 [kernel.vmlinux] [k] clear_page_erms [root@quaco testsuite]# And now only what happens in slices demarcated by those start/end SDT events: [root@quaco testsuite]# perf report --switch-on=sdt_stap:pass2__start --switch-off=sdt_stap:pass2__end | grep cycles:P -A100 # Samples: 240 of event 'cycles:P' # Event count (approx.): 206491934 # # Overhead Command Shared Object Symbol # ........ ....... ................... ................................................ # 38.99% stap stap [.] systemtap_session::register_library_aliases 19.47% stap stap [.] match_key::operator< 15.01% stap libc-2.29.so [.] __memcmp_avx2_movbe 5.19% stap libc-2.29.so [.] _int_malloc 2.50% stap libstdc++.so.6.0.26 [.] std::_Rb_tree_insert_and_rebalance 2.30% stap stap [.] match_node::build_no_more 2.07% stap libc-2.29.so [.] malloc 1.66% stap stap [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::find 1.66% stap stap [.] match_node::bind 1.58% stap [kernel.vmlinux] [k] prepare_exit_to_usermode 1.17% stap [kernel.vmlinux] [k] native_irq_return_iret 0.87% stap stap [.] 0x0000000000032ec4 0.77% stap libstdc++.so.6.0.26 [.] std::_Rb_tree_increment 0.47% stap stap [.] std::vector<derived_probe_builder*, std::allocator<derived_probe_builder*> >::_M_realloc_insert<derived_probe_builder* const&> 0.47% stap [kernel.vmlinux] [k] get_page_from_freelist 0.47% stap [kernel.vmlinux] [k] swapgs_restore_regs_and_return_to_usermode 0.47% stap [kernel.vmlinux] [k] do_user_addr_fault 0.46% stap [kernel.vmlinux] [k] __pagevec_lru_add_fn 0.46% stap stap [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::_M_emplace_unique<std::pair<match_key, match_node*> > 0.42% stap libstdc++.so.6.0.26 [.] 0x00000000000c18fa 0.40% stap [kernel.vmlinux] [k] interrupt_entry 0.40% stap [kernel.vmlinux] [k] update_load_avg 0.40% stap [kernel.vmlinux] [k] __intel_pmu_disable_all 0.40% stap [kernel.vmlinux] [k] clear_page_erms 0.39% stap [kernel.vmlinux] [k] __mod_node_page_state 0.39% stap [kernel.vmlinux] [k] error_entry 0.39% stap [kernel.vmlinux] [k] sync_regs 0.38% stap [kernel.vmlinux] [k] __handle_mm_fault 0.38% stap stap [.] derive_probes # # (Tip: System-wide collection from all CPUs: perf record -a) # [root@quaco testsuite]# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-408hvumcnyn93a0auihnawew@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 2f53ae347f |
perf top: Add --switch-on/--switch-off events
Just like 'perf trace' and 'perf script', should be useful for instance to only consider samples after the initialization phase of some workload. The man page has some examples and considerations about its current interface, that still doesn't handle the on/off events in a special way, behaving just like when multiple events are specified, i.e.: - In non-group mode (when the event list is not enclosed in {}) show a a menu to allow choosing which event the user wants to see in the histograms browser - In group mode, be it using {} or asking for --group, show one column per event. Try for instance: # perf top -e '{cycles,instructions,probe:icmp_rcv}' --switch-on=probe:icmp_rcv Replace probe:icmp_rcv, that I put in place using: # perf probe icmp_rcv:59 To hit when broadcast packets arrive, with a probe installed after an initialization phase is over or after some other point of interest, some garbage collection, etc, and also use --switch-off, for instance, on a probe installed after said garbage collection is over. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-c7q7qjeqtyvc9mkeipxza6ne@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 22ac4318ad |
perf trace: Add --switch-on/--switch-off events
Just like with 'perf script': # perf trace -e sched:*,syscalls:*sleep* sleep 1 0.000 :28345/28345 sched:sched_waking:comm=perf pid=28346 prio=120 target_cpu=005 0.005 :28345/28345 sched:sched_wakeup:perf:28346 [120] success=1 CPU:005 0.383 sleep/28346 sched:sched_process_exec:filename=/usr/bin/sleep pid=28346 old_pid=28346 0.613 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=607375 [ns] vruntime=23289041218 [ns] 0.689 sleep/28346 syscalls:sys_enter_nanosleep:rqtp: 0x7ffc491789b0 0.693 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=72021 [ns] vruntime=23289113239 [ns] 0.694 sleep/28346 sched:sched_switch:sleep:28346 [120] S ==> swapper/5:0 [120] 1000.787 :0/0 sched:sched_waking:comm=sleep pid=28346 prio=120 target_cpu=005 1000.824 :0/0 sched:sched_wakeup:sleep:28346 [120] success=1 CPU:005 1000.908 sleep/28346 syscalls:sys_exit_nanosleep:0x0 1001.218 sleep/28346 sched:sched_process_exit:comm=sleep pid=28346 prio=120 # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep sleep 1 0.000 sleep/28349 sched:sched_stat_runtime:comm=sleep pid=28349 runtime=603036 [ns] vruntime=23873537697 [ns] 0.001 sleep/28349 sched:sched_switch:sleep:28349 [120] S ==> swapper/4:0 [120] 1000.392 :0/0 sched:sched_waking:comm=sleep pid=28349 prio=120 target_cpu=004 1000.443 :0/0 sched:sched_wakeup:sleep:28349 [120] success=1 CPU:004 1000.540 sleep/28349 syscalls:sys_exit_nanosleep:0x0 1000.852 sleep/28349 sched:sched_process_exit:comm=sleep pid=28349 prio=120 # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep sleep 1 0.000 sleep/28352 sched:sched_stat_runtime:comm=sleep pid=28352 runtime=610543 [ns] vruntime=24811686681 [ns] 0.001 sleep/28352 sched:sched_switch:sleep:28352 [120] S ==> swapper/0:0 [120] 1000.397 :0/0 sched:sched_waking:comm=sleep pid=28352 prio=120 target_cpu=000 1000.440 :0/0 sched:sched_wakeup:sleep:28352 [120] success=1 CPU:000 # # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep --show-on-off sleep 1 0.000 sleep/28367 syscalls:sys_enter_nanosleep:rqtp: 0x7fffd1a25fc0 0.004 sleep/28367 sched:sched_stat_runtime:comm=sleep pid=28367 runtime=628760 [ns] vruntime=22170052672 [ns] 0.005 sleep/28367 sched:sched_switch:sleep:28367 [120] S ==> swapper/2:0 [120] 1000.367 :0/0 sched:sched_waking:comm=sleep pid=28367 prio=120 target_cpu=002 1000.412 :0/0 sched:sched_wakeup:sleep:28367 [120] success=1 CPU:002 1000.512 sleep/28367 syscalls:sys_exit_nanosleep:0x0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-t3ngpt1brcc1fm9gep9gxm4q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 8b3c9ea7bf |
perf evswitch: Add hint when not finding specified on/off events
If the user specifies a on or off switch event and it isn't in the perf.data file, provide a hint about how to see the events in the perf.data evlist: # perf script --switch-on=syscall:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep ERROR: event_on event not found (syscall:sys_enter_nanosleep) HINT: use 'perf evlist' to see the available event names # # perf evlist sched:sched_kthread_stop sched:sched_kthread_stop_ret sched:sched_waking sched:sched_wakeup sched:sched_wakeup_new sched:sched_switch sched:sched_migrate_task sched:sched_process_free sched:sched_process_exit sched:sched_wait_task sched:sched_process_wait sched:sched_process_fork sched:sched_process_exec sched:sched_stat_wait sched:sched_stat_sleep sched:sched_stat_iowait sched:sched_stat_blocked sched:sched_stat_runtime sched:sched_pi_setprio sched:sched_move_numa sched:sched_stick_numa sched:sched_swap_numa sched:sched_wake_idle_without_ipi syscalls:sys_enter_clock_nanosleep syscalls:sys_exit_clock_nanosleep syscalls:sys_enter_nanosleep syscalls:sys_exit_nanosleep # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # # perf script --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep sleep 20919 [001] 109866.144411: sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [ns] sleep 20919 [001] 109866.144412: sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120] swapper 0 [001] 109867.144568: sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001 swapper 0 [001] 109867.144586: sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-iijjvdlyad973oskdq8gmi5w@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | c9a4269930 |
perf evswitch: Move enoent error message printing to separate function
Allows adding hints there, will be done in followup patch. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-1kvrdi7weuz3hxycwvarcu6v@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 124e02be72 |
perf evswitch: Introduce init() method to set the on/off evsels from the command line
Another step in having all the boilerplate in just one place to then use in the other tools. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-snreb1wmwyjei3eefwotxp1l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | add3a719c9 |
perf evswitch: Introduce OPTS_EVSWITCH() for cmd line processing
All tools will want those, so provide a convenient way to get them. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-v16pe3sbf3wjmn152u18f649@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 0b495b1215 |
perf evswitch: Add the names of on/off events
So that we can have macros for the OPT_ entries and also for finding those in an evlist, this way other tools will use this very easily. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-q0og1xoqqi0w38ve5u0a43k2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 8829e56fa0 |
perf evswitch: Move switch logic to use in other tools
Now other tools that want switching can use an evswitch for that, just set it up and add it to the PERF_RECORD_SAMPLE processing function. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-b1trj1q97qwfv251l66q3noj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | d236044272 |
perf evswitch: Move struct to a separate header to use in other tools
Now that we see that the simple userspace-based "slicing" of events using delimiter events ("markers") works, lets move it to a separate header to make it available to other tools, next step will be having the switch on/off check done at the PERF_RECORD_SAMPLE processing function moved too. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-z0cyi9ifzlr37cedr9xztc1k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | dd41f660c0 |
perf script: Allow specifying event to switch off processing of other events
Counterpart of --switch-on: # perf record -e sched:*,syscalls:sys_*_nanosleep sleep 1 [ perf record: Woken up 36 times to write data ] [ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ] # # perf script :20918 20918 [002] 109866.143696: sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001 :20918 20918 [002] 109866.143702: sched:sched_wakeup: perf:20919 [120] success=1 CPU:001 sleep 20919 [001] 109866.144081: sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919 sleep 20919 [001] 109866.144408: syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000 sleep 20919 [001] 109866.144411: sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n> sleep 20919 [001] 109866.144412: sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120] swapper 0 [001] 109867.144568: sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001 swapper 0 [001] 109867.144586: sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001 sleep 20919 [001] 109867.144614: syscalls:sys_exit_nanosleep: 0x0 sleep 20919 [001] 109867.144753: sched:sched_process_exit: comm=sleep pid=20919 prio=120 # # perf script --switch-off syscalls:sys_exit_nanosleep :20918 20918 [002] 109866.143696: sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001 :20918 20918 [002] 109866.143702: sched:sched_wakeup: perf:20919 [120] success=1 CPU:001 sleep 20919 [001] 109866.144081: sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919 sleep 20919 [001] 109866.144408: syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000 sleep 20919 [001] 109866.144411: sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n> sleep 20919 [001] 109866.144412: sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120] swapper 0 [001] 109867.144568: sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001 swapper 0 [001] 109867.144586: sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001 sleep 20919 [001] 109867.144753: sched:sched_process_exit: comm=sleep pid=20919 prio=120 # # perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep sleep 20919 [001] 109866.144411: sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n> sleep 20919 [001] 109866.144412: sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120] swapper 0 [001] 109867.144568: sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001 swapper 0 [001] 109867.144586: sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001 # # perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep --show-on-off sleep 20919 [001] 109866.144408: syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000 sleep 20919 [001] 109866.144411: sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n> sleep 20919 [001] 109866.144412: sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120] swapper 0 [001] 109867.144568: sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001 swapper 0 [001] 109867.144586: sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001 sleep 20919 [001] 109867.144614: syscalls:sys_exit_nanosleep: 0x0 # Now think about using this together with 'perf probe' to create custom on/off events in your app :-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-li3j01c4tmj9kw6ydsl8swej@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 6469eb6dff |
perf script: Allow showing the --switch-on event
One may want to see the --switch-on event as well, allow for that, using the previous cset example: # perf script --switch-on syscalls:sys_enter_nanosleep --show-on-off sleep 13638 [001] 108237.582286: syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000 sleep 13638 [001] 108237.582289: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns] sleep 13638 [001] 108237.582291: sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120] swapper 0 [001] 108238.582428: sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001 swapper 0 [001] 108238.582458: sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001 sleep 13638 [001] 108238.582698: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns] sleep 13638 [001] 108238.582782: sched:sched_process_exit: comm=sleep pid=13638 prio=120 # # perf script --switch-on syscalls:sys_enter_nanosleep sleep 13638 [001] 108237.582289: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns] sleep 13638 [001] 108237.582291: sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120] swapper 0 [001] 108238.582428: sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001 swapper 0 [001] 108238.582458: sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001 sleep 13638 [001] 108238.582698: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns] sleep 13638 [001] 108238.582782: sched:sched_process_exit: comm=sleep pid=13638 prio=120 # Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Link: https://lkml.kernel.org/n/tip-0omwwoywj1v63gu8cz0tr0cy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | f90a24171a |
perf script: Allow specifying event to switch on processing of other events
Sometime we want to only consider events after something happens, so allow discarding events till such events is found, e.g.: Record all scheduler tracepoints and the sys_enter_nanosleep syscall event for the 'sleep 1' workload: # perf record -e sched:*,syscalls:sys_enter_nanosleep sleep 1 [ perf record: Woken up 31 times to write data ] [ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ] # So we have these events in the generated perf data file: # perf evlist sched:sched_kthread_stop sched:sched_kthread_stop_ret sched:sched_waking sched:sched_wakeup sched:sched_wakeup_new sched:sched_switch sched:sched_migrate_task sched:sched_process_free sched:sched_process_exit sched:sched_wait_task sched:sched_process_wait sched:sched_process_fork sched:sched_process_exec sched:sched_stat_wait sched:sched_stat_sleep sched:sched_stat_iowait sched:sched_stat_blocked sched:sched_stat_runtime sched:sched_pi_setprio sched:sched_move_numa sched:sched_stick_numa sched:sched_swap_numa sched:sched_wake_idle_without_ipi syscalls:sys_enter_nanosleep # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # Then show all of the events that actually took place in this 'perf record' session: # perf script :13637 13637 [002] 108237.581529: sched:sched_waking: comm=perf pid=13638 prio=120 target_cpu=001 :13637 13637 [002] 108237.581537: sched:sched_wakeup: perf:13638 [120] success=1 CPU:001 sleep 13638 [001] 108237.581992: sched:sched_process_exec: filename=/usr/bin/sleep pid=13638 old_pid=13638 sleep 13638 [001] 108237.582286: syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000 sleep 13638 [001] 108237.582289: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns] sleep 13638 [001] 108237.582291: sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120] swapper 0 [001] 108238.582428: sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001 swapper 0 [001] 108238.582458: sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001 sleep 13638 [001] 108238.582698: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns] sleep 13638 [001] 108238.582782: sched:sched_process_exit: comm=sleep pid=13638 prio=120 # Now lets see only the ones that took place after a certain "marker": # perf script --switch-on syscalls:sys_enter_nanosleep sleep 13638 [001] 108237.582289: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns] sleep 13638 [001] 108237.582291: sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120] swapper 0 [001] 108238.582428: sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001 swapper 0 [001] 108238.582458: sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001 sleep 13638 [001] 108238.582698: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns] sleep 13638 [001] 108238.582782: sched:sched_process_exit: comm=sleep pid=13638 prio=120 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-f1oo0ufdhrkx6nhy2lj1ierm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Haiyan Song | 11e54d35e6 |
perf vendor events intel: Add Tremontx event file v1.02
Add a Intel event file for perf. Signed-off-by: Haiyan Song <haiyanx.song@intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190815035942.30602-1-haiyanx.song@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 1cd8fa288e |
perf ui: No need to set ui_browser to 1 twice
We need to do it only when fallbacking from GTK to the TUI. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-dda0acxqef1k72n9z4myjbjt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Tan Xiaojun | 0a4d8fb229 |
perf record: Support aarch64 random socket_id assignment
Same as in the commit
|
|
Vince Weaver | 3143906c27 |
perf.data documentation: Clarify HEADER_SAMPLE_TOPOLOGY format
The perf.data file format documentation for HEADER_SAMPLE_TOPOLOGY specifies the layout in a confusing manner that doesn't match the rest of the document. This patch attempts to describe things consistent with the rest of the file. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Chong Jiang <chongjiang@chromium.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908011425240.14303@macbook-air Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Andy Shevchenko | 38fe26b46f |
tools: Keep list of tools in alphabetical order
When `make help` is executed it lists the possible tools to build, though couple of entries is kept unordered. Fix it here. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/n/tip-0ke3p64ksa0hnbueh52n3v3q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | acb9f2d475 |
perf evsel: Provide meaningful warning when trying to use 'aux_output' on older kernels
Just like we do with the 'write_backwards' feature: Before: # perf record -e {intel_pt/branch=0/,cycles/aux-output/ppp} uname Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles/aux-output/ppp). /bin/dmesg | grep -i perf may provide additional information. # After: # perf record -e {intel_pt/branch=0/,cycles/aux-output/ppp} uname Error: The 'aux_output' feature is not supported, update the kernel. # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-wgjsjroe1e150c0metgwmqwd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Adrian Hunter | 243384dd25 |
perf intel-pt: Add brief documentation for PEBS via Intel PT
Document how to select PEBS via Intel PT and how to display synthesized PEBS samples. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190806084606.4021-8-alexander.shishkin@linux.intel.com Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> [ Update the example to use a group with intel_pt// as the group leader, as per Alex comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Adrian Hunter | 1b9921546a |
perf tools: Add aux-output config term
Expose the aux_output attribute flag to the user to configure, by adding a config term 'aux-output'. For events that support it, selection of 'aux-output' causes the generation of AUX records instead of event records. This requires that an AUX area event is also provided. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190806084606.4021-7-alexander.shishkin@linux.intel.com Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |