linux/tools/perf/Documentation/perf-diff.txt

228 lines
6.5 KiB
Plaintext
Raw Normal View History

perf diff: Introduce tool to show performance difference I guess it is enough to show some examples: [root@doppio linux-2.6-tip]# rm -f perf.data* [root@doppio linux-2.6-tip]# ls -la perf.data* ls: cannot access perf.data*: No such file or directory [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74440 2009-12-14 20:03 perf.data [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2692 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74280 2009-12-14 20:03 perf.data -rw------- 1 root root 74440 2009-12-14 20:03 perf.data.old [root@doppio linux-2.6-tip]# perf diff | head -5 1 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 -15307806 [kernel.kallsyms] __kmalloc 3 +1 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -p | head -5 1 +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 [kernel.kallsyms] __kmalloc 3 +1 /lib64/libc-2.10.1.so __GI_memmove 4 +4 /lib64/libc-2.10.1.so _int_malloc 5 +7 -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -v | head -5 1 361449551 326454971 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 151009241 135701435 -15307806 [kernel.kallsyms] __kmalloc 3 +1 101805328 105471269 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 78041440 101550435 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 59536172 98074985 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -vp | head -5 1 9.00% 8.00% +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 3.00% 3.00% [kernel.kallsyms] __kmalloc 3 +1 2.00% 2.00% /lib64/libc-2.10.1.so __GI_memmove 4 +4 2.00% 2.00% /lib64/libc-2.10.1.so _int_malloc 5 +7 1.00% 2.00% -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# This should be enough for diffs where the system is non volatile, i.e. when one doesn't updates binaries. For volatile environments, stay tuned for the next perf tool feature: a buildid cache populated by 'perf record', managed by 'perf buildid-cache' a-la ccache, and used by all the report tools. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <1260828571-3613-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 06:09:31 +08:00
perf-diff(1)
============
perf diff: Introduce tool to show performance difference I guess it is enough to show some examples: [root@doppio linux-2.6-tip]# rm -f perf.data* [root@doppio linux-2.6-tip]# ls -la perf.data* ls: cannot access perf.data*: No such file or directory [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74440 2009-12-14 20:03 perf.data [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2692 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74280 2009-12-14 20:03 perf.data -rw------- 1 root root 74440 2009-12-14 20:03 perf.data.old [root@doppio linux-2.6-tip]# perf diff | head -5 1 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 -15307806 [kernel.kallsyms] __kmalloc 3 +1 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -p | head -5 1 +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 [kernel.kallsyms] __kmalloc 3 +1 /lib64/libc-2.10.1.so __GI_memmove 4 +4 /lib64/libc-2.10.1.so _int_malloc 5 +7 -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -v | head -5 1 361449551 326454971 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 151009241 135701435 -15307806 [kernel.kallsyms] __kmalloc 3 +1 101805328 105471269 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 78041440 101550435 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 59536172 98074985 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -vp | head -5 1 9.00% 8.00% +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 3.00% 3.00% [kernel.kallsyms] __kmalloc 3 +1 2.00% 2.00% /lib64/libc-2.10.1.so __GI_memmove 4 +4 2.00% 2.00% /lib64/libc-2.10.1.so _int_malloc 5 +7 1.00% 2.00% -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# This should be enough for diffs where the system is non volatile, i.e. when one doesn't updates binaries. For volatile environments, stay tuned for the next perf tool feature: a buildid cache populated by 'perf record', managed by 'perf buildid-cache' a-la ccache, and used by all the report tools. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <1260828571-3613-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 06:09:31 +08:00
NAME
----
perf-diff - Read perf.data files and display the differential profile
perf diff: Introduce tool to show performance difference I guess it is enough to show some examples: [root@doppio linux-2.6-tip]# rm -f perf.data* [root@doppio linux-2.6-tip]# ls -la perf.data* ls: cannot access perf.data*: No such file or directory [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74440 2009-12-14 20:03 perf.data [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2692 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74280 2009-12-14 20:03 perf.data -rw------- 1 root root 74440 2009-12-14 20:03 perf.data.old [root@doppio linux-2.6-tip]# perf diff | head -5 1 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 -15307806 [kernel.kallsyms] __kmalloc 3 +1 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -p | head -5 1 +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 [kernel.kallsyms] __kmalloc 3 +1 /lib64/libc-2.10.1.so __GI_memmove 4 +4 /lib64/libc-2.10.1.so _int_malloc 5 +7 -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -v | head -5 1 361449551 326454971 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 151009241 135701435 -15307806 [kernel.kallsyms] __kmalloc 3 +1 101805328 105471269 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 78041440 101550435 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 59536172 98074985 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -vp | head -5 1 9.00% 8.00% +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 3.00% 3.00% [kernel.kallsyms] __kmalloc 3 +1 2.00% 2.00% /lib64/libc-2.10.1.so __GI_memmove 4 +4 2.00% 2.00% /lib64/libc-2.10.1.so _int_malloc 5 +7 1.00% 2.00% -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# This should be enough for diffs where the system is non volatile, i.e. when one doesn't updates binaries. For volatile environments, stay tuned for the next perf tool feature: a buildid cache populated by 'perf record', managed by 'perf buildid-cache' a-la ccache, and used by all the report tools. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <1260828571-3613-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 06:09:31 +08:00
SYNOPSIS
--------
[verse]
'perf diff' [baseline file] [data file1] [[data file2] ... ]
perf diff: Introduce tool to show performance difference I guess it is enough to show some examples: [root@doppio linux-2.6-tip]# rm -f perf.data* [root@doppio linux-2.6-tip]# ls -la perf.data* ls: cannot access perf.data*: No such file or directory [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74440 2009-12-14 20:03 perf.data [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2692 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74280 2009-12-14 20:03 perf.data -rw------- 1 root root 74440 2009-12-14 20:03 perf.data.old [root@doppio linux-2.6-tip]# perf diff | head -5 1 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 -15307806 [kernel.kallsyms] __kmalloc 3 +1 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -p | head -5 1 +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 [kernel.kallsyms] __kmalloc 3 +1 /lib64/libc-2.10.1.so __GI_memmove 4 +4 /lib64/libc-2.10.1.so _int_malloc 5 +7 -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -v | head -5 1 361449551 326454971 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 151009241 135701435 -15307806 [kernel.kallsyms] __kmalloc 3 +1 101805328 105471269 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 78041440 101550435 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 59536172 98074985 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -vp | head -5 1 9.00% 8.00% +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 3.00% 3.00% [kernel.kallsyms] __kmalloc 3 +1 2.00% 2.00% /lib64/libc-2.10.1.so __GI_memmove 4 +4 2.00% 2.00% /lib64/libc-2.10.1.so _int_malloc 5 +7 1.00% 2.00% -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# This should be enough for diffs where the system is non volatile, i.e. when one doesn't updates binaries. For volatile environments, stay tuned for the next perf tool feature: a buildid cache populated by 'perf record', managed by 'perf buildid-cache' a-la ccache, and used by all the report tools. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <1260828571-3613-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 06:09:31 +08:00
DESCRIPTION
-----------
This command displays the performance difference amongst two or more perf.data
files captured via perf record.
perf diff: Introduce tool to show performance difference I guess it is enough to show some examples: [root@doppio linux-2.6-tip]# rm -f perf.data* [root@doppio linux-2.6-tip]# ls -la perf.data* ls: cannot access perf.data*: No such file or directory [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74440 2009-12-14 20:03 perf.data [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2692 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74280 2009-12-14 20:03 perf.data -rw------- 1 root root 74440 2009-12-14 20:03 perf.data.old [root@doppio linux-2.6-tip]# perf diff | head -5 1 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 -15307806 [kernel.kallsyms] __kmalloc 3 +1 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -p | head -5 1 +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 [kernel.kallsyms] __kmalloc 3 +1 /lib64/libc-2.10.1.so __GI_memmove 4 +4 /lib64/libc-2.10.1.so _int_malloc 5 +7 -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -v | head -5 1 361449551 326454971 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 151009241 135701435 -15307806 [kernel.kallsyms] __kmalloc 3 +1 101805328 105471269 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 78041440 101550435 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 59536172 98074985 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -vp | head -5 1 9.00% 8.00% +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 3.00% 3.00% [kernel.kallsyms] __kmalloc 3 +1 2.00% 2.00% /lib64/libc-2.10.1.so __GI_memmove 4 +4 2.00% 2.00% /lib64/libc-2.10.1.so _int_malloc 5 +7 1.00% 2.00% -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# This should be enough for diffs where the system is non volatile, i.e. when one doesn't updates binaries. For volatile environments, stay tuned for the next perf tool feature: a buildid cache populated by 'perf record', managed by 'perf buildid-cache' a-la ccache, and used by all the report tools. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <1260828571-3613-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 06:09:31 +08:00
If no parameters are passed it will assume perf.data.old and perf.data.
perf diff: Make diff command work with evsel hists Putting 'perf diff' command back on track with the 'latest' evsel hists changes. Each evsel has its own 'hists' object gathering stats for the particular event. While currently counts are accumulated for the whole session regardless of the events diversification within compared sessions. The 'perf diff' command now outputs all matching events within compared sessions (with event name specified). The per event diff output stays the same. $ ./perf diff # Event 'cycles' # # Baseline Delta Shared Object Symbol # ........ .......... ................. .............................. # 0.00% +15.14% [kernel.kallsyms] [k] __wake_up 0.00% +13.38% [kernel.kallsyms] [k] ext4fs_dirhash ... SNIP 0.00% +0.42% [kernel.kallsyms] [k] local_clock 0.17% -0.05% [kernel.kallsyms] [k] native_write_msr_safe # Event 'faults' # # Baseline Delta Shared Object Symbol # ........ .......... ................. .............................. # 0.00% +79.12% ld-2.15.so [.] _dl_relocate_object 0.00% +11.62% ld-2.15.so [.] openaux Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1346946426-13496-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-06 23:46:55 +08:00
The differential profile is displayed only for events matching both
specified perf.data files.
perf diff: Support for different binaries Currently, the perf diff only works with same binaries. That's because it compares the symbol start address. It doesn't work if the perf.data comes from different binaries. This patch matches the symbol names. Actually, perf diff once intended to compare the symbol names. The commit as below can look for a pair by name. 604c5c92972d (perf diff: Change the default sort order to "dso,symbol") However, at that time, perf diff used a global list of dsos. That means the binaries which has same name can only be loaded once. That's a problem for comparing different binaries. For example, we have an old binary and an updated binary. They very likely have same name and most of the functions, so only dsos from old binary will be loaded. When processing the data from updated binary, perf still use the symbol information from old binary. That's wrong. Then the commit as below used IP to replace symbol name. 9c443dfdd31e ("perf diff: Fix support for all --sort combinations") >From that time, perf diff starts to compare the symbol address. The global dsos is discarded from a patch in 2010. a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance from host") However, at that time, perf diff already compared by address. So perf diff cannot work for different binaries as well. This patch actually rolls back the perf diff to original design. The document is also changed, so everybody knows the original design is to compare the symbol names. Here are some examples: The only difference between example_v1.c and example_v2.c is the location of f2 and f3. There is no change in behavior, but the previous perf diff display the wrong differential profile. example_v1.c noinline void f3(void) { volatile int i; for (i = 0; i < 10000;) { if(i%2) i++; else i++; } } noinline void f2(void) { volatile int a = 100, b, c; for (b = 0; b < 10000; b++) c = a * b; } noinline void f1(void) { f2(); f3(); } int main() { int i; for (i = 0; i < 100000; i++) f1(); } example_v2.c noinline void f2(void) { volatile int a = 100, b, c; for (b = 0; b < 10000; b++) c = a * b; } noinline void f3(void) { volatile int i; for (i = 0; i < 10000;) { if(i%2) i++; else i++; } } noinline void f1(void) { f2(); f3(); } int main() { int i; for (i = 0; i < 100000; i++) f1(); } [lk@localhost perf_diff]$ gcc example_v1.c -o example [lk@localhost perf_diff]$ perf record -o example_v1.data ./example [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ] [lk@localhost perf_diff]$ gcc example_v2.c -o example [lk@localhost perf_diff]$ perf record -o example_v2.data ./example [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ] Old perf diff result: [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data Event 'cycles' Baseline Delta Shared Object Symbol ........ ....... ................ ............................... [kernel.vmlinux] [k] __perf_event_task_sched_out 0.00% [kernel.vmlinux] [k] apic_timer_interrupt [kernel.vmlinux] [k] idle_cpu [kernel.vmlinux] [k] intel_pstate_timer_func [kernel.vmlinux] [k] native_read_msr_safe 0.00% [kernel.vmlinux] [k] native_read_tsc 0.00% [kernel.vmlinux] [k] native_write_msr_safe [kernel.vmlinux] [k] ntp_tick_length 0.00% [kernel.vmlinux] [k] rb_erase 0.00% [kernel.vmlinux] [k] tick_sched_timer 0.00% [kernel.vmlinux] [k] unmap_single_vma 0.00% [kernel.vmlinux] [k] update_wall_time 0.00% example [.] f1 46.24% example [.] f2 53.71% -7.55% example [.] f3 +53.81% example [.] f3 0.02% example [.] main New perf diff result: [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data [kernel.vmlinux] [k] __perf_event_task_sched_out 0.00% [kernel.vmlinux] [k] apic_timer_interrupt [kernel.vmlinux] [k] idle_cpu [kernel.vmlinux] [k] intel_pstate_timer_func [kernel.vmlinux] [k] native_read_msr_safe 0.00% [kernel.vmlinux] [k] native_read_tsc 0.00% [kernel.vmlinux] [k] native_write_msr_safe [kernel.vmlinux] [k] ntp_tick_length 0.00% [kernel.vmlinux] [k] rb_erase 0.00% [kernel.vmlinux] [k] tick_sched_timer 0.00% [kernel.vmlinux] [k] unmap_single_vma 0.00% [kernel.vmlinux] [k] update_wall_time 0.00% example [.] f1 46.24% -0.08% example [.] f2 53.71% +0.11% example [.] f3 0.02% example [.] main Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-09 13:39:44 +08:00
If no parameters are passed the samples will be sorted by dso and symbol.
As the perf.data files could come from different binaries, the symbols addresses
could vary. So perf diff is based on the comparison of the files and
symbols name.
perf diff: Introduce tool to show performance difference I guess it is enough to show some examples: [root@doppio linux-2.6-tip]# rm -f perf.data* [root@doppio linux-2.6-tip]# ls -la perf.data* ls: cannot access perf.data*: No such file or directory [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74440 2009-12-14 20:03 perf.data [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2692 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74280 2009-12-14 20:03 perf.data -rw------- 1 root root 74440 2009-12-14 20:03 perf.data.old [root@doppio linux-2.6-tip]# perf diff | head -5 1 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 -15307806 [kernel.kallsyms] __kmalloc 3 +1 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -p | head -5 1 +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 [kernel.kallsyms] __kmalloc 3 +1 /lib64/libc-2.10.1.so __GI_memmove 4 +4 /lib64/libc-2.10.1.so _int_malloc 5 +7 -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -v | head -5 1 361449551 326454971 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 151009241 135701435 -15307806 [kernel.kallsyms] __kmalloc 3 +1 101805328 105471269 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 78041440 101550435 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 59536172 98074985 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -vp | head -5 1 9.00% 8.00% +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 3.00% 3.00% [kernel.kallsyms] __kmalloc 3 +1 2.00% 2.00% /lib64/libc-2.10.1.so __GI_memmove 4 +4 2.00% 2.00% /lib64/libc-2.10.1.so _int_malloc 5 +7 1.00% 2.00% -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# This should be enough for diffs where the system is non volatile, i.e. when one doesn't updates binaries. For volatile environments, stay tuned for the next perf tool feature: a buildid cache populated by 'perf record', managed by 'perf buildid-cache' a-la ccache, and used by all the report tools. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <1260828571-3613-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 06:09:31 +08:00
OPTIONS
-------
-D::
--dump-raw-trace::
Dump raw trace in ASCII.
--kallsyms=<file>::
kallsyms pathname
-m::
--modules::
Load module symbols. WARNING: use only with -k and LIVE kernel
perf diff: Use perf_session__fprintf_hists just like 'perf record' That means that almost everything you can do with 'perf report' can be done with 'perf diff', for instance: $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2687 samples) ] perf diff | head -8 9.02% +1.00% find libc-2.10.1.so [.] _IO_vfprintf_internal 2.91% -1.00% find [kernel] [k] __kmalloc 2.85% -1.00% find [kernel] [k] ext4_htree_store_dirent 1.99% -1.00% find [kernel] [k] _atomic_dec_and_lock 2.44% find [kernel] [k] half_md4_transform $ So if you want to zoom into libc: $ perf diff --dsos libc-2.10.1.so | head -8 37.34% find [.] _IO_vfprintf_internal 10.34% find [.] __GI_memmove 8.25% +2.00% find [.] _int_malloc 5.07% -1.00% find [.] __GI_mempcpy 7.62% +2.00% find [.] _int_free $ And if there were multiple commands using libc, it is also possible to aggregate them all by using --sort symbol: $ perf diff --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% [.] __GI_mempcpy 7.62% +2.00% [.] _int_free $ The displacement column now is off by default, to use it: perf diff -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% +2 [.] __GI_mempcpy 7.62% +2.00% -1 [.] _int_free $ Using -t/--field-separator can be used for scripting: $ perf diff -t, -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34, , ,[.] _IO_vfprintf_internal 10.34, , ,[.] __GI_memmove 8.25,+2.00%, ,[.] _int_malloc 5.07,-1.00%, +2,[.] __GI_mempcpy 7.62,+2.00%, -1,[.] _int_free 6.99,+1.00%, -1,[.] _IO_new_file_xsputn 1.89,-2.00%, +4,[.] __readdir64 $ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260978567-550-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 23:49:27 +08:00
-d::
--dsos=::
Only consider symbols in these dsos. CSV that understands
file://filename entries. This option will affect the percentage
of the Baseline/Delta column. See --percentage for more info.
perf diff: Use perf_session__fprintf_hists just like 'perf record' That means that almost everything you can do with 'perf report' can be done with 'perf diff', for instance: $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2687 samples) ] perf diff | head -8 9.02% +1.00% find libc-2.10.1.so [.] _IO_vfprintf_internal 2.91% -1.00% find [kernel] [k] __kmalloc 2.85% -1.00% find [kernel] [k] ext4_htree_store_dirent 1.99% -1.00% find [kernel] [k] _atomic_dec_and_lock 2.44% find [kernel] [k] half_md4_transform $ So if you want to zoom into libc: $ perf diff --dsos libc-2.10.1.so | head -8 37.34% find [.] _IO_vfprintf_internal 10.34% find [.] __GI_memmove 8.25% +2.00% find [.] _int_malloc 5.07% -1.00% find [.] __GI_mempcpy 7.62% +2.00% find [.] _int_free $ And if there were multiple commands using libc, it is also possible to aggregate them all by using --sort symbol: $ perf diff --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% [.] __GI_mempcpy 7.62% +2.00% [.] _int_free $ The displacement column now is off by default, to use it: perf diff -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% +2 [.] __GI_mempcpy 7.62% +2.00% -1 [.] _int_free $ Using -t/--field-separator can be used for scripting: $ perf diff -t, -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34, , ,[.] _IO_vfprintf_internal 10.34, , ,[.] __GI_memmove 8.25,+2.00%, ,[.] _int_malloc 5.07,-1.00%, +2,[.] __GI_mempcpy 7.62,+2.00%, -1,[.] _int_free 6.99,+1.00%, -1,[.] _IO_new_file_xsputn 1.89,-2.00%, +4,[.] __readdir64 $ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260978567-550-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 23:49:27 +08:00
-C::
--comms=::
Only consider symbols in these comms. CSV that understands
file://filename entries. This option will affect the percentage
of the Baseline/Delta column. See --percentage for more info.
perf diff: Use perf_session__fprintf_hists just like 'perf record' That means that almost everything you can do with 'perf report' can be done with 'perf diff', for instance: $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2687 samples) ] perf diff | head -8 9.02% +1.00% find libc-2.10.1.so [.] _IO_vfprintf_internal 2.91% -1.00% find [kernel] [k] __kmalloc 2.85% -1.00% find [kernel] [k] ext4_htree_store_dirent 1.99% -1.00% find [kernel] [k] _atomic_dec_and_lock 2.44% find [kernel] [k] half_md4_transform $ So if you want to zoom into libc: $ perf diff --dsos libc-2.10.1.so | head -8 37.34% find [.] _IO_vfprintf_internal 10.34% find [.] __GI_memmove 8.25% +2.00% find [.] _int_malloc 5.07% -1.00% find [.] __GI_mempcpy 7.62% +2.00% find [.] _int_free $ And if there were multiple commands using libc, it is also possible to aggregate them all by using --sort symbol: $ perf diff --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% [.] __GI_mempcpy 7.62% +2.00% [.] _int_free $ The displacement column now is off by default, to use it: perf diff -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% +2 [.] __GI_mempcpy 7.62% +2.00% -1 [.] _int_free $ Using -t/--field-separator can be used for scripting: $ perf diff -t, -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34, , ,[.] _IO_vfprintf_internal 10.34, , ,[.] __GI_memmove 8.25,+2.00%, ,[.] _int_malloc 5.07,-1.00%, +2,[.] __GI_mempcpy 7.62,+2.00%, -1,[.] _int_free 6.99,+1.00%, -1,[.] _IO_new_file_xsputn 1.89,-2.00%, +4,[.] __readdir64 $ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260978567-550-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 23:49:27 +08:00
-S::
--symbols=::
Only consider these symbols. CSV that understands
file://filename entries. This option will affect the percentage
of the Baseline/Delta column. See --percentage for more info.
perf diff: Use perf_session__fprintf_hists just like 'perf record' That means that almost everything you can do with 'perf report' can be done with 'perf diff', for instance: $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2687 samples) ] perf diff | head -8 9.02% +1.00% find libc-2.10.1.so [.] _IO_vfprintf_internal 2.91% -1.00% find [kernel] [k] __kmalloc 2.85% -1.00% find [kernel] [k] ext4_htree_store_dirent 1.99% -1.00% find [kernel] [k] _atomic_dec_and_lock 2.44% find [kernel] [k] half_md4_transform $ So if you want to zoom into libc: $ perf diff --dsos libc-2.10.1.so | head -8 37.34% find [.] _IO_vfprintf_internal 10.34% find [.] __GI_memmove 8.25% +2.00% find [.] _int_malloc 5.07% -1.00% find [.] __GI_mempcpy 7.62% +2.00% find [.] _int_free $ And if there were multiple commands using libc, it is also possible to aggregate them all by using --sort symbol: $ perf diff --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% [.] __GI_mempcpy 7.62% +2.00% [.] _int_free $ The displacement column now is off by default, to use it: perf diff -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% +2 [.] __GI_mempcpy 7.62% +2.00% -1 [.] _int_free $ Using -t/--field-separator can be used for scripting: $ perf diff -t, -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34, , ,[.] _IO_vfprintf_internal 10.34, , ,[.] __GI_memmove 8.25,+2.00%, ,[.] _int_malloc 5.07,-1.00%, +2,[.] __GI_mempcpy 7.62,+2.00%, -1,[.] _int_free 6.99,+1.00%, -1,[.] _IO_new_file_xsputn 1.89,-2.00%, +4,[.] __readdir64 $ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260978567-550-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 23:49:27 +08:00
-s::
--sort=::
Sort by key(s): pid, comm, dso, symbol, cpu, parent, srcline.
Please see description of --sort in the perf-report man page.
perf diff: Use perf_session__fprintf_hists just like 'perf record' That means that almost everything you can do with 'perf report' can be done with 'perf diff', for instance: $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2687 samples) ] perf diff | head -8 9.02% +1.00% find libc-2.10.1.so [.] _IO_vfprintf_internal 2.91% -1.00% find [kernel] [k] __kmalloc 2.85% -1.00% find [kernel] [k] ext4_htree_store_dirent 1.99% -1.00% find [kernel] [k] _atomic_dec_and_lock 2.44% find [kernel] [k] half_md4_transform $ So if you want to zoom into libc: $ perf diff --dsos libc-2.10.1.so | head -8 37.34% find [.] _IO_vfprintf_internal 10.34% find [.] __GI_memmove 8.25% +2.00% find [.] _int_malloc 5.07% -1.00% find [.] __GI_mempcpy 7.62% +2.00% find [.] _int_free $ And if there were multiple commands using libc, it is also possible to aggregate them all by using --sort symbol: $ perf diff --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% [.] __GI_mempcpy 7.62% +2.00% [.] _int_free $ The displacement column now is off by default, to use it: perf diff -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% +2 [.] __GI_mempcpy 7.62% +2.00% -1 [.] _int_free $ Using -t/--field-separator can be used for scripting: $ perf diff -t, -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34, , ,[.] _IO_vfprintf_internal 10.34, , ,[.] __GI_memmove 8.25,+2.00%, ,[.] _int_malloc 5.07,-1.00%, +2,[.] __GI_mempcpy 7.62,+2.00%, -1,[.] _int_free 6.99,+1.00%, -1,[.] _IO_new_file_xsputn 1.89,-2.00%, +4,[.] __readdir64 $ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260978567-550-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 23:49:27 +08:00
-t::
--field-separator=::
Use a special separator character and don't pad with spaces, replacing
all occurrences of this separator in symbol names (and other output)
perf diff: Use perf_session__fprintf_hists just like 'perf record' That means that almost everything you can do with 'perf report' can be done with 'perf diff', for instance: $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] $ perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2687 samples) ] perf diff | head -8 9.02% +1.00% find libc-2.10.1.so [.] _IO_vfprintf_internal 2.91% -1.00% find [kernel] [k] __kmalloc 2.85% -1.00% find [kernel] [k] ext4_htree_store_dirent 1.99% -1.00% find [kernel] [k] _atomic_dec_and_lock 2.44% find [kernel] [k] half_md4_transform $ So if you want to zoom into libc: $ perf diff --dsos libc-2.10.1.so | head -8 37.34% find [.] _IO_vfprintf_internal 10.34% find [.] __GI_memmove 8.25% +2.00% find [.] _int_malloc 5.07% -1.00% find [.] __GI_mempcpy 7.62% +2.00% find [.] _int_free $ And if there were multiple commands using libc, it is also possible to aggregate them all by using --sort symbol: $ perf diff --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% [.] __GI_mempcpy 7.62% +2.00% [.] _int_free $ The displacement column now is off by default, to use it: perf diff -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34% [.] _IO_vfprintf_internal 10.34% [.] __GI_memmove 8.25% +2.00% [.] _int_malloc 5.07% -1.00% +2 [.] __GI_mempcpy 7.62% +2.00% -1 [.] _int_free $ Using -t/--field-separator can be used for scripting: $ perf diff -t, -m --dsos libc-2.10.1.so --sort symbol | head -8 37.34, , ,[.] _IO_vfprintf_internal 10.34, , ,[.] __GI_memmove 8.25,+2.00%, ,[.] _int_malloc 5.07,-1.00%, +2,[.] __GI_mempcpy 7.62,+2.00%, -1,[.] _int_free 6.99,+1.00%, -1,[.] _IO_new_file_xsputn 1.89,-2.00%, +4,[.] __readdir64 $ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260978567-550-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 23:49:27 +08:00
with a '.' character, that thus it's the only non valid separator.
perf diff: Introduce tool to show performance difference I guess it is enough to show some examples: [root@doppio linux-2.6-tip]# rm -f perf.data* [root@doppio linux-2.6-tip]# ls -la perf.data* ls: cannot access perf.data*: No such file or directory [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74440 2009-12-14 20:03 perf.data [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2692 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74280 2009-12-14 20:03 perf.data -rw------- 1 root root 74440 2009-12-14 20:03 perf.data.old [root@doppio linux-2.6-tip]# perf diff | head -5 1 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 -15307806 [kernel.kallsyms] __kmalloc 3 +1 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -p | head -5 1 +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 [kernel.kallsyms] __kmalloc 3 +1 /lib64/libc-2.10.1.so __GI_memmove 4 +4 /lib64/libc-2.10.1.so _int_malloc 5 +7 -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -v | head -5 1 361449551 326454971 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 151009241 135701435 -15307806 [kernel.kallsyms] __kmalloc 3 +1 101805328 105471269 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 78041440 101550435 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 59536172 98074985 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -vp | head -5 1 9.00% 8.00% +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 3.00% 3.00% [kernel.kallsyms] __kmalloc 3 +1 2.00% 2.00% /lib64/libc-2.10.1.so __GI_memmove 4 +4 2.00% 2.00% /lib64/libc-2.10.1.so _int_malloc 5 +7 1.00% 2.00% -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# This should be enough for diffs where the system is non volatile, i.e. when one doesn't updates binaries. For volatile environments, stay tuned for the next perf tool feature: a buildid cache populated by 'perf record', managed by 'perf buildid-cache' a-la ccache, and used by all the report tools. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <1260828571-3613-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 06:09:31 +08:00
-v::
--verbose::
Be verbose, for instance, show the raw counts in addition to the
perf diff: Introduce tool to show performance difference I guess it is enough to show some examples: [root@doppio linux-2.6-tip]# rm -f perf.data* [root@doppio linux-2.6-tip]# ls -la perf.data* ls: cannot access perf.data*: No such file or directory [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74440 2009-12-14 20:03 perf.data [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2692 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74280 2009-12-14 20:03 perf.data -rw------- 1 root root 74440 2009-12-14 20:03 perf.data.old [root@doppio linux-2.6-tip]# perf diff | head -5 1 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 -15307806 [kernel.kallsyms] __kmalloc 3 +1 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -p | head -5 1 +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 [kernel.kallsyms] __kmalloc 3 +1 /lib64/libc-2.10.1.so __GI_memmove 4 +4 /lib64/libc-2.10.1.so _int_malloc 5 +7 -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -v | head -5 1 361449551 326454971 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 151009241 135701435 -15307806 [kernel.kallsyms] __kmalloc 3 +1 101805328 105471269 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 78041440 101550435 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 59536172 98074985 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -vp | head -5 1 9.00% 8.00% +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 3.00% 3.00% [kernel.kallsyms] __kmalloc 3 +1 2.00% 2.00% /lib64/libc-2.10.1.so __GI_memmove 4 +4 2.00% 2.00% /lib64/libc-2.10.1.so _int_malloc 5 +7 1.00% 2.00% -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# This should be enough for diffs where the system is non volatile, i.e. when one doesn't updates binaries. For volatile environments, stay tuned for the next perf tool feature: a buildid cache populated by 'perf record', managed by 'perf buildid-cache' a-la ccache, and used by all the report tools. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <1260828571-3613-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 06:09:31 +08:00
diff.
perf diff: Add -q/--quiet option The -q/--quiet option is to suppress any message. Sometimes users just want to see the numbers and it can be used for that case. Committer notes: Before: # perf diff | head -10 Failed to open /tmp/perf-6678.map, continuing without symbols Failed to open /tmp/perf-6678.map, continuing without symbols Failed to open /tmp/perf-2646.map, continuing without symbols # Event 'cycles' # # Baseline Delta Abs Shared Object Symbol # ........ ......... .......................... ............................................ # 5.36% -1.76% [kernel.vmlinux] [k] intel_idle 2.80% +1.48% firefox [.] 0x00000000000101fe 57.12% -1.25% libxul.so [.] 0x00000000009bea92 1.36% -1.11% [kernel.vmlinux] [k] __schedule 4.26% -1.00% perf-6678.map [.] 0x00007fac4b0e9320 After: # perf diff -q | head -10 5.36% -1.76% [kernel.vmlinux] [k] intel_idle 2.80% +1.48% firefox [.] 0x00000000000101fe 57.12% -1.25% libxul.so [.] 0x00000000009bea92 1.36% -1.11% [kernel.vmlinux] [k] __schedule 4.26% -1.00% perf-6678.map [.] 0x00007fac4b0e9320 1.86% +0.95% [kernel.vmlinux] [k] update_blocked_averages 0.80% -0.70% [kernel.vmlinux] [k] native_sched_clock 0.74% -0.58% [kernel.vmlinux] [k] native_write_msr 0.76% -0.56% qemu-system-x86_64 [.] 0x00000000002395c0 +0.54% libpulsecommon-10.0.so [.] 0x000000000002d91b # Signed-off-by: Namhyung Kim <namhyung@kernel.org> Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170217081742.17417-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-17 16:17:40 +08:00
-q::
--quiet::
Do not show any message. (Suppress -v)
-f::
--force::
Don't do ownership validation.
--symfs=<directory>::
Look for files with symbols relative to this directory.
-b::
--baseline-only::
Show only items with match in baseline.
-c::
--compute::
Differential computation selection - delta, ratio, wdiff, delta-abs
(default is delta-abs). Default can be changed using diff.compute
config option. See COMPARISON METHODS section for more info.
-p::
--period::
Show period values for both compared hist entries.
-F::
--formula::
Show formula for given computation.
-o::
--order::
Specify compute sorting column number. 0 means sorting by baseline
overhead and 1 (default) means sorting by computed value of column 1
(data from the first file other base baseline). Values more than 1
can be used only if enough data files are provided.
The default value can be set using the diff.order config option.
--percentage::
Determine how to display the overhead percentage of filtered entries.
Filters can be applied by --comms, --dsos and/or --symbols options.
"relative" means it's relative to filtered entries only so that the
sum of shown entries will be always 100%. "absolute" means it retains
the original value before and after the filter is applied.
COMPARISON
----------
The comparison is governed by the baseline file. The baseline perf.data
file is iterated for samples. All other perf.data files specified on
the command line are searched for the baseline sample pair. If the pair
is found, specified computation is made and result is displayed.
All samples from non-baseline perf.data files, that do not match any
baseline entry, are displayed with empty space within baseline column
and possible computation results (delta) in their related column.
Example files samples:
- file A with samples f1, f2, f3, f4, f6
- file B with samples f2, f4, f5
- file C with samples f1, f2, f5
Example output:
x - computation takes place for pair
b - baseline sample percentage
- perf diff A B C
baseline/A compute/B compute/C samples
---------------------------------------
b x f1
b x x f2
b f3
b x f4
b f6
x x f5
- perf diff B A C
baseline/B compute/A compute/C samples
---------------------------------------
b x x f2
b x f4
b x f5
x x f1
x f3
x f6
- perf diff C B A
baseline/C compute/B compute/A samples
---------------------------------------
b x f1
b x x f2
b x f5
x f3
x x f4
x f6
COMPARISON METHODS
------------------
delta
~~~~~
If specified the 'Delta' column is displayed with value 'd' computed as:
d = A->period_percent - B->period_percent
with:
- A/B being matching hist entry from data/baseline file specified
(or perf.data/perf.data.old) respectively.
- period_percent being the % of the hist entry period value within
single data file
- with filtering by -C, -d and/or -S, period_percent might be changed
relative to how entries are filtered. Use --percentage=absolute to
prevent such fluctuation.
perf diff: Add 'delta-abs' compute method The 'delta-abs' compute method is same as 'delta' but shows entries with bigger absolute delta first instead of sorting numerically. This is only useful together with -o option. Below is default output (-c delta): $ perf diff -o 1 -c delta | grep -v ^# | head 42.22% +4.97% [kernel.kallsyms] [k] cfb_imageblit 0.62% +1.23% [kernel.kallsyms] [k] mutex_lock +1.15% [kernel.kallsyms] [k] copy_user_generic_string 2.40% +0.95% [kernel.kallsyms] [k] bit_putcs 0.31% +0.79% [kernel.kallsyms] [k] link_path_walk +0.64% [kernel.kallsyms] [k] kmem_cache_alloc 0.00% +0.57% [kernel.kallsyms] [k] __rcu_read_unlock +0.45% [kernel.kallsyms] [k] alloc_set_pte 0.16% +0.45% [kernel.kallsyms] [k] menu_select +0.41% ld-2.24.so [.] do_lookup_x Now with 'delta-abs' it shows entries have bigger delta value either positive or negative. $ perf diff -o 1 -c delta-abs | grep -v ^# | head 42.22% +4.97% [kernel.kallsyms] [k] cfb_imageblit 12.72% -3.01% [kernel.kallsyms] [k] intel_idle 9.72% -1.31% [unknown] [.] 0x0000000000411343 0.62% +1.23% [kernel.kallsyms] [k] mutex_lock 2.40% +0.95% [kernel.kallsyms] [k] bit_putcs 0.31% +0.79% [kernel.kallsyms] [k] link_path_walk 1.35% -0.71% [kernel.kallsyms] [k] smp_call_function_single 0.00% +0.57% [kernel.kallsyms] [k] __rcu_read_unlock 0.16% +0.45% [kernel.kallsyms] [k] menu_select 0.72% -0.44% [kernel.kallsyms] [k] lookup_fast Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170210073614.24584-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-10 15:36:11 +08:00
delta-abs
~~~~~~~~~
Same as 'delta` method, but sort the result with the absolute values.
ratio
~~~~~
If specified the 'Ratio' column is displayed with value 'r' computed as:
r = A->period / B->period
with:
- A/B being matching hist entry from data/baseline file specified
(or perf.data/perf.data.old) respectively.
- period being the hist entry period value
wdiff:WEIGHT-B,WEIGHT-A
~~~~~~~~~~~~~~~~~~~~~~~
If specified the 'Weighted diff' column is displayed with value 'd' computed as:
d = B->period * WEIGHT-A - A->period * WEIGHT-B
- A/B being matching hist entry from data/baseline file specified
(or perf.data/perf.data.old) respectively.
- period being the hist entry period value
- WEIGHT-A/WEIGHT-B being user supplied weights in the the '-c' option
behind ':' separator like '-c wdiff:1,2'.
- WEIGHT-A being the weight of the data file
- WEIGHT-B being the weight of the baseline data file
perf diff: Introduce tool to show performance difference I guess it is enough to show some examples: [root@doppio linux-2.6-tip]# rm -f perf.data* [root@doppio linux-2.6-tip]# ls -la perf.data* ls: cannot access perf.data*: No such file or directory [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2699 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74440 2009-12-14 20:03 perf.data [root@doppio linux-2.6-tip]# perf record -f find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.062 MB perf.data (~2692 samples) ] [root@doppio linux-2.6-tip]# ls -la perf.data* -rw------- 1 root root 74280 2009-12-14 20:03 perf.data -rw------- 1 root root 74440 2009-12-14 20:03 perf.data.old [root@doppio linux-2.6-tip]# perf diff | head -5 1 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 -15307806 [kernel.kallsyms] __kmalloc 3 +1 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -p | head -5 1 +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 [kernel.kallsyms] __kmalloc 3 +1 /lib64/libc-2.10.1.so __GI_memmove 4 +4 /lib64/libc-2.10.1.so _int_malloc 5 +7 -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -v | head -5 1 361449551 326454971 -34994580 /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 151009241 135701435 -15307806 [kernel.kallsyms] __kmalloc 3 +1 101805328 105471269 +3665941 /lib64/libc-2.10.1.so __GI_memmove 4 +4 78041440 101550435 +23508995 /lib64/libc-2.10.1.so _int_malloc 5 +7 59536172 98074985 +38538813 [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# perf diff -vp | head -5 1 9.00% 8.00% +1.00% /lib64/libc-2.10.1.so _IO_vfprintf_internal 2 3.00% 3.00% [kernel.kallsyms] __kmalloc 3 +1 2.00% 2.00% /lib64/libc-2.10.1.so __GI_memmove 4 +4 2.00% 2.00% /lib64/libc-2.10.1.so _int_malloc 5 +7 1.00% 2.00% -1.00% [kernel.kallsyms] __d_lookup [root@doppio linux-2.6-tip]# This should be enough for diffs where the system is non volatile, i.e. when one doesn't updates binaries. For volatile environments, stay tuned for the next perf tool feature: a buildid cache populated by 'perf record', managed by 'perf buildid-cache' a-la ccache, and used by all the report tools. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <1260828571-3613-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 06:09:31 +08:00
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-report[1]