User visible:
- Support cross unwinding, i.e. collecting '--call-graph dwarf' perf.data files
in one machine and then doing analysis in another machine of a different
hardware architecture. This enables, for instance, to do:
perf record -a --call-graph dwarf
on a x86-32 or aarch64 system and then do 'perf report' on it on a
x86_64 workstation. (He Kuang)
- Fix crash in build_id_cache__kallsyms_path(), recent regression (Wang Nan)
Infrastructure:
- Make tools/lib/bpf use the IS_ERR return facility consistently and also stop
using the _get_ term for non-reference count methods (Arnaldo Carvalho de Melo)
- 'perf config' refactorings (Taeung Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXVyVuAAoJENZQFvNTUqpAcXAP/R+mXODq3brY9C9N4/ujnAvc
vu/WHq7J1kbL8JrAsUHos0avPCZJigibx3rzG6ZuV3SqOpvKBc7hy+nSOrRyXgT3
QG8mMpdhgyexCYRkd8nELAxS2q1E6XzeHJ5Ot5gcHx+N1QCCTm9xZ4++m5ee/Juo
0ceud29dljmjjTehAzEht5N4fgRPavAwjO8DTYVQF0+hMPbGQE7EAS7LF+mDTFxY
UAe7n1buyBm4Y/OLXNP79zQG78eSktau3amUiZ/W3owJkhbUrE7trfWNnNPCYQVU
ySVm0xbmy8fdhm6XNxCem+ya9SZ1ROF71A0FiDWlShioutcKBbustBo5UO82Rjxg
htHLOoE08BqniIWrqg8VCG2SP8PjzC9JrjxsppnZd4UgfgvwJ0hmRZ++h5dN1HFk
/RK2GJTot4EaxR9etkWG9R1sCBLAtJy+MLO/oHMxpR8Rw/1RlJRPV7amnLMvrmdx
wRlEVa+OkV0Yb968aai9DObqOkNmNSMLXXQ5ez5T01h2+VjA/PV7UtJ7z0O7Nk4L
lyn6e1IE5uko1oaPJqNhfnu7imq1AhRt8ysXfEHF4RD+eIlUMTZNxRzfyruKyJXo
YUxpfVGho60pYKMYHjriH966wWxCmWZhhY2qERbt3A9ECkVuagUWiFwx/u5ngiG1
BWxVkGtBh1yUKuttY7We
=DvZE
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20160607' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Support cross unwinding, i.e. collecting '--call-graph dwarf' perf.data files
in one machine and then doing analysis in another machine of a different
hardware architecture. This enables, for instance, to do:
perf record -a --call-graph dwarf
on a x86-32 or aarch64 system and then do 'perf report' on it on a
x86_64 workstation. (He Kuang)
- Fix crash in build_id_cache__kallsyms_path(), recent regression (Wang Nan)
Infrastructure changes:
- Make tools/lib/bpf use the IS_ERR return facility consistently and also stop
using the _get_ term for non-reference count methods (Arnaldo Carvalho de Melo)
- 'perf config' refactorings (Taeung Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
User visible:
- Tooling support for TopDown counters, recently added to the kernel (Andi Kleen)
- Show call graphs in 'perf script' when 1st event doesn't have it but some other has (He Kuang)
- Fix terminal cleanup when handling invalid .perfconfig files in 'perf top' (Taeung Song)
Build fixes:
- Respect CROSS_COMPILE for the linker in libapi (Lucas Stach)
Infrastructure:
- Fix perf_evlist__alloc_mmap() failure path (Wang Nan)
- Provide way to extract integer value from format_field (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXVeP9AAoJENZQFvNTUqpAe8oP/10gbODr38JjOwrgARoux7so
CVLpaxI21tpfEd3RTSPQSHx7wbph1luWy/GI2KDu39futg5FRDKNzVK7Ciy5VUx6
dW2BKhDjQb97G6X62Zp8rsw7jwNN775ot8E7XFzEMl6bHVRkU0bO4cBM8rMh65NW
dP4UMhE65hEkInxPpr87SYyxgKoWAYao+ZXFyIXzqApUZJ2Enqp7uNwDt5x4zedh
3d9y7EUQpaDmXcdtcSFfsZCa5B+3/5HKYWksX6ofDcwvTZUTtjEslAPH9Xt+dmYn
x+HoMvM3ZX33UeYfTiPYjEAmOUdis8ZYvbSf0XdtpYTNUsiw+YgUcBbWsYOZrdvk
N+dCRoVlaU4BBn1UpPWu1o0BC/uBgVjkWbXsvVSZC73cJXOoB1PbMzHTLvX22KxA
gcGOG5EpgKKf+vz3NwwRNRhmjIh97ChA/OOP8TYIBEWFheOtBvnHJvRawG4vDR09
4Ywg2uecbZfN+UI3qv7kis9TvV03RWgpCCWMClAmrm3RGqNFfGQGIEoRucwcUGK7
TzfvU2ycetw6hizoLgXR8gyYhhkEiZ5rhogvx+5SxVh+g9tSxLrWZxhU6jDH5zRp
vAJ1hUFkafOC60qGEkfGVnFdo1zLWhhGmN1W5mzSsYfWZCVuhDduD3i10UfX4dip
3gNEajV6R96WEEuGeIbs
=60TB
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20160606' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Tooling support for TopDown counters, recently added to the kernel (Andi Kleen)
- Show call graphs in 'perf script' when 1st event doesn't have it but some other has (He Kuang)
- Fix terminal cleanup when handling invalid .perfconfig files in 'perf top' (Taeung Song)
Build fixes:
- Respect CROSS_COMPILE for the linker in libapi (Lucas Stach)
Infrastructure changes:
- Fix perf_evlist__alloc_mmap() failure path (Wang Nan)
- Provide way to extract integer value from format_field (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Export normalize_arch() function, so other part of perf can get
normalized form of arch string.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1464924803-22214-10-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
CONFIG_LIBUNWIND/NO_LIBUNWIND are changed to CONFIG_LOCAL_LIBUNWIND/
NO_LOCAL_LIBUNWIND for retaining local unwind features. The new
CONFIG_LIBUNWIND stands for either local or remote or both unwind are
supported, and NO_LIBUNWIND means that neither local nor remote unwind
is supported.
LIBUNWIND_LIBS is eliminated in LDFLAGS if local libunwind is not
supported.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1464924803-22214-7-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add basic plumbing for TopDown in perf stat
TopDown is intended to replace the frontend cycles idle/ backend cycles
idle metrics in standard perf stat output. These metrics are not
reliable in many workloads, due to out of order effects.
This implements a new --topdown mode in perf stat (similar to
--transaction) that measures the pipe line bottlenecks using
standardized formulas. The measurement can be all done with 5 counters
(one fixed counter)
The result are four metrics:
FrontendBound, BackendBound, BadSpeculation, Retiring
that describe the CPU pipeline behavior on a high level.
The full top down methology has many hierarchical metrics. This
implementation only supports level 1 which can be collected without
multiplexing. A full implementation of top down on top of perf is
available in pmu-tools toplev. (http://github.com/andikleen/pmu-tools)
The current version works on Intel Core CPUs starting with Sandy Bridge,
and Atom CPUs starting with Silvermont. In principle the generic
metrics should be also implementable on other out of order CPUs.
TopDown level 1 uses a set of abstracted metrics which are generic to
out of order CPU cores (although some CPUs may not implement all of
them):
topdown-total-slots Available slots in the pipeline
topdown-slots-issued Slots issued into the pipeline
topdown-slots-retired Slots successfully retired
topdown-fetch-bubbles Pipeline gaps in the frontend
topdown-recovery-bubbles Pipeline gaps during recovery
from misspeculation
These metrics then allow to compute four useful metrics:
FrontendBound, BackendBound, Retiring, BadSpeculation.
Add a new --topdown options to enable events. When --topdown is
specified set up events for all topdown events supported by the kernel.
Add topdown-* as a special case to the event parser, as is needed for
all events containing -.
The actual code to compute the metrics is in follow-on patches.
v2: Use standard sysctl read function.
v3: Move x86 specific code to arch/
v4: Enable --metric-only implicitly for topdown.
v5: Add --single-thread option to not force per core mode
v6: Fix output order of topdown metrics
v7: Allow combining with -d
v8: Remove --single-thread again
v9: Rename functions, adding arch_ and topdown_.
v10: Expand man page and describe TopDown better
Paste intro into commit description.
Print error when malloc fails.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464119559-17203-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is possible that all events in an evlist are overwritable.
perf_event__synth_time_conv() should not crash in this case.
record__pick_pc() is used to check avaliability.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Highlights:
- Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
- Live patching support for ppc64le (also merged via livepatching.git)
Various cleanups & minor fixes from:
- Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart
Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta,
Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin
Rothberg, Vipin K Parashar.
General:
- Update LMB associativity index during DLPAR add/remove from Nathan Fontenot
- Fix branching to OOL handlers in relocatable kernel from Hari Bathini
- Add support for userspace Power9 copy/paste from Chris Smart
- Always use STRICT_MM_TYPECHECKS from Michael Ellerman
- Add mask of possible MMU features from Michael Ellerman
PCI:
- Enable pass through of NVLink to guests from Alexey Kardashevskiy
- Cleanups in preparation for powernv PCI hotplug from Gavin Shan
- Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
- Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
- Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G. Piccoli
- Remove the dependency on EEH struct in DDW mechanism from Guilherme G. Piccoli
selftests:
- Test cp_abort during context switch from Chris Smart
- Add several tests for transactional memory support from Rashmica Gupta
perf:
- Add support for sampling interrupt register state from Anju T
- Add support for unwinding perf-stackdump from Chandan Kumar
cxl:
- Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud
- Allow initialization on timebase sync failures from Frederic Barrat
- Increase timeout for detection of AFU mmio hang from Frederic Barrat
- Handle num_of_processes larger than can fit in the SPA from Ian Munsie
- Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie
- Add kernel API to allow a context to operate with relocate disabled from Ian Munsie
- Check periodically the coherent platform function's state from Christophe Lombard
Freescale:
- Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum
workaround, and a kconfig dependency fix."
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXPsGzAAoJEFHr6jzI4aWAVoAP/iKdrDe0eYHlVAE9SqnbsiZs
lgDxdsC8P3fsmP1G9o/HkKhC82zHl/La8Ztz8dtqa+LkSzbfliWP1ztJsI7GsBFo
tyCKzWnX9Rwvd3meHu/o/SQ29TNLm/PbPyyRqpj5QPbJ8XCXkAXR7ZZZqjvcMsJW
/AgIr7Cgf53tl9oZzzl/c7CnNHhMq+NBdA71vhWtUx+T97wfJEGyKW6HhZyHDbEU
iAki7fu77ZpEqC/Fh9swf0dCGBJ+a132NoMVo0AdV7EQLznUYlQpQEqa+1PyHZOP
/ArOzf2mDg6m3PfCo1eiB07v8PnVZ3llEUbVAJNg3GUxbE4SHrqq/kwm0iElm3p/
DvFxerCwdX9vmskJX4wDs+pSZRabXYj9XVMptsgFzA4joWrqqb7mBHqaort88YcY
YSljEt1bHyXmiJ+dBya40qARsWUkCVN7ZgEzdxckq0KI3w7g2tqpqIbO2lClWT6t
B3GpqQ4jp34+d1M14FB91fIGK7tMvOhSInE0Mv9+tPvRsepXqiiU/SwdAtRlr3m2
zs/K+4FYcVjJ3Rmpgc+tI38PbZxHe212I35YN6L1LP+4ZfAtzz0NyKdooTIBtkbO
19pX4WbBjKq8zK+YutrySncBIrbnI6VjW51vtRhgVKZliPFO/6zKagyU6FbxM+E5
udQES+t3F/9gvtxgxtDe
=YvyQ
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights:
- Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
- Live patching support for ppc64le (also merged via livepatching.git)
Various cleanups & minor fixes from:
- Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie,
Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring,
Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras,
Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung
Bauermann, Valentin Rothberg, Vipin K Parashar.
General:
- Update LMB associativity index during DLPAR add/remove from Nathan
Fontenot
- Fix branching to OOL handlers in relocatable kernel from Hari Bathini
- Add support for userspace Power9 copy/paste from Chris Smart
- Always use STRICT_MM_TYPECHECKS from Michael Ellerman
- Add mask of possible MMU features from Michael Ellerman
PCI:
- Enable pass through of NVLink to guests from Alexey Kardashevskiy
- Cleanups in preparation for powernv PCI hotplug from Gavin Shan
- Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
- Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
- Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
from Guilherme G Piccoli
- Remove the dependency on EEH struct in DDW mechanism from Guilherme
G Piccoli
selftests:
- Test cp_abort during context switch from Chris Smart
- Add several tests for transactional memory support from Rashmica
Gupta
perf:
- Add support for sampling interrupt register state from Anju T
- Add support for unwinding perf-stackdump from Chandan Kumar
cxl:
- Configure the PSL for two CAPI ports on POWER8NVL from Philippe
Bergheaud
- Allow initialization on timebase sync failures from Frederic Barrat
- Increase timeout for detection of AFU mmio hang from Frederic
Barrat
- Handle num_of_processes larger than can fit in the SPA from Ian
Munsie
- Ensure PSL interrupt is configured for contexts with no AFU IRQs
from Ian Munsie
- Add kernel API to allow a context to operate with relocate disabled
from Ian Munsie
- Check periodically the coherent platform function's state from
Christophe Lombard
Freescale:
- Updates from Scott: "Contains 86xx fixes, minor device tree fixes,
an erratum workaround, and a kconfig dependency fix."
* tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits)
powerpc/86xx: Fix PCI interrupt map definition
powerpc/86xx: Move pci1 definition to the include file
powerpc/fsl: Fix build of the dtb embedded kernel images
powerpc/fsl: Fix rcpm compatible string
powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC
powerpc/fsl-pci: Add a workaround for PCI 5 errata
powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb
powerpc/powernv/npu: Add PE to PHB's list
powerpc/powernv: Fix insufficient memory allocation
powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism
Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner()
powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover()
powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()
powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()
Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()"
powerpc/powernv/npu: Enable NVLink pass through
powerpc/powernv/npu: Rework TCE Kill handling
powerpc/powernv/npu: Add set/unset window helpers
powerpc/powernv/ioda2: Export debug helper pe_level_printk()
...
Adds support for unwinding user stack dump by linking with libunwind.
Signed-off-by: Chandan Kumar <chandan.kumar@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
ppc64le functions have a Global Entry Point (GEP) and a Local Entry
Point (LEP). While placing a probe, we always prefer the LEP since it
catches function calls through both the GEP and the LEP. In order to do
this, we fixup the function entry points during elf symbol table lookup
to point to the LEPs. This works, but breaks 'perf test kallsyms' since
the symbols loaded from the symbol table (pointing to the LEP) do not
match the symbols in kallsyms.
To fix this, we do not adjust all the symbols during symbol table load.
Instead, we note down st_other in a newly introduced arch-specific
member of perf symbol structure, and later use this to adjust the probe
trace point.
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: Mark Wielaard <mjw@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/6be7c2b17e370100c2f79dd444509df7929bdd3e.1460451721.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So far, we used to treat probe point offsets as being offset from the
LEP. However, userspace applications (objdump/readelf) always show
disassembly and offsets from the function GEP. This is confusing to the
user as we will end up probing at an address different from what the
user expects when looking at the function disassembly with
readelf/objdump. Fix this by changing how we modify probe address with
perf.
If only the function name is provided, we assume the user needs the LEP.
Otherwise, if an offset is specified, we assume that the user knows the
exact address to probe based on function disassembly, and so we just
place the probe from the GEP offset.
Finally, kretprobe was also broken with kallsyms as we were trying to
specify an offset. This patch also fixes that issue.
Reported-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: Mark Wielaard <mjw@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/75df860aad8216bf4b9bcd10c6351ecc0e3dee54.1460451721.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduced in commit 4babf2c5ef ("x86: wire up preadv2 and pwritev2").
This will make 'perf trace' aware of them.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vojoylgce2cetsy36446s5ny@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add sample_reg_mask array with pt_regs registers.
This is needed for printing supported regs ( -I? option).
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Instead receive a callchain_param pointer to configure callchain
aspects, not doing so if NULL is passed.
This will allow fine grained control over which evsels in an evlist
gets callchains enabled.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To fix the build on Fedora Rawhide (gcc 6.0.0 20160311 (Red Hat 6.0.0-0.17):
CC /tmp/build/perf/arch/x86/util/dwarf-regs.o
arch/x86/util/dwarf-regs.c:66:36: error: 'x86_32_regoffset_table' defined but not used [-Werror=unused-const-variable=]
static const struct pt_regs_offset x86_32_regoffset_table[] = {
^~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fghuksc1u8ln82bof4lwcj0o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We used libaudit to map ids to syscall names and vice-versa, but that
imposes a delay in supporting new syscalls, having to wait for libaudit
to get those new syscalls on its tables.
To remove that delay, for x86_64 initially, grab a copy of
arch/x86/entry/syscalls/syscall_64.tbl and use it to generate those
tables.
Syscalls currently not available in audit-libs:
# trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
Error: Invalid syscall copy_file_range, membarrier, mlock2, pread64, pwrite64, timerfd_create, userfaultfd
Hint: try 'perf list syscalls:sys_enter_*'
Hint: and: 'man syscalls'
#
With this patch:
# trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
8505.733 ( 0.010 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 36
8506.688 ( 0.005 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 40
30023.097 ( 0.025 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63ae382000, count: 4096, pos: 529592320) = 4096
31268.712 ( 0.028 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afd8b000, count: 4096, pos: 2314133504) = 4096
31268.854 ( 0.016 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afda2000, count: 4096, pos: 2314137600) = 4096
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-51xfjbxevdsucmnbc4ka5r88@git.kernel.org
[ Added make dep for 'prepare' in 'LIBPERF_IN', fix by Wang Nan to fix parallell build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For Intel PT / BTS, define the environment variable that selects TSC
timestamps in the jitdump file.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1457426333-30260-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses TSC as a timestamp, so add support for using TSC instead
of the monotonic clock. Use of TSC is selected by an environment
variable "JITDUMP_USE_ARCH_TIMESTAMP" and flagged in the jitdump file
with flag JITDUMP_FLAGS_ARCH_TIMESTAMP.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457426330-30226-1-git-send-email-adrian.hunter@intel.com
[ Added the fixup from He Kuang to make it build on other arches, ]
[ such as aarch64, to avoid inserting this bisectiong breakage upstream ]
Link: http://lkml.kernel.org/r/1459482572-129494-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses the time members from the perf_event_mmap_page to convert
between TSC and perf time.
Due to a lack of foresight when Intel PT was implemented, those time
members were recorded in the (implementation dependent) AUXTRACE_INFO
event, the structure of which is generally inaccessible outside of the
Intel PT decoder. However now the conversion between TSC and perf time
is needed when processing a jitdump file when Intel PT has been used for
tracing.
So add a user event to record the time members. 'perf record' will
synthesize the event if the information is available. And session
processing will put a copy of the event on the session so that tools
like 'perf inject' can easily access it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 531d241063 ("perf tools: Do not include stringify.h from the
kernel sources") seems to have accidentially removed the inclusion of
"util/header.h" from "arch/powerpc/util/header.c".
"util/header.h" provides the prototype for get_cpuid() and is needed to
build perf on Powerpc:
arch/powerpc/util/header.c:17:1: error: no previous prototype for 'get_cpuid' [-Werror=missing-prototypes]
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 531d241063 ("perf tools: Do not include stringify.h from the kernel sources")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
[ Included "util.h" too, to get the scnprintf() prototype ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use instead the copy just made to tools/include/linux/.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-q736w12nwy98x5ox2hamp5ow@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Build jitdump only on architectures defined in util/genelf.h file, to avoid
breaking the build on such arches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Mel Gorman <mgorman@suse.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160310164113.GA11357@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit b9511cd761 ("perf/x86: Fix time_shift in perf_event_mmap_page")
altered the time conversion algorithms documented in the perf_event.h
header file, to use 64-bit shifts. That was done to make the code more
future-proof (i.e. some time in the future a 32-bit shift could be
allowed). Reflect those changes in perf tools.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The sa_flags field is not being initialized, so a garbage value is being
passed to sigaction. Initialize it to zero.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1456923322-29697-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To follow convention used in other tools/perf/ areas. Also remove the
need to check if it is NULL before calling the destructor, again, to
follow convention that goes back to free().
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-w6owu7rb8a46gvunlinxaqwx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Powerpc provides hcall events that also provides insights into guest
behaviour. Enhance perf kvm stat to record and analyze hcall events.
- To trace hcall events :
perf kvm stat record
- To show the results :
perf kvm stat report --event=hcall
The result shows the number of hypervisor calls from the guest grouped
by their respective reasons displayed with the frequency.
This patch makes use of two additional tracepoints
"kvm_hv:kvm_hcall_enter" and "kvm_hv:kvm_hcall_exit". To map the hcall
codes to their respective names, it needs a mapping. Such mapping is
added in this patch in book3s_hcalls.h.
# pgrep qemu
A sample output :
19378
60515
2 VMs running.
# perf kvm stat record -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 4.153 MB perf.data.guest (39624
samples) ]
# perf kvm stat report -p 60515 --event=hcall
Analyze events for all VMs, all VCPUs:
HCALL-EVENT Samples Samples% Time% MinTime MaxTime AvgTime
H_IPI 822 66.08% 88.10% 0.63us 11.38us 2.05us (+- 1.42%)
H_SEND_CRQ 144 11.58% 3.77% 0.41us 0.88us 0.50us (+- 1.47%)
H_VIO_SIGNAL 118 9.49% 2.86% 0.37us 0.83us 0.47us (+- 1.43%)
H_PUT_TERM_CHAR 76 6.11% 2.07% 0.37us 0.90us 0.52us (+- 2.43%)
H_GET_TERM_CHAR 74 5.95% 2.23% 0.37us 1.70us 0.58us (+- 4.77%)
H_RTAS 6 0.48% 0.85% 1.10us 9.25us 2.70us (+-48.57%)
H_PERFMON 4 0.32% 0.12% 0.41us 0.96us 0.59us (+-20.92%)
Total Samples:1244, Total events handled time:1916.69us.
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-4-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf kvm can be used to analyze guest exit reasons. This support already
exists in x86. Hence, porting it to powerpc.
- To trace KVM events :
perf kvm stat record
If many guests are running, we can track for a specific guest by using
--pid as in : perf kvm stat record --pid <pid>
- To see the results :
perf kvm stat report
The result shows the number of exits (from the guest context to
host/hypervisor context) grouped by their respective exit reasons with
their frequency.
Since, different powerpc machines have different KVM tracepoints, this
patch discovers the available tracepoints dynamically and accordingly
looks for them. If any single tracepoint is not present, this support
won't be enabled for reporting. To record, this will fail if any of the
events we are looking to record isn't available. Right now, its only
supported on PowerPC Book3S_HV architectures.
To analyze the different exits, group them and present them (in a slight
descriptive way) to the user, we need a mapping between the "exit code"
(dumped in the kvm_guest_exit tracepoint data) and to its related
Interrupt vector description (exit reason). This patch adds this mapping
in book3s_hv_exits.h.
It records on two available KVM tracepoints for book3s_hv:
"kvm_hv:kvm_guest_exit" and "kvm_hv:kvm_guest_enter".
Here is a sample o/p:
# pgrep qemu
19378
60515
2 Guests are running on the host.
# perf kvm stat record -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 4.153 MB perf.data.guest (39624
samples) ]
# perf kvm stat report -p 60515
Analyze events for pid(s) 60515, all VCPUs:
VM-EXIT Samples Samples% Time% MinTime MaxTime Avg time
SYSCALL 9141 63.67% 7.49% 1.26us 5782.39us 9.87us (+- 6.46%)
H_DATA_STORAGE 4114 28.66% 5.07% 1.72us 4597.68us 14.84us (+-20.06%)
HV_DECREMENTER 418 2.91% 4.26% 0.70us 30002.22us 122.58us (+-70.29%)
EXTERNAL 392 2.73% 0.06% 0.64us 104.10us 1.94us (+-18.83%)
RETURN_TO_HOST 287 2.00% 83.11% 1.53us 124240.15us 3486.52us (+-16.81%)
H_INST_STORAGE 5 0.03% 0.00% 1.88us 3.73us 2.39us (+-14.20%)
Total Samples:14357, Total events handled time:1203918.42us.
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-3-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch removes the "const" qualifier from kvm_events_tp declaration
to account for the fact that some architectures may need to update this
variable dynamically. For instance, powerpc will need to update this
variable dynamically depending on the machine type.
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-2-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Its better to remove the dependency on uapi/kvm_perf.h to allow dynamic
discovery of kvm events (if its needed). To do this, some extern
variables have been introduced with which we can keep the generic
functions generic.
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-1-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On some architecture the size of the private header may be dependent on
the number of tracers used in the session. As such adding a "struct
perf_evlist *" parameter, which should contain all the required
information.
Also adjusting the existing client of the interface to take the new
parameter into account.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Chunyan Zhang <zhang.chunyan@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: Mike Leach <mike.leach@arm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabin@rab.in>
Cc: Tor Jeremiassen <tor@ti.com>
Link: http://lkml.kernel.org/r/1452807977-8069-22-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The while loop was spinning. Fix by removing a semicolon.
The issue was pointed out by gcc-6's -Wmisleading-indentation.
Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 035827e9f2 ("perf tests: Add Intel CQM test")
Link: http://lkml.kernel.org/r/20151214154335.GA1409@x4
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Replacing them with perf_evsel__(enable|disable).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before:
$ perf test -v cqm
48: Test intel cqm nmi context read :
--- start ---
test child forked, pid 1681
parse_events failed
test child finished with -2
---- end ----
Test intel cqm nmi context read: Skip
$
After:
$ perf test -v cqm
48: Test intel cqm nmi context read :
--- start ---
test child forked, pid 1681
parse_events failed, is "intel_cqm/llc_occupancy/" available?
test child finished with -2
---- end ----
Test intel cqm nmi context read: Skip
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eidpiv5x4nkbsx37xwikbnir@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were asking for a 4kHz sample_freq, making the test fail needlessly
when the system reduced /proc/sys/kernel/perf_event_max_sample_rate
below that.
In this test we only look at the PERF_SAMPLE_TIME fields in PERF_RECORD_
meta events, no need to set sample_freq.
Thanks to Namhyung for suggesting that max_sample_rate could be the
reason for the test failure, seeing the 'perf test -vv' output I sent.
Before:
# echo 1000 > /proc/sys/kernel/perf_event_max_sample_rate
# perf test TSC
45: Test converting perf time to TSC : FAILED!
After:
# perf test TSC
45: Test converting perf time to TSC : Ok
# cat /proc/sys/kernel/perf_event_max_sample_rate
1000
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lcob05qhawkuvsyuu9g1fld5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the subcommand-related files from perf to a new library named
libsubcmd.a.
Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to
'exec-cmd.*' to be consistent with the naming of all the other files.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some tests have sub-tests we want to run, so allow passing this.
Wang tried to avoid having to touch all tests, but then, having the
test.func in an anonymous union makes the build fail on older compilers,
like the one in RHEL6, where:
test a = {
.func = foo,
};
fails.
To fix it leave the func pointer in the main structure and pass the subtest
index to all tests, end result function is the same, but we have just one
function pointer, not two, with and without the subtest index as an argument.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5genj0ficwdmelpoqlds0u4y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
regs_query_register_offset() in dwarf-regs.c is required by BPF
prologue. This patch compiles it if CONFIG_BPF_PROLOGUE is on to avoid
build failure when CONFIG_BPF_PROLOGUE is on but CONFIG_DWARF is not
set.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1447675815-166222-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Peter reports that it's possible to trigger a WARN_ON_ONCE() in the
Intel CQM code by combining a hardware event and an Intel CQM
(software) event into a group. Unfortunately, the perf tools are not
able to create this bundle and we need to manually construct a test
case.
For posterity, record Peter's proof of concept test case in tools/perf
so that it presents a model for how we can perform architecture
specific tests, or "arch tests", in perf in the future.
The particular issue triggered in the test case is that when the
counter for the hardware event overflows and triggers a PMI we'll read
both the hardware event and the software event counters.
Unfortunately, for CQM that involves performing an IPI to read the CQM
event counters on all sockets, which in NMI context triggers the
WARN_ON_ONCE().
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/r/1437490509-15373-1-git-send-email-matt@codeblueprint.co.uk
Link: http://lkml.kernel.org/n/tip-3p4ra0u8vzm7m289a1m799kf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move out the x86-specific tests into tools/perf/arch/x86/tests and
define an 'arch_tests' array, which is the list of tests that only apply
to the build architecture.
We can also now begin to get rid of some of the #ifdef code that is
present in the generic perf tests.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/n/tip-9s68h4ptg06ah0lgnjz55mqn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tests that only make sense for some architectures currently live in
the same place as the generic tests. Move out the x86-specific tests
into tools/perf/arch/x86/tests and define an 'arch_tests' array, which
is the list of tests that only apply to the build architecture.
The main idea is to encourage developers to add arch tests to build
out perf's test coverage, without dumping everything in
tools/perf/tests.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/n/tip-p4uc1c15ssbj8xj7ku5slpa6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
regs_query_register_offset() is a helper function which converts
register name like "%rax" to offset of a register in 'struct pt_regs',
which is required by BPF prologue generator. Since the function is
identical, try to reuse the code in arch/x86/kernel/ptrace.c.
Comment inside dwarf-regs.c list the differences between this
implementation and kernel code.
get_arch_regstr() switches to regoffset_table and the old string table
is dropped.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1441523623-152703-20-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In ce80d3bef9 ("perf tools: Rename perf_session_env to perf_env") we
forgot to rename a few functions to the "perf_env" prefix, do it now.
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-b3ui3z6ock89z1814pu2er98@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for selecting and processing PERF_RECORD_SWITCH events for
use by Intel PT. If they are available, they will be used in preference
to sched_switch events.
This enables an unprivileged user to trace multi-threaded or
multi-process workloads with any level of perf_event_paranoid. However
it depends on kernel support for PERF_RECORD_SWITCH.
Without this patch, tracing a multi-threaded workload will decode
without error but all the data will be attributed to the main thread.
Without this patch, tracing a multi-process workload will result in
decoder errors because the decoder will not know which executable is
executing.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1439458857-30636-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch makes perf compile on non x86 platforms by defining a weak
symbol for sample_reg_masks[] in util/perf_regs.c.
The patch also moves the REG() and REG_END() macros into the
util/per_regs.h header file. The macros are renamed to
SMPL_REG/SMPL_REG_END to avoid clashes with other header files.
Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1441099814-26783-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is a problem in the dwarf-regs.c files for sh, sparc and x86 where
it is possible to make an out-of-bounds array access when searching for
register names.
This patch fixes it by replacing '<=' to '<', so when register (number
== XXX_MAX_REGS), get_arch_regstr() will return NULL.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reviewed-by: Matt Fleming <matt@console-pimps.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@huawei.com
Link: http://lkml.kernel.org/r/1441078184-105038-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>