perf tools changes for v5.13: 1st batch
perf stat: - Add support for hybrid PMUs to support systems such as Intel Alderlake and its BIG/little core/atom cpus. - Introduce 'bperf' to share hardware PMCs with BPF. - New --iostat option to collect and present IO stats on Intel hardware. This functionality is based on recently introduced sysfs attributes for Intel® Xeon® Scalable processor family (code name Skylake-SP): commitbb42b3d397
("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping") It is intended to provide four I/O performance metrics in MB per each PCIe root port: - Inbound Read: I/O devices below root port read from the host memory - Inbound Write: I/O devices below root port write to the host memory - Outbound Read: CPU reads from I/O devices below root port - Outbound Write: CPU writes to I/O devices below root port - Align CSV output for summary. - Clarify --null use cases: Assess raw overhead of 'perf stat' or measure just wall clock time. - Improve readability of shadow stats. perf record: - Change the COMM when starting tha workload so that --exclude-perf doesn't seem to be not honoured. - Improve 'Workload failed' message printing events + what was exec'ed. - Fix cross-arch support for TIME_CONV. perf report: - Add option to disable raw event ordering. - Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'. - Improvements to --stat output, that shows information about PERF_RECORD_ events. - Preserve identifier id in OCaml demangler. perf annotate: - Show full source location with 'l' hotkey in the 'perf annotate' TUI. - Add line number like in TUI and source location at EOL to the 'perf annotate' --stdio mode. - Add --demangle and --demangle-kernel to 'perf annotate'. - Allow configuring annotate.demangle{,_kernel} in 'perf config'. - Fix sample events lost in stdio mode. perf data: - Allow converting a perf.data file to JSON. libperf: - Add support for user space counter access. - Update topdown documentation to permit rdpmc calls. perf test: - Add 'perf test' for 'perf stat' CSV output. - Add 'perf test' entries to test the hybrid PMU support. - Cleanup 'perf test daemon' if its 'perf test' is interrupted. - Handle metric reuse in pmu-events parsing 'perf test' entry. - Add test for PE executable support. - Add timeout for wait for daemon start in its 'perf test' entries. Build: - Enable libtraceevent dynamic linking. - Improve feature detection output. - Fix caching of feature checks caching. - First round of updates for tools copies of kernel headers. - Enable warnings when compiling BPF programs. Vendor specific events: Intel: - Add missing skylake & icelake model numbers. arm64: - Add Hisi hip08 L1, L2 and L3 metrics. - Add Fujitsu A64FX PMU events. PowerPC: - Initial JSON/events list for power10 platform. - Remove unsupported power9 metrics. AMD: - Add Zen3 events. - Fix broken L2 Cache Hits from L2 HWPF metric. - Use lowercases for all the eventcodes and umasks. Hardware tracing: arm64: - Update CoreSight ETM metadata format. - Fix bitmap for CS-ETM option. - Support PID tracing in config. - Detect pid in VMID for kernel running at EL2. Arch specific: MIPS: - Support MIPS unwinding and dwarf-regs. - Generate mips syscalls_n64.c syscall table. PowerPC: - Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC. - Support pipeline stage cycles for powerpc. libbeauty: - Fix fsconfig generator. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYIshAwAKCRCyPKLppCJ+ J8oWAP9c1POclDQ7AZDe5/t/InZYSQKJFIku1sE1SNCSOupy7wEAuPBtaN7wDaRj BFBibfUGd4MNzLPvMMHneIhSY3DgJwg= =FLLr -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "perf stat: - Add support for hybrid PMUs to support systems such as Intel Alderlake and its BIG/little core/atom cpus. - Introduce 'bperf' to share hardware PMCs with BPF. - New --iostat option to collect and present IO stats on Intel hardware. This functionality is based on recently introduced sysfs attributes for Intel® Xeon® Scalable processor family (code name Skylake-SP) in commitbb42b3d397
("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping") It is intended to provide four I/O performance metrics in MB per each PCIe root port: - Inbound Read: I/O devices below root port read from the host memory - Inbound Write: I/O devices below root port write to the host memory - Outbound Read: CPU reads from I/O devices below root port - Outbound Write: CPU writes to I/O devices below root port - Align CSV output for summary. - Clarify --null use cases: Assess raw overhead of 'perf stat' or measure just wall clock time. - Improve readability of shadow stats. perf record: - Change the COMM when starting tha workload so that --exclude-perf doesn't seem to be not honoured. - Improve 'Workload failed' message printing events + what was exec'ed. - Fix cross-arch support for TIME_CONV. perf report: - Add option to disable raw event ordering. - Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'. - Improvements to --stat output, that shows information about PERF_RECORD_ events. - Preserve identifier id in OCaml demangler. perf annotate: - Show full source location with 'l' hotkey in the 'perf annotate' TUI. - Add line number like in TUI and source location at EOL to the 'perf annotate' --stdio mode. - Add --demangle and --demangle-kernel to 'perf annotate'. - Allow configuring annotate.demangle{,_kernel} in 'perf config'. - Fix sample events lost in stdio mode. perf data: - Allow converting a perf.data file to JSON. libperf: - Add support for user space counter access. - Update topdown documentation to permit rdpmc calls. perf test: - Add 'perf test' for 'perf stat' CSV output. - Add 'perf test' entries to test the hybrid PMU support. - Cleanup 'perf test daemon' if its 'perf test' is interrupted. - Handle metric reuse in pmu-events parsing 'perf test' entry. - Add test for PE executable support. - Add timeout for wait for daemon start in its 'perf test' entries. Build: - Enable libtraceevent dynamic linking. - Improve feature detection output. - Fix caching of feature checks caching. - First round of updates for tools copies of kernel headers. - Enable warnings when compiling BPF programs. Vendor specific events: - Intel: - Add missing skylake & icelake model numbers. - arm64: - Add Hisi hip08 L1, L2 and L3 metrics. - Add Fujitsu A64FX PMU events. - PowerPC: - Initial JSON/events list for power10 platform. - Remove unsupported power9 metrics. - AMD: - Add Zen3 events. - Fix broken L2 Cache Hits from L2 HWPF metric. - Use lowercases for all the eventcodes and umasks. Hardware tracing: - arm64: - Update CoreSight ETM metadata format. - Fix bitmap for CS-ETM option. - Support PID tracing in config. - Detect pid in VMID for kernel running at EL2. Arch specific updates: - MIPS: - Support MIPS unwinding and dwarf-regs. - Generate mips syscalls_n64.c syscall table. - PowerPC: - Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC. - Support pipeline stage cycles for powerpc. libbeauty: - Fix fsconfig generator" * tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (132 commits) perf build: Defer printing detected features to the end of all feature checks tools build: Allow deferring printing the results of feature detection perf build: Regenerate the FEATURE_DUMP file after extra feature checks perf session: Dump PERF_RECORD_TIME_CONV event perf session: Add swap operation for event TIME_CONV perf jit: Let convert_timestamp() to be backwards-compatible perf tools: Change fields type in perf_record_time_conv perf tools: Enable libtraceevent dynamic linking perf Documentation: Document intel-hybrid support perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid perf tests: Support 'Convert perf time to TSC' test for hybrid perf tests: Support 'Session topology' test for hybrid perf tests: Support 'Parse and process metrics' test for hybrid perf tests: Support 'Track with sched_switch' test for hybrid perf tests: Skip 'Setup struct perf_event_attr' test for hybrid perf tests: Add hybrid cases for 'Roundtrip evsel->name' test perf tests: Add hybrid cases for 'Parse event definition strings' test perf record: Uniquify hybrid event name perf stat: Warn group events from different hybrid PMU perf stat: Filter out unmatched aggregation for hybrid event ...
This commit is contained in:
commit
10a3efd0fe
|
@ -14290,8 +14290,10 @@ R: Mark Rutland <mark.rutland@arm.com>
|
|||
R: Alexander Shishkin <alexander.shishkin@linux.intel.com>
|
||||
R: Jiri Olsa <jolsa@redhat.com>
|
||||
R: Namhyung Kim <namhyung@kernel.org>
|
||||
L: linux-perf-users@vger.kernel.org
|
||||
L: linux-kernel@vger.kernel.org
|
||||
S: Supported
|
||||
W: https://perf.wiki.kernel.org/
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core
|
||||
F: arch/*/events/*
|
||||
F: arch/*/events/*/*
|
||||
|
|
|
@ -52,6 +52,7 @@ FEATURE_TESTS_BASIC := \
|
|||
libpython-version \
|
||||
libslang \
|
||||
libslang-include-subdir \
|
||||
libtraceevent \
|
||||
libcrypto \
|
||||
libunwind \
|
||||
pthread-attr-setaffinity-np \
|
||||
|
@ -239,17 +240,24 @@ ifeq ($(VF),1)
|
|||
feature_verbose := 1
|
||||
endif
|
||||
|
||||
ifeq ($(feature_display),1)
|
||||
$(info )
|
||||
$(info Auto-detecting system features:)
|
||||
$(foreach feat,$(FEATURE_DISPLAY),$(call feature_print_status,$(feat),))
|
||||
ifneq ($(feature_verbose),1)
|
||||
feature_display_entries = $(eval $(feature_display_entries_code))
|
||||
define feature_display_entries_code
|
||||
ifeq ($(feature_display),1)
|
||||
$(info )
|
||||
$(info Auto-detecting system features:)
|
||||
$(foreach feat,$(FEATURE_DISPLAY),$(call feature_print_status,$(feat),))
|
||||
ifneq ($(feature_verbose),1)
|
||||
$(info )
|
||||
endif
|
||||
endif
|
||||
|
||||
ifeq ($(feature_verbose),1)
|
||||
TMP := $(filter-out $(FEATURE_DISPLAY),$(FEATURE_TESTS))
|
||||
$(foreach feat,$(TMP),$(call feature_print_status,$(feat),))
|
||||
$(info )
|
||||
endif
|
||||
endif
|
||||
endef
|
||||
|
||||
ifeq ($(feature_verbose),1)
|
||||
TMP := $(filter-out $(FEATURE_DISPLAY),$(FEATURE_TESTS))
|
||||
$(foreach feat,$(TMP),$(call feature_print_status,$(feat),))
|
||||
$(info )
|
||||
ifeq ($(FEATURE_DISPLAY_DEFERRED),)
|
||||
$(call feature_display_entries)
|
||||
endif
|
||||
|
|
|
@ -36,6 +36,7 @@ FILES= \
|
|||
test-libpython-version.bin \
|
||||
test-libslang.bin \
|
||||
test-libslang-include-subdir.bin \
|
||||
test-libtraceevent.bin \
|
||||
test-libcrypto.bin \
|
||||
test-libunwind.bin \
|
||||
test-libunwind-debug-frame.bin \
|
||||
|
@ -196,6 +197,9 @@ $(OUTPUT)test-libslang.bin:
|
|||
$(OUTPUT)test-libslang-include-subdir.bin:
|
||||
$(BUILD) -lslang
|
||||
|
||||
$(OUTPUT)test-libtraceevent.bin:
|
||||
$(BUILD) -ltraceevent
|
||||
|
||||
$(OUTPUT)test-libcrypto.bin:
|
||||
$(BUILD) -lcrypto
|
||||
|
||||
|
|
|
@ -0,0 +1,12 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <traceevent/trace-seq.h>
|
||||
|
||||
int main(void)
|
||||
{
|
||||
int rv = 0;
|
||||
struct trace_seq s;
|
||||
trace_seq_init(&s);
|
||||
rv += !(s.state == TRACE_SEQ__GOOD);
|
||||
trace_seq_destroy(&s);
|
||||
return rv;
|
||||
}
|
|
@ -0,0 +1,75 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef _LINUX_MATH64_H
|
||||
#define _LINUX_MATH64_H
|
||||
|
||||
#include <linux/types.h>
|
||||
|
||||
#ifdef __x86_64__
|
||||
static inline u64 mul_u64_u64_div64(u64 a, u64 b, u64 c)
|
||||
{
|
||||
u64 q;
|
||||
|
||||
asm ("mulq %2; divq %3" : "=a" (q)
|
||||
: "a" (a), "rm" (b), "rm" (c)
|
||||
: "rdx");
|
||||
|
||||
return q;
|
||||
}
|
||||
#define mul_u64_u64_div64 mul_u64_u64_div64
|
||||
#endif
|
||||
|
||||
#ifdef __SIZEOF_INT128__
|
||||
static inline u64 mul_u64_u32_shr(u64 a, u32 b, unsigned int shift)
|
||||
{
|
||||
return (u64)(((unsigned __int128)a * b) >> shift);
|
||||
}
|
||||
|
||||
#else
|
||||
|
||||
#ifdef __i386__
|
||||
static inline u64 mul_u32_u32(u32 a, u32 b)
|
||||
{
|
||||
u32 high, low;
|
||||
|
||||
asm ("mull %[b]" : "=a" (low), "=d" (high)
|
||||
: [a] "a" (a), [b] "rm" (b) );
|
||||
|
||||
return low | ((u64)high) << 32;
|
||||
}
|
||||
#else
|
||||
static inline u64 mul_u32_u32(u32 a, u32 b)
|
||||
{
|
||||
return (u64)a * b;
|
||||
}
|
||||
#endif
|
||||
|
||||
static inline u64 mul_u64_u32_shr(u64 a, u32 b, unsigned int shift)
|
||||
{
|
||||
u32 ah, al;
|
||||
u64 ret;
|
||||
|
||||
al = a;
|
||||
ah = a >> 32;
|
||||
|
||||
ret = mul_u32_u32(al, b) >> shift;
|
||||
if (ah)
|
||||
ret += mul_u32_u32(ah, b) << (32 - shift);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
#endif /* __SIZEOF_INT128__ */
|
||||
|
||||
#ifndef mul_u64_u64_div64
|
||||
static inline u64 mul_u64_u64_div64(u64 a, u64 b, u64 c)
|
||||
{
|
||||
u64 quot, rem;
|
||||
|
||||
quot = a / c;
|
||||
rem = a % c;
|
||||
|
||||
return quot * b + (rem * b) / c;
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif /* _LINUX_MATH64_H */
|
|
@ -61,6 +61,9 @@ typedef __u32 __bitwise __be32;
|
|||
typedef __u64 __bitwise __le64;
|
||||
typedef __u64 __bitwise __be64;
|
||||
|
||||
typedef __u16 __bitwise __sum16;
|
||||
typedef __u32 __bitwise __wsum;
|
||||
|
||||
typedef struct {
|
||||
int counter;
|
||||
} atomic_t;
|
||||
|
|
|
@ -37,6 +37,21 @@ enum perf_type_id {
|
|||
PERF_TYPE_MAX, /* non-ABI */
|
||||
};
|
||||
|
||||
/*
|
||||
* attr.config layout for type PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE
|
||||
* PERF_TYPE_HARDWARE: 0xEEEEEEEE000000AA
|
||||
* AA: hardware event ID
|
||||
* EEEEEEEE: PMU type ID
|
||||
* PERF_TYPE_HW_CACHE: 0xEEEEEEEE00DDCCBB
|
||||
* BB: hardware cache ID
|
||||
* CC: hardware cache op ID
|
||||
* DD: hardware cache op result ID
|
||||
* EEEEEEEE: PMU type ID
|
||||
* If the PMU type ID is 0, the PERF_TYPE_RAW will be applied.
|
||||
*/
|
||||
#define PERF_PMU_TYPE_SHIFT 32
|
||||
#define PERF_HW_EVENT_MASK 0xffffffff
|
||||
|
||||
/*
|
||||
* Generalized performance event event_id types, used by the
|
||||
* attr.event_id parameter of the sys_perf_event_open()
|
||||
|
|
|
@ -136,6 +136,9 @@ SYNOPSIS
|
|||
struct perf_thread_map *threads);
|
||||
void perf_evsel__close(struct perf_evsel *evsel);
|
||||
void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu);
|
||||
int perf_evsel__mmap(struct perf_evsel *evsel, int pages);
|
||||
void perf_evsel__munmap(struct perf_evsel *evsel);
|
||||
void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu, int thread);
|
||||
int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
|
||||
struct perf_counts_values *count);
|
||||
int perf_evsel__enable(struct perf_evsel *evsel);
|
||||
|
|
|
@ -11,10 +11,12 @@
|
|||
#include <stdlib.h>
|
||||
#include <internal/xyarray.h>
|
||||
#include <internal/cpumap.h>
|
||||
#include <internal/mmap.h>
|
||||
#include <internal/threadmap.h>
|
||||
#include <internal/lib.h>
|
||||
#include <linux/string.h>
|
||||
#include <sys/ioctl.h>
|
||||
#include <sys/mman.h>
|
||||
|
||||
void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr)
|
||||
{
|
||||
|
@ -38,6 +40,7 @@ void perf_evsel__delete(struct perf_evsel *evsel)
|
|||
}
|
||||
|
||||
#define FD(e, x, y) (*(int *) xyarray__entry(e->fd, x, y))
|
||||
#define MMAP(e, x, y) (e->mmap ? ((struct perf_mmap *) xyarray__entry(e->mmap, x, y)) : NULL)
|
||||
|
||||
int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
|
||||
{
|
||||
|
@ -55,6 +58,13 @@ int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
|
|||
return evsel->fd != NULL ? 0 : -ENOMEM;
|
||||
}
|
||||
|
||||
static int perf_evsel__alloc_mmap(struct perf_evsel *evsel, int ncpus, int nthreads)
|
||||
{
|
||||
evsel->mmap = xyarray__new(ncpus, nthreads, sizeof(struct perf_mmap));
|
||||
|
||||
return evsel->mmap != NULL ? 0 : -ENOMEM;
|
||||
}
|
||||
|
||||
static int
|
||||
sys_perf_event_open(struct perf_event_attr *attr,
|
||||
pid_t pid, int cpu, int group_fd,
|
||||
|
@ -156,6 +166,72 @@ void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu)
|
|||
perf_evsel__close_fd_cpu(evsel, cpu);
|
||||
}
|
||||
|
||||
void perf_evsel__munmap(struct perf_evsel *evsel)
|
||||
{
|
||||
int cpu, thread;
|
||||
|
||||
if (evsel->fd == NULL || evsel->mmap == NULL)
|
||||
return;
|
||||
|
||||
for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++) {
|
||||
for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) {
|
||||
int fd = FD(evsel, cpu, thread);
|
||||
struct perf_mmap *map = MMAP(evsel, cpu, thread);
|
||||
|
||||
if (fd < 0)
|
||||
continue;
|
||||
|
||||
perf_mmap__munmap(map);
|
||||
}
|
||||
}
|
||||
|
||||
xyarray__delete(evsel->mmap);
|
||||
evsel->mmap = NULL;
|
||||
}
|
||||
|
||||
int perf_evsel__mmap(struct perf_evsel *evsel, int pages)
|
||||
{
|
||||
int ret, cpu, thread;
|
||||
struct perf_mmap_param mp = {
|
||||
.prot = PROT_READ | PROT_WRITE,
|
||||
.mask = (pages * page_size) - 1,
|
||||
};
|
||||
|
||||
if (evsel->fd == NULL || evsel->mmap)
|
||||
return -EINVAL;
|
||||
|
||||
if (perf_evsel__alloc_mmap(evsel, xyarray__max_x(evsel->fd), xyarray__max_y(evsel->fd)) < 0)
|
||||
return -ENOMEM;
|
||||
|
||||
for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++) {
|
||||
for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) {
|
||||
int fd = FD(evsel, cpu, thread);
|
||||
struct perf_mmap *map = MMAP(evsel, cpu, thread);
|
||||
|
||||
if (fd < 0)
|
||||
continue;
|
||||
|
||||
perf_mmap__init(map, NULL, false, NULL);
|
||||
|
||||
ret = perf_mmap__mmap(map, &mp, fd, cpu);
|
||||
if (ret) {
|
||||
perf_evsel__munmap(evsel);
|
||||
return ret;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu, int thread)
|
||||
{
|
||||
if (FD(evsel, cpu, thread) < 0 || MMAP(evsel, cpu, thread) == NULL)
|
||||
return NULL;
|
||||
|
||||
return MMAP(evsel, cpu, thread)->base;
|
||||
}
|
||||
|
||||
int perf_evsel__read_size(struct perf_evsel *evsel)
|
||||
{
|
||||
u64 read_format = evsel->attr.read_format;
|
||||
|
@ -191,6 +267,10 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
|
|||
if (FD(evsel, cpu, thread) < 0)
|
||||
return -EINVAL;
|
||||
|
||||
if (MMAP(evsel, cpu, thread) &&
|
||||
!perf_mmap__read_self(MMAP(evsel, cpu, thread), count))
|
||||
return 0;
|
||||
|
||||
if (readn(FD(evsel, cpu, thread), count->values, size) <= 0)
|
||||
return -errno;
|
||||
|
||||
|
|
|
@ -41,6 +41,7 @@ struct perf_evsel {
|
|||
struct perf_cpu_map *own_cpus;
|
||||
struct perf_thread_map *threads;
|
||||
struct xyarray *fd;
|
||||
struct xyarray *mmap;
|
||||
struct xyarray *sample_id;
|
||||
u64 *id;
|
||||
u32 ids;
|
||||
|
|
|
@ -11,6 +11,7 @@
|
|||
#define PERF_SAMPLE_MAX_SIZE (1 << 16)
|
||||
|
||||
struct perf_mmap;
|
||||
struct perf_counts_values;
|
||||
|
||||
typedef void (*libperf_unmap_cb_t)(struct perf_mmap *map);
|
||||
|
||||
|
@ -52,4 +53,6 @@ void perf_mmap__put(struct perf_mmap *map);
|
|||
|
||||
u64 perf_mmap__read_head(struct perf_mmap *map);
|
||||
|
||||
int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count);
|
||||
|
||||
#endif /* __LIBPERF_INTERNAL_MMAP_H */
|
||||
|
|
|
@ -3,11 +3,32 @@
|
|||
#define __LIBPERF_INTERNAL_TESTS_H
|
||||
|
||||
#include <stdio.h>
|
||||
#include <unistd.h>
|
||||
|
||||
int tests_failed;
|
||||
int tests_verbose;
|
||||
|
||||
static inline int get_verbose(char **argv, int argc)
|
||||
{
|
||||
int c;
|
||||
int verbose = 0;
|
||||
|
||||
while ((c = getopt(argc, argv, "v")) != -1) {
|
||||
switch (c)
|
||||
{
|
||||
case 'v':
|
||||
verbose = 1;
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
return verbose;
|
||||
}
|
||||
|
||||
#define __T_START \
|
||||
do { \
|
||||
tests_verbose = get_verbose(argv, argc); \
|
||||
fprintf(stdout, "- running %s...", __FILE__); \
|
||||
fflush(NULL); \
|
||||
tests_failed = 0; \
|
||||
|
@ -30,4 +51,15 @@ do {
|
|||
} \
|
||||
} while (0)
|
||||
|
||||
#define __T_VERBOSE(...) \
|
||||
do { \
|
||||
if (tests_verbose) { \
|
||||
if (tests_verbose == 1) { \
|
||||
fputc('\n', stderr); \
|
||||
tests_verbose++; \
|
||||
} \
|
||||
fprintf(stderr, ##__VA_ARGS__); \
|
||||
} \
|
||||
} while (0)
|
||||
|
||||
#endif /* __LIBPERF_INTERNAL_TESTS_H */
|
||||
|
|
|
@ -18,11 +18,18 @@ struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size);
|
|||
void xyarray__delete(struct xyarray *xy);
|
||||
void xyarray__reset(struct xyarray *xy);
|
||||
|
||||
static inline void *xyarray__entry(struct xyarray *xy, int x, int y)
|
||||
static inline void *__xyarray__entry(struct xyarray *xy, int x, int y)
|
||||
{
|
||||
return &xy->contents[x * xy->row_size + y * xy->entry_size];
|
||||
}
|
||||
|
||||
static inline void *xyarray__entry(struct xyarray *xy, size_t x, size_t y)
|
||||
{
|
||||
if (x >= xy->max_x || y >= xy->max_y)
|
||||
return NULL;
|
||||
return __xyarray__entry(xy, x, y);
|
||||
}
|
||||
|
||||
static inline int xyarray__max_y(struct xyarray *xy)
|
||||
{
|
||||
return xy->max_y;
|
||||
|
|
|
@ -0,0 +1,31 @@
|
|||
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
|
||||
#ifndef __LIBPERF_BPF_PERF_H
|
||||
#define __LIBPERF_BPF_PERF_H
|
||||
|
||||
#include <linux/types.h> /* for __u32 */
|
||||
|
||||
/*
|
||||
* bpf_perf uses a hashmap, the attr_map, to track all the leader programs.
|
||||
* The hashmap is pinned in bpffs. flock() on this file is used to ensure
|
||||
* no concurrent access to the attr_map. The key of attr_map is struct
|
||||
* perf_event_attr, and the value is struct perf_event_attr_map_entry.
|
||||
*
|
||||
* struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
|
||||
* leader prog, and the diff_map. Each perf-stat session holds a reference
|
||||
* to the bpf_link to make sure the leader prog is attached to sched_switch
|
||||
* tracepoint.
|
||||
*
|
||||
* Since the hashmap only contains IDs of the bpf_link and diff_map, it
|
||||
* does not hold any references to the leader program. Once all perf-stat
|
||||
* sessions of these events exit, the leader prog, its maps, and the
|
||||
* perf_events will be freed.
|
||||
*/
|
||||
struct perf_event_attr_map_entry {
|
||||
__u32 link_id;
|
||||
__u32 diff_map_id;
|
||||
};
|
||||
|
||||
/* default attr_map name */
|
||||
#define BPF_PERF_DEFAULT_ATTR_MAP_PATH "perf_attr_map"
|
||||
|
||||
#endif /* __LIBPERF_BPF_PERF_H */
|
|
@ -8,6 +8,8 @@
|
|||
#include <linux/bpf.h>
|
||||
#include <sys/types.h> /* pid_t */
|
||||
|
||||
#define event_contains(obj, mem) ((obj).header.size > offsetof(typeof(obj), mem))
|
||||
|
||||
struct perf_record_mmap {
|
||||
struct perf_event_header header;
|
||||
__u32 pid, tid;
|
||||
|
@ -346,8 +348,9 @@ struct perf_record_time_conv {
|
|||
__u64 time_zero;
|
||||
__u64 time_cycles;
|
||||
__u64 time_mask;
|
||||
bool cap_user_time_zero;
|
||||
bool cap_user_time_short;
|
||||
__u8 cap_user_time_zero;
|
||||
__u8 cap_user_time_short;
|
||||
__u8 reserved[6]; /* For alignment */
|
||||
};
|
||||
|
||||
struct perf_record_header_feature {
|
||||
|
|
|
@ -27,6 +27,9 @@ LIBPERF_API int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *
|
|||
struct perf_thread_map *threads);
|
||||
LIBPERF_API void perf_evsel__close(struct perf_evsel *evsel);
|
||||
LIBPERF_API void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu);
|
||||
LIBPERF_API int perf_evsel__mmap(struct perf_evsel *evsel, int pages);
|
||||
LIBPERF_API void perf_evsel__munmap(struct perf_evsel *evsel);
|
||||
LIBPERF_API void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu, int thread);
|
||||
LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
|
||||
struct perf_counts_values *count);
|
||||
LIBPERF_API int perf_evsel__enable(struct perf_evsel *evsel);
|
||||
|
|
|
@ -23,6 +23,9 @@ LIBPERF_0.0.1 {
|
|||
perf_evsel__disable;
|
||||
perf_evsel__open;
|
||||
perf_evsel__close;
|
||||
perf_evsel__mmap;
|
||||
perf_evsel__munmap;
|
||||
perf_evsel__mmap_base;
|
||||
perf_evsel__read;
|
||||
perf_evsel__cpus;
|
||||
perf_evsel__threads;
|
||||
|
|
|
@ -8,9 +8,11 @@
|
|||
#include <linux/perf_event.h>
|
||||
#include <perf/mmap.h>
|
||||
#include <perf/event.h>
|
||||
#include <perf/evsel.h>
|
||||
#include <internal/mmap.h>
|
||||
#include <internal/lib.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/math64.h>
|
||||
#include "internal.h"
|
||||
|
||||
void perf_mmap__init(struct perf_mmap *map, struct perf_mmap *prev,
|
||||
|
@ -273,3 +275,89 @@ union perf_event *perf_mmap__read_event(struct perf_mmap *map)
|
|||
|
||||
return event;
|
||||
}
|
||||
|
||||
#if defined(__i386__) || defined(__x86_64__)
|
||||
static u64 read_perf_counter(unsigned int counter)
|
||||
{
|
||||
unsigned int low, high;
|
||||
|
||||
asm volatile("rdpmc" : "=a" (low), "=d" (high) : "c" (counter));
|
||||
|
||||
return low | ((u64)high) << 32;
|
||||
}
|
||||
|
||||
static u64 read_timestamp(void)
|
||||
{
|
||||
unsigned int low, high;
|
||||
|
||||
asm volatile("rdtsc" : "=a" (low), "=d" (high));
|
||||
|
||||
return low | ((u64)high) << 32;
|
||||
}
|
||||
#else
|
||||
static u64 read_perf_counter(unsigned int counter __maybe_unused) { return 0; }
|
||||
static u64 read_timestamp(void) { return 0; }
|
||||
#endif
|
||||
|
||||
int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count)
|
||||
{
|
||||
struct perf_event_mmap_page *pc = map->base;
|
||||
u32 seq, idx, time_mult = 0, time_shift = 0;
|
||||
u64 cnt, cyc = 0, time_offset = 0, time_cycles = 0, time_mask = ~0ULL;
|
||||
|
||||
if (!pc || !pc->cap_user_rdpmc)
|
||||
return -1;
|
||||
|
||||
do {
|
||||
seq = READ_ONCE(pc->lock);
|
||||
barrier();
|
||||
|
||||
count->ena = READ_ONCE(pc->time_enabled);
|
||||
count->run = READ_ONCE(pc->time_running);
|
||||
|
||||
if (pc->cap_user_time && count->ena != count->run) {
|
||||
cyc = read_timestamp();
|
||||
time_mult = READ_ONCE(pc->time_mult);
|
||||
time_shift = READ_ONCE(pc->time_shift);
|
||||
time_offset = READ_ONCE(pc->time_offset);
|
||||
|
||||
if (pc->cap_user_time_short) {
|
||||
time_cycles = READ_ONCE(pc->time_cycles);
|
||||
time_mask = READ_ONCE(pc->time_mask);
|
||||
}
|
||||
}
|
||||
|
||||
idx = READ_ONCE(pc->index);
|
||||
cnt = READ_ONCE(pc->offset);
|
||||
if (pc->cap_user_rdpmc && idx) {
|
||||
s64 evcnt = read_perf_counter(idx - 1);
|
||||
u16 width = READ_ONCE(pc->pmc_width);
|
||||
|
||||
evcnt <<= 64 - width;
|
||||
evcnt >>= 64 - width;
|
||||
cnt += evcnt;
|
||||
} else
|
||||
return -1;
|
||||
|
||||
barrier();
|
||||
} while (READ_ONCE(pc->lock) != seq);
|
||||
|
||||
if (count->ena != count->run) {
|
||||
u64 delta;
|
||||
|
||||
/* Adjust for cap_usr_time_short, a nop if not */
|
||||
cyc = time_cycles + ((cyc - time_cycles) & time_mask);
|
||||
|
||||
delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift);
|
||||
|
||||
count->ena += delta;
|
||||
if (idx)
|
||||
count->run += delta;
|
||||
|
||||
cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
|
||||
}
|
||||
|
||||
count->val = cnt;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
|
|
@ -5,6 +5,8 @@ TESTS = test-cpumap test-threadmap test-evlist test-evsel
|
|||
TESTS_SO := $(addsuffix -so,$(TESTS))
|
||||
TESTS_A := $(addsuffix -a,$(TESTS))
|
||||
|
||||
TEST_ARGS := $(if $(V),-v)
|
||||
|
||||
# Set compile option CFLAGS
|
||||
ifdef EXTRA_CFLAGS
|
||||
CFLAGS := $(EXTRA_CFLAGS)
|
||||
|
@ -28,9 +30,9 @@ all: $(TESTS_A) $(TESTS_SO)
|
|||
|
||||
run:
|
||||
@echo "running static:"
|
||||
@for i in $(TESTS_A); do ./$$i; done
|
||||
@for i in $(TESTS_A); do ./$$i $(TEST_ARGS); done
|
||||
@echo "running dynamic:"
|
||||
@for i in $(TESTS_SO); do LD_LIBRARY_PATH=../ ./$$i; done
|
||||
@for i in $(TESTS_SO); do LD_LIBRARY_PATH=../ ./$$i $(TEST_ARGS); done
|
||||
|
||||
clean:
|
||||
$(call QUIET_CLEAN, tests)$(RM) $(TESTS_A) $(TESTS_SO)
|
||||
|
|
|
@ -120,6 +120,70 @@ static int test_stat_thread_enable(void)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int test_stat_user_read(int event)
|
||||
{
|
||||
struct perf_counts_values counts = { .val = 0 };
|
||||
struct perf_thread_map *threads;
|
||||
struct perf_evsel *evsel;
|
||||
struct perf_event_mmap_page *pc;
|
||||
struct perf_event_attr attr = {
|
||||
.type = PERF_TYPE_HARDWARE,
|
||||
.config = event,
|
||||
};
|
||||
int err, i;
|
||||
|
||||
threads = perf_thread_map__new_dummy();
|
||||
__T("failed to create threads", threads);
|
||||
|
||||
perf_thread_map__set_pid(threads, 0, 0);
|
||||
|
||||
evsel = perf_evsel__new(&attr);
|
||||
__T("failed to create evsel", evsel);
|
||||
|
||||
err = perf_evsel__open(evsel, NULL, threads);
|
||||
__T("failed to open evsel", err == 0);
|
||||
|
||||
err = perf_evsel__mmap(evsel, 0);
|
||||
__T("failed to mmap evsel", err == 0);
|
||||
|
||||
pc = perf_evsel__mmap_base(evsel, 0, 0);
|
||||
|
||||
#if defined(__i386__) || defined(__x86_64__)
|
||||
__T("userspace counter access not supported", pc->cap_user_rdpmc);
|
||||
__T("userspace counter access not enabled", pc->index);
|
||||
__T("userspace counter width not set", pc->pmc_width >= 32);
|
||||
#endif
|
||||
|
||||
perf_evsel__read(evsel, 0, 0, &counts);
|
||||
__T("failed to read value for evsel", counts.val != 0);
|
||||
|
||||
for (i = 0; i < 5; i++) {
|
||||
volatile int count = 0x10000 << i;
|
||||
__u64 start, end, last = 0;
|
||||
|
||||
__T_VERBOSE("\tloop = %u, ", count);
|
||||
|
||||
perf_evsel__read(evsel, 0, 0, &counts);
|
||||
start = counts.val;
|
||||
|
||||
while (count--) ;
|
||||
|
||||
perf_evsel__read(evsel, 0, 0, &counts);
|
||||
end = counts.val;
|
||||
|
||||
__T("invalid counter data", (end - start) > last);
|
||||
last = end - start;
|
||||
__T_VERBOSE("count = %llu\n", end - start);
|
||||
}
|
||||
|
||||
perf_evsel__munmap(evsel);
|
||||
perf_evsel__close(evsel);
|
||||
perf_evsel__delete(evsel);
|
||||
|
||||
perf_thread_map__put(threads);
|
||||
return 0;
|
||||
}
|
||||
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
__T_START;
|
||||
|
@ -129,6 +193,8 @@ int main(int argc, char **argv)
|
|||
test_stat_cpu();
|
||||
test_stat_thread();
|
||||
test_stat_thread_enable();
|
||||
test_stat_user_read(PERF_COUNT_HW_INSTRUCTIONS);
|
||||
test_stat_user_read(PERF_COUNT_HW_CPU_CYCLES);
|
||||
|
||||
__T_END;
|
||||
return tests_failed == 0 ? 0 : -1;
|
||||
|
|
|
@ -20,6 +20,7 @@ perf.data.old
|
|||
output.svg
|
||||
perf-archive
|
||||
perf-with-kcore
|
||||
perf-iostat
|
||||
tags
|
||||
TAGS
|
||||
cscope*
|
||||
|
|
|
@ -0,0 +1,214 @@
|
|||
Intel hybrid support
|
||||
--------------------
|
||||
Support for Intel hybrid events within perf tools.
|
||||
|
||||
For some Intel platforms, such as AlderLake, which is hybrid platform and
|
||||
it consists of atom cpu and core cpu. Each cpu has dedicated event list.
|
||||
Part of events are available on core cpu, part of events are available
|
||||
on atom cpu and even part of events are available on both.
|
||||
|
||||
Kernel exports two new cpu pmus via sysfs:
|
||||
/sys/devices/cpu_core
|
||||
/sys/devices/cpu_atom
|
||||
|
||||
The 'cpus' files are created under the directories. For example,
|
||||
|
||||
cat /sys/devices/cpu_core/cpus
|
||||
0-15
|
||||
|
||||
cat /sys/devices/cpu_atom/cpus
|
||||
16-23
|
||||
|
||||
It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus.
|
||||
|
||||
Quickstart
|
||||
|
||||
List hybrid event
|
||||
-----------------
|
||||
|
||||
As before, use perf-list to list the symbolic event.
|
||||
|
||||
perf list
|
||||
|
||||
inst_retired.any
|
||||
[Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
|
||||
inst_retired.any
|
||||
[Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]
|
||||
|
||||
The 'Unit: xxx' is added to brief description to indicate which pmu
|
||||
the event is belong to. Same event name but with different pmu can
|
||||
be supported.
|
||||
|
||||
Enable hybrid event with a specific pmu
|
||||
---------------------------------------
|
||||
|
||||
To enable a core only event or atom only event, following syntax is supported:
|
||||
|
||||
cpu_core/<event name>/
|
||||
or
|
||||
cpu_atom/<event name>/
|
||||
|
||||
For example, count the 'cycles' event on core cpus.
|
||||
|
||||
perf stat -e cpu_core/cycles/
|
||||
|
||||
Create two events for one hardware event automatically
|
||||
------------------------------------------------------
|
||||
|
||||
When creating one event and the event is available on both atom and core,
|
||||
two events are created automatically. One is for atom, the other is for
|
||||
core. Most of hardware events and cache events are available on both
|
||||
cpu_core and cpu_atom.
|
||||
|
||||
For hardware events, they have pre-defined configs (e.g. 0 for cycles).
|
||||
But on hybrid platform, kernel needs to know where the event comes from
|
||||
(from atom or from core). The original perf event type PERF_TYPE_HARDWARE
|
||||
can't carry pmu information. So now this type is extended to be PMU aware
|
||||
type. The PMU type ID is stored at attr.config[63:32].
|
||||
|
||||
PMU type ID is retrieved from sysfs.
|
||||
/sys/devices/cpu_atom/type
|
||||
/sys/devices/cpu_core/type
|
||||
|
||||
The new attr.config layout for PERF_TYPE_HARDWARE:
|
||||
|
||||
PERF_TYPE_HARDWARE: 0xEEEEEEEE000000AA
|
||||
AA: hardware event ID
|
||||
EEEEEEEE: PMU type ID
|
||||
|
||||
Cache event is similar. The type PERF_TYPE_HW_CACHE is extended to be
|
||||
PMU aware type. The PMU type ID is stored at attr.config[63:32].
|
||||
|
||||
The new attr.config layout for PERF_TYPE_HW_CACHE:
|
||||
|
||||
PERF_TYPE_HW_CACHE: 0xEEEEEEEE00DDCCBB
|
||||
BB: hardware cache ID
|
||||
CC: hardware cache op ID
|
||||
DD: hardware cache op result ID
|
||||
EEEEEEEE: PMU type ID
|
||||
|
||||
When enabling a hardware event without specified pmu, such as,
|
||||
perf stat -e cycles -a (use system-wide in this example), two events
|
||||
are created automatically.
|
||||
|
||||
------------------------------------------------------------
|
||||
perf_event_attr:
|
||||
size 120
|
||||
config 0x400000000
|
||||
sample_type IDENTIFIER
|
||||
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
|
||||
disabled 1
|
||||
inherit 1
|
||||
exclude_guest 1
|
||||
------------------------------------------------------------
|
||||
|
||||
and
|
||||
|
||||
------------------------------------------------------------
|
||||
perf_event_attr:
|
||||
size 120
|
||||
config 0x800000000
|
||||
sample_type IDENTIFIER
|
||||
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
|
||||
disabled 1
|
||||
inherit 1
|
||||
exclude_guest 1
|
||||
------------------------------------------------------------
|
||||
|
||||
type 0 is PERF_TYPE_HARDWARE.
|
||||
0x4 in 0x400000000 indicates it's cpu_core pmu.
|
||||
0x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random).
|
||||
|
||||
The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus),
|
||||
and create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus).
|
||||
|
||||
For perf-stat result, it displays two events:
|
||||
|
||||
Performance counter stats for 'system wide':
|
||||
|
||||
6,744,979 cpu_core/cycles/
|
||||
1,965,552 cpu_atom/cycles/
|
||||
|
||||
The first 'cycles' is core event, the second 'cycles' is atom event.
|
||||
|
||||
Thread mode example:
|
||||
--------------------
|
||||
|
||||
perf-stat reports the scaled counts for hybrid event and with a percentage
|
||||
displayed. The percentage is the event's running time/enabling time.
|
||||
|
||||
One example, 'triad_loop' runs on cpu16 (atom core), while we can see the
|
||||
scaled value for core cycles is 160,444,092 and the percentage is 0.47%.
|
||||
|
||||
perf stat -e cycles -- taskset -c 16 ./triad_loop
|
||||
|
||||
As previous, two events are created.
|
||||
|
||||
------------------------------------------------------------
|
||||
perf_event_attr:
|
||||
size 120
|
||||
config 0x400000000
|
||||
sample_type IDENTIFIER
|
||||
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
|
||||
disabled 1
|
||||
inherit 1
|
||||
enable_on_exec 1
|
||||
exclude_guest 1
|
||||
------------------------------------------------------------
|
||||
|
||||
and
|
||||
|
||||
------------------------------------------------------------
|
||||
perf_event_attr:
|
||||
size 120
|
||||
config 0x800000000
|
||||
sample_type IDENTIFIER
|
||||
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
|
||||
disabled 1
|
||||
inherit 1
|
||||
enable_on_exec 1
|
||||
exclude_guest 1
|
||||
------------------------------------------------------------
|
||||
|
||||
Performance counter stats for 'taskset -c 16 ./triad_loop':
|
||||
|
||||
233,066,666 cpu_core/cycles/ (0.43%)
|
||||
604,097,080 cpu_atom/cycles/ (99.57%)
|
||||
|
||||
perf-record:
|
||||
------------
|
||||
|
||||
If there is no '-e' specified in perf record, on hybrid platform,
|
||||
it creates two default 'cycles' and adds them to event list. One
|
||||
is for core, the other is for atom.
|
||||
|
||||
perf-stat:
|
||||
----------
|
||||
|
||||
If there is no '-e' specified in perf stat, on hybrid platform,
|
||||
besides of software events, following events are created and
|
||||
added to event list in order.
|
||||
|
||||
cpu_core/cycles/,
|
||||
cpu_atom/cycles/,
|
||||
cpu_core/instructions/,
|
||||
cpu_atom/instructions/,
|
||||
cpu_core/branches/,
|
||||
cpu_atom/branches/,
|
||||
cpu_core/branch-misses/,
|
||||
cpu_atom/branch-misses/
|
||||
|
||||
Of course, both perf-stat and perf-record support to enable
|
||||
hybrid event with a specific pmu.
|
||||
|
||||
e.g.
|
||||
perf stat -e cpu_core/cycles/
|
||||
perf stat -e cpu_atom/cycles/
|
||||
perf stat -e cpu_core/r1a/
|
||||
perf stat -e cpu_atom/L1-icache-loads/
|
||||
perf stat -e cpu_core/cycles/,cpu_atom/instructions/
|
||||
perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}'
|
||||
|
||||
But '{cpu_core/cycles/,cpu_atom/instructions/}' will return
|
||||
warning and disable grouping, because the pmus in group are
|
||||
not matched (cpu_core vs. cpu_atom).
|
|
@ -124,6 +124,13 @@ OPTIONS
|
|||
--group::
|
||||
Show event group information together
|
||||
|
||||
--demangle::
|
||||
Demangle symbol names to human readable form. It's enabled by default,
|
||||
disable with --no-demangle.
|
||||
|
||||
--demangle-kernel::
|
||||
Demangle kernel symbol names to human readable form (for C++ kernels).
|
||||
|
||||
--percent-type::
|
||||
Set annotation percent type from following choices:
|
||||
global-period, local-period, global-hits, local-hits
|
||||
|
|
|
@ -57,7 +57,7 @@ OPTIONS
|
|||
-u::
|
||||
--update=::
|
||||
Update specified file of the cache. Note that this doesn't remove
|
||||
older entires since those may be still needed for annotating old
|
||||
older entries since those may be still needed for annotating old
|
||||
(or remote) perf.data. Only if there is already a cache which has
|
||||
exactly same build-id, that is replaced by new one. It can be used
|
||||
to update kallsyms and kernel dso to vmlinux in order to support
|
||||
|
|
|
@ -123,6 +123,7 @@ Given a $HOME/.perfconfig like this:
|
|||
queue-size = 0
|
||||
children = true
|
||||
group = true
|
||||
skip-empty = true
|
||||
|
||||
[llvm]
|
||||
dump-obj = true
|
||||
|
@ -393,6 +394,12 @@ annotate.*::
|
|||
|
||||
This option works with tui, stdio2 browsers.
|
||||
|
||||
annotate.demangle::
|
||||
Demangle symbol names to human readable form. Default is 'true'.
|
||||
|
||||
annotate.demangle_kernel::
|
||||
Demangle kernel symbol names to human readable form. Default is 'true'.
|
||||
|
||||
hist.*::
|
||||
hist.percentage::
|
||||
This option control the way to calculate overhead of filtered entries -
|
||||
|
@ -525,6 +532,10 @@ report.*::
|
|||
0.07% 0.00% noploop ld-2.15.so [.] strcmp
|
||||
0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del
|
||||
|
||||
report.skip-empty::
|
||||
This option can change default stat behavior with empty results.
|
||||
If it's set true, 'perf report --stat' will not show 0 stats.
|
||||
|
||||
top.*::
|
||||
top.children::
|
||||
Same as 'report.children'. So if it is enabled, the output of 'top'
|
||||
|
|
|
@ -17,7 +17,7 @@ Data file related processing.
|
|||
COMMANDS
|
||||
--------
|
||||
convert::
|
||||
Converts perf data file into another format (only CTF [1] format is support by now).
|
||||
Converts perf data file into another format.
|
||||
It's possible to set data-convert debug variable to get debug messages from conversion,
|
||||
like:
|
||||
perf --debug data-convert data convert ...
|
||||
|
@ -27,6 +27,9 @@ OPTIONS for 'convert'
|
|||
--to-ctf::
|
||||
Triggers the CTF conversion, specify the path of CTF data directory.
|
||||
|
||||
--to-json::
|
||||
Triggers JSON conversion. Specify the JSON filename to output.
|
||||
|
||||
--tod::
|
||||
Convert time to wall clock time.
|
||||
|
||||
|
|
|
@ -0,0 +1,88 @@
|
|||
perf-iostat(1)
|
||||
===============
|
||||
|
||||
NAME
|
||||
----
|
||||
perf-iostat - Show I/O performance metrics
|
||||
|
||||
SYNOPSIS
|
||||
--------
|
||||
[verse]
|
||||
'perf iostat' list
|
||||
'perf iostat' <ports> -- <command> [<options>]
|
||||
|
||||
DESCRIPTION
|
||||
-----------
|
||||
Mode is intended to provide four I/O performance metrics per each PCIe root port:
|
||||
|
||||
- Inbound Read - I/O devices below root port read from the host memory, in MB
|
||||
|
||||
- Inbound Write - I/O devices below root port write to the host memory, in MB
|
||||
|
||||
- Outbound Read - CPU reads from I/O devices below root port, in MB
|
||||
|
||||
- Outbound Write - CPU writes to I/O devices below root port, in MB
|
||||
|
||||
OPTIONS
|
||||
-------
|
||||
<command>...::
|
||||
Any command you can specify in a shell.
|
||||
|
||||
list::
|
||||
List all PCIe root ports.
|
||||
|
||||
<ports>::
|
||||
Select the root ports for monitoring. Comma-separated list is supported.
|
||||
|
||||
EXAMPLES
|
||||
--------
|
||||
|
||||
1. List all PCIe root ports (example for 2-S platform):
|
||||
|
||||
$ perf iostat list
|
||||
S0-uncore_iio_0<0000:00>
|
||||
S1-uncore_iio_0<0000:80>
|
||||
S0-uncore_iio_1<0000:17>
|
||||
S1-uncore_iio_1<0000:85>
|
||||
S0-uncore_iio_2<0000:3a>
|
||||
S1-uncore_iio_2<0000:ae>
|
||||
S0-uncore_iio_3<0000:5d>
|
||||
S1-uncore_iio_3<0000:d7>
|
||||
|
||||
2. Collect metrics for all PCIe root ports:
|
||||
|
||||
$ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
|
||||
357708+0 records in
|
||||
357707+0 records out
|
||||
375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s
|
||||
|
||||
Performance counter stats for 'system wide':
|
||||
|
||||
port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB)
|
||||
0000:00 1 0 2 3
|
||||
0000:80 0 0 0 0
|
||||
0000:17 352552 43 0 21
|
||||
0000:85 0 0 0 0
|
||||
0000:3a 3 0 0 0
|
||||
0000:ae 0 0 0 0
|
||||
0000:5d 0 0 0 0
|
||||
0000:d7 0 0 0 0
|
||||
|
||||
3. Collect metrics for comma-separated list of PCIe root ports:
|
||||
|
||||
$ perf iostat 0000:17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
|
||||
357708+0 records in
|
||||
357707+0 records out
|
||||
375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s
|
||||
|
||||
Performance counter stats for 'system wide':
|
||||
|
||||
port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB)
|
||||
0000:17 358559 44 0 22
|
||||
0000:3a 3 2 0 0
|
||||
|
||||
197.081983474 seconds time elapsed
|
||||
|
||||
SEE ALSO
|
||||
--------
|
||||
linkperf:perf-stat[1]
|
|
@ -695,6 +695,7 @@ measurements:
|
|||
wait -n ${perf_pid}
|
||||
exit $?
|
||||
|
||||
include::intel-hybrid.txt[]
|
||||
|
||||
SEE ALSO
|
||||
--------
|
||||
|
|
|
@ -112,6 +112,8 @@ OPTIONS
|
|||
- ins_lat: Instruction latency in core cycles. This is the global instruction
|
||||
latency
|
||||
- local_ins_lat: Local instruction latency version
|
||||
- p_stage_cyc: On powerpc, this presents the number of cycles spent in a
|
||||
pipeline stage. And currently supported only on powerpc.
|
||||
|
||||
By default, comm, dso and symbol keys are used.
|
||||
(i.e. --sort comm,dso,symbol)
|
||||
|
@ -224,6 +226,9 @@ OPTIONS
|
|||
--dump-raw-trace::
|
||||
Dump raw trace in ASCII.
|
||||
|
||||
--disable-order::
|
||||
Disable raw trace ordering.
|
||||
|
||||
-g::
|
||||
--call-graph=<print_type,threshold[,print_limit],order,sort_key[,branch],value>::
|
||||
Display call chains using type, min percent threshold, print limit,
|
||||
|
@ -472,7 +477,7 @@ OPTIONS
|
|||
but probably we'll make the default not to show the switch-on/off events
|
||||
on the --group mode and if there is only one event besides the off/on ones,
|
||||
go straight to the histogram browser, just like 'perf report' with no events
|
||||
explicitely specified does.
|
||||
explicitly specified does.
|
||||
|
||||
--itrace::
|
||||
Options for decoding instruction tracing data. The options are:
|
||||
|
@ -566,6 +571,9 @@ include::itrace.txt[]
|
|||
sampled cycles
|
||||
'Avg Cycles' - block average sampled cycles
|
||||
|
||||
--skip-empty::
|
||||
Do not print 0 results in the --stat output.
|
||||
|
||||
include::callchain-overhead-calculation.txt[]
|
||||
|
||||
SEE ALSO
|
||||
|
|
|
@ -93,6 +93,19 @@ report::
|
|||
|
||||
1.102235068 seconds time elapsed
|
||||
|
||||
--bpf-counters::
|
||||
Use BPF programs to aggregate readings from perf_events. This
|
||||
allows multiple perf-stat sessions that are counting the same metric (cycles,
|
||||
instructions, etc.) to share hardware counters.
|
||||
To use BPF programs on common events by default, use
|
||||
"perf config stat.bpf-counter-events=<list_of_events>".
|
||||
|
||||
--bpf-attr-map::
|
||||
With option "--bpf-counters", different perf-stat sessions share
|
||||
information about shared BPF programs and maps via a pinned hashmap.
|
||||
Use "--bpf-attr-map" to specify the path of this pinned hashmap.
|
||||
The default path is /sys/fs/bpf/perf_attr_map.
|
||||
|
||||
ifdef::HAVE_LIBPFM[]
|
||||
--pfm-events events::
|
||||
Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net)
|
||||
|
@ -142,7 +155,10 @@ Do not aggregate counts across all monitored CPUs.
|
|||
|
||||
-n::
|
||||
--null::
|
||||
null run - don't start any counters
|
||||
null run - Don't start any counters.
|
||||
|
||||
This can be useful to measure just elapsed wall-clock time - or to assess the
|
||||
raw overhead of perf stat itself, without running any counters.
|
||||
|
||||
-v::
|
||||
--verbose::
|
||||
|
@ -468,6 +484,15 @@ convenient for post processing.
|
|||
--summary::
|
||||
Print summary for interval mode (-I).
|
||||
|
||||
--no-csv-summary::
|
||||
Don't print 'summary' at the first column for CVS summary output.
|
||||
This option must be used with -x and --summary.
|
||||
|
||||
This option can be enabled in perf config by setting the variable
|
||||
'stat.no-csv-summary'.
|
||||
|
||||
$ perf config stat.no-csv-summary=true
|
||||
|
||||
EXAMPLES
|
||||
--------
|
||||
|
||||
|
@ -527,6 +552,8 @@ The fields are in this order:
|
|||
|
||||
Additional metrics may be printed with all earlier fields being empty.
|
||||
|
||||
include::intel-hybrid.txt[]
|
||||
|
||||
SEE ALSO
|
||||
--------
|
||||
linkperf:perf-top[1], linkperf:perf-list[1]
|
||||
|
|
|
@ -317,7 +317,7 @@ Default is to monitor all CPUS.
|
|||
but probably we'll make the default not to show the switch-on/off events
|
||||
on the --group mode and if there is only one event besides the off/on ones,
|
||||
go straight to the histogram browser, just like 'perf top' with no events
|
||||
explicitely specified does.
|
||||
explicitly specified does.
|
||||
|
||||
--stitch-lbr::
|
||||
Show callgraph with stitched LBRs, which may have more complete
|
||||
|
|
|
@ -76,3 +76,15 @@ SEE ALSO
|
|||
linkperf:perf-stat[1], linkperf:perf-top[1],
|
||||
linkperf:perf-record[1], linkperf:perf-report[1],
|
||||
linkperf:perf-list[1]
|
||||
|
||||
linkperf:perf-annotate[1],linkperf:perf-archive[1],
|
||||
linkperf:perf-bench[1], linkperf:perf-buildid-cache[1],
|
||||
linkperf:perf-buildid-list[1], linkperf:perf-c2c[1],
|
||||
linkperf:perf-config[1], linkperf:perf-data[1], linkperf:perf-diff[1],
|
||||
linkperf:perf-evlist[1], linkperf:perf-ftrace[1],
|
||||
linkperf:perf-help[1], linkperf:perf-inject[1],
|
||||
linkperf:perf-intel-pt[1], linkperf:perf-kallsyms[1],
|
||||
linkperf:perf-kmem[1], linkperf:perf-kvm[1], linkperf:perf-lock[1],
|
||||
linkperf:perf-mem[1], linkperf:perf-probe[1], linkperf:perf-sched[1],
|
||||
linkperf:perf-script[1], linkperf:perf-test[1],
|
||||
linkperf:perf-trace[1], linkperf:perf-version[1]
|
||||
|
|
|
@ -72,6 +72,7 @@ For example, the perf_event_attr structure can be initialized with
|
|||
The Fixed counter 3 must be the leader of the group.
|
||||
|
||||
#include <linux/perf_event.h>
|
||||
#include <sys/mman.h>
|
||||
#include <sys/syscall.h>
|
||||
#include <unistd.h>
|
||||
|
||||
|
@ -95,6 +96,11 @@ int slots_fd = perf_event_open(&slots, 0, -1, -1, 0);
|
|||
if (slots_fd < 0)
|
||||
... error ...
|
||||
|
||||
/* Memory mapping the fd permits _rdpmc calls from userspace */
|
||||
void *slots_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, slots_fd, 0);
|
||||
if (!slot_p)
|
||||
.... error ...
|
||||
|
||||
/*
|
||||
* Open metrics event file descriptor for current task.
|
||||
* Set slots event as the leader of the group.
|
||||
|
@ -110,6 +116,14 @@ int metrics_fd = perf_event_open(&metrics, 0, -1, slots_fd, 0);
|
|||
if (metrics_fd < 0)
|
||||
... error ...
|
||||
|
||||
/* Memory mapping the fd permits _rdpmc calls from userspace */
|
||||
void *metrics_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, metrics_fd, 0);
|
||||
if (!metrics_p)
|
||||
... error ...
|
||||
|
||||
Note: the file descriptors returned by the perf_event_open calls must be memory
|
||||
mapped to permit calls to the _rdpmd instruction. Permission may also be granted
|
||||
by writing the /sys/devices/cpu/rdpmc sysfs node.
|
||||
|
||||
The RDPMC instruction (or _rdpmc compiler intrinsic) can now be used
|
||||
to read slots and the topdown metrics at different points of the program:
|
||||
|
@ -141,6 +155,10 @@ as the parallelism and overlap in the CPU program execution will
|
|||
cause too much measurement inaccuracy. For example instrumenting
|
||||
individual basic blocks is definitely too fine grained.
|
||||
|
||||
_rdpmc calls should not be mixed with reading the metrics and slots counters
|
||||
through system calls, as the kernel will reset these counters after each system
|
||||
call.
|
||||
|
||||
Decoding metrics values
|
||||
=======================
|
||||
|
||||
|
|
|
@ -100,7 +100,10 @@ clean:
|
|||
# make -C tools/perf -f tests/make
|
||||
#
|
||||
build-test:
|
||||
@$(MAKE) SHUF=1 -f tests/make REUSE_FEATURES_DUMP=1 MK=Makefile SET_PARALLEL=1 --no-print-directory tarpkg out
|
||||
@$(MAKE) SHUF=1 -f tests/make REUSE_FEATURES_DUMP=1 MK=Makefile SET_PARALLEL=1 --no-print-directory tarpkg make_static make_with_gtk2 out
|
||||
|
||||
build-test-tarball:
|
||||
@$(MAKE) -f tests/make REUSE_FEATURES_DUMP=1 MK=Makefile SET_PARALLEL=1 --no-print-directory out
|
||||
|
||||
#
|
||||
# All other targets get passed through:
|
||||
|
|
|
@ -32,7 +32,7 @@ ifneq ($(NO_SYSCALL_TABLE),1)
|
|||
NO_SYSCALL_TABLE := 0
|
||||
endif
|
||||
else
|
||||
ifeq ($(SRCARCH),$(filter $(SRCARCH),powerpc arm64 s390))
|
||||
ifeq ($(SRCARCH),$(filter $(SRCARCH),powerpc arm64 s390 mips))
|
||||
NO_SYSCALL_TABLE := 0
|
||||
endif
|
||||
endif
|
||||
|
@ -87,6 +87,13 @@ ifeq ($(ARCH),s390)
|
|||
CFLAGS += -fPIC -I$(OUTPUT)arch/s390/include/generated
|
||||
endif
|
||||
|
||||
ifeq ($(ARCH),mips)
|
||||
NO_PERF_REGS := 0
|
||||
CFLAGS += -I$(OUTPUT)arch/mips/include/generated
|
||||
CFLAGS += -I../../arch/mips/include/uapi -I../../arch/mips/include/generated/uapi
|
||||
LIBUNWIND_LIBS = -lunwind -lunwind-mips
|
||||
endif
|
||||
|
||||
ifeq ($(NO_PERF_REGS),0)
|
||||
$(call detected,CONFIG_PERF_REGS)
|
||||
endif
|
||||
|
@ -292,6 +299,9 @@ ifneq ($(TCMALLOC),)
|
|||
endif
|
||||
|
||||
ifeq ($(FEATURES_DUMP),)
|
||||
# We will display at the end of this Makefile.config, using $(call feature_display_entries)
|
||||
# As we may retry some feature detection here, see the disassembler-four-args case, for instance
|
||||
FEATURE_DISPLAY_DEFERRED := 1
|
||||
include $(srctree)/tools/build/Makefile.feature
|
||||
else
|
||||
include $(FEATURES_DUMP)
|
||||
|
@ -1072,6 +1082,15 @@ ifdef LIBPFM4
|
|||
endif
|
||||
endif
|
||||
|
||||
ifdef LIBTRACEEVENT_DYNAMIC
|
||||
$(call feature_check,libtraceevent)
|
||||
ifeq ($(feature-libtraceevent), 1)
|
||||
EXTLIBS += -ltraceevent
|
||||
else
|
||||
dummy := $(error Error: No libtraceevent devel library found, please install libtraceevent-devel);
|
||||
endif
|
||||
endif
|
||||
|
||||
# Among the variables below, these:
|
||||
# perfexecdir
|
||||
# perf_include_dir
|
||||
|
@ -1208,3 +1227,13 @@ $(call detected_var,LIBDIR)
|
|||
$(call detected_var,GTK_CFLAGS)
|
||||
$(call detected_var,PERL_EMBED_CCOPTS)
|
||||
$(call detected_var,PYTHON_EMBED_CCOPTS)
|
||||
|
||||
# re-generate FEATURE-DUMP as we may have called feature_check, found out
|
||||
# extra libraries to add to LDFLAGS of some other test and then redo those
|
||||
# tests, see the block about libbfd, disassembler-four-args, for instance.
|
||||
$(shell rm -f $(FEATURE_DUMP_FILENAME))
|
||||
$(foreach feat,$(FEATURE_TESTS),$(shell echo "$(call feature_assign,$(feat))" >> $(FEATURE_DUMP_FILENAME)))
|
||||
|
||||
ifeq ($(feature_display),1)
|
||||
$(call feature_display_entries)
|
||||
endif
|
||||
|
|
|
@ -128,6 +128,8 @@ include ../scripts/utilities.mak
|
|||
#
|
||||
# Define BUILD_BPF_SKEL to enable BPF skeletons
|
||||
#
|
||||
# Define LIBTRACEEVENT_DYNAMIC to enable libtraceevent dynamic linking
|
||||
#
|
||||
|
||||
# As per kernel Makefile, avoid funny character set dependencies
|
||||
unexport LC_ALL
|
||||
|
@ -283,6 +285,7 @@ SCRIPT_SH =
|
|||
|
||||
SCRIPT_SH += perf-archive.sh
|
||||
SCRIPT_SH += perf-with-kcore.sh
|
||||
SCRIPT_SH += perf-iostat.sh
|
||||
|
||||
grep-libs = $(filter -l%,$(1))
|
||||
strip-libs = $(filter-out -l%,$(1))
|
||||
|
@ -309,7 +312,6 @@ endif
|
|||
|
||||
LIBTRACEEVENT = $(TE_PATH)libtraceevent.a
|
||||
export LIBTRACEEVENT
|
||||
|
||||
LIBTRACEEVENT_DYNAMIC_LIST = $(PLUGINS_PATH)libtraceevent-dynamic-list
|
||||
|
||||
#
|
||||
|
@ -374,12 +376,15 @@ endif
|
|||
|
||||
export PERL_PATH
|
||||
|
||||
PERFLIBS = $(LIBAPI) $(LIBTRACEEVENT) $(LIBSUBCMD) $(LIBPERF)
|
||||
PERFLIBS = $(LIBAPI) $(LIBSUBCMD) $(LIBPERF)
|
||||
ifndef NO_LIBBPF
|
||||
ifndef LIBBPF_DYNAMIC
|
||||
PERFLIBS += $(LIBBPF)
|
||||
endif
|
||||
endif
|
||||
ifndef LIBTRACEEVENT_DYNAMIC
|
||||
PERFLIBS += $(LIBTRACEEVENT)
|
||||
endif
|
||||
|
||||
# We choose to avoid "if .. else if .. else .. endif endif"
|
||||
# because maintaining the nesting to match is a pain. If
|
||||
|
@ -948,6 +953,8 @@ endif
|
|||
$(INSTALL) $(OUTPUT)perf-archive -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
|
||||
$(call QUIET_INSTALL, perf-with-kcore) \
|
||||
$(INSTALL) $(OUTPUT)perf-with-kcore -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
|
||||
$(call QUIET_INSTALL, perf-iostat) \
|
||||
$(INSTALL) $(OUTPUT)perf-iostat -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
|
||||
ifndef NO_LIBAUDIT
|
||||
$(call QUIET_INSTALL, strace/groups) \
|
||||
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(STRACE_GROUPS_INSTDIR_SQ)'; \
|
||||
|
@ -1007,6 +1014,7 @@ python-clean:
|
|||
SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
|
||||
SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
|
||||
SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
|
||||
SKELETONS += $(SKEL_OUT)/bperf_leader.skel.h $(SKEL_OUT)/bperf_follower.skel.h
|
||||
|
||||
ifdef BUILD_BPF_SKEL
|
||||
BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
|
||||
|
@ -1021,7 +1029,7 @@ $(BPFTOOL): | $(SKEL_TMP_OUT)
|
|||
OUTPUT=$(SKEL_TMP_OUT)/ bootstrap
|
||||
|
||||
$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT)
|
||||
$(QUIET_CLANG)$(CLANG) -g -O2 -target bpf $(BPF_INCLUDE) \
|
||||
$(QUIET_CLANG)$(CLANG) -g -O2 -target bpf -Wall -Werror $(BPF_INCLUDE) \
|
||||
-c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@
|
||||
|
||||
$(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL)
|
||||
|
@ -1041,7 +1049,7 @@ bpf-skel-clean:
|
|||
$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
|
||||
|
||||
clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean
|
||||
$(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
|
||||
$(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
|
||||
$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
|
||||
$(Q)$(RM) $(OUTPUT).config-detected
|
||||
$(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32 $(OUTPUT)pmu-events/jevents $(OUTPUT)$(LIBJVMTI).so
|
||||
|
|
|
@ -67,6 +67,7 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,
|
|||
char path[PATH_MAX];
|
||||
int err = -EINVAL;
|
||||
u32 val;
|
||||
u64 contextid;
|
||||
|
||||
ptr = container_of(itr, struct cs_etm_recording, itr);
|
||||
cs_etm_pmu = ptr->cs_etm_pmu;
|
||||
|
@ -86,25 +87,59 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,
|
|||
goto out;
|
||||
}
|
||||
|
||||
/* User has configured for PID tracing, respects it. */
|
||||
contextid = evsel->core.attr.config &
|
||||
(BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_CTXTID2));
|
||||
|
||||
/*
|
||||
* TRCIDR2.CIDSIZE, bit [9-5], indicates whether contextID tracing
|
||||
* is supported:
|
||||
* 0b00000 Context ID tracing is not supported.
|
||||
* 0b00100 Maximum of 32-bit Context ID size.
|
||||
* All other values are reserved.
|
||||
* If user doesn't configure the contextid format, parse PMU format and
|
||||
* enable PID tracing according to the "contextid" format bits:
|
||||
*
|
||||
* If bit ETM_OPT_CTXTID is set, trace CONTEXTIDR_EL1;
|
||||
* If bit ETM_OPT_CTXTID2 is set, trace CONTEXTIDR_EL2.
|
||||
*/
|
||||
val = BMVAL(val, 5, 9);
|
||||
if (!val || val != 0x4) {
|
||||
err = -EINVAL;
|
||||
goto out;
|
||||
if (!contextid)
|
||||
contextid = perf_pmu__format_bits(&cs_etm_pmu->format,
|
||||
"contextid");
|
||||
|
||||
if (contextid & BIT(ETM_OPT_CTXTID)) {
|
||||
/*
|
||||
* TRCIDR2.CIDSIZE, bit [9-5], indicates whether contextID
|
||||
* tracing is supported:
|
||||
* 0b00000 Context ID tracing is not supported.
|
||||
* 0b00100 Maximum of 32-bit Context ID size.
|
||||
* All other values are reserved.
|
||||
*/
|
||||
val = BMVAL(val, 5, 9);
|
||||
if (!val || val != 0x4) {
|
||||
pr_err("%s: CONTEXTIDR_EL1 isn't supported\n",
|
||||
CORESIGHT_ETM_PMU_NAME);
|
||||
err = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
if (contextid & BIT(ETM_OPT_CTXTID2)) {
|
||||
/*
|
||||
* TRCIDR2.VMIDOPT[30:29] != 0 and
|
||||
* TRCIDR2.VMIDSIZE[14:10] == 0b00100 (32bit virtual contextid)
|
||||
* We can't support CONTEXTIDR in VMID if the size of the
|
||||
* virtual context id is < 32bit.
|
||||
* Any value of VMIDSIZE >= 4 (i.e, > 32bit) is fine for us.
|
||||
*/
|
||||
if (!BMVAL(val, 29, 30) || BMVAL(val, 10, 14) < 4) {
|
||||
pr_err("%s: CONTEXTIDR_EL2 isn't supported\n",
|
||||
CORESIGHT_ETM_PMU_NAME);
|
||||
err = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
/* All good, let the kernel know */
|
||||
evsel->core.attr.config |= (1 << ETM_OPT_CTXTID);
|
||||
evsel->core.attr.config |= contextid;
|
||||
err = 0;
|
||||
|
||||
out:
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
|
@ -173,17 +208,17 @@ static int cs_etm_set_option(struct auxtrace_record *itr,
|
|||
!cpu_map__has(online_cpus, i))
|
||||
continue;
|
||||
|
||||
if (option & ETM_SET_OPT_CTXTID) {
|
||||
if (option & BIT(ETM_OPT_CTXTID)) {
|
||||
err = cs_etm_set_context_id(itr, evsel, i);
|
||||
if (err)
|
||||
goto out;
|
||||
}
|
||||
if (option & ETM_SET_OPT_TS) {
|
||||
if (option & BIT(ETM_OPT_TS)) {
|
||||
err = cs_etm_set_timestamp(itr, evsel, i);
|
||||
if (err)
|
||||
goto out;
|
||||
}
|
||||
if (option & ~(ETM_SET_OPT_MASK))
|
||||
if (option & ~(BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS)))
|
||||
/* Nothing else is currently supported */
|
||||
goto out;
|
||||
}
|
||||
|
@ -343,7 +378,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
|
|||
opts->auxtrace_mmap_pages = roundup_pow_of_two(sz);
|
||||
}
|
||||
|
||||
/* Snapshost size can't be bigger than the auxtrace area */
|
||||
/* Snapshot size can't be bigger than the auxtrace area */
|
||||
if (opts->auxtrace_snapshot_size >
|
||||
opts->auxtrace_mmap_pages * (size_t)page_size) {
|
||||
pr_err("Snapshot size %zu must not be greater than AUX area tracing mmap size %zu\n",
|
||||
|
@ -410,7 +445,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
|
|||
evsel__set_sample_bit(cs_etm_evsel, CPU);
|
||||
|
||||
err = cs_etm_set_option(itr, cs_etm_evsel,
|
||||
ETM_SET_OPT_CTXTID | ETM_SET_OPT_TS);
|
||||
BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS));
|
||||
if (err)
|
||||
goto out;
|
||||
}
|
||||
|
@ -489,7 +524,9 @@ static u64 cs_etmv4_get_config(struct auxtrace_record *itr)
|
|||
config |= BIT(ETM4_CFG_BIT_TS);
|
||||
if (config_opts & BIT(ETM_OPT_RETSTK))
|
||||
config |= BIT(ETM4_CFG_BIT_RETSTK);
|
||||
|
||||
if (config_opts & BIT(ETM_OPT_CTXTID2))
|
||||
config |= BIT(ETM4_CFG_BIT_VMID) |
|
||||
BIT(ETM4_CFG_BIT_VMID_OPT);
|
||||
return config;
|
||||
}
|
||||
|
||||
|
@ -576,7 +613,7 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
|
|||
struct auxtrace_record *itr,
|
||||
struct perf_record_auxtrace_info *info)
|
||||
{
|
||||
u32 increment;
|
||||
u32 increment, nr_trc_params;
|
||||
u64 magic;
|
||||
struct cs_etm_recording *ptr =
|
||||
container_of(itr, struct cs_etm_recording, itr);
|
||||
|
@ -611,6 +648,7 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
|
|||
|
||||
/* How much space was used */
|
||||
increment = CS_ETMV4_PRIV_MAX;
|
||||
nr_trc_params = CS_ETMV4_PRIV_MAX - CS_ETMV4_TRCCONFIGR;
|
||||
} else {
|
||||
magic = __perf_cs_etmv3_magic;
|
||||
/* Get configuration register */
|
||||
|
@ -628,11 +666,13 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
|
|||
|
||||
/* How much space was used */
|
||||
increment = CS_ETM_PRIV_MAX;
|
||||
nr_trc_params = CS_ETM_PRIV_MAX - CS_ETM_ETMCR;
|
||||
}
|
||||
|
||||
/* Build generic header portion */
|
||||
info->priv[*offset + CS_ETM_MAGIC] = magic;
|
||||
info->priv[*offset + CS_ETM_CPU] = cpu;
|
||||
info->priv[*offset + CS_ETM_NR_TRC_PARAMS] = nr_trc_params;
|
||||
/* Where the next CPU entry should start from */
|
||||
*offset += increment;
|
||||
}
|
||||
|
@ -678,7 +718,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
|
|||
|
||||
/* First fill out the session header */
|
||||
info->type = PERF_AUXTRACE_CS_ETM;
|
||||
info->priv[CS_HEADER_VERSION_0] = 0;
|
||||
info->priv[CS_HEADER_VERSION] = CS_HEADER_CURRENT_VERSION;
|
||||
info->priv[CS_PMU_TYPE_CPUS] = type << 32;
|
||||
info->priv[CS_PMU_TYPE_CPUS] |= nr_cpu;
|
||||
info->priv[CS_ETM_SNAPSHOT] = ptr->snapshot_mode;
|
||||
|
|
|
@ -2,6 +2,7 @@ perf-y += header.o
|
|||
perf-y += machine.o
|
||||
perf-y += perf_regs.o
|
||||
perf-y += tsc.o
|
||||
perf-y += pmu.o
|
||||
perf-y += kvm-stat.o
|
||||
perf-$(CONFIG_DWARF) += dwarf-regs.o
|
||||
perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <errno.h>
|
||||
#include <memory.h>
|
||||
#include "../../util/evsel.h"
|
||||
#include "../../util/kvm-stat.h"
|
||||
#include "../../../util/evsel.h"
|
||||
#include "../../../util/kvm-stat.h"
|
||||
#include "arm64_exception_types.h"
|
||||
#include "debug.h"
|
||||
|
||||
|
|
|
@ -6,11 +6,11 @@
|
|||
#include "debug.h"
|
||||
#include "symbol.h"
|
||||
|
||||
/* On arm64, kernel text segment start at high memory address,
|
||||
/* On arm64, kernel text segment starts at high memory address,
|
||||
* for example 0xffff 0000 8xxx xxxx. Modules start at a low memory
|
||||
* address, like 0xffff 0000 00ax xxxx. When only samll amount of
|
||||
* address, like 0xffff 0000 00ax xxxx. When only small amount of
|
||||
* memory is used by modules, gap between end of module's text segment
|
||||
* and start of kernel text segment may be reach 2G.
|
||||
* and start of kernel text segment may reach 2G.
|
||||
* Therefore do not fill this gap and do not assign it to the kernel dso map.
|
||||
*/
|
||||
|
||||
|
|
|
@ -108,7 +108,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
|
|||
/* [sp], [sp, NUM] or [sp,NUM] */
|
||||
new_len = 7; /* + ( % s p ) NULL */
|
||||
|
||||
/* If the arugment is [sp], need to fill offset '0' */
|
||||
/* If the argument is [sp], need to fill offset '0' */
|
||||
if (rm[2].rm_so == -1)
|
||||
new_len += 1;
|
||||
else
|
||||
|
|
|
@ -0,0 +1,25 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
#include "../../../util/cpumap.h"
|
||||
#include "../../../util/pmu.h"
|
||||
|
||||
struct pmu_events_map *pmu_events_map__find(void)
|
||||
{
|
||||
struct perf_pmu *pmu = NULL;
|
||||
|
||||
while ((pmu = perf_pmu__scan(pmu))) {
|
||||
if (!is_pmu_core(pmu->name))
|
||||
continue;
|
||||
|
||||
/*
|
||||
* The cpumap should cover all CPUs. Otherwise, some CPUs may
|
||||
* not support some events or have different event IDs.
|
||||
*/
|
||||
if (pmu->cpus->nr != cpu__max_cpu())
|
||||
return NULL;
|
||||
|
||||
return perf_pmu__find_map(pmu);
|
||||
}
|
||||
|
||||
return NULL;
|
||||
}
|
|
@ -4,9 +4,9 @@
|
|||
#ifndef REMOTE_UNWIND_LIBUNWIND
|
||||
#include <libunwind.h>
|
||||
#include "perf_regs.h"
|
||||
#include "../../util/unwind.h"
|
||||
#include "../../../util/unwind.h"
|
||||
#endif
|
||||
#include "../../util/debug.h"
|
||||
#include "../../../util/debug.h"
|
||||
|
||||
int LIBUNWIND__ARCH_REG_ID(int regnum)
|
||||
{
|
||||
|
|
|
@ -0,0 +1,22 @@
|
|||
# SPDX-License-Identifier: GPL-2.0
|
||||
ifndef NO_DWARF
|
||||
PERF_HAVE_DWARF_REGS := 1
|
||||
endif
|
||||
|
||||
# Syscall table generation for perf
|
||||
out := $(OUTPUT)arch/mips/include/generated/asm
|
||||
header := $(out)/syscalls_n64.c
|
||||
sysprf := $(srctree)/tools/perf/arch/mips/entry/syscalls
|
||||
sysdef := $(sysprf)/syscall_n64.tbl
|
||||
systbl := $(sysprf)/mksyscalltbl
|
||||
|
||||
# Create output directory if not already present
|
||||
_dummy := $(shell [ -d '$(out)' ] || mkdir -p '$(out)')
|
||||
|
||||
$(header): $(sysdef) $(systbl)
|
||||
$(Q)$(SHELL) '$(systbl)' $(sysdef) > $@
|
||||
|
||||
clean::
|
||||
$(call QUIET_CLEAN, mips) $(RM) $(header)
|
||||
|
||||
archheaders: $(header)
|
|
@ -0,0 +1,32 @@
|
|||
#!/bin/sh
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
#
|
||||
# Generate system call table for perf. Derived from
|
||||
# s390 script.
|
||||
#
|
||||
# Author(s): Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
|
||||
# Changed by: Tiezhu Yang <yangtiezhu@loongson.cn>
|
||||
|
||||
SYSCALL_TBL=$1
|
||||
|
||||
if ! test -r $SYSCALL_TBL; then
|
||||
echo "Could not read input file" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
create_table()
|
||||
{
|
||||
local max_nr nr abi sc discard
|
||||
|
||||
echo 'static const char *syscalltbl_mips_n64[] = {'
|
||||
while read nr abi sc discard; do
|
||||
printf '\t[%d] = "%s",\n' $nr $sc
|
||||
max_nr=$nr
|
||||
done
|
||||
echo '};'
|
||||
echo "#define SYSCALLTBL_MIPS_N64_MAX_ID $max_nr"
|
||||
}
|
||||
|
||||
grep -E "^[[:digit:]]+[[:space:]]+(n64)" $SYSCALL_TBL \
|
||||
|sort -k1 -n \
|
||||
|create_table
|
|
@ -0,0 +1,358 @@
|
|||
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
|
||||
#
|
||||
# system call numbers and entry vectors for mips
|
||||
#
|
||||
# The format is:
|
||||
# <number> <abi> <name> <entry point>
|
||||
#
|
||||
# The <abi> is always "n64" for this file.
|
||||
#
|
||||
0 n64 read sys_read
|
||||
1 n64 write sys_write
|
||||
2 n64 open sys_open
|
||||
3 n64 close sys_close
|
||||
4 n64 stat sys_newstat
|
||||
5 n64 fstat sys_newfstat
|
||||
6 n64 lstat sys_newlstat
|
||||
7 n64 poll sys_poll
|
||||
8 n64 lseek sys_lseek
|
||||
9 n64 mmap sys_mips_mmap
|
||||
10 n64 mprotect sys_mprotect
|
||||
11 n64 munmap sys_munmap
|
||||
12 n64 brk sys_brk
|
||||
13 n64 rt_sigaction sys_rt_sigaction
|
||||
14 n64 rt_sigprocmask sys_rt_sigprocmask
|
||||
15 n64 ioctl sys_ioctl
|
||||
16 n64 pread64 sys_pread64
|
||||
17 n64 pwrite64 sys_pwrite64
|
||||
18 n64 readv sys_readv
|
||||
19 n64 writev sys_writev
|
||||
20 n64 access sys_access
|
||||
21 n64 pipe sysm_pipe
|
||||
22 n64 _newselect sys_select
|
||||
23 n64 sched_yield sys_sched_yield
|
||||
24 n64 mremap sys_mremap
|
||||
25 n64 msync sys_msync
|
||||
26 n64 mincore sys_mincore
|
||||
27 n64 madvise sys_madvise
|
||||
28 n64 shmget sys_shmget
|
||||
29 n64 shmat sys_shmat
|
||||
30 n64 shmctl sys_old_shmctl
|
||||
31 n64 dup sys_dup
|
||||
32 n64 dup2 sys_dup2
|
||||
33 n64 pause sys_pause
|
||||
34 n64 nanosleep sys_nanosleep
|
||||
35 n64 getitimer sys_getitimer
|
||||
36 n64 setitimer sys_setitimer
|
||||
37 n64 alarm sys_alarm
|
||||
38 n64 getpid sys_getpid
|
||||
39 n64 sendfile sys_sendfile64
|
||||
40 n64 socket sys_socket
|
||||
41 n64 connect sys_connect
|
||||
42 n64 accept sys_accept
|
||||
43 n64 sendto sys_sendto
|
||||
44 n64 recvfrom sys_recvfrom
|
||||
45 n64 sendmsg sys_sendmsg
|
||||
46 n64 recvmsg sys_recvmsg
|
||||
47 n64 shutdown sys_shutdown
|
||||
48 n64 bind sys_bind
|
||||
49 n64 listen sys_listen
|
||||
50 n64 getsockname sys_getsockname
|
||||
51 n64 getpeername sys_getpeername
|
||||
52 n64 socketpair sys_socketpair
|
||||
53 n64 setsockopt sys_setsockopt
|
||||
54 n64 getsockopt sys_getsockopt
|
||||
55 n64 clone __sys_clone
|
||||
56 n64 fork __sys_fork
|
||||
57 n64 execve sys_execve
|
||||
58 n64 exit sys_exit
|
||||
59 n64 wait4 sys_wait4
|
||||
60 n64 kill sys_kill
|
||||
61 n64 uname sys_newuname
|
||||
62 n64 semget sys_semget
|
||||
63 n64 semop sys_semop
|
||||
64 n64 semctl sys_old_semctl
|
||||
65 n64 shmdt sys_shmdt
|
||||
66 n64 msgget sys_msgget
|
||||
67 n64 msgsnd sys_msgsnd
|
||||
68 n64 msgrcv sys_msgrcv
|
||||
69 n64 msgctl sys_old_msgctl
|
||||
70 n64 fcntl sys_fcntl
|
||||
71 n64 flock sys_flock
|
||||
72 n64 fsync sys_fsync
|
||||
73 n64 fdatasync sys_fdatasync
|
||||
74 n64 truncate sys_truncate
|
||||
75 n64 ftruncate sys_ftruncate
|
||||
76 n64 getdents sys_getdents
|
||||
77 n64 getcwd sys_getcwd
|
||||
78 n64 chdir sys_chdir
|
||||
79 n64 fchdir sys_fchdir
|
||||
80 n64 rename sys_rename
|
||||
81 n64 mkdir sys_mkdir
|
||||
82 n64 rmdir sys_rmdir
|
||||
83 n64 creat sys_creat
|
||||
84 n64 link sys_link
|
||||
85 n64 unlink sys_unlink
|
||||
86 n64 symlink sys_symlink
|
||||
87 n64 readlink sys_readlink
|
||||
88 n64 chmod sys_chmod
|
||||
89 n64 fchmod sys_fchmod
|
||||
90 n64 chown sys_chown
|
||||
91 n64 fchown sys_fchown
|
||||
92 n64 lchown sys_lchown
|
||||
93 n64 umask sys_umask
|
||||
94 n64 gettimeofday sys_gettimeofday
|
||||
95 n64 getrlimit sys_getrlimit
|
||||
96 n64 getrusage sys_getrusage
|
||||
97 n64 sysinfo sys_sysinfo
|
||||
98 n64 times sys_times
|
||||
99 n64 ptrace sys_ptrace
|
||||
100 n64 getuid sys_getuid
|
||||
101 n64 syslog sys_syslog
|
||||
102 n64 getgid sys_getgid
|
||||
103 n64 setuid sys_setuid
|
||||
104 n64 setgid sys_setgid
|
||||
105 n64 geteuid sys_geteuid
|
||||
106 n64 getegid sys_getegid
|
||||
107 n64 setpgid sys_setpgid
|
||||
108 n64 getppid sys_getppid
|
||||
109 n64 getpgrp sys_getpgrp
|
||||
110 n64 setsid sys_setsid
|
||||
111 n64 setreuid sys_setreuid
|
||||
112 n64 setregid sys_setregid
|
||||
113 n64 getgroups sys_getgroups
|
||||
114 n64 setgroups sys_setgroups
|
||||
115 n64 setresuid sys_setresuid
|
||||
116 n64 getresuid sys_getresuid
|
||||
117 n64 setresgid sys_setresgid
|
||||
118 n64 getresgid sys_getresgid
|
||||
119 n64 getpgid sys_getpgid
|
||||
120 n64 setfsuid sys_setfsuid
|
||||
121 n64 setfsgid sys_setfsgid
|
||||
122 n64 getsid sys_getsid
|
||||
123 n64 capget sys_capget
|
||||
124 n64 capset sys_capset
|
||||
125 n64 rt_sigpending sys_rt_sigpending
|
||||
126 n64 rt_sigtimedwait sys_rt_sigtimedwait
|
||||
127 n64 rt_sigqueueinfo sys_rt_sigqueueinfo
|
||||
128 n64 rt_sigsuspend sys_rt_sigsuspend
|
||||
129 n64 sigaltstack sys_sigaltstack
|
||||
130 n64 utime sys_utime
|
||||
131 n64 mknod sys_mknod
|
||||
132 n64 personality sys_personality
|
||||
133 n64 ustat sys_ustat
|
||||
134 n64 statfs sys_statfs
|
||||
135 n64 fstatfs sys_fstatfs
|
||||
136 n64 sysfs sys_sysfs
|
||||
137 n64 getpriority sys_getpriority
|
||||
138 n64 setpriority sys_setpriority
|
||||
139 n64 sched_setparam sys_sched_setparam
|
||||
140 n64 sched_getparam sys_sched_getparam
|
||||
141 n64 sched_setscheduler sys_sched_setscheduler
|
||||
142 n64 sched_getscheduler sys_sched_getscheduler
|
||||
143 n64 sched_get_priority_max sys_sched_get_priority_max
|
||||
144 n64 sched_get_priority_min sys_sched_get_priority_min
|
||||
145 n64 sched_rr_get_interval sys_sched_rr_get_interval
|
||||
146 n64 mlock sys_mlock
|
||||
147 n64 munlock sys_munlock
|
||||
148 n64 mlockall sys_mlockall
|
||||
149 n64 munlockall sys_munlockall
|
||||
150 n64 vhangup sys_vhangup
|
||||
151 n64 pivot_root sys_pivot_root
|
||||
152 n64 _sysctl sys_ni_syscall
|
||||
153 n64 prctl sys_prctl
|
||||
154 n64 adjtimex sys_adjtimex
|
||||
155 n64 setrlimit sys_setrlimit
|
||||
156 n64 chroot sys_chroot
|
||||
157 n64 sync sys_sync
|
||||
158 n64 acct sys_acct
|
||||
159 n64 settimeofday sys_settimeofday
|
||||
160 n64 mount sys_mount
|
||||
161 n64 umount2 sys_umount
|
||||
162 n64 swapon sys_swapon
|
||||
163 n64 swapoff sys_swapoff
|
||||
164 n64 reboot sys_reboot
|
||||
165 n64 sethostname sys_sethostname
|
||||
166 n64 setdomainname sys_setdomainname
|
||||
167 n64 create_module sys_ni_syscall
|
||||
168 n64 init_module sys_init_module
|
||||
169 n64 delete_module sys_delete_module
|
||||
170 n64 get_kernel_syms sys_ni_syscall
|
||||
171 n64 query_module sys_ni_syscall
|
||||
172 n64 quotactl sys_quotactl
|
||||
173 n64 nfsservctl sys_ni_syscall
|
||||
174 n64 getpmsg sys_ni_syscall
|
||||
175 n64 putpmsg sys_ni_syscall
|
||||
176 n64 afs_syscall sys_ni_syscall
|
||||
# 177 reserved for security
|
||||
177 n64 reserved177 sys_ni_syscall
|
||||
178 n64 gettid sys_gettid
|
||||
179 n64 readahead sys_readahead
|
||||
180 n64 setxattr sys_setxattr
|
||||
181 n64 lsetxattr sys_lsetxattr
|
||||
182 n64 fsetxattr sys_fsetxattr
|
||||
183 n64 getxattr sys_getxattr
|
||||
184 n64 lgetxattr sys_lgetxattr
|
||||
185 n64 fgetxattr sys_fgetxattr
|
||||
186 n64 listxattr sys_listxattr
|
||||
187 n64 llistxattr sys_llistxattr
|
||||
188 n64 flistxattr sys_flistxattr
|
||||
189 n64 removexattr sys_removexattr
|
||||
190 n64 lremovexattr sys_lremovexattr
|
||||
191 n64 fremovexattr sys_fremovexattr
|
||||
192 n64 tkill sys_tkill
|
||||
193 n64 reserved193 sys_ni_syscall
|
||||
194 n64 futex sys_futex
|
||||
195 n64 sched_setaffinity sys_sched_setaffinity
|
||||
196 n64 sched_getaffinity sys_sched_getaffinity
|
||||
197 n64 cacheflush sys_cacheflush
|
||||
198 n64 cachectl sys_cachectl
|
||||
199 n64 sysmips __sys_sysmips
|
||||
200 n64 io_setup sys_io_setup
|
||||
201 n64 io_destroy sys_io_destroy
|
||||
202 n64 io_getevents sys_io_getevents
|
||||
203 n64 io_submit sys_io_submit
|
||||
204 n64 io_cancel sys_io_cancel
|
||||
205 n64 exit_group sys_exit_group
|
||||
206 n64 lookup_dcookie sys_lookup_dcookie
|
||||
207 n64 epoll_create sys_epoll_create
|
||||
208 n64 epoll_ctl sys_epoll_ctl
|
||||
209 n64 epoll_wait sys_epoll_wait
|
||||
210 n64 remap_file_pages sys_remap_file_pages
|
||||
211 n64 rt_sigreturn sys_rt_sigreturn
|
||||
212 n64 set_tid_address sys_set_tid_address
|
||||
213 n64 restart_syscall sys_restart_syscall
|
||||
214 n64 semtimedop sys_semtimedop
|
||||
215 n64 fadvise64 sys_fadvise64_64
|
||||
216 n64 timer_create sys_timer_create
|
||||
217 n64 timer_settime sys_timer_settime
|
||||
218 n64 timer_gettime sys_timer_gettime
|
||||
219 n64 timer_getoverrun sys_timer_getoverrun
|
||||
220 n64 timer_delete sys_timer_delete
|
||||
221 n64 clock_settime sys_clock_settime
|
||||
222 n64 clock_gettime sys_clock_gettime
|
||||
223 n64 clock_getres sys_clock_getres
|
||||
224 n64 clock_nanosleep sys_clock_nanosleep
|
||||
225 n64 tgkill sys_tgkill
|
||||
226 n64 utimes sys_utimes
|
||||
227 n64 mbind sys_mbind
|
||||
228 n64 get_mempolicy sys_get_mempolicy
|
||||
229 n64 set_mempolicy sys_set_mempolicy
|
||||
230 n64 mq_open sys_mq_open
|
||||
231 n64 mq_unlink sys_mq_unlink
|
||||
232 n64 mq_timedsend sys_mq_timedsend
|
||||
233 n64 mq_timedreceive sys_mq_timedreceive
|
||||
234 n64 mq_notify sys_mq_notify
|
||||
235 n64 mq_getsetattr sys_mq_getsetattr
|
||||
236 n64 vserver sys_ni_syscall
|
||||
237 n64 waitid sys_waitid
|
||||
# 238 was sys_setaltroot
|
||||
239 n64 add_key sys_add_key
|
||||
240 n64 request_key sys_request_key
|
||||
241 n64 keyctl sys_keyctl
|
||||
242 n64 set_thread_area sys_set_thread_area
|
||||
243 n64 inotify_init sys_inotify_init
|
||||
244 n64 inotify_add_watch sys_inotify_add_watch
|
||||
245 n64 inotify_rm_watch sys_inotify_rm_watch
|
||||
246 n64 migrate_pages sys_migrate_pages
|
||||
247 n64 openat sys_openat
|
||||
248 n64 mkdirat sys_mkdirat
|
||||
249 n64 mknodat sys_mknodat
|
||||
250 n64 fchownat sys_fchownat
|
||||
251 n64 futimesat sys_futimesat
|
||||
252 n64 newfstatat sys_newfstatat
|
||||
253 n64 unlinkat sys_unlinkat
|
||||
254 n64 renameat sys_renameat
|
||||
255 n64 linkat sys_linkat
|
||||
256 n64 symlinkat sys_symlinkat
|
||||
257 n64 readlinkat sys_readlinkat
|
||||
258 n64 fchmodat sys_fchmodat
|
||||
259 n64 faccessat sys_faccessat
|
||||
260 n64 pselect6 sys_pselect6
|
||||
261 n64 ppoll sys_ppoll
|
||||
262 n64 unshare sys_unshare
|
||||
263 n64 splice sys_splice
|
||||
264 n64 sync_file_range sys_sync_file_range
|
||||
265 n64 tee sys_tee
|
||||
266 n64 vmsplice sys_vmsplice
|
||||
267 n64 move_pages sys_move_pages
|
||||
268 n64 set_robust_list sys_set_robust_list
|
||||
269 n64 get_robust_list sys_get_robust_list
|
||||
270 n64 kexec_load sys_kexec_load
|
||||
271 n64 getcpu sys_getcpu
|
||||
272 n64 epoll_pwait sys_epoll_pwait
|
||||
273 n64 ioprio_set sys_ioprio_set
|
||||
274 n64 ioprio_get sys_ioprio_get
|
||||
275 n64 utimensat sys_utimensat
|
||||
276 n64 signalfd sys_signalfd
|
||||
277 n64 timerfd sys_ni_syscall
|
||||
278 n64 eventfd sys_eventfd
|
||||
279 n64 fallocate sys_fallocate
|
||||
280 n64 timerfd_create sys_timerfd_create
|
||||
281 n64 timerfd_gettime sys_timerfd_gettime
|
||||
282 n64 timerfd_settime sys_timerfd_settime
|
||||
283 n64 signalfd4 sys_signalfd4
|
||||
284 n64 eventfd2 sys_eventfd2
|
||||
285 n64 epoll_create1 sys_epoll_create1
|
||||
286 n64 dup3 sys_dup3
|
||||
287 n64 pipe2 sys_pipe2
|
||||
288 n64 inotify_init1 sys_inotify_init1
|
||||
289 n64 preadv sys_preadv
|
||||
290 n64 pwritev sys_pwritev
|
||||
291 n64 rt_tgsigqueueinfo sys_rt_tgsigqueueinfo
|
||||
292 n64 perf_event_open sys_perf_event_open
|
||||
293 n64 accept4 sys_accept4
|
||||
294 n64 recvmmsg sys_recvmmsg
|
||||
295 n64 fanotify_init sys_fanotify_init
|
||||
296 n64 fanotify_mark sys_fanotify_mark
|
||||
297 n64 prlimit64 sys_prlimit64
|
||||
298 n64 name_to_handle_at sys_name_to_handle_at
|
||||
299 n64 open_by_handle_at sys_open_by_handle_at
|
||||
300 n64 clock_adjtime sys_clock_adjtime
|
||||
301 n64 syncfs sys_syncfs
|
||||
302 n64 sendmmsg sys_sendmmsg
|
||||
303 n64 setns sys_setns
|
||||
304 n64 process_vm_readv sys_process_vm_readv
|
||||
305 n64 process_vm_writev sys_process_vm_writev
|
||||
306 n64 kcmp sys_kcmp
|
||||
307 n64 finit_module sys_finit_module
|
||||
308 n64 getdents64 sys_getdents64
|
||||
309 n64 sched_setattr sys_sched_setattr
|
||||
310 n64 sched_getattr sys_sched_getattr
|
||||
311 n64 renameat2 sys_renameat2
|
||||
312 n64 seccomp sys_seccomp
|
||||
313 n64 getrandom sys_getrandom
|
||||
314 n64 memfd_create sys_memfd_create
|
||||
315 n64 bpf sys_bpf
|
||||
316 n64 execveat sys_execveat
|
||||
317 n64 userfaultfd sys_userfaultfd
|
||||
318 n64 membarrier sys_membarrier
|
||||
319 n64 mlock2 sys_mlock2
|
||||
320 n64 copy_file_range sys_copy_file_range
|
||||
321 n64 preadv2 sys_preadv2
|
||||
322 n64 pwritev2 sys_pwritev2
|
||||
323 n64 pkey_mprotect sys_pkey_mprotect
|
||||
324 n64 pkey_alloc sys_pkey_alloc
|
||||
325 n64 pkey_free sys_pkey_free
|
||||
326 n64 statx sys_statx
|
||||
327 n64 rseq sys_rseq
|
||||
328 n64 io_pgetevents sys_io_pgetevents
|
||||
# 329 through 423 are reserved to sync up with other architectures
|
||||
424 n64 pidfd_send_signal sys_pidfd_send_signal
|
||||
425 n64 io_uring_setup sys_io_uring_setup
|
||||
426 n64 io_uring_enter sys_io_uring_enter
|
||||
427 n64 io_uring_register sys_io_uring_register
|
||||
428 n64 open_tree sys_open_tree
|
||||
429 n64 move_mount sys_move_mount
|
||||
430 n64 fsopen sys_fsopen
|
||||
431 n64 fsconfig sys_fsconfig
|
||||
432 n64 fsmount sys_fsmount
|
||||
433 n64 fspick sys_fspick
|
||||
434 n64 pidfd_open sys_pidfd_open
|
||||
435 n64 clone3 __sys_clone3
|
||||
436 n64 close_range sys_close_range
|
||||
437 n64 openat2 sys_openat2
|
||||
438 n64 pidfd_getfd sys_pidfd_getfd
|
||||
439 n64 faccessat2 sys_faccessat2
|
||||
440 n64 process_madvise sys_process_madvise
|
||||
441 n64 epoll_pwait2 sys_epoll_pwait2
|
|
@ -0,0 +1,31 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
/*
|
||||
* dwarf-regs-table.h : Mapping of DWARF debug register numbers into
|
||||
* register names.
|
||||
*
|
||||
* Copyright (C) 2013 Cavium, Inc.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License as published by
|
||||
* the Free Software Foundation; either version 2 of the License, or
|
||||
* (at your option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
*/
|
||||
|
||||
#ifdef DEFINE_DWARF_REGSTR_TABLE
|
||||
#undef REG_DWARFNUM_NAME
|
||||
#define REG_DWARFNUM_NAME(reg, idx) [idx] = "$" #reg
|
||||
static const char * const mips_regstr_tbl[] = {
|
||||
"$0", "$1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", "$9",
|
||||
"$10", "$11", "$12", "$13", "$14", "$15", "$16", "$17", "$18", "$19",
|
||||
"$20", "$21", "$22", "$23", "$24", "$25", "$26", "$27", "$28", "%29",
|
||||
"$30", "$31",
|
||||
REG_DWARFNUM_NAME(hi, 64),
|
||||
REG_DWARFNUM_NAME(lo, 65),
|
||||
};
|
||||
#endif
|
|
@ -0,0 +1,84 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef ARCH_PERF_REGS_H
|
||||
#define ARCH_PERF_REGS_H
|
||||
|
||||
#include <stdlib.h>
|
||||
#include <linux/types.h>
|
||||
#include <asm/perf_regs.h>
|
||||
|
||||
#define PERF_REGS_MAX PERF_REG_MIPS_MAX
|
||||
#define PERF_REG_IP PERF_REG_MIPS_PC
|
||||
#define PERF_REG_SP PERF_REG_MIPS_R29
|
||||
|
||||
#define PERF_REGS_MASK ((1ULL << PERF_REG_MIPS_MAX) - 1)
|
||||
|
||||
static inline const char *__perf_reg_name(int id)
|
||||
{
|
||||
switch (id) {
|
||||
case PERF_REG_MIPS_PC:
|
||||
return "PC";
|
||||
case PERF_REG_MIPS_R1:
|
||||
return "$1";
|
||||
case PERF_REG_MIPS_R2:
|
||||
return "$2";
|
||||
case PERF_REG_MIPS_R3:
|
||||
return "$3";
|
||||
case PERF_REG_MIPS_R4:
|
||||
return "$4";
|
||||
case PERF_REG_MIPS_R5:
|
||||
return "$5";
|
||||
case PERF_REG_MIPS_R6:
|
||||
return "$6";
|
||||
case PERF_REG_MIPS_R7:
|
||||
return "$7";
|
||||
case PERF_REG_MIPS_R8:
|
||||
return "$8";
|
||||
case PERF_REG_MIPS_R9:
|
||||
return "$9";
|
||||
case PERF_REG_MIPS_R10:
|
||||
return "$10";
|
||||
case PERF_REG_MIPS_R11:
|
||||
return "$11";
|
||||
case PERF_REG_MIPS_R12:
|
||||
return "$12";
|
||||
case PERF_REG_MIPS_R13:
|
||||
return "$13";
|
||||
case PERF_REG_MIPS_R14:
|
||||
return "$14";
|
||||
case PERF_REG_MIPS_R15:
|
||||
return "$15";
|
||||
case PERF_REG_MIPS_R16:
|
||||
return "$16";
|
||||
case PERF_REG_MIPS_R17:
|
||||
return "$17";
|
||||
case PERF_REG_MIPS_R18:
|
||||
return "$18";
|
||||
case PERF_REG_MIPS_R19:
|
||||
return "$19";
|
||||
case PERF_REG_MIPS_R20:
|
||||
return "$20";
|
||||
case PERF_REG_MIPS_R21:
|
||||
return "$21";
|
||||
case PERF_REG_MIPS_R22:
|
||||
return "$22";
|
||||
case PERF_REG_MIPS_R23:
|
||||
return "$23";
|
||||
case PERF_REG_MIPS_R24:
|
||||
return "$24";
|
||||
case PERF_REG_MIPS_R25:
|
||||
return "$25";
|
||||
case PERF_REG_MIPS_R28:
|
||||
return "$28";
|
||||
case PERF_REG_MIPS_R29:
|
||||
return "$29";
|
||||
case PERF_REG_MIPS_R30:
|
||||
return "$30";
|
||||
case PERF_REG_MIPS_R31:
|
||||
return "$31";
|
||||
default:
|
||||
break;
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
#endif /* ARCH_PERF_REGS_H */
|
|
@ -0,0 +1,3 @@
|
|||
perf-y += perf_regs.o
|
||||
perf-$(CONFIG_DWARF) += dwarf-regs.o
|
||||
perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
|
|
@ -0,0 +1,38 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
/*
|
||||
* dwarf-regs.c : Mapping of DWARF debug register numbers into register names.
|
||||
*
|
||||
* Copyright (C) 2013 Cavium, Inc.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License as published by
|
||||
* the Free Software Foundation; either version 2 of the License, or
|
||||
* (at your option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
*/
|
||||
|
||||
#include <stdio.h>
|
||||
#include <dwarf-regs.h>
|
||||
|
||||
static const char *mips_gpr_names[32] = {
|
||||
"$0", "$1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", "$9",
|
||||
"$10", "$11", "$12", "$13", "$14", "$15", "$16", "$17", "$18", "$19",
|
||||
"$20", "$21", "$22", "$23", "$24", "$25", "$26", "$27", "$28", "$29",
|
||||
"$30", "$31"
|
||||
};
|
||||
|
||||
const char *get_arch_regstr(unsigned int n)
|
||||
{
|
||||
if (n < 32)
|
||||
return mips_gpr_names[n];
|
||||
if (n == 64)
|
||||
return "hi";
|
||||
if (n == 65)
|
||||
return "lo";
|
||||
return NULL;
|
||||
}
|
|
@ -0,0 +1,6 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include "../../util/perf_regs.h"
|
||||
|
||||
const struct sample_reg sample_reg_masks[] = {
|
||||
SMPL_REG_END
|
||||
};
|
|
@ -0,0 +1,22 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
#include <errno.h>
|
||||
#include <libunwind.h>
|
||||
#include "perf_regs.h"
|
||||
#include "../../util/unwind.h"
|
||||
#include "util/debug.h"
|
||||
|
||||
int libunwind__arch_reg_id(int regnum)
|
||||
{
|
||||
switch (regnum) {
|
||||
case UNW_MIPS_R1 ... UNW_MIPS_R25:
|
||||
return regnum - UNW_MIPS_R1 + PERF_REG_MIPS_R1;
|
||||
case UNW_MIPS_R28 ... UNW_MIPS_R31:
|
||||
return regnum - UNW_MIPS_R28 + PERF_REG_MIPS_R28;
|
||||
case UNW_MIPS_PC:
|
||||
return PERF_REG_MIPS_PC;
|
||||
default:
|
||||
pr_err("unwind: invalid reg id %d\n", regnum);
|
||||
return -EINVAL;
|
||||
}
|
||||
}
|
|
@ -4,6 +4,8 @@ perf-y += kvm-stat.o
|
|||
perf-y += perf_regs.o
|
||||
perf-y += mem-events.o
|
||||
perf-y += sym-handling.o
|
||||
perf-y += evsel.o
|
||||
perf-y += event.o
|
||||
|
||||
perf-$(CONFIG_DWARF) += dwarf-regs.o
|
||||
perf-$(CONFIG_DWARF) += skip-callchain-idx.o
|
||||
|
|
|
@ -0,0 +1,53 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <linux/types.h>
|
||||
#include <linux/string.h>
|
||||
#include <linux/zalloc.h>
|
||||
|
||||
#include "../../../util/event.h"
|
||||
#include "../../../util/synthetic-events.h"
|
||||
#include "../../../util/machine.h"
|
||||
#include "../../../util/tool.h"
|
||||
#include "../../../util/map.h"
|
||||
#include "../../../util/debug.h"
|
||||
|
||||
void arch_perf_parse_sample_weight(struct perf_sample *data,
|
||||
const __u64 *array, u64 type)
|
||||
{
|
||||
union perf_sample_weight weight;
|
||||
|
||||
weight.full = *array;
|
||||
if (type & PERF_SAMPLE_WEIGHT)
|
||||
data->weight = weight.full;
|
||||
else {
|
||||
data->weight = weight.var1_dw;
|
||||
data->ins_lat = weight.var2_w;
|
||||
data->p_stage_cyc = weight.var3_w;
|
||||
}
|
||||
}
|
||||
|
||||
void arch_perf_synthesize_sample_weight(const struct perf_sample *data,
|
||||
__u64 *array, u64 type)
|
||||
{
|
||||
*array = data->weight;
|
||||
|
||||
if (type & PERF_SAMPLE_WEIGHT_STRUCT) {
|
||||
*array &= 0xffffffff;
|
||||
*array |= ((u64)data->ins_lat << 32);
|
||||
}
|
||||
}
|
||||
|
||||
const char *arch_perf_header_entry(const char *se_header)
|
||||
{
|
||||
if (!strcmp(se_header, "Local INSTR Latency"))
|
||||
return "Finish Cyc";
|
||||
else if (!strcmp(se_header, "Pipeline Stage Cycle"))
|
||||
return "Dispatch Cyc";
|
||||
return se_header;
|
||||
}
|
||||
|
||||
int arch_support_sort_key(const char *sort_key)
|
||||
{
|
||||
if (!strcmp(sort_key, "p_stage_cyc"))
|
||||
return 1;
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,8 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <stdio.h>
|
||||
#include "util/evsel.h"
|
||||
|
||||
void arch_evsel__set_sample_weight(struct evsel *evsel)
|
||||
{
|
||||
evsel__set_sample_bit(evsel, WEIGHT_STRUCT);
|
||||
}
|
|
@ -176,7 +176,7 @@ int cpu_isa_init(struct perf_kvm_stat *kvm, const char *cpuid __maybe_unused)
|
|||
}
|
||||
|
||||
/*
|
||||
* Incase of powerpc architecture, pmu registers are programmable
|
||||
* In case of powerpc architecture, pmu registers are programmable
|
||||
* by guest kernel. So monitoring guest via host may not provide
|
||||
* valid samples with default 'cycles' event. It is better to use
|
||||
* 'trace_imc/trace_cycles' event for guest profiling, since it
|
||||
|
|
|
@ -10,6 +10,6 @@
|
|||
|
||||
#define SPRN_PVR 0x11F /* Processor Version Register */
|
||||
#define PVR_VER(pvr) (((pvr) >> 16) & 0xFFFF) /* Version field */
|
||||
#define PVR_REV(pvr) (((pvr) >> 0) & 0xFFFF) /* Revison field */
|
||||
#define PVR_REV(pvr) (((pvr) >> 0) & 0xFFFF) /* Revision field */
|
||||
|
||||
#endif /* __PERF_UTIL_HEADER_H */
|
||||
|
|
|
@ -73,7 +73,7 @@ static int bp_modify1(void)
|
|||
/*
|
||||
* The parent does following steps:
|
||||
* - creates a new breakpoint (id 0) for bp_2 function
|
||||
* - changes that breakponit to bp_1 function
|
||||
* - changes that breakpoint to bp_1 function
|
||||
* - waits for the breakpoint to hit and checks
|
||||
* it has proper rip of bp_1 function
|
||||
* - detaches the child
|
||||
|
|
|
@ -9,6 +9,7 @@ perf-y += event.o
|
|||
perf-y += evlist.o
|
||||
perf-y += mem-events.o
|
||||
perf-y += evsel.o
|
||||
perf-y += iostat.o
|
||||
|
||||
perf-$(CONFIG_DWARF) += dwarf-regs.o
|
||||
perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
|
||||
|
|
|
@ -0,0 +1,470 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
/*
|
||||
* perf iostat
|
||||
*
|
||||
* Copyright (C) 2020, Intel Corporation
|
||||
*
|
||||
* Authors: Alexander Antonov <alexander.antonov@linux.intel.com>
|
||||
*/
|
||||
|
||||
#include <api/fs/fs.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/err.h>
|
||||
#include <limits.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
#include <errno.h>
|
||||
#include <sys/types.h>
|
||||
#include <sys/stat.h>
|
||||
#include <fcntl.h>
|
||||
#include <dirent.h>
|
||||
#include <unistd.h>
|
||||
#include <stdlib.h>
|
||||
#include <regex.h>
|
||||
#include "util/cpumap.h"
|
||||
#include "util/debug.h"
|
||||
#include "util/iostat.h"
|
||||
#include "util/counts.h"
|
||||
#include "path.h"
|
||||
|
||||
#ifndef MAX_PATH
|
||||
#define MAX_PATH 1024
|
||||
#endif
|
||||
|
||||
#define UNCORE_IIO_PMU_PATH "devices/uncore_iio_%d"
|
||||
#define SYSFS_UNCORE_PMU_PATH "%s/"UNCORE_IIO_PMU_PATH
|
||||
#define PLATFORM_MAPPING_PATH UNCORE_IIO_PMU_PATH"/die%d"
|
||||
|
||||
/*
|
||||
* Each metric requiries one IIO event which increments at every 4B transfer
|
||||
* in corresponding direction. The formulas to compute metrics are generic:
|
||||
* #EventCount * 4B / (1024 * 1024)
|
||||
*/
|
||||
static const char * const iostat_metrics[] = {
|
||||
"Inbound Read(MB)",
|
||||
"Inbound Write(MB)",
|
||||
"Outbound Read(MB)",
|
||||
"Outbound Write(MB)",
|
||||
};
|
||||
|
||||
static inline int iostat_metrics_count(void)
|
||||
{
|
||||
return sizeof(iostat_metrics) / sizeof(char *);
|
||||
}
|
||||
|
||||
static const char *iostat_metric_by_idx(int idx)
|
||||
{
|
||||
return *(iostat_metrics + idx % iostat_metrics_count());
|
||||
}
|
||||
|
||||
struct iio_root_port {
|
||||
u32 domain;
|
||||
u8 bus;
|
||||
u8 die;
|
||||
u8 pmu_idx;
|
||||
int idx;
|
||||
};
|
||||
|
||||
struct iio_root_ports_list {
|
||||
struct iio_root_port **rps;
|
||||
int nr_entries;
|
||||
};
|
||||
|
||||
static struct iio_root_ports_list *root_ports;
|
||||
|
||||
static void iio_root_port_show(FILE *output,
|
||||
const struct iio_root_port * const rp)
|
||||
{
|
||||
if (output && rp)
|
||||
fprintf(output, "S%d-uncore_iio_%d<%04x:%02x>\n",
|
||||
rp->die, rp->pmu_idx, rp->domain, rp->bus);
|
||||
}
|
||||
|
||||
static struct iio_root_port *iio_root_port_new(u32 domain, u8 bus,
|
||||
u8 die, u8 pmu_idx)
|
||||
{
|
||||
struct iio_root_port *p = calloc(1, sizeof(*p));
|
||||
|
||||
if (p) {
|
||||
p->domain = domain;
|
||||
p->bus = bus;
|
||||
p->die = die;
|
||||
p->pmu_idx = pmu_idx;
|
||||
}
|
||||
return p;
|
||||
}
|
||||
|
||||
static void iio_root_ports_list_free(struct iio_root_ports_list *list)
|
||||
{
|
||||
int idx;
|
||||
|
||||
if (list) {
|
||||
for (idx = 0; idx < list->nr_entries; idx++)
|
||||
free(list->rps[idx]);
|
||||
free(list->rps);
|
||||
free(list);
|
||||
}
|
||||
}
|
||||
|
||||
static struct iio_root_port *iio_root_port_find_by_notation(
|
||||
const struct iio_root_ports_list * const list, u32 domain, u8 bus)
|
||||
{
|
||||
int idx;
|
||||
struct iio_root_port *rp;
|
||||
|
||||
if (list) {
|
||||
for (idx = 0; idx < list->nr_entries; idx++) {
|
||||
rp = list->rps[idx];
|
||||
if (rp && rp->domain == domain && rp->bus == bus)
|
||||
return rp;
|
||||
}
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static int iio_root_ports_list_insert(struct iio_root_ports_list *list,
|
||||
struct iio_root_port * const rp)
|
||||
{
|
||||
struct iio_root_port **tmp_buf;
|
||||
|
||||
if (list && rp) {
|
||||
rp->idx = list->nr_entries++;
|
||||
tmp_buf = realloc(list->rps,
|
||||
list->nr_entries * sizeof(*list->rps));
|
||||
if (!tmp_buf) {
|
||||
pr_err("Failed to realloc memory\n");
|
||||
return -ENOMEM;
|
||||
}
|
||||
tmp_buf[rp->idx] = rp;
|
||||
list->rps = tmp_buf;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int iio_mapping(u8 pmu_idx, struct iio_root_ports_list * const list)
|
||||
{
|
||||
char *buf;
|
||||
char path[MAX_PATH];
|
||||
u32 domain;
|
||||
u8 bus;
|
||||
struct iio_root_port *rp;
|
||||
size_t size;
|
||||
int ret;
|
||||
|
||||
for (int die = 0; die < cpu__max_node(); die++) {
|
||||
scnprintf(path, MAX_PATH, PLATFORM_MAPPING_PATH, pmu_idx, die);
|
||||
if (sysfs__read_str(path, &buf, &size) < 0) {
|
||||
if (pmu_idx)
|
||||
goto out;
|
||||
pr_err("Mode iostat is not supported\n");
|
||||
return -1;
|
||||
}
|
||||
ret = sscanf(buf, "%04x:%02hhx", &domain, &bus);
|
||||
free(buf);
|
||||
if (ret != 2) {
|
||||
pr_err("Invalid mapping data: iio_%d; die%d\n",
|
||||
pmu_idx, die);
|
||||
return -1;
|
||||
}
|
||||
rp = iio_root_port_new(domain, bus, die, pmu_idx);
|
||||
if (!rp || iio_root_ports_list_insert(list, rp)) {
|
||||
free(rp);
|
||||
return -ENOMEM;
|
||||
}
|
||||
}
|
||||
out:
|
||||
return 0;
|
||||
}
|
||||
|
||||
static u8 iio_pmu_count(void)
|
||||
{
|
||||
u8 pmu_idx = 0;
|
||||
char path[MAX_PATH];
|
||||
const char *sysfs = sysfs__mountpoint();
|
||||
|
||||
if (sysfs) {
|
||||
for (;; pmu_idx++) {
|
||||
snprintf(path, sizeof(path), SYSFS_UNCORE_PMU_PATH,
|
||||
sysfs, pmu_idx);
|
||||
if (access(path, F_OK) != 0)
|
||||
break;
|
||||
}
|
||||
}
|
||||
return pmu_idx;
|
||||
}
|
||||
|
||||
static int iio_root_ports_scan(struct iio_root_ports_list **list)
|
||||
{
|
||||
int ret = -ENOMEM;
|
||||
struct iio_root_ports_list *tmp_list;
|
||||
u8 pmu_count = iio_pmu_count();
|
||||
|
||||
if (!pmu_count) {
|
||||
pr_err("Unsupported uncore pmu configuration\n");
|
||||
return -1;
|
||||
}
|
||||
|
||||
tmp_list = calloc(1, sizeof(*tmp_list));
|
||||
if (!tmp_list)
|
||||
goto err;
|
||||
|
||||
for (u8 pmu_idx = 0; pmu_idx < pmu_count; pmu_idx++) {
|
||||
ret = iio_mapping(pmu_idx, tmp_list);
|
||||
if (ret)
|
||||
break;
|
||||
}
|
||||
err:
|
||||
if (!ret)
|
||||
*list = tmp_list;
|
||||
else
|
||||
iio_root_ports_list_free(tmp_list);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int iio_root_port_parse_str(u32 *domain, u8 *bus, char *str)
|
||||
{
|
||||
int ret;
|
||||
regex_t regex;
|
||||
/*
|
||||
* Expected format domain:bus:
|
||||
* Valid domain range [0:ffff]
|
||||
* Valid bus range [0:ff]
|
||||
* Example: 0000:af, 0:3d, 01:7
|
||||
*/
|
||||
regcomp(®ex, "^([a-f0-9A-F]{1,}):([a-f0-9A-F]{1,2})", REG_EXTENDED);
|
||||
ret = regexec(®ex, str, 0, NULL, 0);
|
||||
if (ret || sscanf(str, "%08x:%02hhx", domain, bus) != 2)
|
||||
pr_warning("Unrecognized root port format: %s\n"
|
||||
"Please use the following format:\n"
|
||||
"\t [domain]:[bus]\n"
|
||||
"\t for example: 0000:3d\n", str);
|
||||
|
||||
regfree(®ex);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int iio_root_ports_list_filter(struct iio_root_ports_list **list,
|
||||
const char *filter)
|
||||
{
|
||||
char *tok, *tmp, *filter_copy = NULL;
|
||||
struct iio_root_port *rp;
|
||||
u32 domain;
|
||||
u8 bus;
|
||||
int ret = -ENOMEM;
|
||||
struct iio_root_ports_list *tmp_list = calloc(1, sizeof(*tmp_list));
|
||||
|
||||
if (!tmp_list)
|
||||
goto err;
|
||||
|
||||
filter_copy = strdup(filter);
|
||||
if (!filter_copy)
|
||||
goto err;
|
||||
|
||||
for (tok = strtok_r(filter_copy, ",", &tmp); tok;
|
||||
tok = strtok_r(NULL, ",", &tmp)) {
|
||||
if (!iio_root_port_parse_str(&domain, &bus, tok)) {
|
||||
rp = iio_root_port_find_by_notation(*list, domain, bus);
|
||||
if (rp) {
|
||||
(*list)->rps[rp->idx] = NULL;
|
||||
ret = iio_root_ports_list_insert(tmp_list, rp);
|
||||
if (ret) {
|
||||
free(rp);
|
||||
goto err;
|
||||
}
|
||||
} else if (!iio_root_port_find_by_notation(tmp_list,
|
||||
domain, bus))
|
||||
pr_warning("Root port %04x:%02x were not found\n",
|
||||
domain, bus);
|
||||
}
|
||||
}
|
||||
|
||||
if (tmp_list->nr_entries == 0) {
|
||||
pr_err("Requested root ports were not found\n");
|
||||
ret = -EINVAL;
|
||||
}
|
||||
err:
|
||||
iio_root_ports_list_free(*list);
|
||||
if (ret)
|
||||
iio_root_ports_list_free(tmp_list);
|
||||
else
|
||||
*list = tmp_list;
|
||||
|
||||
free(filter_copy);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int iostat_event_group(struct evlist *evl,
|
||||
struct iio_root_ports_list *list)
|
||||
{
|
||||
int ret;
|
||||
int idx;
|
||||
const char *iostat_cmd_template =
|
||||
"{uncore_iio_%x/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/,\
|
||||
uncore_iio_%x/event=0x83,umask=0x01,ch_mask=0xF,fc_mask=0x07/,\
|
||||
uncore_iio_%x/event=0xc0,umask=0x04,ch_mask=0xF,fc_mask=0x07/,\
|
||||
uncore_iio_%x/event=0xc0,umask=0x01,ch_mask=0xF,fc_mask=0x07/}";
|
||||
const int len_template = strlen(iostat_cmd_template) + 1;
|
||||
struct evsel *evsel = NULL;
|
||||
int metrics_count = iostat_metrics_count();
|
||||
char *iostat_cmd = calloc(len_template, 1);
|
||||
|
||||
if (!iostat_cmd)
|
||||
return -ENOMEM;
|
||||
|
||||
for (idx = 0; idx < list->nr_entries; idx++) {
|
||||
sprintf(iostat_cmd, iostat_cmd_template,
|
||||
list->rps[idx]->pmu_idx, list->rps[idx]->pmu_idx,
|
||||
list->rps[idx]->pmu_idx, list->rps[idx]->pmu_idx);
|
||||
ret = parse_events(evl, iostat_cmd, NULL);
|
||||
if (ret)
|
||||
goto err;
|
||||
}
|
||||
|
||||
evlist__for_each_entry(evl, evsel) {
|
||||
evsel->priv = list->rps[evsel->idx / metrics_count];
|
||||
}
|
||||
list->nr_entries = 0;
|
||||
err:
|
||||
iio_root_ports_list_free(list);
|
||||
free(iostat_cmd);
|
||||
return ret;
|
||||
}
|
||||
|
||||
int iostat_prepare(struct evlist *evlist, struct perf_stat_config *config)
|
||||
{
|
||||
if (evlist->core.nr_entries > 0) {
|
||||
pr_warning("The -e and -M options are not supported."
|
||||
"All chosen events/metrics will be dropped\n");
|
||||
evlist__delete(evlist);
|
||||
evlist = evlist__new();
|
||||
if (!evlist)
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
config->metric_only = true;
|
||||
config->aggr_mode = AGGR_GLOBAL;
|
||||
|
||||
return iostat_event_group(evlist, root_ports);
|
||||
}
|
||||
|
||||
int iostat_parse(const struct option *opt, const char *str,
|
||||
int unset __maybe_unused)
|
||||
{
|
||||
int ret;
|
||||
struct perf_stat_config *config = (struct perf_stat_config *)opt->data;
|
||||
|
||||
ret = iio_root_ports_scan(&root_ports);
|
||||
if (!ret) {
|
||||
config->iostat_run = true;
|
||||
if (!str)
|
||||
iostat_mode = IOSTAT_RUN;
|
||||
else if (!strcmp(str, "list"))
|
||||
iostat_mode = IOSTAT_LIST;
|
||||
else {
|
||||
iostat_mode = IOSTAT_RUN;
|
||||
ret = iio_root_ports_list_filter(&root_ports, str);
|
||||
}
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
void iostat_list(struct evlist *evlist, struct perf_stat_config *config)
|
||||
{
|
||||
struct evsel *evsel;
|
||||
struct iio_root_port *rp = NULL;
|
||||
|
||||
evlist__for_each_entry(evlist, evsel) {
|
||||
if (rp != evsel->priv) {
|
||||
rp = evsel->priv;
|
||||
iio_root_port_show(config->output, rp);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void iostat_release(struct evlist *evlist)
|
||||
{
|
||||
struct evsel *evsel;
|
||||
struct iio_root_port *rp = NULL;
|
||||
|
||||
evlist__for_each_entry(evlist, evsel) {
|
||||
if (rp != evsel->priv) {
|
||||
rp = evsel->priv;
|
||||
free(evsel->priv);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void iostat_prefix(struct evlist *evlist,
|
||||
struct perf_stat_config *config,
|
||||
char *prefix, struct timespec *ts)
|
||||
{
|
||||
struct iio_root_port *rp = evlist->selected->priv;
|
||||
|
||||
if (rp) {
|
||||
if (ts)
|
||||
sprintf(prefix, "%6lu.%09lu%s%04x:%02x%s",
|
||||
ts->tv_sec, ts->tv_nsec,
|
||||
config->csv_sep, rp->domain, rp->bus,
|
||||
config->csv_sep);
|
||||
else
|
||||
sprintf(prefix, "%04x:%02x%s", rp->domain, rp->bus,
|
||||
config->csv_sep);
|
||||
}
|
||||
}
|
||||
|
||||
void iostat_print_header_prefix(struct perf_stat_config *config)
|
||||
{
|
||||
if (config->csv_output)
|
||||
fputs("port,", config->output);
|
||||
else if (config->interval)
|
||||
fprintf(config->output, "# time port ");
|
||||
else
|
||||
fprintf(config->output, " port ");
|
||||
}
|
||||
|
||||
void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
|
||||
struct perf_stat_output_ctx *out)
|
||||
{
|
||||
double iostat_value = 0;
|
||||
u64 prev_count_val = 0;
|
||||
const char *iostat_metric = iostat_metric_by_idx(evsel->idx);
|
||||
u8 die = ((struct iio_root_port *)evsel->priv)->die;
|
||||
struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
|
||||
|
||||
if (count->run && count->ena) {
|
||||
if (evsel->prev_raw_counts && !out->force_header) {
|
||||
struct perf_counts_values *prev_count =
|
||||
perf_counts(evsel->prev_raw_counts, die, 0);
|
||||
|
||||
prev_count_val = prev_count->val;
|
||||
prev_count->val = count->val;
|
||||
}
|
||||
iostat_value = (count->val - prev_count_val) /
|
||||
((double) count->run / count->ena);
|
||||
}
|
||||
out->print_metric(config, out->ctx, NULL, "%8.0f", iostat_metric,
|
||||
iostat_value / (256 * 1024));
|
||||
}
|
||||
|
||||
void iostat_print_counters(struct evlist *evlist,
|
||||
struct perf_stat_config *config, struct timespec *ts,
|
||||
char *prefix, iostat_print_counter_t print_cnt_cb)
|
||||
{
|
||||
void *perf_device = NULL;
|
||||
struct evsel *counter = evlist__first(evlist);
|
||||
|
||||
evlist__set_selected(evlist, counter);
|
||||
iostat_prefix(evlist, config, prefix, ts);
|
||||
fprintf(config->output, "%s", prefix);
|
||||
evlist__for_each_entry(evlist, counter) {
|
||||
perf_device = evlist->selected->priv;
|
||||
if (perf_device && perf_device != counter->priv) {
|
||||
evlist__set_selected(evlist, counter);
|
||||
iostat_prefix(evlist, config, prefix, ts);
|
||||
fprintf(config->output, "\n%s", prefix);
|
||||
}
|
||||
print_cnt_cb(config, counter, prefix);
|
||||
}
|
||||
fputc('\n', config->output);
|
||||
}
|
|
@ -165,7 +165,7 @@ static int sdt_init_op_regex(void)
|
|||
/*
|
||||
* Max x86 register name length is 5(ex: %r15d). So, 6th char
|
||||
* should always contain NULL. This helps to find register name
|
||||
* length using strlen, insted of maintaing one more variable.
|
||||
* length using strlen, instead of maintaining one more variable.
|
||||
*/
|
||||
#define SDT_REG_NAME_SIZE 6
|
||||
|
||||
|
@ -207,7 +207,7 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
|
|||
* and displacement 0 (Both sign and displacement 0 are
|
||||
* optional so it may be empty). Use one more character
|
||||
* to hold last NULL so that strlen can be used to find
|
||||
* prefix length, instead of maintaing one more variable.
|
||||
* prefix length, instead of maintaining one more variable.
|
||||
*/
|
||||
char prefix[3] = {0};
|
||||
|
||||
|
|
|
@ -17,7 +17,7 @@
|
|||
* While the second model, enabled via --multiq option, uses multiple
|
||||
* queueing (which refers to one epoll instance per worker). For example,
|
||||
* short lived tcp connections in a high throughput httpd server will
|
||||
* ditribute the accept()'ing connections across CPUs. In this case each
|
||||
* distribute the accept()'ing connections across CPUs. In this case each
|
||||
* worker does a limited amount of processing.
|
||||
*
|
||||
* [queue A] ---> [worker]
|
||||
|
@ -198,7 +198,7 @@ static void *workerfn(void *arg)
|
|||
|
||||
do {
|
||||
/*
|
||||
* Block undefinitely waiting for the IN event.
|
||||
* Block indefinitely waiting for the IN event.
|
||||
* In order to stress the epoll_wait(2) syscall,
|
||||
* call it event per event, instead of a larger
|
||||
* batch (max)limit.
|
||||
|
|
|
@ -372,7 +372,7 @@ static int inject_build_id(struct bench_data *data, u64 *max_rss)
|
|||
len += synthesize_flush(data);
|
||||
}
|
||||
|
||||
/* tihs makes the child to finish */
|
||||
/* this makes the child to finish */
|
||||
close(data->input_pipe[1]);
|
||||
|
||||
wait4(data->pid, &status, 0, &rusage);
|
||||
|
|
|
@ -42,7 +42,7 @@
|
|||
#endif
|
||||
|
||||
/*
|
||||
* Regular printout to the terminal, supressed if -q is specified:
|
||||
* Regular printout to the terminal, suppressed if -q is specified:
|
||||
*/
|
||||
#define tprintf(x...) do { if (g && g->p.show_details >= 0) printf(x); } while (0)
|
||||
|
||||
|
|
|
@ -239,7 +239,7 @@ static int evsel__add_sample(struct evsel *evsel, struct perf_sample *sample,
|
|||
}
|
||||
|
||||
/*
|
||||
* XXX filtered samples can still have branch entires pointing into our
|
||||
* XXX filtered samples can still have branch entries pointing into our
|
||||
* symbol and are missed.
|
||||
*/
|
||||
process_branch_stack(sample->branch_stack, al, sample);
|
||||
|
@ -374,13 +374,6 @@ static void hists__find_annotations(struct hists *hists,
|
|||
} else {
|
||||
hist_entry__tty_annotate(he, evsel, ann);
|
||||
nd = rb_next(nd);
|
||||
/*
|
||||
* Since we have a hist_entry per IP for the same
|
||||
* symbol, free he->ms.sym->src to signal we already
|
||||
* processed this symbol.
|
||||
*/
|
||||
zfree(¬es->src->cycles_hist);
|
||||
zfree(¬es->src);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -411,8 +404,8 @@ static int __cmd_annotate(struct perf_annotate *ann)
|
|||
goto out;
|
||||
|
||||
if (dump_trace) {
|
||||
perf_session__fprintf_nr_events(session, stdout);
|
||||
evlist__fprintf_nr_events(session->evlist, stdout);
|
||||
perf_session__fprintf_nr_events(session, stdout, false);
|
||||
evlist__fprintf_nr_events(session->evlist, stdout, false);
|
||||
goto out;
|
||||
}
|
||||
|
||||
|
@ -425,7 +418,7 @@ static int __cmd_annotate(struct perf_annotate *ann)
|
|||
total_nr_samples = 0;
|
||||
evlist__for_each_entry(session->evlist, pos) {
|
||||
struct hists *hists = evsel__hists(pos);
|
||||
u32 nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
|
||||
u32 nr_samples = hists->stats.nr_samples;
|
||||
|
||||
if (nr_samples > 0) {
|
||||
total_nr_samples += nr_samples;
|
||||
|
@ -538,6 +531,10 @@ int cmd_annotate(int argc, const char **argv)
|
|||
"Strip first N entries of source file path name in programs (with --prefix)"),
|
||||
OPT_STRING(0, "objdump", &annotate.opts.objdump_path, "path",
|
||||
"objdump binary to use for disassembly and annotations"),
|
||||
OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
|
||||
"Enable symbol demangling"),
|
||||
OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
|
||||
"Enable kernel symbol demangling"),
|
||||
OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
|
||||
"Show event group information together"),
|
||||
OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period,
|
||||
|
@ -619,14 +616,22 @@ int cmd_annotate(int argc, const char **argv)
|
|||
|
||||
setup_browser(true);
|
||||
|
||||
if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) {
|
||||
/*
|
||||
* Events of different processes may correspond to the same
|
||||
* symbol, we do not care about the processes in annotate,
|
||||
* set sort order to avoid repeated output.
|
||||
*/
|
||||
sort_order = "dso,symbol";
|
||||
|
||||
/*
|
||||
* Set SORT_MODE__BRANCH so that annotate display IPC/Cycle
|
||||
* if branch info is in perf data in TUI mode.
|
||||
*/
|
||||
if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack)
|
||||
sort__mode = SORT_MODE__BRANCH;
|
||||
if (setup_sorting(annotate.session->evlist) < 0)
|
||||
usage_with_options(annotate_usage, options);
|
||||
} else {
|
||||
if (setup_sorting(NULL) < 0)
|
||||
usage_with_options(annotate_usage, options);
|
||||
}
|
||||
|
||||
if (setup_sorting(NULL) < 0)
|
||||
usage_with_options(annotate_usage, options);
|
||||
|
||||
ret = __cmd_annotate(&annotate);
|
||||
|
||||
|
|
|
@ -6,7 +6,6 @@
|
|||
#include <linux/zalloc.h>
|
||||
#include <linux/string.h>
|
||||
#include <linux/limits.h>
|
||||
#include <linux/string.h>
|
||||
#include <string.h>
|
||||
#include <sys/file.h>
|
||||
#include <signal.h>
|
||||
|
@ -24,8 +23,6 @@
|
|||
#include <sys/signalfd.h>
|
||||
#include <sys/wait.h>
|
||||
#include <poll.h>
|
||||
#include <sys/stat.h>
|
||||
#include <time.h>
|
||||
#include "builtin.h"
|
||||
#include "perf.h"
|
||||
#include "debug.h"
|
||||
|
|
|
@ -7,7 +7,6 @@
|
|||
#include "debug.h"
|
||||
#include <subcmd/parse-options.h>
|
||||
#include "data-convert.h"
|
||||
#include "data-convert-bt.h"
|
||||
|
||||
typedef int (*data_cmd_fn_t)(int argc, const char **argv);
|
||||
|
||||
|
@ -55,7 +54,8 @@ static const char * const data_convert_usage[] = {
|
|||
|
||||
static int cmd_data_convert(int argc, const char **argv)
|
||||
{
|
||||
const char *to_ctf = NULL;
|
||||
const char *to_json = NULL;
|
||||
const char *to_ctf = NULL;
|
||||
struct perf_data_convert_opts opts = {
|
||||
.force = false,
|
||||
.all = false,
|
||||
|
@ -63,6 +63,7 @@ static int cmd_data_convert(int argc, const char **argv)
|
|||
const struct option options[] = {
|
||||
OPT_INCR('v', "verbose", &verbose, "be more verbose"),
|
||||
OPT_STRING('i', "input", &input_name, "file", "input file name"),
|
||||
OPT_STRING(0, "to-json", &to_json, NULL, "Convert to JSON format"),
|
||||
#ifdef HAVE_LIBBABELTRACE_SUPPORT
|
||||
OPT_STRING(0, "to-ctf", &to_ctf, NULL, "Convert to CTF format"),
|
||||
OPT_BOOLEAN(0, "tod", &opts.tod, "Convert time to wall clock time"),
|
||||
|
@ -72,11 +73,6 @@ static int cmd_data_convert(int argc, const char **argv)
|
|||
OPT_END()
|
||||
};
|
||||
|
||||
#ifndef HAVE_LIBBABELTRACE_SUPPORT
|
||||
pr_err("No conversion support compiled in. perf should be compiled with environment variables LIBBABELTRACE=1 and LIBBABELTRACE_DIR=/path/to/libbabeltrace/\n");
|
||||
return -1;
|
||||
#endif
|
||||
|
||||
argc = parse_options(argc, argv, options,
|
||||
data_convert_usage, 0);
|
||||
if (argc) {
|
||||
|
@ -84,11 +80,25 @@ static int cmd_data_convert(int argc, const char **argv)
|
|||
return -1;
|
||||
}
|
||||
|
||||
if (to_json && to_ctf) {
|
||||
pr_err("You cannot specify both --to-ctf and --to-json.\n");
|
||||
return -1;
|
||||
}
|
||||
if (!to_json && !to_ctf) {
|
||||
pr_err("You must specify one of --to-ctf or --to-json.\n");
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (to_json)
|
||||
return bt_convert__perf2json(input_name, to_json, &opts);
|
||||
|
||||
if (to_ctf) {
|
||||
#ifdef HAVE_LIBBABELTRACE_SUPPORT
|
||||
return bt_convert__perf2ctf(input_name, to_ctf, &opts);
|
||||
#else
|
||||
pr_err("The libbabeltrace support is not compiled in.\n");
|
||||
pr_err("The libbabeltrace support is not compiled in. perf should be "
|
||||
"compiled with environment variables LIBBABELTRACE=1 and "
|
||||
"LIBBABELTRACE_DIR=/path/to/libbabeltrace/\n");
|
||||
return -1;
|
||||
#endif
|
||||
}
|
||||
|
|
|
@ -1796,7 +1796,7 @@ static int ui_init(void)
|
|||
data__for_each_file(i, d) {
|
||||
|
||||
/*
|
||||
* Baseline or compute realted columns:
|
||||
* Baseline or compute related columns:
|
||||
*
|
||||
* PERF_HPP_DIFF__BASELINE
|
||||
* PERF_HPP_DIFF__DELTA
|
||||
|
|
|
@ -49,7 +49,7 @@ struct lock_stat {
|
|||
|
||||
/*
|
||||
* FIXME: evsel__intval() returns u64,
|
||||
* so address of lockdep_map should be dealed as 64bit.
|
||||
* so address of lockdep_map should be treated as 64bit.
|
||||
* Is there more better solution?
|
||||
*/
|
||||
void *addr; /* address of lockdep_map, used as ID */
|
||||
|
|
|
@ -47,6 +47,8 @@
|
|||
#include "util/util.h"
|
||||
#include "util/pfm.h"
|
||||
#include "util/clockid.h"
|
||||
#include "util/pmu-hybrid.h"
|
||||
#include "util/evlist-hybrid.h"
|
||||
#include "asm/bug.h"
|
||||
#include "perf.h"
|
||||
|
||||
|
@ -1603,6 +1605,32 @@ static void hit_auxtrace_snapshot_trigger(struct record *rec)
|
|||
}
|
||||
}
|
||||
|
||||
static void record__uniquify_name(struct record *rec)
|
||||
{
|
||||
struct evsel *pos;
|
||||
struct evlist *evlist = rec->evlist;
|
||||
char *new_name;
|
||||
int ret;
|
||||
|
||||
if (!perf_pmu__has_hybrid())
|
||||
return;
|
||||
|
||||
evlist__for_each_entry(evlist, pos) {
|
||||
if (!evsel__is_hybrid(pos))
|
||||
continue;
|
||||
|
||||
if (strchr(pos->name, '/'))
|
||||
continue;
|
||||
|
||||
ret = asprintf(&new_name, "%s/%s/",
|
||||
pos->pmu_name, pos->name);
|
||||
if (ret) {
|
||||
free(pos->name);
|
||||
pos->name = new_name;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static int __cmd_record(struct record *rec, int argc, const char **argv)
|
||||
{
|
||||
int err;
|
||||
|
@ -1707,6 +1735,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
|
|||
if (data->is_pipe && rec->evlist->core.nr_entries == 1)
|
||||
rec->opts.sample_id = true;
|
||||
|
||||
record__uniquify_name(rec);
|
||||
|
||||
if (record__open(rec) != 0) {
|
||||
err = -1;
|
||||
goto out_child;
|
||||
|
@ -1977,9 +2007,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
|
|||
record__auxtrace_snapshot_exit(rec);
|
||||
|
||||
if (forks && workload_exec_errno) {
|
||||
char msg[STRERR_BUFSIZE];
|
||||
char msg[STRERR_BUFSIZE], strevsels[2048];
|
||||
const char *emsg = str_error_r(workload_exec_errno, msg, sizeof(msg));
|
||||
pr_err("Workload failed: %s\n", emsg);
|
||||
|
||||
evlist__scnprintf_evsels(rec->evlist, sizeof(strevsels), strevsels);
|
||||
|
||||
pr_err("Failed to collect '%s' for the '%s' workload: %s\n",
|
||||
strevsels, argv[0], emsg);
|
||||
err = -1;
|
||||
goto out_child;
|
||||
}
|
||||
|
@ -2786,10 +2820,19 @@ int cmd_record(int argc, const char **argv)
|
|||
if (record.opts.overwrite)
|
||||
record.opts.tail_synthesize = true;
|
||||
|
||||
if (rec->evlist->core.nr_entries == 0 &&
|
||||
__evlist__add_default(rec->evlist, !record.opts.no_samples) < 0) {
|
||||
pr_err("Not enough memory for event selector list\n");
|
||||
goto out;
|
||||
if (rec->evlist->core.nr_entries == 0) {
|
||||
if (perf_pmu__has_hybrid()) {
|
||||
err = evlist__add_default_hybrid(rec->evlist,
|
||||
!record.opts.no_samples);
|
||||
} else {
|
||||
err = __evlist__add_default(rec->evlist,
|
||||
!record.opts.no_samples);
|
||||
}
|
||||
|
||||
if (err < 0) {
|
||||
pr_err("Not enough memory for event selector list\n");
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
if (rec->opts.target.tid && !rec->opts.no_inherit_set)
|
||||
|
|
|
@ -84,6 +84,8 @@ struct report {
|
|||
bool nonany_branch_mode;
|
||||
bool group_set;
|
||||
bool stitch_lbr;
|
||||
bool disable_order;
|
||||
bool skip_empty;
|
||||
int max_stack;
|
||||
struct perf_read_values show_threads_values;
|
||||
struct annotation_options annotation_opts;
|
||||
|
@ -134,6 +136,11 @@ static int report__config(const char *var, const char *value, void *cb)
|
|||
return 0;
|
||||
}
|
||||
|
||||
if (!strcmp(var, "report.skip-empty")) {
|
||||
rep->skip_empty = perf_config_bool(var, value);
|
||||
return 0;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -435,7 +442,7 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report
|
|||
{
|
||||
size_t ret;
|
||||
char unit;
|
||||
unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
|
||||
unsigned long nr_samples = hists->stats.nr_samples;
|
||||
u64 nr_events = hists->stats.total_period;
|
||||
struct evsel *evsel = hists_to_evsel(hists);
|
||||
char buf[512];
|
||||
|
@ -463,7 +470,7 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report
|
|||
nr_samples += pos_hists->stats.nr_non_filtered_samples;
|
||||
nr_events += pos_hists->stats.total_non_filtered_period;
|
||||
} else {
|
||||
nr_samples += pos_hists->stats.nr_events[PERF_RECORD_SAMPLE];
|
||||
nr_samples += pos_hists->stats.nr_samples;
|
||||
nr_events += pos_hists->stats.total_period;
|
||||
}
|
||||
}
|
||||
|
@ -529,6 +536,9 @@ static int evlist__tty_browse_hists(struct evlist *evlist, struct report *rep, c
|
|||
if (symbol_conf.event_group && !evsel__is_group_leader(pos))
|
||||
continue;
|
||||
|
||||
if (rep->skip_empty && !hists->stats.nr_samples)
|
||||
continue;
|
||||
|
||||
hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
|
||||
|
||||
if (rep->total_cycles_mode) {
|
||||
|
@ -707,9 +717,22 @@ static void report__output_resort(struct report *rep)
|
|||
ui_progress__finish();
|
||||
}
|
||||
|
||||
static int count_sample_event(struct perf_tool *tool __maybe_unused,
|
||||
union perf_event *event __maybe_unused,
|
||||
struct perf_sample *sample __maybe_unused,
|
||||
struct evsel *evsel,
|
||||
struct machine *machine __maybe_unused)
|
||||
{
|
||||
struct hists *hists = evsel__hists(evsel);
|
||||
|
||||
hists__inc_nr_events(hists);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void stats_setup(struct report *rep)
|
||||
{
|
||||
memset(&rep->tool, 0, sizeof(rep->tool));
|
||||
rep->tool.sample = count_sample_event;
|
||||
rep->tool.no_warn = true;
|
||||
}
|
||||
|
||||
|
@ -717,7 +740,8 @@ static int stats_print(struct report *rep)
|
|||
{
|
||||
struct perf_session *session = rep->session;
|
||||
|
||||
perf_session__fprintf_nr_events(session, stdout);
|
||||
perf_session__fprintf_nr_events(session, stdout, rep->skip_empty);
|
||||
evlist__fprintf_nr_events(session->evlist, stdout, rep->skip_empty);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -929,8 +953,10 @@ static int __cmd_report(struct report *rep)
|
|||
perf_session__fprintf_dsos(session, stdout);
|
||||
|
||||
if (dump_trace) {
|
||||
perf_session__fprintf_nr_events(session, stdout);
|
||||
evlist__fprintf_nr_events(session->evlist, stdout);
|
||||
perf_session__fprintf_nr_events(session, stdout,
|
||||
rep->skip_empty);
|
||||
evlist__fprintf_nr_events(session->evlist, stdout,
|
||||
rep->skip_empty);
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
@ -1139,6 +1165,7 @@ int cmd_report(int argc, const char **argv)
|
|||
.pretty_printing_style = "normal",
|
||||
.socket_filter = -1,
|
||||
.annotation_opts = annotation__default_options,
|
||||
.skip_empty = true,
|
||||
};
|
||||
const struct option options[] = {
|
||||
OPT_STRING('i', "input", &input_name, "file",
|
||||
|
@ -1296,6 +1323,10 @@ int cmd_report(int argc, const char **argv)
|
|||
OPTS_EVSWITCH(&report.evswitch),
|
||||
OPT_BOOLEAN(0, "total-cycles", &report.total_cycles_mode,
|
||||
"Sort all blocks by 'Sampled Cycles%'"),
|
||||
OPT_BOOLEAN(0, "disable-order", &report.disable_order,
|
||||
"Disable raw trace ordering"),
|
||||
OPT_BOOLEAN(0, "skip-empty", &report.skip_empty,
|
||||
"Do not display empty (or dummy) events in the output"),
|
||||
OPT_END()
|
||||
};
|
||||
struct perf_data data = {
|
||||
|
@ -1329,7 +1360,7 @@ int cmd_report(int argc, const char **argv)
|
|||
if (report.mmaps_mode)
|
||||
report.tasks_mode = true;
|
||||
|
||||
if (dump_trace)
|
||||
if (dump_trace && report.disable_order)
|
||||
report.tool.ordered_events = false;
|
||||
|
||||
if (quiet)
|
||||
|
|
|
@ -1712,7 +1712,7 @@ static int perf_sched__process_fork_event(struct perf_tool *tool,
|
|||
{
|
||||
struct perf_sched *sched = container_of(tool, struct perf_sched, tool);
|
||||
|
||||
/* run the fork event through the perf machineruy */
|
||||
/* run the fork event through the perf machinery */
|
||||
perf_event__process_fork(tool, event, sample, machine);
|
||||
|
||||
/* and then run additional processing needed for this command */
|
||||
|
|
|
@ -314,8 +314,7 @@ static inline struct evsel_script *evsel_script(struct evsel *evsel)
|
|||
return (struct evsel_script *)evsel->priv;
|
||||
}
|
||||
|
||||
static struct evsel_script *perf_evsel_script__new(struct evsel *evsel,
|
||||
struct perf_data *data)
|
||||
static struct evsel_script *evsel_script__new(struct evsel *evsel, struct perf_data *data)
|
||||
{
|
||||
struct evsel_script *es = zalloc(sizeof(*es));
|
||||
|
||||
|
@ -335,7 +334,7 @@ static struct evsel_script *perf_evsel_script__new(struct evsel *evsel,
|
|||
return NULL;
|
||||
}
|
||||
|
||||
static void perf_evsel_script__delete(struct evsel_script *es)
|
||||
static void evsel_script__delete(struct evsel_script *es)
|
||||
{
|
||||
zfree(&es->filename);
|
||||
fclose(es->fp);
|
||||
|
@ -343,7 +342,7 @@ static void perf_evsel_script__delete(struct evsel_script *es)
|
|||
free(es);
|
||||
}
|
||||
|
||||
static int perf_evsel_script__fprintf(struct evsel_script *es, FILE *fp)
|
||||
static int evsel_script__fprintf(struct evsel_script *es, FILE *fp)
|
||||
{
|
||||
struct stat st;
|
||||
|
||||
|
@ -2219,8 +2218,7 @@ static int process_attr(struct perf_tool *tool, union perf_event *event,
|
|||
|
||||
if (!evsel->priv) {
|
||||
if (scr->per_event_dump) {
|
||||
evsel->priv = perf_evsel_script__new(evsel,
|
||||
scr->session->data);
|
||||
evsel->priv = evsel_script__new(evsel, scr->session->data);
|
||||
} else {
|
||||
es = zalloc(sizeof(*es));
|
||||
if (!es)
|
||||
|
@ -2475,7 +2473,7 @@ static void perf_script__fclose_per_event_dump(struct perf_script *script)
|
|||
evlist__for_each_entry(evlist, evsel) {
|
||||
if (!evsel->priv)
|
||||
break;
|
||||
perf_evsel_script__delete(evsel->priv);
|
||||
evsel_script__delete(evsel->priv);
|
||||
evsel->priv = NULL;
|
||||
}
|
||||
}
|
||||
|
@ -2488,14 +2486,14 @@ static int perf_script__fopen_per_event_dump(struct perf_script *script)
|
|||
/*
|
||||
* Already setup? I.e. we may be called twice in cases like
|
||||
* Intel PT, one for the intel_pt// and dummy events, then
|
||||
* for the evsels syntheized from the auxtrace info.
|
||||
* for the evsels synthesized from the auxtrace info.
|
||||
*
|
||||
* Ses perf_script__process_auxtrace_info.
|
||||
*/
|
||||
if (evsel->priv != NULL)
|
||||
continue;
|
||||
|
||||
evsel->priv = perf_evsel_script__new(evsel, script->session->data);
|
||||
evsel->priv = evsel_script__new(evsel, script->session->data);
|
||||
if (evsel->priv == NULL)
|
||||
goto out_err_fclose;
|
||||
}
|
||||
|
@ -2530,8 +2528,8 @@ static void perf_script__exit_per_event_dump_stats(struct perf_script *script)
|
|||
evlist__for_each_entry(script->session->evlist, evsel) {
|
||||
struct evsel_script *es = evsel->priv;
|
||||
|
||||
perf_evsel_script__fprintf(es, stdout);
|
||||
perf_evsel_script__delete(es);
|
||||
evsel_script__fprintf(es, stdout);
|
||||
evsel_script__delete(es);
|
||||
evsel->priv = NULL;
|
||||
}
|
||||
}
|
||||
|
@ -3085,7 +3083,7 @@ static int list_available_scripts(const struct option *opt __maybe_unused,
|
|||
*
|
||||
* Fixme: All existing "xxx-record" are all in good formats "-e event ",
|
||||
* which is covered well now. And new parsing code should be added to
|
||||
* cover the future complexing formats like event groups etc.
|
||||
* cover the future complex formats like event groups etc.
|
||||
*/
|
||||
static int check_ev_match(char *dir_name, char *scriptname,
|
||||
struct perf_session *session)
|
||||
|
|
|
@ -48,6 +48,7 @@
|
|||
#include "util/pmu.h"
|
||||
#include "util/event.h"
|
||||
#include "util/evlist.h"
|
||||
#include "util/evlist-hybrid.h"
|
||||
#include "util/evsel.h"
|
||||
#include "util/debug.h"
|
||||
#include "util/color.h"
|
||||
|
@ -68,6 +69,8 @@
|
|||
#include "util/affinity.h"
|
||||
#include "util/pfm.h"
|
||||
#include "util/bpf_counter.h"
|
||||
#include "util/iostat.h"
|
||||
#include "util/pmu-hybrid.h"
|
||||
#include "asm/bug.h"
|
||||
|
||||
#include <linux/time64.h>
|
||||
|
@ -160,6 +163,7 @@ static const char *smi_cost_attrs = {
|
|||
};
|
||||
|
||||
static struct evlist *evsel_list;
|
||||
static bool all_counters_use_bpf = true;
|
||||
|
||||
static struct target target = {
|
||||
.uid = UINT_MAX,
|
||||
|
@ -212,7 +216,8 @@ static struct perf_stat_config stat_config = {
|
|||
.walltime_nsecs_stats = &walltime_nsecs_stats,
|
||||
.big_num = true,
|
||||
.ctl_fd = -1,
|
||||
.ctl_fd_ack = -1
|
||||
.ctl_fd_ack = -1,
|
||||
.iostat_run = false,
|
||||
};
|
||||
|
||||
static bool cpus_map_matched(struct evsel *a, struct evsel *b)
|
||||
|
@ -239,6 +244,9 @@ static void evlist__check_cpu_maps(struct evlist *evlist)
|
|||
struct evsel *evsel, *pos, *leader;
|
||||
char buf[1024];
|
||||
|
||||
if (evlist__has_hybrid(evlist))
|
||||
evlist__warn_hybrid_group(evlist);
|
||||
|
||||
evlist__for_each_entry(evlist, evsel) {
|
||||
leader = evsel->leader;
|
||||
|
||||
|
@ -399,6 +407,9 @@ static int read_affinity_counters(struct timespec *rs)
|
|||
struct affinity affinity;
|
||||
int i, ncpus, cpu;
|
||||
|
||||
if (all_counters_use_bpf)
|
||||
return 0;
|
||||
|
||||
if (affinity__setup(&affinity) < 0)
|
||||
return -1;
|
||||
|
||||
|
@ -413,6 +424,8 @@ static int read_affinity_counters(struct timespec *rs)
|
|||
evlist__for_each_entry(evsel_list, counter) {
|
||||
if (evsel__cpu_iter_skip(counter, cpu))
|
||||
continue;
|
||||
if (evsel__is_bpf(counter))
|
||||
continue;
|
||||
if (!counter->err) {
|
||||
counter->err = read_counter_cpu(counter, rs,
|
||||
counter->cpu_iter - 1);
|
||||
|
@ -429,6 +442,9 @@ static int read_bpf_map_counters(void)
|
|||
int err;
|
||||
|
||||
evlist__for_each_entry(evsel_list, counter) {
|
||||
if (!evsel__is_bpf(counter))
|
||||
continue;
|
||||
|
||||
err = bpf_counter__read(counter);
|
||||
if (err)
|
||||
return err;
|
||||
|
@ -439,14 +455,10 @@ static int read_bpf_map_counters(void)
|
|||
static void read_counters(struct timespec *rs)
|
||||
{
|
||||
struct evsel *counter;
|
||||
int err;
|
||||
|
||||
if (!stat_config.stop_read_counter) {
|
||||
if (target__has_bpf(&target))
|
||||
err = read_bpf_map_counters();
|
||||
else
|
||||
err = read_affinity_counters(rs);
|
||||
if (err < 0)
|
||||
if (read_bpf_map_counters() ||
|
||||
read_affinity_counters(rs))
|
||||
return;
|
||||
}
|
||||
|
||||
|
@ -535,12 +547,13 @@ static int enable_counters(void)
|
|||
struct evsel *evsel;
|
||||
int err;
|
||||
|
||||
if (target__has_bpf(&target)) {
|
||||
evlist__for_each_entry(evsel_list, evsel) {
|
||||
err = bpf_counter__enable(evsel);
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
evlist__for_each_entry(evsel_list, evsel) {
|
||||
if (!evsel__is_bpf(evsel))
|
||||
continue;
|
||||
|
||||
err = bpf_counter__enable(evsel);
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
if (stat_config.initial_delay < 0) {
|
||||
|
@ -784,14 +797,20 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
|
|||
if (affinity__setup(&affinity) < 0)
|
||||
return -1;
|
||||
|
||||
if (target__has_bpf(&target)) {
|
||||
evlist__for_each_entry(evsel_list, counter) {
|
||||
if (bpf_counter__load(counter, &target))
|
||||
return -1;
|
||||
}
|
||||
evlist__for_each_entry(evsel_list, counter) {
|
||||
if (bpf_counter__load(counter, &target))
|
||||
return -1;
|
||||
if (!evsel__is_bpf(counter))
|
||||
all_counters_use_bpf = false;
|
||||
}
|
||||
|
||||
evlist__for_each_cpu (evsel_list, i, cpu) {
|
||||
/*
|
||||
* bperf calls evsel__open_per_cpu() in bperf__load(), so
|
||||
* no need to call it again here.
|
||||
*/
|
||||
if (target.use_bpf)
|
||||
break;
|
||||
affinity__set(&affinity, cpu);
|
||||
|
||||
evlist__for_each_entry(evsel_list, counter) {
|
||||
|
@ -799,6 +818,8 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
|
|||
continue;
|
||||
if (counter->reset_group || counter->errored)
|
||||
continue;
|
||||
if (evsel__is_bpf(counter))
|
||||
continue;
|
||||
try_again:
|
||||
if (create_perf_stat_counter(counter, &stat_config, &target,
|
||||
counter->cpu_iter - 1) < 0) {
|
||||
|
@ -925,15 +946,15 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
|
|||
/*
|
||||
* Enable counters and exec the command:
|
||||
*/
|
||||
t0 = rdclock();
|
||||
clock_gettime(CLOCK_MONOTONIC, &ref_time);
|
||||
|
||||
if (forks) {
|
||||
evlist__start_workload(evsel_list);
|
||||
err = enable_counters();
|
||||
if (err)
|
||||
return -1;
|
||||
|
||||
t0 = rdclock();
|
||||
clock_gettime(CLOCK_MONOTONIC, &ref_time);
|
||||
|
||||
if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
|
||||
status = dispatch_events(forks, timeout, interval, ×);
|
||||
if (child_pid != -1) {
|
||||
|
@ -954,6 +975,10 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
|
|||
err = enable_counters();
|
||||
if (err)
|
||||
return -1;
|
||||
|
||||
t0 = rdclock();
|
||||
clock_gettime(CLOCK_MONOTONIC, &ref_time);
|
||||
|
||||
status = dispatch_events(forks, timeout, interval, ×);
|
||||
}
|
||||
|
||||
|
@ -1083,6 +1108,11 @@ void perf_stat__set_big_num(int set)
|
|||
stat_config.big_num = (set != 0);
|
||||
}
|
||||
|
||||
void perf_stat__set_no_csv_summary(int set)
|
||||
{
|
||||
stat_config.no_csv_summary = (set != 0);
|
||||
}
|
||||
|
||||
static int stat__set_big_num(const struct option *opt __maybe_unused,
|
||||
const char *s __maybe_unused, int unset)
|
||||
{
|
||||
|
@ -1146,6 +1176,10 @@ static struct option stat_options[] = {
|
|||
#ifdef HAVE_BPF_SKEL
|
||||
OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id",
|
||||
"stat events on existing bpf program id"),
|
||||
OPT_BOOLEAN(0, "bpf-counters", &target.use_bpf,
|
||||
"use bpf program to count events"),
|
||||
OPT_STRING(0, "bpf-attr-map", &target.attr_map, "attr-map-path",
|
||||
"path to perf_event_attr map"),
|
||||
#endif
|
||||
OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
|
||||
"system-wide collection from all CPUs"),
|
||||
|
@ -1235,6 +1269,8 @@ static struct option stat_options[] = {
|
|||
"threads of same physical core"),
|
||||
OPT_BOOLEAN(0, "summary", &stat_config.summary,
|
||||
"print summary for interval mode"),
|
||||
OPT_BOOLEAN(0, "no-csv-summary", &stat_config.no_csv_summary,
|
||||
"don't print 'summary' for CSV summary output"),
|
||||
OPT_BOOLEAN(0, "quiet", &stat_config.quiet,
|
||||
"don't print output (useful with record)"),
|
||||
#ifdef HAVE_LIBPFM
|
||||
|
@ -1247,6 +1283,9 @@ static struct option stat_options[] = {
|
|||
"\t\t\t Optionally send control command completion ('ack\\n') to ack-fd descriptor.\n"
|
||||
"\t\t\t Alternatively, ctl-fifo / ack-fifo will be opened and used as ctl-fd / ack-fd.",
|
||||
parse_control_option),
|
||||
OPT_CALLBACK_OPTARG(0, "iostat", &evsel_list, &stat_config, "default",
|
||||
"measure I/O performance metrics provided by arch/platform",
|
||||
iostat_parse),
|
||||
OPT_END()
|
||||
};
|
||||
|
||||
|
@ -1604,6 +1643,12 @@ static int add_default_attributes(void)
|
|||
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
|
||||
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },
|
||||
|
||||
};
|
||||
struct perf_event_attr default_sw_attrs[] = {
|
||||
{ .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
|
||||
{ .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
|
||||
{ .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
|
||||
{ .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
|
||||
};
|
||||
|
||||
/*
|
||||
|
@ -1705,7 +1750,7 @@ static int add_default_attributes(void)
|
|||
bzero(&errinfo, sizeof(errinfo));
|
||||
if (transaction_run) {
|
||||
/* Handle -T as -M transaction. Once platform specific metrics
|
||||
* support has been added to the json files, all archictures
|
||||
* support has been added to the json files, all architectures
|
||||
* will use this approach. To determine transaction support
|
||||
* on an architecture test for such a metric name.
|
||||
*/
|
||||
|
@ -1841,6 +1886,28 @@ static int add_default_attributes(void)
|
|||
}
|
||||
|
||||
if (!evsel_list->core.nr_entries) {
|
||||
if (perf_pmu__has_hybrid()) {
|
||||
const char *hybrid_str = "cycles,instructions,branches,branch-misses";
|
||||
|
||||
if (target__has_cpu(&target))
|
||||
default_sw_attrs[0].config = PERF_COUNT_SW_CPU_CLOCK;
|
||||
|
||||
if (evlist__add_default_attrs(evsel_list,
|
||||
default_sw_attrs) < 0) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
err = parse_events(evsel_list, hybrid_str, &errinfo);
|
||||
if (err) {
|
||||
fprintf(stderr,
|
||||
"Cannot set up hybrid events %s: %d\n",
|
||||
hybrid_str, err);
|
||||
parse_events_print_error(&errinfo, hybrid_str);
|
||||
return -1;
|
||||
}
|
||||
return err;
|
||||
}
|
||||
|
||||
if (target__has_cpu(&target))
|
||||
default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;
|
||||
|
||||
|
@ -2320,6 +2387,17 @@ int cmd_stat(int argc, const char **argv)
|
|||
goto out;
|
||||
}
|
||||
|
||||
if (stat_config.iostat_run) {
|
||||
status = iostat_prepare(evsel_list, &stat_config);
|
||||
if (status)
|
||||
goto out;
|
||||
if (iostat_mode == IOSTAT_LIST) {
|
||||
iostat_list(evsel_list, &stat_config);
|
||||
goto out;
|
||||
} else if (verbose)
|
||||
iostat_list(evsel_list, &stat_config);
|
||||
}
|
||||
|
||||
if (add_default_attributes())
|
||||
goto out;
|
||||
|
||||
|
@ -2357,6 +2435,9 @@ int cmd_stat(int argc, const char **argv)
|
|||
|
||||
evlist__check_cpu_maps(evsel_list);
|
||||
|
||||
if (perf_pmu__has_hybrid())
|
||||
stat_config.no_merge = true;
|
||||
|
||||
/*
|
||||
* Initialize thread_map with comm names,
|
||||
* so we could print it out on output.
|
||||
|
@ -2459,7 +2540,7 @@ int cmd_stat(int argc, const char **argv)
|
|||
/*
|
||||
* We synthesize the kernel mmap record just so that older tools
|
||||
* don't emit warnings about not being able to resolve symbols
|
||||
* due to /proc/sys/kernel/kptr_restrict settings and instear provide
|
||||
* due to /proc/sys/kernel/kptr_restrict settings and instead provide
|
||||
* a saner message about no samples being in the perf.data file.
|
||||
*
|
||||
* This also serves to suppress a warning about f_header.data.size == 0
|
||||
|
@ -2495,6 +2576,9 @@ int cmd_stat(int argc, const char **argv)
|
|||
perf_stat__exit_aggr_mode();
|
||||
evlist__free_stats(evsel_list);
|
||||
out:
|
||||
if (stat_config.iostat_run)
|
||||
iostat_release(evsel_list);
|
||||
|
||||
zfree(&stat_config.walltime_run);
|
||||
|
||||
if (smi_cost && smi_reset)
|
||||
|
|
|
@ -328,13 +328,13 @@ static void perf_top__print_sym_table(struct perf_top *top)
|
|||
printf("%-*.*s\n", win_width, win_width, graph_dotted_line);
|
||||
|
||||
if (!top->record_opts.overwrite &&
|
||||
(hists->stats.nr_lost_warned !=
|
||||
hists->stats.nr_events[PERF_RECORD_LOST])) {
|
||||
hists->stats.nr_lost_warned =
|
||||
hists->stats.nr_events[PERF_RECORD_LOST];
|
||||
(top->evlist->stats.nr_lost_warned !=
|
||||
top->evlist->stats.nr_events[PERF_RECORD_LOST])) {
|
||||
top->evlist->stats.nr_lost_warned =
|
||||
top->evlist->stats.nr_events[PERF_RECORD_LOST];
|
||||
color_fprintf(stdout, PERF_COLOR_RED,
|
||||
"WARNING: LOST %d chunks, Check IO/CPU overload",
|
||||
hists->stats.nr_lost_warned);
|
||||
top->evlist->stats.nr_lost_warned);
|
||||
++printed;
|
||||
}
|
||||
|
||||
|
@ -852,11 +852,9 @@ static void
|
|||
perf_top__process_lost(struct perf_top *top, union perf_event *event,
|
||||
struct evsel *evsel)
|
||||
{
|
||||
struct hists *hists = evsel__hists(evsel);
|
||||
|
||||
top->lost += event->lost.lost;
|
||||
top->lost_total += event->lost.lost;
|
||||
hists->stats.total_lost += event->lost.lost;
|
||||
evsel->evlist->stats.total_lost += event->lost.lost;
|
||||
}
|
||||
|
||||
static void
|
||||
|
@ -864,11 +862,9 @@ perf_top__process_lost_samples(struct perf_top *top,
|
|||
union perf_event *event,
|
||||
struct evsel *evsel)
|
||||
{
|
||||
struct hists *hists = evsel__hists(evsel);
|
||||
|
||||
top->lost += event->lost_samples.lost;
|
||||
top->lost_total += event->lost_samples.lost;
|
||||
hists->stats.total_lost_samples += event->lost_samples.lost;
|
||||
evsel->evlist->stats.total_lost_samples += event->lost_samples.lost;
|
||||
}
|
||||
|
||||
static u64 last_timestamp;
|
||||
|
@ -1205,7 +1201,7 @@ static int deliver_event(struct ordered_events *qe,
|
|||
} else if (event->header.type == PERF_RECORD_LOST_SAMPLES) {
|
||||
perf_top__process_lost_samples(top, event, evsel);
|
||||
} else if (event->header.type < PERF_RECORD_MAX) {
|
||||
hists__inc_nr_events(evsel__hists(evsel), event->header.type);
|
||||
events_stats__inc(&session->evlist->stats, event->header.type);
|
||||
machine__process_event(machine, event, &sample);
|
||||
} else
|
||||
++session->evlist->stats.nr_unknown_events;
|
||||
|
@ -1607,7 +1603,7 @@ int cmd_top(int argc, const char **argv)
|
|||
if (status) {
|
||||
/*
|
||||
* Some arches do not provide a get_cpuid(), so just use pr_debug, otherwise
|
||||
* warn the user explicitely.
|
||||
* warn the user explicitly.
|
||||
*/
|
||||
eprintf(status == ENOSYS ? 1 : 0, verbose,
|
||||
"Couldn't read the cpuid for this machine: %s\n",
|
||||
|
|
|
@ -153,6 +153,7 @@ check lib/ctype.c '-I "^EXPORT_SYMBOL" -I "^#include <linux/export.h>" -B
|
|||
check_2 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
|
||||
check_2 tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
|
||||
check_2 tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
|
||||
check_2 tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
|
||||
|
||||
for i in $BEAUTY_FILES; do
|
||||
beauty_check $i -B
|
||||
|
|
|
@ -14,6 +14,7 @@ perf-config mainporcelain common
|
|||
perf-evlist mainporcelain common
|
||||
perf-ftrace mainporcelain common
|
||||
perf-inject mainporcelain common
|
||||
perf-iostat mainporcelain common
|
||||
perf-kallsyms mainporcelain common
|
||||
perf-kmem mainporcelain common
|
||||
perf-kvm mainporcelain common
|
||||
|
|
|
@ -262,7 +262,7 @@ int sys_enter(struct syscall_enter_args *args)
|
|||
/*
|
||||
* Jump to syscall specific augmenter, even if the default one,
|
||||
* "!raw_syscalls:unaugmented" that will just return 1 to return the
|
||||
* unagmented tracepoint payload.
|
||||
* unaugmented tracepoint payload.
|
||||
*/
|
||||
bpf_tail_call(args, &syscalls_sys_enter, augmented_args->args.syscall_nr);
|
||||
|
||||
|
@ -282,7 +282,7 @@ int sys_exit(struct syscall_exit_args *args)
|
|||
/*
|
||||
* Jump to syscall specific return augmenter, even if the default one,
|
||||
* "!raw_syscalls:unaugmented" that will just return 1 to return the
|
||||
* unagmented tracepoint payload.
|
||||
* unaugmented tracepoint payload.
|
||||
*/
|
||||
bpf_tail_call(args, &syscalls_sys_exit, exit_args.syscall_nr);
|
||||
/*
|
||||
|
|
|
@ -390,7 +390,7 @@ jvmti_write_code(void *agent, char const *sym,
|
|||
rec.p.total_size += size;
|
||||
|
||||
/*
|
||||
* If JVM is multi-threaded, nultiple concurrent calls to agent
|
||||
* If JVM is multi-threaded, multiple concurrent calls to agent
|
||||
* may be possible, so protect file writes
|
||||
*/
|
||||
flockfile(fp);
|
||||
|
@ -457,7 +457,7 @@ jvmti_write_debug_info(void *agent, uint64_t code,
|
|||
rec.p.total_size = size;
|
||||
|
||||
/*
|
||||
* If JVM is multi-threaded, nultiple concurrent calls to agent
|
||||
* If JVM is multi-threaded, multiple concurrent calls to agent
|
||||
* may be possible, so protect file writes
|
||||
*/
|
||||
flockfile(fp);
|
||||
|
|
|
@ -0,0 +1,12 @@
|
|||
#!/bin/bash
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
# perf iostat
|
||||
# Alexander Antonov <alexander.antonov@linux.intel.com>
|
||||
|
||||
if [[ "$1" == "list" ]] || [[ "$1" =~ ([a-f0-9A-F]{1,}):([a-f0-9A-F]{1,2})(,)? ]]; then
|
||||
DELIMITER="="
|
||||
else
|
||||
DELIMITER=" "
|
||||
fi
|
||||
|
||||
perf stat --iostat$DELIMITER$*
|
|
@ -209,12 +209,24 @@
|
|||
"EventName": "L2D_TLB_REFILL",
|
||||
"BriefDescription": "Attributable Level 2 data TLB refill"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Attributable Level 2 instruction TLB refill.",
|
||||
"EventCode": "0x2E",
|
||||
"EventName": "L2I_TLB_REFILL",
|
||||
"BriefDescription": "Attributable Level 2 instruction TLB refill."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Attributable Level 2 data or unified TLB access",
|
||||
"EventCode": "0x2F",
|
||||
"EventName": "L2D_TLB",
|
||||
"BriefDescription": "Attributable Level 2 data or unified TLB access"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Attributable Level 2 instruction TLB access.",
|
||||
"EventCode": "0x30",
|
||||
"EventName": "L2I_TLB",
|
||||
"BriefDescription": "Attributable Level 2 instruction TLB access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Access to another socket in a multi-socket system",
|
||||
"EventCode": "0x31",
|
||||
|
@ -244,5 +256,221 @@
|
|||
"EventCode": "0x37",
|
||||
"EventName": "LL_CACHE_MISS_RD",
|
||||
"BriefDescription": "Last level cache miss, read"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SIMD Instruction architecturally executed.",
|
||||
"EventCode": "0x8000",
|
||||
"EventName": "SIMD_INST_RETIRED",
|
||||
"BriefDescription": "SIMD Instruction architecturally executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, SVE.",
|
||||
"EventCode": "0x8002",
|
||||
"EventName": "SVE_INST_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, SVE."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Microarchitectural operation, Operations speculatively executed.",
|
||||
"EventCode": "0x8008",
|
||||
"EventName": "UOP_SPEC",
|
||||
"BriefDescription": "Microarchitectural operation, Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE Math accelerator Operations speculatively executed.",
|
||||
"EventCode": "0x800E",
|
||||
"EventName": "SVE_MATH_SPEC",
|
||||
"BriefDescription": "SVE Math accelerator Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Floating-point Operations speculatively executed.",
|
||||
"EventCode": "0x8010",
|
||||
"EventName": "FP_SPEC",
|
||||
"BriefDescription": "Floating-point Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Floating-point FMA Operations speculatively executed.",
|
||||
"EventCode": "0x8028",
|
||||
"EventName": "FP_FMA_SPEC",
|
||||
"BriefDescription": "Floating-point FMA Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Floating-point reciprocal estimate Operations speculatively executed.",
|
||||
"EventCode": "0x8034",
|
||||
"EventName": "FP_RECPE_SPEC",
|
||||
"BriefDescription": "Floating-point reciprocal estimate Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "floating-point convert Operations speculatively executed.",
|
||||
"EventCode": "0x8038",
|
||||
"EventName": "FP_CVT_SPEC",
|
||||
"BriefDescription": "floating-point convert Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Advanced SIMD and SVE integer Operations speculatively executed.",
|
||||
"EventCode": "0x8043",
|
||||
"EventName": "ASE_SVE_INT_SPEC",
|
||||
"BriefDescription": "Advanced SIMD and SVE integer Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE predicated Operations speculatively executed.",
|
||||
"EventCode": "0x8074",
|
||||
"EventName": "SVE_PRED_SPEC",
|
||||
"BriefDescription": "SVE predicated Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE MOVPRFX Operations speculatively executed.",
|
||||
"EventCode": "0x807C",
|
||||
"EventName": "SVE_MOVPRFX_SPEC",
|
||||
"BriefDescription": "SVE MOVPRFX Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE MOVPRFX unfused Operations speculatively executed.",
|
||||
"EventCode": "0x807F",
|
||||
"EventName": "SVE_MOVPRFX_U_SPEC",
|
||||
"BriefDescription": "SVE MOVPRFX unfused Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Advanced SIMD and SVE load Operations speculatively executed.",
|
||||
"EventCode": "0x8085",
|
||||
"EventName": "ASE_SVE_LD_SPEC",
|
||||
"BriefDescription": "Advanced SIMD and SVE load Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Advanced SIMD and SVE store Operations speculatively executed.",
|
||||
"EventCode": "0x8086",
|
||||
"EventName": "ASE_SVE_ST_SPEC",
|
||||
"BriefDescription": "Advanced SIMD and SVE store Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Prefetch Operations speculatively executed.",
|
||||
"EventCode": "0x8087",
|
||||
"EventName": "PRF_SPEC",
|
||||
"BriefDescription": "Prefetch Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "General-purpose register load Operations speculatively executed.",
|
||||
"EventCode": "0x8089",
|
||||
"EventName": "BASE_LD_REG_SPEC",
|
||||
"BriefDescription": "General-purpose register load Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "General-purpose register store Operations speculatively executed.",
|
||||
"EventCode": "0x808A",
|
||||
"EventName": "BASE_ST_REG_SPEC",
|
||||
"BriefDescription": "General-purpose register store Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE unpredicated load register Operations speculatively executed.",
|
||||
"EventCode": "0x8091",
|
||||
"EventName": "SVE_LDR_REG_SPEC",
|
||||
"BriefDescription": "SVE unpredicated load register Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE unpredicated store register Operations speculatively executed.",
|
||||
"EventCode": "0x8092",
|
||||
"EventName": "SVE_STR_REG_SPEC",
|
||||
"BriefDescription": "SVE unpredicated store register Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE load predicate register Operations speculatively executed.",
|
||||
"EventCode": "0x8095",
|
||||
"EventName": "SVE_LDR_PREG_SPEC",
|
||||
"BriefDescription": "SVE load predicate register Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE store predicate register Operations speculatively executed.",
|
||||
"EventCode": "0x8096",
|
||||
"EventName": "SVE_STR_PREG_SPEC",
|
||||
"BriefDescription": "SVE store predicate register Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE contiguous prefetch element Operations speculatively executed.",
|
||||
"EventCode": "0x809F",
|
||||
"EventName": "SVE_PRF_CONTIG_SPEC",
|
||||
"BriefDescription": "SVE contiguous prefetch element Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Advanced SIMD and SVE contiguous load multiple vector Operations speculatively executed.",
|
||||
"EventCode": "0x80A5",
|
||||
"EventName": "ASE_SVE_LD_MULTI_SPEC",
|
||||
"BriefDescription": "Advanced SIMD and SVE contiguous load multiple vector Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Advanced SIMD and SVE contiguous store multiple vector Operations speculatively executed.",
|
||||
"EventCode": "0x80A6",
|
||||
"EventName": "ASE_SVE_ST_MULTI_SPEC",
|
||||
"BriefDescription": "Advanced SIMD and SVE contiguous store multiple vector Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE gather-load Operations speculatively executed.",
|
||||
"EventCode": "0x80AD",
|
||||
"EventName": "SVE_LD_GATHER_SPEC",
|
||||
"BriefDescription": "SVE gather-load Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE scatter-store Operations speculatively executed.",
|
||||
"EventCode": "0x80AE",
|
||||
"EventName": "SVE_ST_SCATTER_SPEC",
|
||||
"BriefDescription": "SVE scatter-store Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE gather-prefetch Operations speculatively executed.",
|
||||
"EventCode": "0x80AF",
|
||||
"EventName": "SVE_PRF_GATHER_SPEC",
|
||||
"BriefDescription": "SVE gather-prefetch Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "SVE First-fault load Operations speculatively executed.",
|
||||
"EventCode": "0x80BC",
|
||||
"EventName": "SVE_LDFF_SPEC",
|
||||
"BriefDescription": "SVE First-fault load Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Scalable floating-point element Operations speculatively executed.",
|
||||
"EventCode": "0x80C0",
|
||||
"EventName": "FP_SCALE_OPS_SPEC",
|
||||
"BriefDescription": "Scalable floating-point element Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Non-scalable floating-point element Operations speculatively executed.",
|
||||
"EventCode": "0x80C1",
|
||||
"EventName": "FP_FIXED_OPS_SPEC",
|
||||
"BriefDescription": "Non-scalable floating-point element Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Scalable half-precision floating-point element Operations speculatively executed.",
|
||||
"EventCode": "0x80C2",
|
||||
"EventName": "FP_HP_SCALE_OPS_SPEC",
|
||||
"BriefDescription": "Scalable half-precision floating-point element Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Non-scalable half-precision floating-point element Operations speculatively executed.",
|
||||
"EventCode": "0x80C3",
|
||||
"EventName": "FP_HP_FIXED_OPS_SPEC",
|
||||
"BriefDescription": "Non-scalable half-precision floating-point element Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Scalable single-precision floating-point element Operations speculatively executed.",
|
||||
"EventCode": "0x80C4",
|
||||
"EventName": "FP_SP_SCALE_OPS_SPEC",
|
||||
"BriefDescription": "Scalable single-precision floating-point element Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Non-scalable single-precision floating-point element Operations speculatively executed.",
|
||||
"EventCode": "0x80C5",
|
||||
"EventName": "FP_SP_FIXED_OPS_SPEC",
|
||||
"BriefDescription": "Non-scalable single-precision floating-point element Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Scalable double-precision floating-point element Operations speculatively executed.",
|
||||
"EventCode": "0x80C6",
|
||||
"EventName": "FP_DP_SCALE_OPS_SPEC",
|
||||
"BriefDescription": "Scalable double-precision floating-point element Operations speculatively executed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Non-scalable double-precision floating-point element Operations speculatively executed.",
|
||||
"EventCode": "0x80C7",
|
||||
"EventName": "FP_DP_FIXED_OPS_SPEC",
|
||||
"BriefDescription": "Non-scalable double-precision floating-point element Operations speculatively executed."
|
||||
}
|
||||
]
|
||||
|
|
|
@ -0,0 +1,8 @@
|
|||
[
|
||||
{
|
||||
"ArchStdEvent": "BR_MIS_PRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_PRED"
|
||||
}
|
||||
]
|
|
@ -0,0 +1,62 @@
|
|||
[
|
||||
{
|
||||
"PublicDescription": "This event counts read transactions from tofu controller to measured CMG.",
|
||||
"EventCode": "0x314",
|
||||
"EventName": "BUS_READ_TOTAL_TOFU",
|
||||
"BriefDescription": "This event counts read transactions from tofu controller to measured CMG."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts read transactions from PCI controller to measured CMG.",
|
||||
"EventCode": "0x315",
|
||||
"EventName": "BUS_READ_TOTAL_PCI",
|
||||
"BriefDescription": "This event counts read transactions from PCI controller to measured CMG."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts read transactions from measured CMG local memory to measured CMG.",
|
||||
"EventCode": "0x316",
|
||||
"EventName": "BUS_READ_TOTAL_MEM",
|
||||
"BriefDescription": "This event counts read transactions from measured CMG local memory to measured CMG."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts write transactions from measured CMG to CMG0, if measured CMG is not CMG0.",
|
||||
"EventCode": "0x318",
|
||||
"EventName": "BUS_WRITE_TOTAL_CMG0",
|
||||
"BriefDescription": "This event counts write transactions from measured CMG to CMG0, if measured CMG is not CMG0."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts write transactions from measured CMG to CMG1, if measured CMG is not CMG1.",
|
||||
"EventCode": "0x319",
|
||||
"EventName": "BUS_WRITE_TOTAL_CMG1",
|
||||
"BriefDescription": "This event counts write transactions from measured CMG to CMG1, if measured CMG is not CMG1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts write transactions from measured CMG to CMG2, if measured CMG is not CMG2.",
|
||||
"EventCode": "0x31A",
|
||||
"EventName": "BUS_WRITE_TOTAL_CMG2",
|
||||
"BriefDescription": "This event counts write transactions from measured CMG to CMG2, if measured CMG is not CMG2."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts write transactions from measured CMG to CMG3, if measured CMG is not CMG3.",
|
||||
"EventCode": "0x31B",
|
||||
"EventName": "BUS_WRITE_TOTAL_CMG3",
|
||||
"BriefDescription": "This event counts write transactions from measured CMG to CMG3, if measured CMG is not CMG3."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts write transactions from measured CMG to tofu controller.",
|
||||
"EventCode": "0x31C",
|
||||
"EventName": "BUS_WRITE_TOTAL_TOFU",
|
||||
"BriefDescription": "This event counts write transactions from measured CMG to tofu controller."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts write transactions from measured CMG to PCI controller.",
|
||||
"EventCode": "0x31D",
|
||||
"EventName": "BUS_WRITE_TOTAL_PCI",
|
||||
"BriefDescription": "This event counts write transactions from measured CMG to PCI controller."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts write transactions from measured CMG to measured CMG local memory.",
|
||||
"EventCode": "0x31E",
|
||||
"EventName": "BUS_WRITE_TOTAL_MEM",
|
||||
"BriefDescription": "This event counts write transactions from measured CMG to measured CMG local memory."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,128 @@
|
|||
[
|
||||
{
|
||||
"ArchStdEvent": "L1I_CACHE_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1I_TLB_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_TLB_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1I_CACHE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE_WB"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_WB"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_TLB_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2I_TLB_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_TLB"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2I_TLB"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts L1D_CACHE_REFILL caused by software or hardware prefetch.",
|
||||
"EventCode": "0x49",
|
||||
"EventName": "L1D_CACHE_REFILL_PRF",
|
||||
"BriefDescription": "This event counts L1D_CACHE_REFILL caused by software or hardware prefetch."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts L2D_CACHE_REFILL caused by software or hardware prefetch.",
|
||||
"EventCode": "0x59",
|
||||
"EventName": "L2D_CACHE_REFILL_PRF",
|
||||
"BriefDescription": "This event counts L2D_CACHE_REFILL caused by software or hardware prefetch."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts L1D_CACHE_REFILL caused by demand access.",
|
||||
"EventCode": "0x200",
|
||||
"EventName": "L1D_CACHE_REFILL_DM",
|
||||
"BriefDescription": "This event counts L1D_CACHE_REFILL caused by demand access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts L1D_CACHE_REFILL caused by hardware prefetch.",
|
||||
"EventCode": "0x202",
|
||||
"EventName": "L1D_CACHE_REFILL_HWPRF",
|
||||
"BriefDescription": "This event counts L1D_CACHE_REFILL caused by hardware prefetch."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts outstanding L1D cache miss requests per cycle.",
|
||||
"EventCode": "0x208",
|
||||
"EventName": "L1_MISS_WAIT",
|
||||
"BriefDescription": "This event counts outstanding L1D cache miss requests per cycle."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts outstanding L1I cache miss requests per cycle.",
|
||||
"EventCode": "0x209",
|
||||
"EventName": "L1I_MISS_WAIT",
|
||||
"BriefDescription": "This event counts outstanding L1I cache miss requests per cycle."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts L2D_CACHE_REFILL caused by demand access.",
|
||||
"EventCode": "0x300",
|
||||
"EventName": "L2D_CACHE_REFILL_DM",
|
||||
"BriefDescription": "This event counts L2D_CACHE_REFILL caused by demand access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts L2D_CACHE_REFILL caused by hardware prefetch.",
|
||||
"EventCode": "0x302",
|
||||
"EventName": "L2D_CACHE_REFILL_HWPRF",
|
||||
"BriefDescription": "This event counts L2D_CACHE_REFILL caused by hardware prefetch."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts outstanding L2 cache miss requests per cycle.",
|
||||
"EventCode": "0x308",
|
||||
"EventName": "L2_MISS_WAIT",
|
||||
"BriefDescription": "This event counts outstanding L2 cache miss requests per cycle."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the number of times of L2 cache miss.",
|
||||
"EventCode": "0x309",
|
||||
"EventName": "L2_MISS_COUNT",
|
||||
"BriefDescription": "This event counts the number of times of L2 cache miss."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts operations where demand access hits an L2 cache refill buffer allocated by software or hardware prefetch.",
|
||||
"EventCode": "0x325",
|
||||
"EventName": "L2D_SWAP_DM",
|
||||
"BriefDescription": "This event counts operations where demand access hits an L2 cache refill buffer allocated by software or hardware prefetch."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts operations where software or hardware prefetch hits an L2 cache refill buffer allocated by demand access.",
|
||||
"EventCode": "0x326",
|
||||
"EventName": "L2D_CACHE_MIBMCH_PRF",
|
||||
"BriefDescription": "This event counts operations where software or hardware prefetch hits an L2 cache refill buffer allocated by demand access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts operations where demand access hits an L2 cache refill buffer allocated by software or hardware prefetch.",
|
||||
"EventCode": "0x396",
|
||||
"EventName": "L2D_CACHE_SWAP_LOCAL",
|
||||
"BriefDescription": "This event counts operations where demand access hits an L2 cache refill buffer allocated by software or hardware prefetch."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts energy consumption per cycle of L2 cache.",
|
||||
"EventCode": "0x3E0",
|
||||
"EventName": "EA_L2",
|
||||
"BriefDescription": "This event counts energy consumption per cycle of L2 cache."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,5 @@
|
|||
[
|
||||
{
|
||||
"ArchStdEvent": "CPU_CYCLES"
|
||||
}
|
||||
]
|
|
@ -0,0 +1,29 @@
|
|||
[
|
||||
{
|
||||
"ArchStdEvent": "EXC_TAKEN"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_UNDEF"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_SVC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_PABORT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_DABORT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_IRQ"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_FIQ"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_SMC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_HVC"
|
||||
}
|
||||
]
|
|
@ -0,0 +1,131 @@
|
|||
[
|
||||
{
|
||||
"ArchStdEvent": "SW_INCR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "INST_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_RETURN"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "CID_WRITE_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "INST_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "LDREX_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STREX_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "LD_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ST_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "LDST_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "DP_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ASE_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "VFP_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "PC_WRITE_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "CRYPTO_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_IMMED_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_RETURN_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_INDIRECT_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ISB_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "DSB_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "DMB_SPEC"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed zero blocking operations due to the 'DC ZVA' instruction.",
|
||||
"EventCode": "0x9F",
|
||||
"EventName": "DCZVA_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed zero blocking operations due to the 'DC ZVA' instruction."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed floating-point move operations.",
|
||||
"EventCode": "0x105",
|
||||
"EventName": "FP_MV_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed floating-point move operations."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed operations that using predicate register.",
|
||||
"EventCode": "0x108",
|
||||
"EventName": "PRD_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed operations that using predicate register."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed inter-element manipulation operations.",
|
||||
"EventCode": "0x109",
|
||||
"EventName": "IEL_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed inter-element manipulation operations."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed inter-register manipulation operations.",
|
||||
"EventCode": "0x10A",
|
||||
"EventName": "IREG_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed inter-register manipulation operations."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed NOSIMD load operations that using SIMD&FP registers.",
|
||||
"EventCode": "0x112",
|
||||
"EventName": "FP_LD_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed NOSIMD load operations that using SIMD&FP registers."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed NOSIMD store operations that using SIMD&FP registers.",
|
||||
"EventCode": "0x113",
|
||||
"EventName": "FP_ST_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed NOSIMD store operations that using SIMD&FP registers."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed SIMD broadcast floating-point load operations.",
|
||||
"EventCode": "0x11A",
|
||||
"EventName": "BC_LD_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed SIMD broadcast floating-point load operations."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed instructions, excluding the MOVPRFX instruction.",
|
||||
"EventCode": "0x121",
|
||||
"EventName": "EFFECTIVE_INST_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed instructions, excluding the MOVPRFX instruction."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed operations that uses 'pre-index' as its addressing mode.",
|
||||
"EventCode": "0x123",
|
||||
"EventName": "PRE_INDEX_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed operations that uses 'pre-index' as its addressing mode."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts architecturally executed operations that uses 'post-index' as its addressing mode.",
|
||||
"EventCode": "0x124",
|
||||
"EventName": "POST_INDEX_SPEC",
|
||||
"BriefDescription": "This event counts architecturally executed operations that uses 'post-index' as its addressing mode."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,8 @@
|
|||
[
|
||||
{
|
||||
"PublicDescription": "This event counts energy consumption per cycle of CMG local memory.",
|
||||
"EventCode": "0x3E8",
|
||||
"EventName": "EA_MEMORY",
|
||||
"BriefDescription": "This event counts energy consumption per cycle of CMG local memory."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,188 @@
|
|||
[
|
||||
{
|
||||
"PublicDescription": "This event counts the occurrence count of the micro-operation split.",
|
||||
"EventCode": "0x139",
|
||||
"EventName": "UOP_SPLIT",
|
||||
"BriefDescription": "This event counts the occurrence count of the micro-operation split."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no operation was committed because the oldest and uncommitted load/store/prefetch operation waits for memory access.",
|
||||
"EventCode": "0x180",
|
||||
"EventName": "LD_COMP_WAIT_L2_MISS",
|
||||
"BriefDescription": "This event counts every cycle that no operation was committed because the oldest and uncommitted load/store/prefetch operation waits for memory access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted integer load operation waits for memory access.",
|
||||
"EventCode": "0x181",
|
||||
"EventName": "LD_COMP_WAIT_L2_MISS_EX",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted integer load operation waits for memory access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted load/store/prefetch operation waits for L2 cache access.",
|
||||
"EventCode": "0x182",
|
||||
"EventName": "LD_COMP_WAIT_L1_MISS",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted load/store/prefetch operation waits for L2 cache access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted integer load operation waits for L2 cache access.",
|
||||
"EventCode": "0x183",
|
||||
"EventName": "LD_COMP_WAIT_L1_MISS_EX",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted integer load operation waits for L2 cache access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted load/store/prefetch operation waits for L1D cache, L2 cache and memory access.",
|
||||
"EventCode": "0x184",
|
||||
"EventName": "LD_COMP_WAIT",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted load/store/prefetch operation waits for L1D cache, L2 cache and memory access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted integer load operation waits for L1D cache, L2 cache and memory access.",
|
||||
"EventCode": "0x185",
|
||||
"EventName": "LD_COMP_WAIT_EX",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed because the oldest and uncommitted integer load operation waits for L1D cache, L2 cache and memory access."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed due to the lack of an available prefetch port.",
|
||||
"EventCode": "0x186",
|
||||
"EventName": "LD_COMP_WAIT_PFP_BUSY",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed due to the lack of an available prefetch port."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the LD_COMP_WAIT_PFP_BUSY caused by an integer load operation.",
|
||||
"EventCode": "0x187",
|
||||
"EventName": "LD_COMP_WAIT_PFP_BUSY_EX",
|
||||
"BriefDescription": "This event counts the LD_COMP_WAIT_PFP_BUSY caused by an integer load operation."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the LD_COMP_WAIT_PFP_BUSY caused by a software prefetch instruction.",
|
||||
"EventCode": "0x188",
|
||||
"EventName": "LD_COMP_WAIT_PFP_BUSY_SWPF",
|
||||
"BriefDescription": "This event counts the LD_COMP_WAIT_PFP_BUSY caused by a software prefetch instruction."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed and the oldest and uncommitted instruction is an integer or floating-point/SIMD instruction.",
|
||||
"EventCode": "0x189",
|
||||
"EventName": "EU_COMP_WAIT",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed and the oldest and uncommitted instruction is an integer or floating-point/SIMD instruction."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed and the oldest and uncommitted instruction is a floating-point/SIMD instruction.",
|
||||
"EventCode": "0x18A",
|
||||
"EventName": "FL_COMP_WAIT",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed and the oldest and uncommitted instruction is a floating-point/SIMD instruction."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed and the oldest and uncommitted instruction is a branch instruction.",
|
||||
"EventCode": "0x18B",
|
||||
"EventName": "BR_COMP_WAIT",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed and the oldest and uncommitted instruction is a branch instruction."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed because the CSE is empty.",
|
||||
"EventCode": "0x18C",
|
||||
"EventName": "ROB_EMPTY",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed because the CSE is empty."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed because the CSE is empty and the store port (SP) is full.",
|
||||
"EventCode": "0x18D",
|
||||
"EventName": "ROB_EMPTY_STQ_BUSY",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed because the CSE is empty and the store port (SP) is full."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that the instruction unit is halted by the WFE/WFI instruction.",
|
||||
"EventCode": "0x18E",
|
||||
"EventName": "WFE_WFI_CYCLE",
|
||||
"BriefDescription": "This event counts every cycle that the instruction unit is halted by the WFE/WFI instruction."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that no instruction was committed, but counts at the time when commits MOVPRFX only.",
|
||||
"EventCode": "0x190",
|
||||
"EventName": "_0INST_COMMIT",
|
||||
"BriefDescription": "This event counts every cycle that no instruction was committed, but counts at the time when commits MOVPRFX only."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that one instruction is committed.",
|
||||
"EventCode": "0x191",
|
||||
"EventName": "_1INST_COMMIT",
|
||||
"BriefDescription": "This event counts every cycle that one instruction is committed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that two instructions are committed.",
|
||||
"EventCode": "0x192",
|
||||
"EventName": "_2INST_COMMIT",
|
||||
"BriefDescription": "This event counts every cycle that two instructions are committed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that three instructions are committed.",
|
||||
"EventCode": "0x193",
|
||||
"EventName": "_3INST_COMMIT",
|
||||
"BriefDescription": "This event counts every cycle that three instructions are committed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that four instructions are committed.",
|
||||
"EventCode": "0x194",
|
||||
"EventName": "_4INST_COMMIT",
|
||||
"BriefDescription": "This event counts every cycle that four instructions are committed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that only any micro-operations are committed.",
|
||||
"EventCode": "0x198",
|
||||
"EventName": "UOP_ONLY_COMMIT",
|
||||
"BriefDescription": "This event counts every cycle that only any micro-operations are committed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts every cycle that only the MOVPRFX instruction is committed.",
|
||||
"EventCode": "0x199",
|
||||
"EventName": "SINGLE_MOVPRFX_COMMIT",
|
||||
"BriefDescription": "This event counts every cycle that only the MOVPRFX instruction is committed."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts energy consumption per cycle of core.",
|
||||
"EventCode": "0x1E0",
|
||||
"EventName": "EA_CORE",
|
||||
"BriefDescription": "This event counts energy consumption per cycle of core."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts streaming prefetch requests to L1D cache generated by hardware prefetcher.",
|
||||
"EventCode": "0x230",
|
||||
"EventName": "L1HWPF_STREAM_PF",
|
||||
"BriefDescription": "This event counts streaming prefetch requests to L1D cache generated by hardware prefetcher."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts allocation type prefetch injection requests to L1D cache generated by hardware prefetcher.",
|
||||
"EventCode": "0x231",
|
||||
"EventName": "L1HWPF_INJ_ALLOC_PF",
|
||||
"BriefDescription": "This event counts allocation type prefetch injection requests to L1D cache generated by hardware prefetcher."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts non-allocation type prefetch injection requests to L1D cache generated by hardware prefetcher.",
|
||||
"EventCode": "0x232",
|
||||
"EventName": "L1HWPF_INJ_NOALLOC_PF",
|
||||
"BriefDescription": "This event counts non-allocation type prefetch injection requests to L1D cache generated by hardware prefetcher."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts streaming prefetch requests to L2 cache generated by hardware prefecher.",
|
||||
"EventCode": "0x233",
|
||||
"EventName": "L2HWPF_STREAM_PF",
|
||||
"BriefDescription": "This event counts streaming prefetch requests to L2 cache generated by hardware prefecher."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts allocation type prefetch injection requests to L2 cache generated by hardware prefetcher.",
|
||||
"EventCode": "0x234",
|
||||
"EventName": "L2HWPF_INJ_ALLOC_PF",
|
||||
"BriefDescription": "This event counts allocation type prefetch injection requests to L2 cache generated by hardware prefetcher."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts non-allocation type prefetch injection requests to L2 cache generated by hardware prefetcher.",
|
||||
"EventCode": "0x235",
|
||||
"EventName": "L2HWPF_INJ_NOALLOC_PF",
|
||||
"BriefDescription": "This event counts non-allocation type prefetch injection requests to L2 cache generated by hardware prefetcher."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts prefetch requests to L2 cache generated by the other causes.",
|
||||
"EventCode": "0x236",
|
||||
"EventName": "L2HWPF_OTHER",
|
||||
"BriefDescription": "This event counts prefetch requests to L2 cache generated by the other causes."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,194 @@
|
|||
[
|
||||
{
|
||||
"ArchStdEvent": "STALL_FRONTEND"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STALL_BACKEND"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of EAGA pipeline.",
|
||||
"EventCode": "0x1A0",
|
||||
"EventName": "EAGA_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of EAGA pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of EAGB pipeline.",
|
||||
"EventCode": "0x1A1",
|
||||
"EventName": "EAGB_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of EAGB pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of EXA pipeline.",
|
||||
"EventCode": "0x1A2",
|
||||
"EventName": "EXA_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of EXA pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of EXB pipeline.",
|
||||
"EventCode": "0x1A3",
|
||||
"EventName": "EXB_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of EXB pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of FLA pipeline.",
|
||||
"EventCode": "0x1A4",
|
||||
"EventName": "FLA_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of FLA pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of FLB pipeline.",
|
||||
"EventCode": "0x1A5",
|
||||
"EventName": "FLB_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of FLB pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of PRX pipeline.",
|
||||
"EventCode": "0x1A6",
|
||||
"EventName": "PRX_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of PRX pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the number of 1's in the predicate bits of request in FLA pipeline, where it is corrected so that it becomes 16 when all bits are 1.",
|
||||
"EventCode": "0x1B4",
|
||||
"EventName": "FLA_VAL_PRD_CNT",
|
||||
"BriefDescription": "This event counts the number of 1's in the predicate bits of request in FLA pipeline, where it is corrected so that it becomes 16 when all bits are 1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the number of 1's in the predicate bits of request in FLB pipeline, where it is corrected so that it becomes 16 when all bits are 1.",
|
||||
"EventCode": "0x1B5",
|
||||
"EventName": "FLB_VAL_PRD_CNT",
|
||||
"BriefDescription": "This event counts the number of 1's in the predicate bits of request in FLB pipeline, where it is corrected so that it becomes 16 when all bits are 1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of L1D cache pipeline#0.",
|
||||
"EventCode": "0x240",
|
||||
"EventName": "L1_PIPE0_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of L1D cache pipeline#0."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of L1D cache pipeline#1.",
|
||||
"EventCode": "0x241",
|
||||
"EventName": "L1_PIPE1_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of L1D cache pipeline#1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts requests in L1D cache pipeline#0 that its sce bit of tagged address is 1.",
|
||||
"EventCode": "0x250",
|
||||
"EventName": "L1_PIPE0_VAL_IU_TAG_ADRS_SCE",
|
||||
"BriefDescription": "This event counts requests in L1D cache pipeline#0 that its sce bit of tagged address is 1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts requests in L1D cache pipeline#0 that its pfe bit of tagged address is 1.",
|
||||
"EventCode": "0x251",
|
||||
"EventName": "L1_PIPE0_VAL_IU_TAG_ADRS_PFE",
|
||||
"BriefDescription": "This event counts requests in L1D cache pipeline#0 that its pfe bit of tagged address is 1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts requests in L1D cache pipeline#1 that its sce bit of tagged address is 1.",
|
||||
"EventCode": "0x252",
|
||||
"EventName": "L1_PIPE1_VAL_IU_TAG_ADRS_SCE",
|
||||
"BriefDescription": "This event counts requests in L1D cache pipeline#1 that its sce bit of tagged address is 1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts requests in L1D cache pipeline#1 that its pfe bit of tagged address is 1.",
|
||||
"EventCode": "0x253",
|
||||
"EventName": "L1_PIPE1_VAL_IU_TAG_ADRS_PFE",
|
||||
"BriefDescription": "This event counts requests in L1D cache pipeline#1 that its pfe bit of tagged address is 1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts completed requests in L1D cache pipeline#0.",
|
||||
"EventCode": "0x260",
|
||||
"EventName": "L1_PIPE0_COMP",
|
||||
"BriefDescription": "This event counts completed requests in L1D cache pipeline#0."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts completed requests in L1D cache pipeline#1.",
|
||||
"EventCode": "0x261",
|
||||
"EventName": "L1_PIPE1_COMP",
|
||||
"BriefDescription": "This event counts completed requests in L1D cache pipeline#1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts completed requests in L1I cache pipeline.",
|
||||
"EventCode": "0x268",
|
||||
"EventName": "L1I_PIPE_COMP",
|
||||
"BriefDescription": "This event counts completed requests in L1I cache pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of L1I cache pipeline.",
|
||||
"EventCode": "0x269",
|
||||
"EventName": "L1I_PIPE_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of L1I cache pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts aborted requests in L1D pipelines that due to store-load interlock.",
|
||||
"EventCode": "0x274",
|
||||
"EventName": "L1_PIPE_ABORT_STLD_INTLK",
|
||||
"BriefDescription": "This event counts aborted requests in L1D pipelines that due to store-load interlock."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts requests in L1D cache pipeline#0 that its sector cache ID is not 0.",
|
||||
"EventCode": "0x2A0",
|
||||
"EventName": "L1_PIPE0_VAL_IU_NOT_SEC0",
|
||||
"BriefDescription": "This event counts requests in L1D cache pipeline#0 that its sector cache ID is not 0."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts requests in L1D cache pipeline#1 that its sector cache ID is not 0.",
|
||||
"EventCode": "0x2A1",
|
||||
"EventName": "L1_PIPE1_VAL_IU_NOT_SEC0",
|
||||
"BriefDescription": "This event counts requests in L1D cache pipeline#1 that its sector cache ID is not 0."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the number of times where 2 elements of the gather instructions became 2 flows because 2 elements could not be combined.",
|
||||
"EventCode": "0x2B0",
|
||||
"EventName": "L1_PIPE_COMP_GATHER_2FLOW",
|
||||
"BriefDescription": "This event counts the number of times where 2 elements of the gather instructions became 2 flows because 2 elements could not be combined."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the number of times where 2 elements of the gather instructions became 1 flow because 2 elements could be combined.",
|
||||
"EventCode": "0x2B1",
|
||||
"EventName": "L1_PIPE_COMP_GATHER_1FLOW",
|
||||
"BriefDescription": "This event counts the number of times where 2 elements of the gather instructions became 1 flow because 2 elements could be combined."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the number of times where 2 elements of the gather instructions became 0 flow because both predicate values are 0.",
|
||||
"EventCode": "0x2B2",
|
||||
"EventName": "L1_PIPE_COMP_GATHER_0FLOW",
|
||||
"BriefDescription": "This event counts the number of times where 2 elements of the gather instructions became 0 flow because both predicate values are 0."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the number of flows of the scatter instructions.",
|
||||
"EventCode": "0x2B3",
|
||||
"EventName": "L1_PIPE_COMP_SCATTER_1FLOW",
|
||||
"BriefDescription": "This event counts the number of flows of the scatter instructions."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the number of 1's in the predicate bits of request in L1D cache pipeline#0, where it is corrected so that it becomes 16 when all bits are 1.",
|
||||
"EventCode": "0x2B8",
|
||||
"EventName": "L1_PIPE0_COMP_PRD_CNT",
|
||||
"BriefDescription": "This event counts the number of 1's in the predicate bits of request in L1D cache pipeline#0, where it is corrected so that it becomes 16 when all bits are 1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts the number of 1's in the predicate bits of request in L1D cache pipeline#1, where it is corrected so that it becomes 16 when all bits are 1.",
|
||||
"EventCode": "0x2B9",
|
||||
"EventName": "L1_PIPE1_COMP_PRD_CNT",
|
||||
"BriefDescription": "This event counts the number of 1's in the predicate bits of request in L1D cache pipeline#1, where it is corrected so that it becomes 16 when all bits are 1."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts valid cycles of L2 cache pipeline.",
|
||||
"EventCode": "0x330",
|
||||
"EventName": "L2_PIPE_VAL",
|
||||
"BriefDescription": "This event counts valid cycles of L2 cache pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts completed requests in L2 cache pipeline.",
|
||||
"EventCode": "0x350",
|
||||
"EventName": "L2_PIPE_COMP_ALL",
|
||||
"BriefDescription": "This event counts completed requests in L2 cache pipeline."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "This event counts operations where software or hardware prefetch hits an L2 cache refill buffer allocated by demand access.",
|
||||
"EventCode": "0x370",
|
||||
"EventName": "L2_PIPE_COMP_PF_L2MIB_MCH",
|
||||
"BriefDescription": "This event counts operations where software or hardware prefetch hits an L2 cache refill buffer allocated by demand access."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,110 @@
|
|||
[
|
||||
{
|
||||
"ArchStdEvent": "SIMD_INST_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_INST_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "UOP_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_MATH_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_FMA_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_RECPE_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_CVT_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ASE_SVE_INT_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_PRED_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_MOVPRFX_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_MOVPRFX_U_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ASE_SVE_LD_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ASE_SVE_ST_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "PRF_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BASE_LD_REG_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BASE_ST_REG_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_LDR_REG_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_STR_REG_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_LDR_PREG_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_STR_PREG_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_PRF_CONTIG_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ASE_SVE_LD_MULTI_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ASE_SVE_ST_MULTI_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_LD_GATHER_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_ST_SCATTER_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_PRF_GATHER_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SVE_LDFF_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_SCALE_OPS_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_FIXED_OPS_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_HP_SCALE_OPS_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_HP_FIXED_OPS_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_SP_SCALE_OPS_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_SP_FIXED_OPS_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_DP_SCALE_OPS_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FP_DP_FIXED_OPS_SPEC"
|
||||
}
|
||||
]
|
|
@ -0,0 +1,233 @@
|
|||
[
|
||||
{
|
||||
"MetricExpr": "FETCH_BUBBLE / (4 * CPU_CYCLES)",
|
||||
"PublicDescription": "Frontend bound L1 topdown metric",
|
||||
"BriefDescription": "Frontend bound L1 topdown metric",
|
||||
"MetricGroup": "TopDownL1",
|
||||
"MetricName": "frontend_bound"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(INST_SPEC - INST_RETIRED) / (4 * CPU_CYCLES)",
|
||||
"PublicDescription": "Bad Speculation L1 topdown metric",
|
||||
"BriefDescription": "Bad Speculation L1 topdown metric",
|
||||
"MetricGroup": "TopDownL1",
|
||||
"MetricName": "bad_speculation"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "INST_RETIRED / (CPU_CYCLES * 4)",
|
||||
"PublicDescription": "Retiring L1 topdown metric",
|
||||
"BriefDescription": "Retiring L1 topdown metric",
|
||||
"MetricGroup": "TopDownL1",
|
||||
"MetricName": "retiring"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "1 - (frontend_bound + bad_speculation + retiring)",
|
||||
"PublicDescription": "Backend Bound L1 topdown metric",
|
||||
"BriefDescription": "Backend Bound L1 topdown metric",
|
||||
"MetricGroup": "TopDownL1",
|
||||
"MetricName": "backend_bound"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x201d@ / CPU_CYCLES",
|
||||
"PublicDescription": "Fetch latency bound L2 topdown metric",
|
||||
"BriefDescription": "Fetch latency bound L2 topdown metric",
|
||||
"MetricGroup": "TopDownL2",
|
||||
"MetricName": "fetch_latency_bound"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "frontend_bound - fetch_latency_bound",
|
||||
"PublicDescription": "Fetch bandwidth bound L2 topdown metric",
|
||||
"BriefDescription": "Fetch bandwidth bound L2 topdown metric",
|
||||
"MetricGroup": "TopDownL2",
|
||||
"MetricName": "fetch_bandwidth_bound"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(bad_speculation * BR_MIS_PRED) / (BR_MIS_PRED + armv8_pmuv3_0@event\\=0x2013@)",
|
||||
"PublicDescription": "Branch mispredicts L2 topdown metric",
|
||||
"BriefDescription": "Branch mispredicts L2 topdown metric",
|
||||
"MetricGroup": "TopDownL2",
|
||||
"MetricName": "branch_mispredicts"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "bad_speculation - branch_mispredicts",
|
||||
"PublicDescription": "Machine clears L2 topdown metric",
|
||||
"BriefDescription": "Machine clears L2 topdown metric",
|
||||
"MetricGroup": "TopDownL2",
|
||||
"MetricName": "machine_clears"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(EXE_STALL_CYCLE - (MEM_STALL_ANYLOAD + armv8_pmuv3_0@event\\=0x7005@)) / CPU_CYCLES",
|
||||
"PublicDescription": "Core bound L2 topdown metric",
|
||||
"BriefDescription": "Core bound L2 topdown metric",
|
||||
"MetricGroup": "TopDownL2",
|
||||
"MetricName": "core_bound"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(MEM_STALL_ANYLOAD + armv8_pmuv3_0@event\\=0x7005@) / CPU_CYCLES",
|
||||
"PublicDescription": "Memory bound L2 topdown metric",
|
||||
"BriefDescription": "Memory bound L2 topdown metric",
|
||||
"MetricGroup": "TopDownL2",
|
||||
"MetricName": "memory_bound"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(((L2I_TLB - L2I_TLB_REFILL) * 15) + (L2I_TLB_REFILL * 100)) / CPU_CYCLES",
|
||||
"PublicDescription": "Idle by itlb miss L3 topdown metric",
|
||||
"BriefDescription": "Idle by itlb miss L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "idle_by_itlb_miss"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(((L2I_CACHE - L2I_CACHE_REFILL) * 15) + (L2I_CACHE_REFILL * 100)) / CPU_CYCLES",
|
||||
"PublicDescription": "Idle by icache miss L3 topdown metric",
|
||||
"BriefDescription": "Idle by icache miss L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "idle_by_icache_miss"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(BR_MIS_PRED * 5) / CPU_CYCLES",
|
||||
"PublicDescription": "BP misp flush L3 topdown metric",
|
||||
"BriefDescription": "BP misp flush L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "bp_misp_flush"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(armv8_pmuv3_0@event\\=0x2013@ * 5) / CPU_CYCLES",
|
||||
"PublicDescription": "OOO flush L3 topdown metric",
|
||||
"BriefDescription": "OOO flush L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "ooo_flush"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(armv8_pmuv3_0@event\\=0x1001@ * 5) / CPU_CYCLES",
|
||||
"PublicDescription": "Static predictor flush L3 topdown metric",
|
||||
"BriefDescription": "Static predictor flush L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "sp_flush"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x1010@ / BR_MIS_PRED",
|
||||
"PublicDescription": "Indirect branch L3 topdown metric",
|
||||
"BriefDescription": "Indirect branch L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "indirect_branch"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(armv8_pmuv3_0@event\\=0x1014@ + armv8_pmuv3_0@event\\=0x1018@) / BR_MIS_PRED",
|
||||
"PublicDescription": "Push branch L3 topdown metric",
|
||||
"BriefDescription": "Push branch L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "push_branch"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x100c@ / BR_MIS_PRED",
|
||||
"PublicDescription": "Pop branch L3 topdown metric",
|
||||
"BriefDescription": "Pop branch L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "pop_branch"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(BR_MIS_PRED - armv8_pmuv3_0@event\\=0x1010@ - armv8_pmuv3_0@event\\=0x1014@ - armv8_pmuv3_0@event\\=0x1018@ - armv8_pmuv3_0@event\\=0x100c@) / BR_MIS_PRED",
|
||||
"PublicDescription": "Other branch L3 topdown metric",
|
||||
"BriefDescription": "Other branch L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "other_branch"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x2012@ / armv8_pmuv3_0@event\\=0x2013@",
|
||||
"PublicDescription": "Nuke flush L3 topdown metric",
|
||||
"BriefDescription": "Nuke flush L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "nuke_flush"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "1 - nuke_flush",
|
||||
"PublicDescription": "Other flush L3 topdown metric",
|
||||
"BriefDescription": "Other flush L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "other_flush"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x2010@ / CPU_CYCLES",
|
||||
"PublicDescription": "Sync stall L3 topdown metric",
|
||||
"BriefDescription": "Sync stall L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "sync_stall"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x2004@ / CPU_CYCLES",
|
||||
"PublicDescription": "Rob stall L3 topdown metric",
|
||||
"BriefDescription": "Rob stall L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "rob_stall"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(armv8_pmuv3_0@event\\=0x2006@ + armv8_pmuv3_0@event\\=0x2007@ + armv8_pmuv3_0@event\\=0x2008@) / CPU_CYCLES",
|
||||
"PublicDescription": "Ptag stall L3 topdown metric",
|
||||
"BriefDescription": "Ptag stall L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "ptag_stall"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x201e@ / CPU_CYCLES",
|
||||
"PublicDescription": "SaveOpQ stall L3 topdown metric",
|
||||
"BriefDescription": "SaveOpQ stall L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "saveopq_stall"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x2005@ / CPU_CYCLES",
|
||||
"PublicDescription": "PC buffer stall L3 topdown metric",
|
||||
"BriefDescription": "PC buffer stall L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "pc_buffer_stall"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x7002@ / CPU_CYCLES",
|
||||
"PublicDescription": "Divider L3 topdown metric",
|
||||
"BriefDescription": "Divider L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "divider"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x7003@ / CPU_CYCLES",
|
||||
"PublicDescription": "FSU stall L3 topdown metric",
|
||||
"BriefDescription": "FSU stall L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "fsu_stall"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "core_bound - divider - fsu_stall",
|
||||
"PublicDescription": "EXE ports util L3 topdown metric",
|
||||
"BriefDescription": "EXE ports util L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "exe_ports_util"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(MEM_STALL_ANYLOAD - MEM_STALL_L1MISS) / CPU_CYCLES",
|
||||
"PublicDescription": "L1 bound L3 topdown metric",
|
||||
"BriefDescription": "L1 bound L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "l1_bound"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "(MEM_STALL_L1MISS - MEM_STALL_L2MISS) / CPU_CYCLES",
|
||||
"PublicDescription": "L2 bound L3 topdown metric",
|
||||
"BriefDescription": "L2 bound L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "l2_bound"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "MEM_STALL_L2MISS / CPU_CYCLES",
|
||||
"PublicDescription": "Mem bound L3 topdown metric",
|
||||
"BriefDescription": "Mem bound L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "mem_bound"
|
||||
},
|
||||
{
|
||||
"MetricExpr": "armv8_pmuv3_0@event\\=0x7005@ / CPU_CYCLES",
|
||||
"PublicDescription": "Store bound L3 topdown metric",
|
||||
"BriefDescription": "Store bound L3 topdown metric",
|
||||
"MetricGroup": "TopDownL3",
|
||||
"MetricName": "store_bound"
|
||||
},
|
||||
]
|
|
@ -20,5 +20,6 @@
|
|||
0x00000000410fd0c0,v1,arm/cortex-a76-n1,core
|
||||
0x00000000420f5160,v1,cavium/thunderx2,core
|
||||
0x00000000430f0af0,v1,cavium/thunderx2,core
|
||||
0x00000000460f0010,v1,fujitsu/a64fx,core
|
||||
0x00000000480fd010,v1,hisilicon/hip08,core
|
||||
0x00000000500f0000,v1,ampere/emag,core
|
||||
|
|
|
|
@ -15,3 +15,4 @@
|
|||
# Power8 entries
|
||||
004[bcd][[:xdigit:]]{4},1,power8,core
|
||||
004e[[:xdigit:]]{4},1,power9,core
|
||||
0080[[:xdigit:]]{4},1,power10,core
|
||||
|
|
|
|
@ -0,0 +1,47 @@
|
|||
[
|
||||
{
|
||||
"EventCode": "1003C",
|
||||
"EventName": "PM_EXEC_STALL_DMISS_L2L3",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from either the local L2 or local L3."
|
||||
},
|
||||
{
|
||||
"EventCode": "34056",
|
||||
"EventName": "PM_EXEC_STALL_LOAD_FINISH",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was finishing a load after its data was reloaded from a data source beyond the local L1; cycles in which the LSU was processing an L1-hit; cycles in which the NTF instruction merged with another load in the LMQ."
|
||||
},
|
||||
{
|
||||
"EventCode": "3006C",
|
||||
"EventName": "PM_RUN_CYC_SMT2_MODE",
|
||||
"BriefDescription": "Cycles when this thread's run latch is set and the core is in SMT2 mode."
|
||||
},
|
||||
{
|
||||
"EventCode": "300F4",
|
||||
"EventName": "PM_RUN_INST_CMPL_CONC",
|
||||
"BriefDescription": "PowerPC instructions completed by this thread when all threads in the core had the run-latch set."
|
||||
},
|
||||
{
|
||||
"EventCode": "4C016",
|
||||
"EventName": "PM_EXEC_STALL_DMISS_L2L3_CONFLICT",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local L2 or local L3, with a dispatch conflict."
|
||||
},
|
||||
{
|
||||
"EventCode": "4D014",
|
||||
"EventName": "PM_EXEC_STALL_LOAD",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a load instruction executing in the Load Store Unit."
|
||||
},
|
||||
{
|
||||
"EventCode": "4D016",
|
||||
"EventName": "PM_EXEC_STALL_PTESYNC",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a PTESYNC instruction executing in the Load Store Unit."
|
||||
},
|
||||
{
|
||||
"EventCode": "401EA",
|
||||
"EventName": "PM_THRESH_EXC_128",
|
||||
"BriefDescription": "Threshold counter exceeded a value of 128."
|
||||
},
|
||||
{
|
||||
"EventCode": "400F6",
|
||||
"EventName": "PM_BR_MPRED_CMPL",
|
||||
"BriefDescription": "A mispredicted branch completed. Includes direction and target."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,7 @@
|
|||
[
|
||||
{
|
||||
"EventCode": "4016E",
|
||||
"EventName": "PM_THRESH_NOT_MET",
|
||||
"BriefDescription": "Threshold counter did not meet threshold."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,217 @@
|
|||
[
|
||||
{
|
||||
"EventCode": "10004",
|
||||
"EventName": "PM_EXEC_STALL_TRANSLATION",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline suffered a TLB miss or ERAT miss and waited for it to resolve."
|
||||
},
|
||||
{
|
||||
"EventCode": "10010",
|
||||
"EventName": "PM_PMC4_OVERFLOW",
|
||||
"BriefDescription": "The event selected for PMC4 caused the event counter to overflow."
|
||||
},
|
||||
{
|
||||
"EventCode": "10020",
|
||||
"EventName": "PM_PMC4_REWIND",
|
||||
"BriefDescription": "The speculative event selected for PMC4 rewinds and the counter for PMC4 is not charged."
|
||||
},
|
||||
{
|
||||
"EventCode": "10038",
|
||||
"EventName": "PM_DISP_STALL_TRANSLATION",
|
||||
"BriefDescription": "Cycles when dispatch was stalled for this thread because the MMU was handling a translation miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "1003A",
|
||||
"EventName": "PM_DISP_STALL_BR_MPRED_IC_L2",
|
||||
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from the local L2 after suffering a branch mispredict."
|
||||
},
|
||||
{
|
||||
"EventCode": "1E050",
|
||||
"EventName": "PM_DISP_STALL_HELD_STF_MAPPER_CYC",
|
||||
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch because the STF mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR."
|
||||
},
|
||||
{
|
||||
"EventCode": "1F054",
|
||||
"EventName": "PM_DTLB_HIT",
|
||||
"BriefDescription": "The PTE required by the instruction was resident in the TLB (data TLB access). When MMCR1[16]=0 this event counts only demand hits. When MMCR1[16]=1 this event includes demand and prefetch. Applies to both HPT and RPT."
|
||||
},
|
||||
{
|
||||
"EventCode": "101E8",
|
||||
"EventName": "PM_THRESH_EXC_256",
|
||||
"BriefDescription": "Threshold counter exceeded a count of 256."
|
||||
},
|
||||
{
|
||||
"EventCode": "101EC",
|
||||
"EventName": "PM_THRESH_MET",
|
||||
"BriefDescription": "Threshold exceeded."
|
||||
},
|
||||
{
|
||||
"EventCode": "100F2",
|
||||
"EventName": "PM_1PLUS_PPC_CMPL",
|
||||
"BriefDescription": "Cycles in which at least one instruction is completed by this thread."
|
||||
},
|
||||
{
|
||||
"EventCode": "100F6",
|
||||
"EventName": "PM_IERAT_MISS",
|
||||
"BriefDescription": "IERAT Reloaded to satisfy an IERAT miss. All page sizes are counted by this event."
|
||||
},
|
||||
{
|
||||
"EventCode": "100F8",
|
||||
"EventName": "PM_DISP_STALL_CYC",
|
||||
"BriefDescription": "Cycles the ICT has no itags assigned to this thread (no instructions were dispatched during these cycles)."
|
||||
},
|
||||
{
|
||||
"EventCode": "20114",
|
||||
"EventName": "PM_MRK_L2_RC_DISP",
|
||||
"BriefDescription": "Marked instruction RC dispatched in L2."
|
||||
},
|
||||
{
|
||||
"EventCode": "2C010",
|
||||
"EventName": "PM_EXEC_STALL_LSU",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in the Load Store Unit. This does not include simple fixed point instructions."
|
||||
},
|
||||
{
|
||||
"EventCode": "2C016",
|
||||
"EventName": "PM_DISP_STALL_IERAT_ONLY_MISS",
|
||||
"BriefDescription": "Cycles when dispatch was stalled while waiting to resolve an instruction ERAT miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "2C01E",
|
||||
"EventName": "PM_DISP_STALL_BR_MPRED_IC_L3",
|
||||
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from the local L3 after suffering a branch mispredict."
|
||||
},
|
||||
{
|
||||
"EventCode": "2D01A",
|
||||
"EventName": "PM_DISP_STALL_IC_MISS",
|
||||
"BriefDescription": "Cycles when dispatch was stalled for this thread due to an Icache Miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "2D01C",
|
||||
"EventName": "PM_CMPL_STALL_STCX",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a stcx waiting for resolution from the nest before completing."
|
||||
},
|
||||
{
|
||||
"EventCode": "2E018",
|
||||
"EventName": "PM_DISP_STALL_FETCH",
|
||||
"BriefDescription": "Cycles when dispatch was stalled for this thread because Fetch was being held."
|
||||
},
|
||||
{
|
||||
"EventCode": "2E01A",
|
||||
"EventName": "PM_DISP_STALL_HELD_XVFC_MAPPER_CYC",
|
||||
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch because the XVFC mapper/SRB was full."
|
||||
},
|
||||
{
|
||||
"EventCode": "2C142",
|
||||
"EventName": "PM_MRK_XFER_FROM_SRC_PMC2",
|
||||
"BriefDescription": "For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[15:27]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
|
||||
},
|
||||
{
|
||||
"EventCode": "24050",
|
||||
"EventName": "PM_IOPS_DISP",
|
||||
"BriefDescription": "Internal Operations dispatched. PM_IOPS_DISP / PM_INST_DISP will show the average number of internal operations per PowerPC instruction."
|
||||
},
|
||||
{
|
||||
"EventCode": "2405E",
|
||||
"EventName": "PM_ISSUE_CANCEL",
|
||||
"BriefDescription": "An instruction issued and the issue was later cancelled. Only one cancel per PowerPC instruction."
|
||||
},
|
||||
{
|
||||
"EventCode": "200FA",
|
||||
"EventName": "PM_BR_TAKEN_CMPL",
|
||||
"BriefDescription": "Branch Taken instruction completed."
|
||||
},
|
||||
{
|
||||
"EventCode": "30012",
|
||||
"EventName": "PM_FLUSH_COMPLETION",
|
||||
"BriefDescription": "The instruction that was next to complete (oldest in the pipeline) did not complete because it suffered a flush."
|
||||
},
|
||||
{
|
||||
"EventCode": "30014",
|
||||
"EventName": "PM_EXEC_STALL_STORE",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a store instruction executing in the Load Store Unit."
|
||||
},
|
||||
{
|
||||
"EventCode": "30018",
|
||||
"EventName": "PM_DISP_STALL_HELD_SCOREBOARD_CYC",
|
||||
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch while waiting on the Scoreboard. This event combines VSCR and FPSCR together."
|
||||
},
|
||||
{
|
||||
"EventCode": "30026",
|
||||
"EventName": "PM_EXEC_STALL_STORE_MISS",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a store whose cache line was not resident in the L1 and was waiting for allocation of the missing line into the L1."
|
||||
},
|
||||
{
|
||||
"EventCode": "3012A",
|
||||
"EventName": "PM_MRK_L2_RC_DONE",
|
||||
"BriefDescription": "L2 RC machine completed the transaction for the marked instruction."
|
||||
},
|
||||
{
|
||||
"EventCode": "3F046",
|
||||
"EventName": "PM_ITLB_HIT_1G",
|
||||
"BriefDescription": "Instruction TLB hit (IERAT reload) page size 1G, which implies Radix Page Table translation is in use. When MMCR1[17]=0 this event counts only for demand misses. When MMCR1[17]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "34058",
|
||||
"EventName": "PM_DISP_STALL_BR_MPRED_ICMISS",
|
||||
"BriefDescription": "Cycles when dispatch was stalled after a mispredicted branch resulted in an instruction cache miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "3D05C",
|
||||
"EventName": "PM_DISP_STALL_HELD_RENAME_CYC",
|
||||
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch because the mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR and XVFC."
|
||||
},
|
||||
{
|
||||
"EventCode": "3E052",
|
||||
"EventName": "PM_DISP_STALL_IC_L3",
|
||||
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from the local L3."
|
||||
},
|
||||
{
|
||||
"EventCode": "3E054",
|
||||
"EventName": "PM_LD_MISS_L1",
|
||||
"BriefDescription": "Load Missed L1, counted at execution time (can be greater than loads finished). LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load."
|
||||
},
|
||||
{
|
||||
"EventCode": "301EA",
|
||||
"EventName": "PM_THRESH_EXC_1024",
|
||||
"BriefDescription": "Threshold counter exceeded a value of 1024."
|
||||
},
|
||||
{
|
||||
"EventCode": "300FA",
|
||||
"EventName": "PM_INST_FROM_L3MISS",
|
||||
"BriefDescription": "The processor's instruction cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "40006",
|
||||
"EventName": "PM_ISSUE_KILL",
|
||||
"BriefDescription": "Cycles in which an instruction or group of instructions were cancelled after being issued. This event increments once per occurrence, regardless of how many instructions are included in the issue group."
|
||||
},
|
||||
{
|
||||
"EventCode": "40116",
|
||||
"EventName": "PM_MRK_LARX_FIN",
|
||||
"BriefDescription": "Marked load and reserve instruction (LARX) finished. LARX and STCX are instructions used to acquire a lock."
|
||||
},
|
||||
{
|
||||
"EventCode": "4C010",
|
||||
"EventName": "PM_DISP_STALL_BR_MPRED_IC_L3MISS",
|
||||
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from sources beyond the local L3 after suffering a mispredicted branch."
|
||||
},
|
||||
{
|
||||
"EventCode": "4D01E",
|
||||
"EventName": "PM_DISP_STALL_BR_MPRED",
|
||||
"BriefDescription": "Cycles when dispatch was stalled for this thread due to a mispredicted branch."
|
||||
},
|
||||
{
|
||||
"EventCode": "4E010",
|
||||
"EventName": "PM_DISP_STALL_IC_L3MISS",
|
||||
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from any source beyond the local L3."
|
||||
},
|
||||
{
|
||||
"EventCode": "4E01A",
|
||||
"EventName": "PM_DISP_STALL_HELD_CYC",
|
||||
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch for any reason."
|
||||
},
|
||||
{
|
||||
"EventCode": "44056",
|
||||
"EventName": "PM_VECTOR_ST_CMPL",
|
||||
"BriefDescription": "Vector store instructions completed."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,12 @@
|
|||
[
|
||||
{
|
||||
"EventCode": "1E058",
|
||||
"EventName": "PM_STCX_FAIL_FIN",
|
||||
"BriefDescription": "Conditional store instruction (STCX) failed. LARX and STCX are instructions used to acquire a lock."
|
||||
},
|
||||
{
|
||||
"EventCode": "4E050",
|
||||
"EventName": "PM_STCX_PASS_FIN",
|
||||
"BriefDescription": "Conditional store instruction (STCX) passed. LARX and STCX are instructions used to acquire a lock."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,147 @@
|
|||
[
|
||||
{
|
||||
"EventCode": "1002C",
|
||||
"EventName": "PM_LD_PREFETCH_CACHE_LINE_MISS",
|
||||
"BriefDescription": "The L1 cache was reloaded with a line that fulfills a prefetch request."
|
||||
},
|
||||
{
|
||||
"EventCode": "10132",
|
||||
"EventName": "PM_MRK_INST_ISSUED",
|
||||
"BriefDescription": "Marked instruction issued. Note that stores always get issued twice, the address gets issued to the LSU and the data gets issued to the VSU. Also, issues can sometimes get killed/cancelled and cause multiple sequential issues for the same instruction."
|
||||
},
|
||||
{
|
||||
"EventCode": "101E0",
|
||||
"EventName": "PM_MRK_INST_DISP",
|
||||
"BriefDescription": "The thread has dispatched a randomly sampled marked instruction."
|
||||
},
|
||||
{
|
||||
"EventCode": "101E2",
|
||||
"EventName": "PM_MRK_BR_TAKEN_CMPL",
|
||||
"BriefDescription": "Marked Branch Taken instruction completed."
|
||||
},
|
||||
{
|
||||
"EventCode": "20112",
|
||||
"EventName": "PM_MRK_NTF_FIN",
|
||||
"BriefDescription": "The marked instruction became the oldest in the pipeline before it finished. It excludes instructions that finish at dispatch."
|
||||
},
|
||||
{
|
||||
"EventCode": "2C01C",
|
||||
"EventName": "PM_EXEC_STALL_DMISS_OFF_CHIP",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a remote chip."
|
||||
},
|
||||
{
|
||||
"EventCode": "20138",
|
||||
"EventName": "PM_MRK_ST_NEST",
|
||||
"BriefDescription": "A store has been sampled/marked and is at the point of execution where it has completed in the core and can no longer be flushed. At this point the store is sent to the L2."
|
||||
},
|
||||
{
|
||||
"EventCode": "2013A",
|
||||
"EventName": "PM_MRK_BRU_FIN",
|
||||
"BriefDescription": "Marked Branch instruction finished."
|
||||
},
|
||||
{
|
||||
"EventCode": "2C144",
|
||||
"EventName": "PM_MRK_XFER_FROM_SRC_CYC_PMC2",
|
||||
"BriefDescription": "Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[15:27]."
|
||||
},
|
||||
{
|
||||
"EventCode": "24156",
|
||||
"EventName": "PM_MRK_STCX_FIN",
|
||||
"BriefDescription": "Marked conditional store instruction (STCX) finished. LARX and STCX are instructions used to acquire a lock."
|
||||
},
|
||||
{
|
||||
"EventCode": "24158",
|
||||
"EventName": "PM_MRK_INST",
|
||||
"BriefDescription": "An instruction was marked. Includes both Random Instruction Sampling (RIS) at decode time and Random Event Sampling (RES) at the time the configured event happens."
|
||||
},
|
||||
{
|
||||
"EventCode": "2415C",
|
||||
"EventName": "PM_MRK_BR_CMPL",
|
||||
"BriefDescription": "A marked branch completed. All branches are included."
|
||||
},
|
||||
{
|
||||
"EventCode": "200FD",
|
||||
"EventName": "PM_L1_ICACHE_MISS",
|
||||
"BriefDescription": "Demand iCache Miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "30130",
|
||||
"EventName": "PM_MRK_INST_FIN",
|
||||
"BriefDescription": "marked instruction finished. Excludes instructions that finish at dispatch. Note that stores always finish twice since the address gets issued to the LSU and the data gets issued to the VSU."
|
||||
},
|
||||
{
|
||||
"EventCode": "34146",
|
||||
"EventName": "PM_MRK_LD_CMPL",
|
||||
"BriefDescription": "Marked loads completed."
|
||||
},
|
||||
{
|
||||
"EventCode": "3E158",
|
||||
"EventName": "PM_MRK_STCX_FAIL",
|
||||
"BriefDescription": "Marked conditional store instruction (STCX) failed. LARX and STCX are instructions used to acquire a lock."
|
||||
},
|
||||
{
|
||||
"EventCode": "3E15A",
|
||||
"EventName": "PM_MRK_ST_FIN",
|
||||
"BriefDescription": "The marked instruction was a store of any kind."
|
||||
},
|
||||
{
|
||||
"EventCode": "30068",
|
||||
"EventName": "PM_L1_ICACHE_RELOADED_PREF",
|
||||
"BriefDescription": "Counts all Icache prefetch reloads ( includes demand turned into prefetch)."
|
||||
},
|
||||
{
|
||||
"EventCode": "301E4",
|
||||
"EventName": "PM_MRK_BR_MPRED_CMPL",
|
||||
"BriefDescription": "Marked Branch Mispredicted. Includes direction and target."
|
||||
},
|
||||
{
|
||||
"EventCode": "300F6",
|
||||
"EventName": "PM_LD_DEMAND_MISS_L1",
|
||||
"BriefDescription": "The L1 cache was reloaded with a line that fulfills a demand miss request. Counted at reload time, before finish."
|
||||
},
|
||||
{
|
||||
"EventCode": "300FE",
|
||||
"EventName": "PM_DATA_FROM_L3MISS",
|
||||
"BriefDescription": "The processor's data cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "40012",
|
||||
"EventName": "PM_L1_ICACHE_RELOADED_ALL",
|
||||
"BriefDescription": "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch."
|
||||
},
|
||||
{
|
||||
"EventCode": "40134",
|
||||
"EventName": "PM_MRK_INST_TIMEO",
|
||||
"BriefDescription": "Marked instruction finish timeout (instruction was lost)."
|
||||
},
|
||||
{
|
||||
"EventCode": "4003C",
|
||||
"EventName": "PM_DISP_STALL_HELD_SYNC_CYC",
|
||||
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch because of a synchronizing instruction that requires the ICT to be empty before dispatch."
|
||||
},
|
||||
{
|
||||
"EventCode": "4505A",
|
||||
"EventName": "PM_SP_FLOP_CMPL",
|
||||
"BriefDescription": "Single Precision floating point instructions completed."
|
||||
},
|
||||
{
|
||||
"EventCode": "4D058",
|
||||
"EventName": "PM_VECTOR_FLOP_CMPL",
|
||||
"BriefDescription": "Vector floating point instructions completed."
|
||||
},
|
||||
{
|
||||
"EventCode": "4D05A",
|
||||
"EventName": "PM_NON_MATH_FLOP_CMPL",
|
||||
"BriefDescription": "Non Math instructions completed."
|
||||
},
|
||||
{
|
||||
"EventCode": "401E0",
|
||||
"EventName": "PM_MRK_INST_CMPL",
|
||||
"BriefDescription": "marked instruction completed."
|
||||
},
|
||||
{
|
||||
"EventCode": "400FE",
|
||||
"EventName": "PM_DATA_FROM_MEMORY",
|
||||
"BriefDescription": "The processor's data cache was reloaded from local, remote, or distant memory due to a demand miss."
|
||||
}
|
||||
]
|
|
@ -0,0 +1,192 @@
|
|||
[
|
||||
{
|
||||
"EventCode": "1000A",
|
||||
"EventName": "PM_PMC3_REWIND",
|
||||
"BriefDescription": "The speculative event selected for PMC3 rewinds and the counter for PMC3 is not charged."
|
||||
},
|
||||
{
|
||||
"EventCode": "1C040",
|
||||
"EventName": "PM_XFER_FROM_SRC_PMC1",
|
||||
"BriefDescription": "The processor's L1 data cache was reloaded from the source specified in MMCR3[0:12]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
|
||||
},
|
||||
{
|
||||
"EventCode": "1C142",
|
||||
"EventName": "PM_MRK_XFER_FROM_SRC_PMC1",
|
||||
"BriefDescription": "For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[0:12]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
|
||||
},
|
||||
{
|
||||
"EventCode": "1C144",
|
||||
"EventName": "PM_MRK_XFER_FROM_SRC_CYC_PMC1",
|
||||
"BriefDescription": "Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[0:12]."
|
||||
},
|
||||
{
|
||||
"EventCode": "1C056",
|
||||
"EventName": "PM_DERAT_MISS_4K",
|
||||
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 4K. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "1C058",
|
||||
"EventName": "PM_DTLB_MISS_16G",
|
||||
"BriefDescription": "Data TLB reload (after a miss) page size 16G. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "1C05C",
|
||||
"EventName": "PM_DTLB_MISS_2M",
|
||||
"BriefDescription": "Data TLB reload (after a miss) page size 2M. Implies radix translation was used. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "1E056",
|
||||
"EventName": "PM_EXEC_STALL_STORE_PIPE",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in the store unit. This does not include cycles spent handling store misses, PTESYNC instructions or TLBIE instructions."
|
||||
},
|
||||
{
|
||||
"EventCode": "1F150",
|
||||
"EventName": "PM_MRK_ST_L2_CYC",
|
||||
"BriefDescription": "Cycles from L2 RC dispatch to L2 RC completion."
|
||||
},
|
||||
{
|
||||
"EventCode": "10062",
|
||||
"EventName": "PM_LD_L3MISS_PEND_CYC",
|
||||
"BriefDescription": "Cycles L3 miss was pending for this thread."
|
||||
},
|
||||
{
|
||||
"EventCode": "20010",
|
||||
"EventName": "PM_PMC1_OVERFLOW",
|
||||
"BriefDescription": "The event selected for PMC1 caused the event counter to overflow."
|
||||
},
|
||||
{
|
||||
"EventCode": "2001A",
|
||||
"EventName": "PM_ITLB_HIT",
|
||||
"BriefDescription": "The PTE required to translate the instruction address was resident in the TLB (instruction TLB access/IERAT reload). Applies to both HPT and RPT. When MMCR1[17]=0 this event counts only for demand misses. When MMCR1[17]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "2003E",
|
||||
"EventName": "PM_PTESYNC_FIN",
|
||||
"BriefDescription": "Ptesync instruction finished in the store unit. Only one ptesync can finish at a time."
|
||||
},
|
||||
{
|
||||
"EventCode": "2C040",
|
||||
"EventName": "PM_XFER_FROM_SRC_PMC2",
|
||||
"BriefDescription": "The processor's L1 data cache was reloaded from the source specified in MMCR3[15:27]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
|
||||
},
|
||||
{
|
||||
"EventCode": "2C054",
|
||||
"EventName": "PM_DERAT_MISS_64K",
|
||||
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 64K. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "2C056",
|
||||
"EventName": "PM_DTLB_MISS_4K",
|
||||
"BriefDescription": "Data TLB reload (after a miss) page size 4K. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "2D154",
|
||||
"EventName": "PM_MRK_DERAT_MISS_64K",
|
||||
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 64K for a marked instruction. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "200F6",
|
||||
"EventName": "PM_DERAT_MISS",
|
||||
"BriefDescription": "DERAT Reloaded to satisfy a DERAT miss. All page sizes are counted by this event. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "3000A",
|
||||
"EventName": "PM_DISP_STALL_ITLB_MISS",
|
||||
"BriefDescription": "Cycles when dispatch was stalled while waiting to resolve an instruction TLB miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "30016",
|
||||
"EventName": "PM_EXEC_STALL_DERAT_DTLB_MISS",
|
||||
"BriefDescription": "Cycles in which the oldest instruction in the pipeline suffered a TLB miss and waited for it resolve."
|
||||
},
|
||||
{
|
||||
"EventCode": "3C040",
|
||||
"EventName": "PM_XFER_FROM_SRC_PMC3",
|
||||
"BriefDescription": "The processor's L1 data cache was reloaded from the source specified in MMCR3[30:42]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
|
||||
},
|
||||
{
|
||||
"EventCode": "3C142",
|
||||
"EventName": "PM_MRK_XFER_FROM_SRC_PMC3",
|
||||
"BriefDescription": "For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[30:42]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
|
||||
},
|
||||
{
|
||||
"EventCode": "3C144",
|
||||
"EventName": "PM_MRK_XFER_FROM_SRC_CYC_PMC3",
|
||||
"BriefDescription": "Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[30:42]."
|
||||
},
|
||||
{
|
||||
"EventCode": "3C054",
|
||||
"EventName": "PM_DERAT_MISS_16M",
|
||||
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 16M. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "3C056",
|
||||
"EventName": "PM_DTLB_MISS_64K",
|
||||
"BriefDescription": "Data TLB reload (after a miss) page size 64K. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "3C058",
|
||||
"EventName": "PM_LARX_FIN",
|
||||
"BriefDescription": "Load and reserve instruction (LARX) finished. LARX and STCX are instructions used to acquire a lock."
|
||||
},
|
||||
{
|
||||
"EventCode": "301E2",
|
||||
"EventName": "PM_MRK_ST_CMPL",
|
||||
"BriefDescription": "Marked store completed and sent to nest. Note that this count excludes cache-inhibited stores."
|
||||
},
|
||||
{
|
||||
"EventCode": "300FC",
|
||||
"EventName": "PM_DTLB_MISS",
|
||||
"BriefDescription": "The DPTEG required for the load/store instruction in execution was missing from the TLB. It includes pages of all sizes for demand and prefetch activity."
|
||||
},
|
||||
{
|
||||
"EventCode": "4D02C",
|
||||
"EventName": "PM_PMC1_REWIND",
|
||||
"BriefDescription": "The speculative event selected for PMC1 rewinds and the counter for PMC1 is not charged."
|
||||
},
|
||||
{
|
||||
"EventCode": "4003E",
|
||||
"EventName": "PM_LD_CMPL",
|
||||
"BriefDescription": "Loads completed."
|
||||
},
|
||||
{
|
||||
"EventCode": "4C040",
|
||||
"EventName": "PM_XFER_FROM_SRC_PMC4",
|
||||
"BriefDescription": "The processor's L1 data cache was reloaded from the source specified in MMCR3[45:57]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
|
||||
},
|
||||
{
|
||||
"EventCode": "4C142",
|
||||
"EventName": "PM_MRK_XFER_FROM_SRC_PMC4",
|
||||
"BriefDescription": "For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[45:57]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
|
||||
},
|
||||
{
|
||||
"EventCode": "4C144",
|
||||
"EventName": "PM_MRK_XFER_FROM_SRC_CYC_PMC4",
|
||||
"BriefDescription": "Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[45:57]."
|
||||
},
|
||||
{
|
||||
"EventCode": "4C056",
|
||||
"EventName": "PM_DTLB_MISS_16M",
|
||||
"BriefDescription": "Data TLB reload (after a miss) page size 16M. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "4C05A",
|
||||
"EventName": "PM_DTLB_MISS_1G",
|
||||
"BriefDescription": "Data TLB reload (after a miss) page size 1G. Implies radix translation was used. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "4C15E",
|
||||
"EventName": "PM_MRK_DTLB_MISS_64K",
|
||||
"BriefDescription": "Marked Data TLB reload (after a miss) page size 64K. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
},
|
||||
{
|
||||
"EventCode": "4D056",
|
||||
"EventName": "PM_NON_FMA_FLOP_CMPL",
|
||||
"BriefDescription": "Non FMA instruction completed."
|
||||
},
|
||||
{
|
||||
"EventCode": "40164",
|
||||
"EventName": "PM_MRK_DERAT_MISS_2M",
|
||||
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 2M for a marked instruction. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
|
||||
}
|
||||
]
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue