mirror of https://gitee.com/openkylin/linux.git
411 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Linus Torvalds | 36db171cc7 |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Bigger kernel side changes: - Add backwards writing capability to the perf ring-buffer code, which is preparation for future advanced features like robust 'overwrite support' and snapshot mode. (Wang Nan) - Add pause and resume ioctls for the perf ringbuffer (Wang Nan) - x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner) - x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter Zijlstra) - x86 AUX (Intel PT) related enhancements and updates (Alexander Shishkin) - x86 MSR PMU driver enhancements and updates (Huang Rui) - ... and lots of other changes spread out over 40+ commits. Biggest tooling side changes: - 'perf trace' features and enhancements. (Arnaldo Carvalho de Melo) - BPF tooling updates (Wang Nan) - 'perf sched' updates (Jiri Olsa) - 'perf probe' updates (Masami Hiramatsu) - ... plus 200+ other enhancements, fixes and cleanups to tools/ The merge commits, the shortlog and the changelogs contain a lot more details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits) perf/core: Disable the event on a truncated AUX record perf/x86/intel/pt: Generate PMI in the STOP region as well perf buildid-cache: Use lsdir() for looking up buildid caches perf symbols: Use lsdir() for the search in kcore cache directory perf tools: Use SBUILD_ID_SIZE where applicable perf tools: Fix lsdir to set errno correctly perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/ perf trace: Move flock op beautifier to tools/perf/trace/beauty/ perf build: Add build-test for debug-frame on arm/arm64 perf build: Add build-test for libunwind cross-platforms support perf script: Fix export of callchains with recursion in db-export perf script: Fix callchain addresses in db-export perf script: Fix symbol insertion behavior in db-export perf symbols: Add dso__insert_symbol function perf scripting python: Use Py_FatalError instead of die() perf tools: Remove xrealloc and ALLOC_GROW perf help: Do not use ALLOC_GROW in add_cmd_list perf pmu: Make pmu_formats_string to check return value of strbuf perf header: Make topology checkers to check return value of strbuf perf tools: Make alias handler to check return value of strbuf ... |
|
Steven Rostedt | 106b816cb4 |
tools lib traceevent: Do not reassign parg after collapse_tree()
At the end of process_filter(), collapse_tree() was changed to update
the parg parameter, but the reassignment after the call wasn't removed.
What happens is that the "current_op" gets modified and freed and parg
is assigned to the new allocated argument. But after the call to
collapse_tree(), parg is assigned again to the just freed "current_op",
and this causes the tool to crash.
The current_op variable must also be assigned to NULL in case of error,
otherwise it will cause it to be free()ed twice.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org # 3.14+
Fixes:
|
|
Arnaldo Carvalho de Melo | 4bd112df3e |
tools lib api fs: Add helper to read string from procfs file
To read things like /proc/self/comm. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ztpkbmseidt0hq2psr46o0h9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Sedat Dilek | a189c017de |
tools/lib/lockdep: Fix unsupported 'basename -s' in run_tests.sh
Here on Ubuntu/precise I have GNU/coreutils v8.13 installed where 'basename -s' is not supported. The result is that run_tests.sh is not done properly. How to reproduce: $ cd $BUILD_DIR $ LC_ALL=C make -C tools/ liblockdep $ cd tools/lib/lockdep/ $ LC_ALL=C ./run_tests.sh basename: invalid option -- 's' Try `basename --help' for more information. ... timeout: failed to run command `./tests/': Permission denied FAILED! rm: cannot remove `tests/': Is a directory Due to unsupported basename the tests programs are not generated and cannot be removed. Fix this by doing a compatible basename invocation and check for the existence of generated tests programs. For more details see this LKML thread: http://marc.info/?t=145906667300001&r=1&w=2 Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> (maintainer:LIBLOCKDEP) Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org> Link: http://lkml.kernel.org/r/1459326169-7009-1-git-send-email-sedat.dilek@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | 3fa2fe2ce0 |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar: "This tree contains various perf fixes on the kernel side, plus three hw/event-enablement late additions: - Intel Memory Bandwidth Monitoring events and handling - the AMD Accumulated Power Mechanism reporting facility - more IOMMU events ... and a final round of perf tooling updates/fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits) perf llvm: Use strerror_r instead of the thread unsafe strerror one perf llvm: Use realpath to canonicalize paths perf tools: Unexport some methods unused outside strbuf.c perf probe: No need to use formatting strbuf method perf help: Use asprintf instead of adhoc equivalents perf tools: Remove unused perf_pathdup, xstrdup functions perf tools: Do not include stringify.h from the kernel sources tools include: Copy linux/stringify.h from the kernel tools lib traceevent: Remove redundant CPU output perf tools: Remove needless 'extern' from function prototypes perf tools: Simplify die() mechanism perf tools: Remove unused DIE_IF macro perf script: Remove lots of unused arguments perf thread: Rename perf_event__preprocess_sample_addr to thread__resolve perf machine: Rename perf_event__preprocess_sample to machine__resolve perf tools: Add cpumode to struct perf_sample perf tests: Forward the perf_sample in the dwarf unwind test perf tools: Remove misplaced __maybe_unused perf list: Fix documentation of :ppp perf bench numa: Fix assertion for nodes bitfield ... |
|
Ingo Molnar | 05f5ece76a |
perf/core improvements and fixes:
User visible: - Fix documentation of :ppp modifier in 'perf list' (Andi Kleen) - Fix silly nodes bitfield bits/bytes length assertion in 'perf bench numa' (Jakub Jelen) - Remove redundant CPU output in libtraceevent (Steven Rostedt) - Remove 'core_id' check in topology 'perf test' (Sukadev Bhattiprolu) Infrastructure: - Record text offset in dso to calculate objdump address, to use with modules in addition to vDSO symbol address calculations (Wang Nan) - Move utilities.mak from perf to tools/scripts/ (Arnaldo Carvalho de Melo) - Add cpumode to the perf_sample struct, this way we don't need to pass the union event to the machine and thread resolving routines, shortening function signatures and allowing the future introduction of a way to use tracepoint events instead of the unavailable HW cycles counter on powerpc guests in perf kvm by just hooking on perf_evsel__parse_sample, at the end (Arnaldo Carvalho de Melo) - Remove/unexport die() related infrastructure, that at some point will finally be removed (Arnaldo Carvalho de Melo) - Adopt linux/stringify.h from the kernel sources, not to touch this kernel header from tools/ (Arnaldo Carvalho de Melo) - Stop using strbuf for things we can instead trivially use libc's asprintf() (Arnaldo Carvalho de Melo) - Ditch tools/lib/util/abspath.c, its only exported function was used at just one place and can be replaced by libc's realpath() (Arnaldo Carvalho de Melo) - Use strerror_r() in the llvm infrastructure, tread safe, its what is used elsewhere in tools/perf/ (Arnaldo Carvalho de Melo) Cleanups: - Removed misplaced or needless __maybe_unused/export (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW8wWeAAoJENZQFvNTUqpAF9YP/iXdYDnav5mr8yRt1Vu88F0j rmnO0E6zHzH5PcEoMuSPYEea1IPKZcV2KyVWFsfzkbfZrsC05AkGOrHAs7v0jCyz 93p0/sH92Vy7ZTvE2slQJGKm2fgo3veV3ExEZDqdT5NNlzrIHVGshNYTma0wuduM NMdCdZe99KUuJFjqaUP9YCTr9VWF30Vu/AAOTUY7Xg6j7ZuJ47HchNxbOvCOp430 LtirOKRBmVyY8BOoKSDDyaZOOPz18k58z8bKVO2eQNN9wf6ZGjWIN5DpjBs9pdJa W4zhFbHxU272IZXVv8/K2hV1qenov4Ghn5i0oApBjaRGta6dBFhGmYk0zIATcpGV wkuB6OrQTGYnBcNCI9z/kyxdt7Km0X2ffruvFLPviz4d7Sa2W0Gmj2+4JgXaLipp Q/0H/iWcT1kv+SXT9yjWe2+QxgwbJlUJ96+aGEvqDQH23Hifxjx4wezNR98kK/s7 8IGxJEIOMmRocXeDeEZ8sKwqqan34IDrXKs2jMQSOonVVhR+dmvZicqSNwMTvMTn Sz36bsaKkWQtM3d97byx6EHDJn6aPEQlbY/YR6DqKgKmeFTQgs1lDvUp7SVUu1qp HwSDN7FgWbXAJVfxeoe0qBSZY9px+dCCh+qZmNN+TtD84IBgzrJpaSf6hzQ6FIie yTfVHvvkyUDep1KquQz9 =f41L -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo-20160323' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes: User visible fixes: - Fix documentation of :ppp modifier in 'perf list' (Andi Kleen) - Fix silly nodes bitfield bits/bytes length assertion in 'perf bench numa' (Jakub Jelen) - Remove redundant CPU output in libtraceevent (Steven Rostedt) - Remove 'core_id' check in topology 'perf test' (Sukadev Bhattiprolu) Infrastructure changes/fixes: - Record text offset in dso to calculate objdump address, to use with modules in addition to vDSO symbol address calculations (Wang Nan) - Move utilities.mak from perf to tools/scripts/ (Arnaldo Carvalho de Melo) - Add cpumode to the perf_sample struct, this way we don't need to pass the union event to the machine and thread resolving routines, shortening function signatures and allowing the future introduction of a way to use tracepoint events instead of the unavailable HW cycles counter on powerpc guests in perf kvm by just hooking on perf_evsel__parse_sample, at the end (Arnaldo Carvalho de Melo) - Remove/unexport die() related infrastructure, that at some point will finally be removed (Arnaldo Carvalho de Melo) - Adopt linux/stringify.h from the kernel sources, not to touch this kernel header from tools/ (Arnaldo Carvalho de Melo) - Stop using strbuf for things we can instead trivially use libc's asprintf() (Arnaldo Carvalho de Melo) - Ditch tools/lib/util/abspath.c, its only exported function was used at just one place and can be replaced by libc's realpath() (Arnaldo Carvalho de Melo) - Use strerror_r() in the llvm infrastructure, tread safe, its what is used elsewhere in tools/perf/ (Arnaldo Carvalho de Melo) Cleanups: - Removed misplaced or needless __maybe_unused/export (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Steven Rostedt | 4f3c887688 |
tools lib traceevent: Remove redundant CPU output
Commit |
|
Linus Torvalds | 26660a4046 |
Merge branch 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull 'objtool' stack frame validation from Ingo Molnar: "This tree adds a new kernel build-time object file validation feature (ONFIG_STACK_VALIDATION=y): kernel stack frame correctness validation. It was written by and is maintained by Josh Poimboeuf. The motivation: there's a category of hard to find kernel bugs, most of them in assembly code (but also occasionally in C code), that degrades the quality of kernel stack dumps/backtraces. These bugs are hard to detect at the source code level. Such bugs result in incorrect/incomplete backtraces most of time - but can also in some rare cases result in crashes or other undefined behavior. The build time correctness checking is done via the new 'objtool' user-space utility that was written for this purpose and which is hosted in the kernel repository in tools/objtool/. The tool's (very simple) UI and source code design is shaped after Git and perf and shares quite a bit of infrastructure with tools/perf (which tooling infrastructure sharing effort got merged via perf and is already upstream). Objtool follows the well-known kernel coding style. Objtool does not try to check .c or .S files, it instead analyzes the resulting .o generated machine code from first principles: it decodes the instruction stream and interprets it. (Right now objtool supports the x86-64 architecture.) From tools/objtool/Documentation/stack-validation.txt: "The kernel CONFIG_STACK_VALIDATION option enables a host tool named objtool which runs at compile time. It has a "check" subcommand which analyzes every .o file and ensures the validity of its stack metadata. It enforces a set of rules on asm code and C inline assembly code so that stack traces can be reliable. Currently it only checks frame pointer usage, but there are plans to add CFI validation for C files and CFI generation for asm files. For each function, it recursively follows all possible code paths and validates the correct frame pointer state at each instruction. It also follows code paths involving special sections, like .altinstructions, __jump_table, and __ex_table, which can add alternative execution paths to a given instruction (or set of instructions). Similarly, it knows how to follow switch statements, for which gcc sometimes uses jump tables." When this new kernel option is enabled (it's disabled by default), the tool, if it finds any suspicious assembly code pattern, outputs warnings in compiler warning format: warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup warning: objtool:__schedule()+0x3c0: duplicate frame pointer save warning: objtool:__schedule()+0x3fd: sibling call from callable instruction with changed frame pointer ... so that scripts that pick up compiler warnings will notice them. All known warnings triggered by the tool are fixed by the tree, most of the commits in fact prepare the kernel to be warning-free. Most of them are bugfixes or cleanups that stand on their own, but there are also some annotations of 'special' stack frames for justified cases such entries to JIT-ed code (BPF) or really special boot time code. There are two other long-term motivations behind this tool as well: - To improve the quality and reliability of kernel stack frames, so that they can be used for optimized live patching. - To create independent infrastructure to check the correctness of CFI stack frames at build time. CFI debuginfo is notoriously unreliable and we cannot use it in the kernel as-is without extra checking done both on the kernel side and on the build side. The quality of kernel stack frames matters to debuggability as well, so IMO we can merge this without having to consider the live patching or CFI debuginfo angle" * 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) objtool: Only print one warning per function objtool: Add several performance improvements tools: Copy hashtable.h into tools directory objtool: Fix false positive warnings for functions with multiple switch statements objtool: Rename some variables and functions objtool: Remove superflous INIT_LIST_HEAD objtool: Add helper macros for traversing instructions objtool: Fix false positive warnings related to sibling calls objtool: Compile with debugging symbols objtool: Detect infinite recursion objtool: Prevent infinite recursion in noreturn detection objtool: Detect and warn if libelf is missing and don't break the build tools: Support relative directory path for 'O=' objtool: Support CROSS_COMPILE x86/asm/decoder: Use explicitly signed chars objtool: Enable stack metadata validation on 64-bit x86 objtool: Add CONFIG_STACK_VALIDATION option objtool: Add tool to perform compile-time stack metadata validation x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard sched: Always inline context_switch() ... |
|
Arnaldo Carvalho de Melo | ca70c24fb1 |
tools: Move utilities.mak from perf to tools/scripts/
As it is used by several other tools, better move it outside tools/perf. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-34s9kue3xq9w5mijdmfrfx8s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Linus Torvalds | e71c2c1eeb |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Main kernel side changes: - Big reorganization of the x86 perf support code. The old code grew organically deep inside arch/x86/kernel/cpu/perf* and its naming became somewhat messy. The new location is under arch/x86/events/, using the following cleaner hierarchy of source code files: perf/x86: Move perf_event.c .................. => x86/events/core.c perf/x86: Move perf_event_amd.c .............. => x86/events/amd/core.c perf/x86: Move perf_event_amd_ibs.c .......... => x86/events/amd/ibs.c perf/x86: Move perf_event_amd_iommu.[ch] ..... => x86/events/amd/iommu.[ch] perf/x86: Move perf_event_amd_uncore.c ....... => x86/events/amd/uncore.c perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch] perf/x86: Move perf_event_intel_rapl.c ....... => x86/events/intel/rapl.c perf/x86: Move perf_event_intel_uncore.[ch] .. => x86/events/intel/uncore.[ch] perf/x86: Move perf_event_intel_uncore_nhmex.c => x86/events/intel/uncore_nmhex.c perf/x86: Move perf_event_intel_uncore_snb.c => x86/events/intel/uncore_snb.c perf/x86: Move perf_event_intel_uncore_snbep.c => x86/events/intel/uncore_snbep.c perf/x86: Move perf_event_knc.c .............. => x86/events/intel/knc.c perf/x86: Move perf_event_p4.c ............... => x86/events/intel/p4.c perf/x86: Move perf_event_p6.c ............... => x86/events/intel/p6.c perf/x86: Move perf_event_msr.c .............. => x86/events/msr.c (Borislav Petkov) - Update various x86 PMU constraint and hw support details (Stephane Eranian) - Optimize kprobes for BPF execution (Martin KaFai Lau) - Rewrite, refactor and fix the Intel uncore PMU driver code (Thomas Gleixner) - Rewrite, refactor and fix the Intel RAPL PMU code (Thomas Gleixner) - Various fixes and smaller cleanups. There are lots of perf tooling updates as well. A few highlights: perf report/top: - Hierarchy histogram mode for 'perf top' and 'perf report', showing multiple levels, one per --sort entry: (Namhyung Kim) On a mostly idle system: # perf top --hierarchy -s comm,dso Then expand some levels and use 'P' to take a snapshot: # cat perf.hist.0 - 92.32% perf 58.20% perf 22.29% libc-2.22.so 5.97% [kernel] 4.18% libelf-0.165.so 1.69% [unknown] - 4.71% qemu-system-x86 3.10% [kernel] 1.60% qemu-system-x86_64 (deleted) + 2.97% swapper # - Add 'L' hotkey to dynamicly set the percent threshold for histogram entries and callchains, i.e. dynamicly do what the --percent-limit command line option to 'top' and 'report' does. (Namhyung Kim) perf mem: - Allow specifying events via -e in 'perf mem record', also listing what events can be specified via 'perf mem record -e list' (Jiri Olsa) perf record: - Add 'perf record' --all-user/--all-kernel options, so that one can tell that all the events in the command line should be restricted to the user or kernel levels (Jiri Olsa), i.e.: perf record -e cycles:u,instructions:u is equivalent to: perf record --all-user -e cycles,instructions - Make 'perf record' collect CPU cache info in the perf.data file header: $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] $ perf report --header-only -I | tail -10 | head -8 # CPU cache info: # L1 Data 32K [0-1] # L1 Instruction 32K [0-1] # L1 Data 32K [2-3] # L1 Instruction 32K [2-3] # L2 Unified 256K [0-1] # L2 Unified 256K [2-3] # L3 Unified 4096K [0-3] Will be used in 'perf c2c' and eventually in 'perf diff' to allow, for instance running the same workload in multiple machines and then when using 'diff' show the hardware difference. (Jiri Olsa) - Improved support for Java, using the JVMTI agent library to do jitdumps that then will be inserted in synthesized PERF_RECORD_MMAP2 events via 'perf inject' pointed to synthesized ELF files stored in ~/.debug and keyed with build-ids, to allow symbol resolution and even annotation with source line info, see the changeset comments to see how to use it (Stephane Eranian) perf script/trace: - Decode data_src values (e.g. perf.data files generated by 'perf mem record') in 'perf script': (Jiri Olsa) # perf script perf 693 [1] 4.088652: 1 cpu/mem-loads,ldlat=30/P: ffff88007d0b0f40 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No <SNIP> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - Improve support to 'data_src', 'weight' and 'addr' fields in 'perf script' (Jiri Olsa) - Handle empty print fmts in 'perf script -s' i.e. when running python or perl scripts (Taeung Song) perf stat: - 'perf stat' now shows shadow metrics (insn per cycle, etc) in interval mode too. E.g: # perf stat -I 1000 -e instructions,cycles sleep 1 # time counts unit events 1.000215928 519,620 instructions # 0.69 insn per cycle 1.000215928 752,003 cycles <SNIP> - Port 'perf kvm stat' to PowerPC (Hemant Kumar) - Implement CSV metrics output in 'perf stat' (Andi Kleen) perf BPF support: - Support converting data from bpf events in 'perf data' (Wang Nan) - Print bpf-output events in 'perf script': (Wang Nan). # perf record -e bpf-output/no-inherit,name=evt/ -e ./test_bpf_output_3.c/map:channel.event=evt/ usleep 1000 # perf script usleep 4882 21384.532523: evt: ffffffff810e97d1 sys_nanosleep ([kernel.kallsyms]) BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" # - Add API to set values of map entries in a BPF object, be it individual map slots or ranges (Wang Nan) - Introduce support for the 'bpf-output' event (Wang Nan) - Add glue to read perf events in a BPF program (Wang Nan) - Improve support for bpf-output events in 'perf trace' (Wang Nan) ... and tons of other changes as well - see the shortlog and git log for details!" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (342 commits) perf stat: Add --metric-only support for -A perf stat: Implement --metric-only mode perf stat: Document CSV format in manpage perf hists browser: Check sort keys before hot key actions perf hists browser: Allow thread filtering for comm sort key perf tools: Add sort__has_comm variable perf tools: Recalc total periods using top-level entries in hierarchy perf tools: Remove nr_sort_keys field perf hists browser: Cleanup hist_browser__fprintf_hierarchy_entry() perf tools: Remove hist_entry->fmt field perf tools: Fix command line filters in hierarchy mode perf tools: Add more sort entry check functions perf tools: Fix hist_entry__filter() for hierarchy perf jitdump: Build only on supported archs tools lib traceevent: Add '~' operation within arg_num_eval() perf tools: Omit unnecessary cast in perf_pmu__parse_scale perf tools: Pass perf_hpp_list all the way through setup_sort_list perf tools: Fix perf script python database export crash perf jitdump: DWARF is also needed perf bench mem: Prepare the x86-64 build for upstream memcpy_mcsafe() changes ... |
|
Steven Rostedt | 9eb42dee2b |
tools lib traceevent: Add '~' operation within arg_num_eval()
When evaluating values for print flags, if the value included a '~' operator, the parsing would fail. This broke kmalloc's parsing of: __print_flags(REC->gfp_flags, "|", {(unsigned long)((((((( gfp_t)(0x400000u|0x2000000u)) | (( gfp_t)0x40u) | (( gfp_t)0x80u) | (( gfp_t)0x20000u)) | (( gfp_t)0x02u)) | (( gfp_t)0x08u)) | (( gfp_t)0x4000u) | (( gfp_t)0x10000u) | (( gfp_t)0x1000u) | (( gfp_t)0x200u)) & ~(( gfp_t)0x2000000u)) ^ | here Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: David Ahern <dsahern@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20160226181328.22f47129@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Josh Poimboeuf | c1d45c3abd |
objtool: Support CROSS_COMPILE
When building with CONFIG_STACK_VALIDATION on a ppc64le host with an x86 cross-compiler, Stephen Rothwell saw the following objtool build errors: DESCEND objtool CC /home/sfr/next/x86_64_allmodconfig/tools/objtool/builtin-check.o CC /home/sfr/next/x86_64_allmodconfig/tools/objtool/special.o CC /home/sfr/next/x86_64_allmodconfig/tools/objtool/elf.o CC /home/sfr/next/x86_64_allmodconfig/tools/objtool/objtool.o MKDIR /home/sfr/next/x86_64_allmodconfig/tools/objtool/arch/x86/insn/ CC /home/sfr/next/x86_64_allmodconfig/tools/objtool/libstring.o elf.c:22:23: fatal error: sys/types.h: No such file or directory compilation terminated. CC /home/sfr/next/x86_64_allmodconfig/tools/objtool/exec-cmd.o CC /home/sfr/next/x86_64_allmodconfig/tools/objtool/help.o builtin-check.c:28:20: fatal error: string.h: No such file or directory compilation terminated. objtool.c:28:19: fatal error: stdio.h: No such file or directory compilation terminated. It fails to build because it tries to compile objtool with the cross-compiler instead of the host compiler. Ensure that it always uses the host compiler by ignoring CROSS_COMPILE. In order to do that properly, the libsubcmd.a library needs to be built in tools/objtool/ rather than tools/lib/subcmd/. The latter directory contains the cross-compiled version which is needed for perf and possibly other tools. Note that cross-compiling for x86 on a _big_ endian system would result in a bunch of false positive objtool warnings during the kernel build because it isn't endian-aware. But that's generally a rare edge case and there haven't been any reports of anybody needing that. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/55b63eefc347f1bb28573f972d8d1adbf1f1c31d.1456962210.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Steven Rostedt (Red Hat) | a66673a07e |
tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines
When a long value is read on 32 bit machines for 64 bit output, the parsing needs to change "%lu" into "%llu", as the value is read natively. Unfortunately, if "%llu" is already there, the code will add another "l" to it and fail to parse it properly. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20160209204237.337024613@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Steven Rostedt (Red Hat) | 9ec72eafee |
tools lib traceevent: Set int_array fields to NULL if freeing from error
Had a bug where on error of parsing __print_array() where the fields are freed after they were allocated, but since they were not set to NULL, the freeing of the arg also tried to free the already freed fields causing a double free. Fix process_hex() while at it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20160209204237.188327674@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Chaos.Chen | 21a3010045 |
tools lib traceevent: Fix time stamp rounding issue
When rounding to microseconds, if the timestamp subsecond is between .999999500 and .999999999, it is rounded to .1000000, when it should instead increment the second counter due to the overflow. For example, if the timestamp is 1234.999999501 instead of seeing: 1235.000000 we see: 1234.1000000 Signed-off-by: Chaos.Chen <rainboy1215@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20160209204236.824426460@goodmis.org [ fixed incrementing "secs" instead of decrementing it ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Steven Rostedt | a674533078 |
tools lib traceevent: Split pevent_print_event() into specific functionality functions
Currently there's a single function that is used to display a record's data in human readable format. That's pevent_print_event(). Unfortunately, this gives little room for adding other output within the line without updating that function call. I've decided to split that function into 3 parts. pevent_print_event_task() which prints the task comm, pid and the CPU pevent_print_event_time() which outputs the record's timestamp pevent_print_event_data() which outputs the rest of the event data. pevent_print_event() now simply calls these three functions. To save time from doing the search for event from the record's type, I created a new helper function called pevent_find_event_by_record(), which returns the record's event, and this event has to be passed to the above functions. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20160229090128.43a56704@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Ingo Molnar | 013e379a30 |
tools/lib/lockdep: Fix link creation warning
This warning triggers if the .so library has already been linked: triton:~/tip/tools/lib/lockdep> make CC common.o CC lockdep.o CC rbtree.o LD liblockdep-in.o LD liblockdep.a ln: failed to create symbolic link ‘liblockdep.so’: File exists LD liblockdep.so.4.5.0-rc6 Overwrite the link. Cc: Alfredo Alvarez Fernandez <alfredoalvarezfernandez@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Alfredo Alvarez Fernandez | 11a1ac206d |
tools/lib/lockdep: Add tests for AA and ABBA locking
Add test for AA and 2 threaded ABBA locking. Rename AA.c to ABA.c since it was implementing an ABA instead of a pure AA. Now both cases are covered. The expected output for AA.c is that the process blocks and lockdep reports a deadlock. ABBA_2threads.c differs from ABBA.c in that lockdep keeps separate chains of held locks per task. This can lead to different behaviour regarding lock detection. The expected output for this test is that the process blocks and lockdep reports a circular locking dependency. These tests found a lockdep bug - fixed by the next commit. Signed-off-by: Alfredo Alvarez Fernandez <alfredoalvarezfernandez@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455864533-7536-3-git-send-email-alfredoalvarezernandez@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Alfredo Alvarez Fernandez | 9d5a23ac8e |
tools/lib/lockdep: Add userspace version of READ_ONCE()
This was added to the kernel code in <1658d35ead5d> ("list: Use READ_ONCE() when testing for empty lists"). There's nothing special we need to do about it in userspace. Signed-off-by: Alfredo Alvarez Fernandez <alfredoalvarezfernandez@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455864533-7536-2-git-send-email-alfredoalvarezernandez@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | b2ed0998f6 |
tools/lib/lockdep: Fix the build on recent kernels
The following upstream commit:
|
|
Daniel Bristot de Oliveira | 0e47b38dcd |
tools lib traceevent: Implement '%' operation
The operation '%' is not implemented on event-parse.c, causing an error when parsing events with '%' the operation in its printk format. For example, # perf record -e sched:sched_deadline_yield ~/yield-test Warning: [sched:sched_deadline_yield] unknown op '%' .... # perf script Warning: [sched:sched_deadline_yield] unknown op '%' test 1641 [006] 3364.109319: sched:sched_deadline_yield: \ [FAILED TO PARSE] now=3364109314595 \ deadline=3364139295135 runtime=19975597 This patch implements the '%' operation. With this patch, we see the correct output: # perf record -e sched:sched_deadline_yield ~/yield-test No Warning # perf script yield-test 4005 [001] 4623.650978: sched:sched_deadline_yield: \ now=4623.650974050 \ deadline=4623.680957364 remaining_runtime=19979611 Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-rt-users <linux-rt-users@vger.kernel.org> Link: http://lkml.kernel.org/r/5c96a395c56cea6d3d13d949051bdece86cc26e0.1456157869.git.bristot@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Jiri Olsa | 51c0396c60 |
tools lib api fs: Add sysfs__read_str function
Adding sysfs__read_str function to ease up reading string files from sysfs. New interface is: int sysfs__read_str(const char *entry, char **buf, size_t *sizep); Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1455465826-8426-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Jiri Olsa | 607bfbd7ff |
tools lib api fs: Adopt filename__read_str from perf
We already moved similar functions in here, also it'll be useful for sysfs__read_str addition in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1455465826-8426-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Jiri Olsa | 975f14fa8f |
tools lib api: Add debug output support
Adding support for warning/info/debug output within libapi code. Adding following macros: pr_warning(fmt, ...) pr_info(fmt, ...) pr_debug(fmt, ...) Also adding libapi_set_print function to set above functions. This will be used in perf to set standard debug handlers for libapi. Adding 2 header files: debug.h - to be used outside libapi, contains libapi_set_print interface debug-internal.h - to be used within libapi, contains pr_warning/pr_info/pr_debug definitions Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1455465826-8426-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Andrey Ryabinin | 06bea3dbfe |
locking/lockdep: Eliminate lockdep_init()
Lockdep is initialized at compile time now. Get rid of lockdep_init(). Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Krinkin <krinkin.m.u@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Cc: mm-commits@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | 402e8db5c1 |
perf/core improvements and fixes:
User visible: - Rename the "colors.code" ~/.perfconfig variable to "colors.jump_arrows", as it controls just the that UI element in the annotate browser (Taeung Song) - Avoid trying to read ELF symtabs from device files, noticed while doing memory profiling work (Jiri Olsa) - Improve context detection when offering options in the hists browser, i.e. some options don't make sense when the browser is not working with a perf.data file ('perf top' mode), only in 'perf report' mode, like scripting (Namhyung Kim) Infrastructure: - Elliminate duplication in the hists browser filter functions, getting the common part into a function that receives callbacks for filtering by DSO, thread, etc (Namhyung Kim) - Fix misleadingly indented assignment, found using gcc6 -Wmisleading-indentation (Markus Trippelsdorf) - Handle LLVM relocation oddities in libbpf, introducing a 'perf test' that detects such problems and then fixing the problem, so that the test now passes (Wang Nan) - More improvements to the build infrastructure to allow reusing the feature detection facilities (Wang Nan) - Auto initialize the globals needed by cpu__max_{cpu,node}() routines (Arnaldo Carvalho de Melo) Documentation: - Document the perf sysctls in Documentation/sysctl/kernel.txt (Ben Hutchings) - Document a bunch more ~/.perfconfig knobs (Taeung Song) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWp8VgAAoJENZQFvNTUqpA8gAQAJY4pLDDeK6rZAqAY5fD//JJ aETuF0icXErkboef5usFk0MGVzK8HOWFOD5YnVAPXsGRqTHZ9ix3xw5sBFDaZKrP zagySidfHxrPDvtSW6doCjtg571dFaEHWUL48kT8ZpH9vwGDs42Gl/hjEY2P91zK uNktNoHvbHMUOxoMIp9zyCcV5WEWTog8RwCp53QrxxNrLYIT40wpADQIvuKNgqEP wIQyC2pgLv9ra27fXThauDes+a/TWLfURtxoeGgiDaIFmOi2t5VeN8D+DxXskKIB GtYF7Wxk5U+gELsAo5cZKS5Hyf13LqmwL4Jy/Th5jWaObyNXU2ZwnB3zXxZ3Dmvu keiOY8EmoOoKqOhjUVfsdvsVy0tNhObIJYhlqyOfQg+EqizR0PVlkDxWvODEKkkA T+dWXm183aXwCsHKM0EhAPgsVAJ/U9+lQjHro/lPq/i54oOogL/aBsVvUjNKo6Od m6q2ezgFZRHuPMLmOYhJaxtpvOirQkxORZZx2wgzgs5AsJly+ydoR3ETdhAD76Sg QGSKdTCziDA8KM0Vul6mjoqNlASpUM9cN6uLlv4c26pmf1krwleILMFqzaoYV3iE 3y/ebiRyj2luwKSXELNjcs/7GzCfN3h8sjP6AQ+q0fuWH3zU6+J9oKi+KevYBr8J fFEX6MNxdxOY92mXDPZa =cuZc -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Rename the "colors.code" ~/.perfconfig variable to "colors.jump_arrows", as it controls just the that UI element in the annotate browser (Taeung Song) - Avoid trying to read ELF symtabs from device files, noticed while doing memory profiling work (Jiri Olsa) - Improve context detection when offering options in the hists browser, i.e. some options don't make sense when the browser is not working with a perf.data file ('perf top' mode), only in 'perf report' mode, like scripting (Namhyung Kim) Infrastructure changes: - Elliminate duplication in the hists browser filter functions, getting the common part into a function that receives callbacks for filtering by DSO, thread, etc. (Namhyung Kim) - Fix misleadingly indented assignment, found using gcc6 -Wmisleading-indentation (Markus Trippelsdorf) - Handle LLVM relocation oddities in libbpf, introducing a 'perf test' that detects such problems and then fixing the problem, so that the test now passes (Wang Nan) - More improvements to the build infrastructure to allow reusing the feature detection facilities (Wang Nan) - Auto initialize the globals needed by cpu__max_{cpu,node}() routines (Arnaldo Carvalho de Melo) Documentation changes: - Document the perf sysctls in Documentation/sysctl/kernel.txt (Ben Hutchings) - Document a bunch more ~/.perfconfig knobs (Taeung Song) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Wang Nan | 666810e86a |
perf bpf: Check relocation target section
Libbpf should check the target section before doing relocation to ensure the relocation is correct. If not, a bug in LLVM causes an error. See [1]. Also, if an incorrect BPF script uses both global variable and map, global variable whould be treated as map and be relocated without error. This patch saves the id of the map section into obj->efile and compare target section of a relocation symbol against it during relocation. Previous patch introduces a test case about this problem. After this patch: # ~/perf test BPF 37: Test BPF filter : 37.1: Test basic BPF filtering : Ok 37.2: Test BPF prologue generation : Ok 37.3: Test BPF relocation checker : Ok # perf test -v BPF ... 37.3: Test BPF relocation checker : ... libbpf: loading object '[bpf_relocation_test]' from buffer libbpf: section .strtab, size 126, link 0, flags 0, type=3 libbpf: section .text, size 0, link 0, flags 6, type=1 libbpf: section .data, size 0, link 0, flags 3, type=1 libbpf: section .bss, size 0, link 0, flags 3, type=8 libbpf: section func=sys_write, size 104, link 0, flags 6, type=1 libbpf: found program func=sys_write libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9 libbpf: section maps, size 16, link 0, flags 3, type=1 libbpf: maps in [bpf_relocation_test]: 16 bytes libbpf: section license, size 4, link 0, flags 3, type=1 libbpf: license of [bpf_relocation_test] is GPL libbpf: section version, size 4, link 0, flags 3, type=1 libbpf: kernel version of [bpf_relocation_test] is 40400 libbpf: section .symtab, size 144, link 1, flags 0, type=2 libbpf: map 0 is "my_table" libbpf: collecting relocating info for: 'func=sys_write' libbpf: Program 'func=sys_write' contains non-map related relo data pointing to section 65522 bpf: failed to load buffer Compile BPF program failed. test child finished with 0 ---- end ---- Test BPF filter subtest 2: Ok [1] https://llvm.org/bugs/show_bug.cgi?id=26243 Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1453715801-7732-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Linus Torvalds | 048ccca8c1 |
Initial roundup of 4.5 merge window patches
- Remove usage of ib_query_device and instead store attributes in ib_device struct - Move iopoll out of block and into lib, rename to irqpoll, and use in several places in the rdma stack as our new completion queue polling library mechanism. Update the other block drivers that already used iopoll to use the new mechanism too. - Replace the per-entry GID table locks with a single GID table lock - IPoIB multicast cleanup - Cleanups to the IB MR facility - Add support for 64bit extended IB counters - Fix for netlink oops while parsing RDMA nl messages - RoCEv2 support for the core IB code - mlx4 RoCEv2 support - mlx5 RoCEv2 support - Cross Channel support for mlx5 - Timestamp support for mlx5 - Atomic support for mlx5 - Raw QP support for mlx5 - MAINTAINERS update for mlx4/mlx5 - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates - Add support for remote invalidate to the iSER driver (pushed through the RDMA tree due to dependencies, acknowledged by nab) - Update to NFSoRDMA (pushed through the RDMA tree due to dependencies, acknowledged by Bruce) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWoSygAAoJELgmozMOVy/dDjsP/2vbTda2MvQfkfkGEZBQdJSg 095RN0gQgCJdg78lAl8yuaK8r4VN/7uefpDtFdudH1I/Pei7X0wxN9R1UzFNG4KR AD53lz92IVPs15328SbPR2kvNWISR9aBFQo3rlElq3Grqlp0EMn2Ou1vtu87rekF aMllxr8Nl0uZhP+eWusOsYpJUUtwirLgRnrAyfqo2UxZh/TMIroT0TCx1KXjVcAg dhDARiZAdu3OgSc6OsWqmH+DELEq6dFVA5F+DDBGAb8bFZqlJc7cuMHWInwNsNXT so4bnEQ835alTbsdYtqs5DUNS8heJTAJP4Uz0ehkTh/uNCcvnKeUTw1c2P/lXI1k 7s33gMM+0FXj0swMBw0kKwAF2d9Hhus9UAN7NwjBuOyHcjGRd5q7SAnfWkvKx000 s9jVW19slb2I38gB58nhjOh8s+vXUArgxnV1+kTia1+bJSR5swvVoWRicRXdF0vh TvLX/BjbSIU73g1TnnLNYoBTV3ybFKQ6bVdQW7fzSTDs54dsI1vvdHXi3bYZCpnL HVwQTZRfEzkvb0AdKbcvf8p/TlaAHem3ODqtO1eHvO4if1QJBSn+SptTEeJVYYdK n4B3l/dMoBH4JXJUmEHB9jwAvYOpv/YLAFIvdL7NFwbqGNsC3nfXFcmkVORB1W3B KEMcM2we4bz+uyKMjEAD =5oO7 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "Initial roundup of 4.5 merge window patches - Remove usage of ib_query_device and instead store attributes in ib_device struct - Move iopoll out of block and into lib, rename to irqpoll, and use in several places in the rdma stack as our new completion queue polling library mechanism. Update the other block drivers that already used iopoll to use the new mechanism too. - Replace the per-entry GID table locks with a single GID table lock - IPoIB multicast cleanup - Cleanups to the IB MR facility - Add support for 64bit extended IB counters - Fix for netlink oops while parsing RDMA nl messages - RoCEv2 support for the core IB code - mlx4 RoCEv2 support - mlx5 RoCEv2 support - Cross Channel support for mlx5 - Timestamp support for mlx5 - Atomic support for mlx5 - Raw QP support for mlx5 - MAINTAINERS update for mlx4/mlx5 - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates - Add support for remote invalidate to the iSER driver (pushed through the RDMA tree due to dependencies, acknowledged by nab) - Update to NFSoRDMA (pushed through the RDMA tree due to dependencies, acknowledged by Bruce)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits) IB/mlx5: Unify CQ create flags check IB/mlx5: Expose Raw Packet QP to user space consumers {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib IB/mlx5: Support setting Ethernet priority for Raw Packet QPs IB/mlx5: Add Raw Packet QP query functionality IB/mlx5: Add create and destroy functionality for Raw Packet QP IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types IB/mlx5: Allocate a Transport Domain for each ucontext net/mlx5_core: Warn on unsupported events of QP/RQ/SQ net/mlx5_core: Add RQ and SQ event handling net/mlx5_core: Export transport objects IB/mlx5: Expose CQE version to user-space IB/mlx5: Add CQE version 1 support to user QPs and SRQs IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext IB/sa: Fix netlink local service GFP crash IB/srpt: Remove redundant wc array IB/qib: Improve ipoib UD performance IB/mlx4: Advertise RoCE v2 support IB/mlx4: Create and use another QP1 for RoCEv2 IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers ... |
|
Josh Poimboeuf | 24b1e5d7f5 |
tools subcmd: Add missing NORETURN define for parse-options.h
parse-options.h uses the NORETURN macro without defining it. perf doesn't see a build error because it defines the macro in util.h before including parse-options.h. But any other tool including it will see an error. Define the macro in parse-options.h (if not already defined) so that other tools can include it. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Cc: x86@kernel.org Link: http://lkml.kernel.org/r/6c16294ac6dbe5e2ca28fd935fe4389996588564.1450442274.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Naveen N. Rao | d5ef314035 |
perf bpf: Fix build breakage due to libbpf
perf build is currently (v4.4-rc5) broken on powerpc: bpf.c:28:4: error: #error __NR_bpf not defined. libbpf does not support your arch. # error __NR_bpf not defined. libbpf does not support your arch. ^ Fix this by including tools/scripts/Makefile.arch for the proper $ARCH macro. While at it, remove redundant LP64 macro definition. Also, since libbpf require $(srctree) now, detect the path of srctree like perf. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1452520124-2073-10-git-send-email-wangnan0@huawei.com [Use tools/scripts/Makefile.arch] Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Wang Nan | 8f9e05fb29 |
perf tools: Fix PowerPC native building
Checks BPF syscall number, turn off libbpf building on platform doesn't correctly support sys_bpf instead of blocking compiling. Reported-and-Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1452520124-2073-7-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Jiri Olsa | 24ee9b57b9 |
tools lockdep: Add *.cmd files clean up
Add *.cmd files to be removed within clean target. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1452509693-13452-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Jiri Olsa | 22992a3208 |
tools bpf: Add *.cmd files clean up
Add *.cmd files to be removed within clean target. Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1452509693-13452-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 915b0882c3 |
tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/
So that lib/find_bit.c doesn't requires anything inside tools/perf/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: George Spelvin <linux@horizon.com Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Wang Nan <wangnan0@huawei.com> Cc: Yury Norov <yury.norov@gmail.com> Link: http://lkml.kernel.org/n/tip-7lxe7jgohaac5faodndhdmvk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 64af4e0da4 |
tools lib: Sync tools/lib/find_bit.c with the kernel
Need to move the bitmap.[ch] things from tools/perf/ to tools/lib, will be done in the next patches. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: George Spelvin <linux@horizon.com Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Wang Nan <wangnan0@huawei.com> Cc: Yury Norov <yury.norov@gmail.com> Link: http://lkml.kernel.org/n/tip-5fys65wkd7gu8j7a7xgukc5t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Arnaldo Carvalho de Melo | 552eb975b8 |
tools lib: Move find_next_bit.c to tools/lib/
The commit that introduced it should've moved it to the same place, plus
the 'tools/' prefix, but instead moved it to a bogus tools/lib/util/
directory, being the only file there.
Move it to tools/lib/find_bit.c, picking the name for the file where
these routines live since:
|
|
Jiri Olsa | 58683600df |
perf build: Use FEATURE-DUMP in bpf subproject
Using FEATURE-DUMP in bpf subproject for features detection in case bpf is built via perf. Keeping the current features detection otherwise. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama <pi3orama@163.com> Link: http://lkml.kernel.org/r/1450893514-9158-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Namhyung Kim | be45d40efe |
tools lib traceevent: Factor out and export print_event_field[s]()
The print_event_field() and print_event_fields() functions print basic information of a given field or event without the print format. They'll be used by dynamic sort keys later. Committer note: Rename it to pevent_print_field[s]() to get proper namespacing, as discussed with Steven Rostedt. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450876121-22494-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Josh Poimboeuf | 1843b4e057 |
tools subcmd: Rename subcmd header include guards
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/d8081e7528b25ad91f4154b6a3fd063e93c108ec.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Josh Poimboeuf | 4b6ab94eab |
perf subcmd: Create subcmd library
Move the subcommand-related files from perf to a new library named libsubcmd.a. Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to 'exec-cmd.*' to be consistent with the naming of all the other files. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Josh Poimboeuf | ce99091730 |
perf tools: Move strlcpy() from perf to tools/lib/string.c
strlcpy() will be needed by the subcmd library. Move it to the shared tools/lib/string.c file which can be used by other tools. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/71e2804b973bf39ad3d3b9be10f99f2ea630be46.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Christoph Hellwig | 511cbce2ff |
irq_poll: make blk-iopoll available outside the block layer
The new name is irq_poll as iopoll is already taken. Better suggestions welcome. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> |
|
Wang Nan | 77ba9a5b48 |
tools lib bpf: Fetch map names from correct strtab
Namhyung Kim pointed out a potential problem in original code that it fetches names of maps from section header string table, which is used to store section names. Original code doesn't cause error because of a LLVM behavior that, it combines shstrtab into strtab. For example: $ echo 'int func() {return 0;}' | x86_64-oe-linux-clang -x c -o temp.o -c - $ readelf -h ./temp.o ELF Header: Magic: 7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00 ... Section header string table index: 1 $ readelf -S ./temp.o There are 10 section headers, starting at offset 0x288: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .strtab STRTAB 0000000000000000 00000230 0000000000000051 0000000000000000 0 0 1 ... $ readelf -p .strtab ./temp.o String dump of section '.strtab': [ 1] .text [ 7] .comment [ 10] .bss [ 15] .note.GNU-stack [ 25] .rela.eh_frame [ 34] func [ 39] .strtab [ 41] .symtab [ 49] .data [ 4f] - $ readelf -p .shstrtab ./temp.o readelf: Warning: Section '.shstrtab' was not dumped because it does not exist! Where, 'section header string table index' points to '.strtab', and symbol names are also stored there. However, in case of gcc: $ echo 'int func() {return 0;}' | gcc -x c -o temp.o -c - $ readelf -p .shstrtab ./temp.o String dump of section '.shstrtab': [ 1] .symtab [ 9] .strtab [ 11] .shstrtab [ 1b] .text [ 21] .data [ 27] .bss [ 2c] .comment [ 35] .note.GNU-stack [ 45] .rela.eh_frame $ readelf -p .strtab ./temp.o String dump of section '.strtab': [ 1] func They are separated sections. Although original code doesn't cause error, we'd better use canonical method for fetching symbol names to avoid potential behavior changing. This patch learns from readelf's code, fetches string from sh_link of .symbol section. Signed-off-by: Wang Nan <wangnan0@huawei.com> Reported-and-Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1449541544-67621-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Wang Nan | 973170e667 |
tools lib bpf: Check return value of strdup when reading map names
Commit |
|
Wang Nan | 561bbccac7 |
tools lib bpf: Extract and collect map names from BPF object file
This patch collects name of maps in BPF object files and saves them into 'maps' field in 'struct bpf_object'. 'bpf_object__get_map_by_name' is introduced to retrive fd and definitions of a map through its name. Signed-off-by: He Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448614067-197576-3-git-send-email-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Wang Nan | 9d759a9b4a |
tools lib bpf: Collect map definition in bpf_object
This patch collects more information from maps sections in BPF object files into 'struct bpf_object', enables later patches access those information (such as the type and size of the map). In this patch, a new handler 'struct bpf_map' is extracted in parallel with bpf_object and bpf_program. Its iterator and accessor is also created. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448614067-197576-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
He Kuang | 43798bf372 |
bpf tools: Add helper function for updating bpf maps elements
Add bpf_map_update_elem() helper function which calls the sys_bpf syscall to update elements in bpf maps. Upcoming patches will use it to adjust data in map through the perf command line. Signed-off-by: He Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448372181-151723-4-git-send-email-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Wang Nan | d8ad6a15cc |
tools lib bpf: Don't do a feature check when cleaning
Before this patch libbpf always do feature check even when cleaning. For example: $ cd kernel/tools/lib/bpf $ make Auto-detecting system features: ... libelf: [ on ] ... bpf: [ on ] CC libbpf.o CC bpf.o LD libbpf-in.o LINK libbpf.a LINK libbpf.so $ make clean CLEAN libbpf CLEAN core-gen $ make clean Auto-detecting system features: ... libelf: [ on ] ... bpf: [ on ] CLEAN libbpf CLEAN core-gen $ Although the first 'make clean' doesn't show feature check result, it still does the check. No output because check result is similar to FEATURE-DUMP.libbpf. This patch uses same method as perf to turn off feature checking when 'make clean'. Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448372181-151723-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Steven Rostedt | 32abc2ede5 |
tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines
When a long value is read on 32 bit machines for 64 bit output, the parsing needs to change "%lu" into "%llu", as the value is read natively. Unfortunately, if "%llu" is already there, the code will add another "l" to it and fail to parse it properly. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20151116172516.4b79b109@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Ingo Molnar | 8c2accc8ca |
perf/core improvements and fixes:
User visible: - Allows BPF scriptlets specify arguments to be fetched using DWARF info, using a prologue generated at compile/build time (He Kuang, Wang Nan) - Allow attaching BPF scriptlets to module symbols (Wang Nan) - Allow attaching BPF scriptlets to userspace code using uprobe (Wang Nan) - BPF programs now can specify 'perf probe' tunables via its section name, separating key=val values using semicolons (Wang Nan) Testing some of these new BPF features: Use case: get callchains when receiving SSL packets, filter then in the kernel, at arbitrary place. # cat ssl.bpf.c #define SEC(NAME) __attribute__((section(NAME), used)) struct pt_regs; SEC("func=__inet_lookup_established hnum") int func(struct pt_regs *ctx, int err, unsigned short port) { return err == 0 && port == 443; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; # # perf record -a -g -e ssl.bpf.c ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ] # perf script | head -30 swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux) 856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux) 2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux) 2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux) 96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux) 969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux) 2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux) 95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux) 1163ffa start_kernel ([kernel.vmlinux].init.text) 11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text) 1163623 x86_64_start_kernel ([kernel.vmlinux].init.text) qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux) 8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux) 430a br_handle_frame_finish ([bridge]) 48bc br_handle_frame ([bridge]) 855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) # Use 'perf probe' various options to list functions, see what variables can be collected at any given point, experiment first collecting without a filter, then filter, use it together with 'perf trace', 'perf top', with or without callchains, if it explodes, please tell us! - Introduce a new callchain mode: "folded", that will list per line representations of all callchains for a give histogram entry, facilitating 'perf report' output processing by other tools, such as Brendan Gregg's flamegraph tools (Namhyung Kim) E.g: # perf report | grep -v ^# | head 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry | ---cpu_startup_entry | |--12.07%--start_secondary | --6.30%--rest_init start_kernel x86_64_start_reservations x86_64_start_kernel # Becomes, in "folded" mode: # perf report -g folded | grep -v ^# | head -5 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry 12.07% cpu_startup_entry;start_secondary 6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle 11.23% call_cpuidle;cpu_startup_entry;start_secondary 5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter 11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state # The user can also select one of "count", "period" or "percent" as the first column. Infrastructure: - Fix multiple leaks found with valgrind and a refcount debugger (Masami Hiramatsu) - Add further 'perf test' entries for BPF and LLVM (Wang Nan) - Improve 'perf test' to suport subtests, so that the series of tests performed in the LLVM and BPF main tests appear in the default 'perf test' output (Wang Nan) - Move memdup() from tools/perf to tools/lib/string.c (Arnaldo Carvalho de Melo) - Adopt strtobool() from the kernel into tools/lib/ (Wang Nan) - Fix selftests_install tools/ Makefile rule (Kevin Hilman) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWTgrBAAoJENZQFvNTUqpA0wsP/RoB2X6PKjYX+R9C7uFyRVJP n5Du19VGfYDFrenn+98J+yjFNyGv9TtkoVF9mK51uPkfjgVGNgl3Uzkrg1Z1dv5a j3o/UBq43xkO1+ws5cavsN3s/WC4bJSEWT0NgE5vsIwwtyu4VnN22nryVawJ6wI9 tbTfPMtEaUW7RV9Bkgxiot+hxHngrAX2xvWoUpzd5no6YFT24dPZvzEu9n05+lCD 40XA6Ow2Q/Dsknt+3owp/5UlxeN+Ox94KvLDTkpn0rE0tggaT6EhPi05Y93eNb9v i1tm2oDQ4rJceSjn013kPdbCs3IZ9EIeNYkcgcik0BqsjNPRPnbLCn6tygliuwod f+6y/5dPSH+F/IN+X0lMOUGp6tYWnCCvSSKW03FYDkUSsr78t394637y0/rKk2Mr +UOIu2Vu22ktBYmC9GhEWtTDcjHdFdVtSA0EdMiDCH9Ey0JGbMAqvF1Mexc2gd+U 3Ez+yBSDYcxzKHKVxPr24CkTMMXOsbM+5T2+eD5QhFEWPGD/opqyQrZao21c1zoG MTglrGiu2CAy7RlOIz37onCKiBrZ9slmYu03G0pmHEgj+a4c/XPf2lkgvx0jXsfK x+uYstIozXNJgt4OSSYq7ASiTkJwGO8WExEuZN3g5LmXyqftWFTcFgyUENWy+Xal Ck6icpgY7M5hIHff8bRR =lH0L -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Allows BPF scriptlets specify arguments to be fetched using DWARF info, using a prologue generated at compile/build time (He Kuang, Wang Nan) - Allow attaching BPF scriptlets to module symbols (Wang Nan) - Allow attaching BPF scriptlets to userspace code using uprobe (Wang Nan) - BPF programs now can specify 'perf probe' tunables via its section name, separating key=val values using semicolons (Wang Nan) Testing some of these new BPF features: Use case: get callchains when receiving SSL packets, filter then in the kernel, at arbitrary place. # cat ssl.bpf.c #define SEC(NAME) __attribute__((section(NAME), used)) struct pt_regs; SEC("func=__inet_lookup_established hnum") int func(struct pt_regs *ctx, int err, unsigned short port) { return err == 0 && port == 443; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; # # perf record -a -g -e ssl.bpf.c ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ] # perf script | head -30 swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux) 856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux) 2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux) 2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux) 96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux) 969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux) 2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux) 95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux) 1163ffa start_kernel ([kernel.vmlinux].init.text) 11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text) 1163623 x86_64_start_kernel ([kernel.vmlinux].init.text) qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux) 8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux) 430a br_handle_frame_finish ([bridge]) 48bc br_handle_frame ([bridge]) 855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) # Use 'perf probe' various options to list functions, see what variables can be collected at any given point, experiment first collecting without a filter, then filter, use it together with 'perf trace', 'perf top', with or without callchains, if it explodes, please tell us! - Introduce a new callchain mode: "folded", that will list per line representations of all callchains for a give histogram entry, facilitating 'perf report' output processing by other tools, such as Brendan Gregg's flamegraph tools (Namhyung Kim) E.g: # perf report | grep -v ^# | head 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry | ---cpu_startup_entry | |--12.07%--start_secondary | --6.30%--rest_init start_kernel x86_64_start_reservations x86_64_start_kernel # Becomes, in "folded" mode: # perf report -g folded | grep -v ^# | head -5 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry 12.07% cpu_startup_entry;start_secondary 6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle 11.23% call_cpuidle;cpu_startup_entry;start_secondary 5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter 11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state # The user can also select one of "count", "period" or "percent" as the first column. Infrastructure changes: - Fix multiple leaks found with Valgrind and a refcount debugger (Masami Hiramatsu) - Add further 'perf test' entries for BPF and LLVM (Wang Nan) - Improve 'perf test' to suport subtests, so that the series of tests performed in the LLVM and BPF main tests appear in the default 'perf test' output (Wang Nan) - Move memdup() from tools/perf to tools/lib/string.c (Arnaldo Carvalho de Melo) - Adopt strtobool() from the kernel into tools/lib/ (Wang Nan) - Fix selftests_install tools/ Makefile rule (Kevin Hilman) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> |