mirror of https://gitee.com/openkylin/linux.git
A few more minor updates:
- Show the tgid mappings for user space trace tools to use - Fix and optimize the comm and tgid cache recording - Sanitize derived kprobe names - Ftrace selftest updates - trace file header fix - Update of Documentation/trace/ftrace.txt - Compiler warning fixes - Fix possible uninitialized variable -----BEGIN PGP SIGNATURE----- iQExBAABCAAbBQJZZ2rbFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L V3MIAI3NZ3dr0dKJ7DMF1jsQc24YF/bMG2noWm2b9+H/sO+gbnJKsizqzrB2Cm8S lFCYGSydLKGGZgKob3wkAX15iO2fxcUvJOKzkKxmyDbwAteABRf9LSr/llthRIsT 8kSPI5bgJ5dah+lvhl9+1ekarsIZGr41svY97Knj9A2K18kQplnSNqgatkIuV2Kn hIoiPI0tG2y27In2JJoaTedAHj4NIwmI3nhTt6nks0GN7ICx3bMcvdE9l+zB+OLJ akAehsTk3kcNb66ttoj6ZTzGZ7kaes96Cl6/uamVpXzh3SXla36ux1r9Kp8bgONE EgrJwbRwU8BMDaattutDxT7/XmU= =TPGB -----END PGP SIGNATURE----- Merge tag 'trace-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull more tracing updates from Steven Rostedt: "A few more minor updates: - Show the tgid mappings for user space trace tools to use - Fix and optimize the comm and tgid cache recording - Sanitize derived kprobe names - Ftrace selftest updates - trace file header fix - Update of Documentation/trace/ftrace.txt - Compiler warning fixes - Fix possible uninitialized variable" * tag 'trace-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Fix uninitialized variable in match_records() ftrace: Remove an unneeded NULL check ftrace: Hide cached module code for !CONFIG_MODULES tracing: Do note expose stack_trace_filter without DYNAMIC_FTRACE tracing: Update Documentation/trace/ftrace.txt tracing: Fixup trace file header alignment selftests/ftrace: Add a testcase for kprobe event naming selftests/ftrace: Add a test to probe module functions selftests/ftrace: Update multiple kprobes test for powerpc trace/kprobes: Sanitize derived event names tracing: Attempt to record other information even if some fail tracing: Treat recording tgid for idle task as a success tracing: Treat recording comm for idle task as a success tracing: Add saved_tgids file to show cached pid to tgid mappings
This commit is contained in:
commit
bc0f51d359
|
@ -5,10 +5,11 @@ Copyright 2008 Red Hat Inc.
|
||||||
Author: Steven Rostedt <srostedt@redhat.com>
|
Author: Steven Rostedt <srostedt@redhat.com>
|
||||||
License: The GNU Free Documentation License, Version 1.2
|
License: The GNU Free Documentation License, Version 1.2
|
||||||
(dual licensed under the GPL v2)
|
(dual licensed under the GPL v2)
|
||||||
Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
|
Original Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
|
||||||
John Kacur, and David Teigland.
|
John Kacur, and David Teigland.
|
||||||
Written for: 2.6.28-rc2
|
Written for: 2.6.28-rc2
|
||||||
Updated for: 3.10
|
Updated for: 3.10
|
||||||
|
Updated for: 4.13 - Copyright 2017 VMware Inc. Steven Rostedt
|
||||||
|
|
||||||
Introduction
|
Introduction
|
||||||
------------
|
------------
|
||||||
|
@ -26,9 +27,11 @@ a task is woken to the task is actually scheduled in.
|
||||||
|
|
||||||
One of the most common uses of ftrace is the event tracing.
|
One of the most common uses of ftrace is the event tracing.
|
||||||
Through out the kernel is hundreds of static event points that
|
Through out the kernel is hundreds of static event points that
|
||||||
can be enabled via the debugfs file system to see what is
|
can be enabled via the tracefs file system to see what is
|
||||||
going on in certain parts of the kernel.
|
going on in certain parts of the kernel.
|
||||||
|
|
||||||
|
See events.txt for more information.
|
||||||
|
|
||||||
|
|
||||||
Implementation Details
|
Implementation Details
|
||||||
----------------------
|
----------------------
|
||||||
|
@ -39,34 +42,47 @@ See ftrace-design.txt for details for arch porters and such.
|
||||||
The File System
|
The File System
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
Ftrace uses the debugfs file system to hold the control files as
|
Ftrace uses the tracefs file system to hold the control files as
|
||||||
well as the files to display output.
|
well as the files to display output.
|
||||||
|
|
||||||
When debugfs is configured into the kernel (which selecting any ftrace
|
When tracefs is configured into the kernel (which selecting any ftrace
|
||||||
option will do) the directory /sys/kernel/debug will be created. To mount
|
option will do) the directory /sys/kernel/tracing will be created. To mount
|
||||||
this directory, you can add to your /etc/fstab file:
|
this directory, you can add to your /etc/fstab file:
|
||||||
|
|
||||||
debugfs /sys/kernel/debug debugfs defaults 0 0
|
tracefs /sys/kernel/tracing tracefs defaults 0 0
|
||||||
|
|
||||||
Or you can mount it at run time with:
|
Or you can mount it at run time with:
|
||||||
|
|
||||||
mount -t debugfs nodev /sys/kernel/debug
|
mount -t tracefs nodev /sys/kernel/tracing
|
||||||
|
|
||||||
For quicker access to that directory you may want to make a soft link to
|
For quicker access to that directory you may want to make a soft link to
|
||||||
it:
|
it:
|
||||||
|
|
||||||
ln -s /sys/kernel/debug /debug
|
ln -s /sys/kernel/tracing /tracing
|
||||||
|
|
||||||
Any selected ftrace option will also create a directory called tracing
|
*** NOTICE ***
|
||||||
within the debugfs. The rest of the document will assume that you are in
|
|
||||||
the ftrace directory (cd /sys/kernel/debug/tracing) and will only concentrate
|
Before 4.1, all ftrace tracing control files were within the debugfs
|
||||||
on the files within that directory and not distract from the content with
|
file system, which is typically located at /sys/kernel/debug/tracing.
|
||||||
the extended "/sys/kernel/debug/tracing" path name.
|
For backward compatibility, when mounting the debugfs file system,
|
||||||
|
the tracefs file system will be automatically mounted at:
|
||||||
|
|
||||||
|
/sys/kernel/debug/tracing
|
||||||
|
|
||||||
|
All files located in the tracefs file system will be located in that
|
||||||
|
debugfs file system directory as well.
|
||||||
|
|
||||||
|
*** NOTICE ***
|
||||||
|
|
||||||
|
Any selected ftrace option will also create the tracefs file system.
|
||||||
|
The rest of the document will assume that you are in the ftrace directory
|
||||||
|
(cd /sys/kernel/tracing) and will only concentrate on the files within that
|
||||||
|
directory and not distract from the content with the extended
|
||||||
|
"/sys/kernel/tracing" path name.
|
||||||
|
|
||||||
That's it! (assuming that you have ftrace configured into your kernel)
|
That's it! (assuming that you have ftrace configured into your kernel)
|
||||||
|
|
||||||
After mounting debugfs, you can see a directory called
|
After mounting tracefs you will have access to the control and output files
|
||||||
"tracing". This directory contains the control and output files
|
|
||||||
of ftrace. Here is a list of some of the key files:
|
of ftrace. Here is a list of some of the key files:
|
||||||
|
|
||||||
|
|
||||||
|
@ -92,10 +108,20 @@ of ftrace. Here is a list of some of the key files:
|
||||||
writing to the ring buffer, the tracing overhead may
|
writing to the ring buffer, the tracing overhead may
|
||||||
still be occurring.
|
still be occurring.
|
||||||
|
|
||||||
|
The kernel function tracing_off() can be used within the
|
||||||
|
kernel to disable writing to the ring buffer, which will
|
||||||
|
set this file to "0". User space can re-enable tracing by
|
||||||
|
echoing "1" into the file.
|
||||||
|
|
||||||
|
Note, the function and event trigger "traceoff" will also
|
||||||
|
set this file to zero and stop tracing. Which can also
|
||||||
|
be re-enabled by user space using this file.
|
||||||
|
|
||||||
trace:
|
trace:
|
||||||
|
|
||||||
This file holds the output of the trace in a human
|
This file holds the output of the trace in a human
|
||||||
readable format (described below).
|
readable format (described below). Note, tracing is temporarily
|
||||||
|
disabled while this file is being read (opened).
|
||||||
|
|
||||||
trace_pipe:
|
trace_pipe:
|
||||||
|
|
||||||
|
@ -109,7 +135,8 @@ of ftrace. Here is a list of some of the key files:
|
||||||
will not be read again with a sequential read. The
|
will not be read again with a sequential read. The
|
||||||
"trace" file is static, and if the tracer is not
|
"trace" file is static, and if the tracer is not
|
||||||
adding more data, it will display the same
|
adding more data, it will display the same
|
||||||
information every time it is read.
|
information every time it is read. This file will not
|
||||||
|
disable tracing while being read.
|
||||||
|
|
||||||
trace_options:
|
trace_options:
|
||||||
|
|
||||||
|
@ -128,12 +155,14 @@ of ftrace. Here is a list of some of the key files:
|
||||||
tracing_max_latency:
|
tracing_max_latency:
|
||||||
|
|
||||||
Some of the tracers record the max latency.
|
Some of the tracers record the max latency.
|
||||||
For example, the time interrupts are disabled.
|
For example, the maximum time that interrupts are disabled.
|
||||||
This time is saved in this file. The max trace
|
The maximum time is saved in this file. The max trace will also be
|
||||||
will also be stored, and displayed by "trace".
|
stored, and displayed by "trace". A new max trace will only be
|
||||||
A new max trace will only be recorded if the
|
recorded if the latency is greater than the value in this file
|
||||||
latency is greater than the value in this
|
(in microseconds).
|
||||||
file. (in microseconds)
|
|
||||||
|
By echoing in a time into this file, no latency will be recorded
|
||||||
|
unless it is greater than the time in this file.
|
||||||
|
|
||||||
tracing_thresh:
|
tracing_thresh:
|
||||||
|
|
||||||
|
@ -152,32 +181,34 @@ of ftrace. Here is a list of some of the key files:
|
||||||
that the kernel uses for allocation, usually 4 KB in size).
|
that the kernel uses for allocation, usually 4 KB in size).
|
||||||
If the last page allocated has room for more bytes
|
If the last page allocated has room for more bytes
|
||||||
than requested, the rest of the page will be used,
|
than requested, the rest of the page will be used,
|
||||||
making the actual allocation bigger than requested.
|
making the actual allocation bigger than requested or shown.
|
||||||
( Note, the size may not be a multiple of the page size
|
( Note, the size may not be a multiple of the page size
|
||||||
due to buffer management meta-data. )
|
due to buffer management meta-data. )
|
||||||
|
|
||||||
|
Buffer sizes for individual CPUs may vary
|
||||||
|
(see "per_cpu/cpu0/buffer_size_kb" below), and if they do
|
||||||
|
this file will show "X".
|
||||||
|
|
||||||
buffer_total_size_kb:
|
buffer_total_size_kb:
|
||||||
|
|
||||||
This displays the total combined size of all the trace buffers.
|
This displays the total combined size of all the trace buffers.
|
||||||
|
|
||||||
free_buffer:
|
free_buffer:
|
||||||
|
|
||||||
If a process is performing the tracing, and the ring buffer
|
If a process is performing tracing, and the ring buffer should be
|
||||||
should be shrunk "freed" when the process is finished, even
|
shrunk "freed" when the process is finished, even if it were to be
|
||||||
if it were to be killed by a signal, this file can be used
|
killed by a signal, this file can be used for that purpose. On close
|
||||||
for that purpose. On close of this file, the ring buffer will
|
of this file, the ring buffer will be resized to its minimum size.
|
||||||
be resized to its minimum size. Having a process that is tracing
|
Having a process that is tracing also open this file, when the process
|
||||||
also open this file, when the process exits its file descriptor
|
exits its file descriptor for this file will be closed, and in doing so,
|
||||||
for this file will be closed, and in doing so, the ring buffer
|
the ring buffer will be "freed".
|
||||||
will be "freed".
|
|
||||||
|
|
||||||
It may also stop tracing if disable_on_free option is set.
|
It may also stop tracing if disable_on_free option is set.
|
||||||
|
|
||||||
tracing_cpumask:
|
tracing_cpumask:
|
||||||
|
|
||||||
This is a mask that lets the user only trace
|
This is a mask that lets the user only trace on specified CPUs.
|
||||||
on specified CPUs. The format is a hex string
|
The format is a hex string representing the CPUs.
|
||||||
representing the CPUs.
|
|
||||||
|
|
||||||
set_ftrace_filter:
|
set_ftrace_filter:
|
||||||
|
|
||||||
|
@ -190,6 +221,9 @@ of ftrace. Here is a list of some of the key files:
|
||||||
to be traced. Echoing names of functions into this file
|
to be traced. Echoing names of functions into this file
|
||||||
will limit the trace to only those functions.
|
will limit the trace to only those functions.
|
||||||
|
|
||||||
|
The functions listed in "available_filter_functions" are what
|
||||||
|
can be written into this file.
|
||||||
|
|
||||||
This interface also allows for commands to be used. See the
|
This interface also allows for commands to be used. See the
|
||||||
"Filter commands" section for more details.
|
"Filter commands" section for more details.
|
||||||
|
|
||||||
|
@ -202,7 +236,14 @@ of ftrace. Here is a list of some of the key files:
|
||||||
|
|
||||||
set_ftrace_pid:
|
set_ftrace_pid:
|
||||||
|
|
||||||
Have the function tracer only trace a single thread.
|
Have the function tracer only trace the threads whose PID are
|
||||||
|
listed in this file.
|
||||||
|
|
||||||
|
If the "function-fork" option is set, then when a task whose
|
||||||
|
PID is listed in this file forks, the child's PID will
|
||||||
|
automatically be added to this file, and the child will be
|
||||||
|
traced by the function tracer as well. This option will also
|
||||||
|
cause PIDs of tasks that exit to be removed from the file.
|
||||||
|
|
||||||
set_event_pid:
|
set_event_pid:
|
||||||
|
|
||||||
|
@ -217,17 +258,28 @@ of ftrace. Here is a list of some of the key files:
|
||||||
|
|
||||||
set_graph_function:
|
set_graph_function:
|
||||||
|
|
||||||
Set a "trigger" function where tracing should start
|
Functions listed in this file will cause the function graph
|
||||||
with the function graph tracer (See the section
|
tracer to only trace these functions and the functions that
|
||||||
"dynamic ftrace" for more details).
|
they call. (See the section "dynamic ftrace" for more details).
|
||||||
|
|
||||||
|
set_graph_notrace:
|
||||||
|
|
||||||
|
Similar to set_graph_function, but will disable function graph
|
||||||
|
tracing when the function is hit until it exits the function.
|
||||||
|
This makes it possible to ignore tracing functions that are called
|
||||||
|
by a specific function.
|
||||||
|
|
||||||
available_filter_functions:
|
available_filter_functions:
|
||||||
|
|
||||||
This lists the functions that ftrace
|
This lists the functions that ftrace has processed and can trace.
|
||||||
has processed and can trace. These are the function
|
These are the function names that you can pass to
|
||||||
names that you can pass to "set_ftrace_filter" or
|
"set_ftrace_filter" or "set_ftrace_notrace".
|
||||||
"set_ftrace_notrace". (See the section "dynamic ftrace"
|
(See the section "dynamic ftrace" below for more details.)
|
||||||
below for more details.)
|
|
||||||
|
dyn_ftrace_total_info:
|
||||||
|
|
||||||
|
This file is for debugging purposes. The number of functions that
|
||||||
|
have been converted to nops and are available to be traced.
|
||||||
|
|
||||||
enabled_functions:
|
enabled_functions:
|
||||||
|
|
||||||
|
@ -250,12 +302,21 @@ of ftrace. Here is a list of some of the key files:
|
||||||
an 'I' will be displayed on the same line as the function that
|
an 'I' will be displayed on the same line as the function that
|
||||||
can be overridden.
|
can be overridden.
|
||||||
|
|
||||||
|
If the architecture supports it, it will also show what callback
|
||||||
|
is being directly called by the function. If the count is greater
|
||||||
|
than 1 it most likely will be ftrace_ops_list_func().
|
||||||
|
|
||||||
|
If the callback of the function jumps to a trampoline that is
|
||||||
|
specific to a the callback and not the standard trampoline,
|
||||||
|
its address will be printed as well as the function that the
|
||||||
|
trampoline calls.
|
||||||
|
|
||||||
function_profile_enabled:
|
function_profile_enabled:
|
||||||
|
|
||||||
When set it will enable all functions with either the function
|
When set it will enable all functions with either the function
|
||||||
tracer, or if enabled, the function graph tracer. It will
|
tracer, or if configured, the function graph tracer. It will
|
||||||
keep a histogram of the number of functions that were called
|
keep a histogram of the number of functions that were called
|
||||||
and if run with the function graph tracer, it will also keep
|
and if the function graph tracer was configured, it will also keep
|
||||||
track of the time spent in those functions. The histogram
|
track of the time spent in those functions. The histogram
|
||||||
content can be displayed in the files:
|
content can be displayed in the files:
|
||||||
|
|
||||||
|
@ -283,12 +344,11 @@ of ftrace. Here is a list of some of the key files:
|
||||||
printk_formats:
|
printk_formats:
|
||||||
|
|
||||||
This is for tools that read the raw format files. If an event in
|
This is for tools that read the raw format files. If an event in
|
||||||
the ring buffer references a string (currently only trace_printk()
|
the ring buffer references a string, only a pointer to the string
|
||||||
does this), only a pointer to the string is recorded into the buffer
|
is recorded into the buffer and not the string itself. This prevents
|
||||||
and not the string itself. This prevents tools from knowing what
|
tools from knowing what that string was. This file displays the string
|
||||||
that string was. This file displays the string and address for
|
and address for the string allowing tools to map the pointers to what
|
||||||
the string allowing tools to map the pointers to what the
|
the strings were.
|
||||||
strings were.
|
|
||||||
|
|
||||||
saved_cmdlines:
|
saved_cmdlines:
|
||||||
|
|
||||||
|
@ -298,6 +358,22 @@ of ftrace. Here is a list of some of the key files:
|
||||||
comms for events. If a pid for a comm is not listed, then
|
comms for events. If a pid for a comm is not listed, then
|
||||||
"<...>" is displayed in the output.
|
"<...>" is displayed in the output.
|
||||||
|
|
||||||
|
If the option "record-cmd" is set to "0", then comms of tasks
|
||||||
|
will not be saved during recording. By default, it is enabled.
|
||||||
|
|
||||||
|
saved_cmdlines_size:
|
||||||
|
|
||||||
|
By default, 128 comms are saved (see "saved_cmdlines" above). To
|
||||||
|
increase or decrease the amount of comms that are cached, echo
|
||||||
|
in a the number of comms to cache, into this file.
|
||||||
|
|
||||||
|
saved_tgids:
|
||||||
|
|
||||||
|
If the option "record-tgid" is set, on each scheduling context switch
|
||||||
|
the Task Group ID of a task is saved in a table mapping the PID of
|
||||||
|
the thread to its TGID. By default, the "record-tgid" option is
|
||||||
|
disabled.
|
||||||
|
|
||||||
snapshot:
|
snapshot:
|
||||||
|
|
||||||
This displays the "snapshot" buffer and also lets the user
|
This displays the "snapshot" buffer and also lets the user
|
||||||
|
@ -336,6 +412,9 @@ of ftrace. Here is a list of some of the key files:
|
||||||
# cat trace_clock
|
# cat trace_clock
|
||||||
[local] global counter x86-tsc
|
[local] global counter x86-tsc
|
||||||
|
|
||||||
|
The clock with the square brackets around it is the one
|
||||||
|
in effect.
|
||||||
|
|
||||||
local: Default clock, but may not be in sync across CPUs
|
local: Default clock, but may not be in sync across CPUs
|
||||||
|
|
||||||
global: This clock is in sync with all CPUs but may
|
global: This clock is in sync with all CPUs but may
|
||||||
|
@ -448,6 +527,23 @@ of ftrace. Here is a list of some of the key files:
|
||||||
|
|
||||||
See events.txt for more information.
|
See events.txt for more information.
|
||||||
|
|
||||||
|
set_event:
|
||||||
|
|
||||||
|
By echoing in the event into this file, will enable that event.
|
||||||
|
|
||||||
|
See events.txt for more information.
|
||||||
|
|
||||||
|
available_events:
|
||||||
|
|
||||||
|
A list of events that can be enabled in tracing.
|
||||||
|
|
||||||
|
See events.txt for more information.
|
||||||
|
|
||||||
|
hwlat_detector:
|
||||||
|
|
||||||
|
Directory for the Hardware Latency Detector.
|
||||||
|
See "Hardware Latency Detector" section below.
|
||||||
|
|
||||||
per_cpu:
|
per_cpu:
|
||||||
|
|
||||||
This is a directory that contains the trace per_cpu information.
|
This is a directory that contains the trace per_cpu information.
|
||||||
|
@ -539,13 +635,25 @@ Here is the list of current tracers that may be configured.
|
||||||
to draw a graph of function calls similar to C code
|
to draw a graph of function calls similar to C code
|
||||||
source.
|
source.
|
||||||
|
|
||||||
|
"blk"
|
||||||
|
|
||||||
|
The block tracer. The tracer used by the blktrace user
|
||||||
|
application.
|
||||||
|
|
||||||
|
"hwlat"
|
||||||
|
|
||||||
|
The Hardware Latency tracer is used to detect if the hardware
|
||||||
|
produces any latency. See "Hardware Latency Detector" section
|
||||||
|
below.
|
||||||
|
|
||||||
"irqsoff"
|
"irqsoff"
|
||||||
|
|
||||||
Traces the areas that disable interrupts and saves
|
Traces the areas that disable interrupts and saves
|
||||||
the trace with the longest max latency.
|
the trace with the longest max latency.
|
||||||
See tracing_max_latency. When a new max is recorded,
|
See tracing_max_latency. When a new max is recorded,
|
||||||
it replaces the old trace. It is best to view this
|
it replaces the old trace. It is best to view this
|
||||||
trace with the latency-format option enabled.
|
trace with the latency-format option enabled, which
|
||||||
|
happens automatically when the tracer is selected.
|
||||||
|
|
||||||
"preemptoff"
|
"preemptoff"
|
||||||
|
|
||||||
|
@ -571,6 +679,26 @@ Here is the list of current tracers that may be configured.
|
||||||
RT tasks (as the current "wakeup" does). This is useful
|
RT tasks (as the current "wakeup" does). This is useful
|
||||||
for those interested in wake up timings of RT tasks.
|
for those interested in wake up timings of RT tasks.
|
||||||
|
|
||||||
|
"wakeup_dl"
|
||||||
|
|
||||||
|
Traces and records the max latency that it takes for
|
||||||
|
a SCHED_DEADLINE task to be woken (as the "wakeup" and
|
||||||
|
"wakeup_rt" does).
|
||||||
|
|
||||||
|
"mmiotrace"
|
||||||
|
|
||||||
|
A special tracer that is used to trace binary module.
|
||||||
|
It will trace all the calls that a module makes to the
|
||||||
|
hardware. Everything it writes and reads from the I/O
|
||||||
|
as well.
|
||||||
|
|
||||||
|
"branch"
|
||||||
|
|
||||||
|
This tracer can be configured when tracing likely/unlikely
|
||||||
|
calls within the kernel. It will trace when a likely and
|
||||||
|
unlikely branch is hit and if it was correct in its prediction
|
||||||
|
of being correct.
|
||||||
|
|
||||||
"nop"
|
"nop"
|
||||||
|
|
||||||
This is the "trace nothing" tracer. To remove all
|
This is the "trace nothing" tracer. To remove all
|
||||||
|
@ -582,7 +710,7 @@ Examples of using the tracer
|
||||||
----------------------------
|
----------------------------
|
||||||
|
|
||||||
Here are typical examples of using the tracers when controlling
|
Here are typical examples of using the tracers when controlling
|
||||||
them only with the debugfs interface (without using any
|
them only with the tracefs interface (without using any
|
||||||
user-land utilities).
|
user-land utilities).
|
||||||
|
|
||||||
Output format:
|
Output format:
|
||||||
|
@ -674,7 +802,7 @@ why a latency happened. Here is a typical trace.
|
||||||
This shows that the current tracer is "irqsoff" tracing the time
|
This shows that the current tracer is "irqsoff" tracing the time
|
||||||
for which interrupts were disabled. It gives the trace version (which
|
for which interrupts were disabled. It gives the trace version (which
|
||||||
never changes) and the version of the kernel upon which this was executed on
|
never changes) and the version of the kernel upon which this was executed on
|
||||||
(3.10). Then it displays the max latency in microseconds (259 us). The number
|
(3.8). Then it displays the max latency in microseconds (259 us). The number
|
||||||
of trace entries displayed and the total number (both are four: #4/4).
|
of trace entries displayed and the total number (both are four: #4/4).
|
||||||
VP, KP, SP, and HP are always zero and are reserved for later use.
|
VP, KP, SP, and HP are always zero and are reserved for later use.
|
||||||
#P is the number of online CPUs (#P:4).
|
#P is the number of online CPUs (#P:4).
|
||||||
|
@ -709,6 +837,8 @@ explains which is which.
|
||||||
'.' otherwise.
|
'.' otherwise.
|
||||||
|
|
||||||
hardirq/softirq:
|
hardirq/softirq:
|
||||||
|
'Z' - NMI occurred inside a hardirq
|
||||||
|
'z' - NMI is running
|
||||||
'H' - hard irq occurred inside a softirq.
|
'H' - hard irq occurred inside a softirq.
|
||||||
'h' - hard irq is running
|
'h' - hard irq is running
|
||||||
's' - soft irq is running
|
's' - soft irq is running
|
||||||
|
@ -757,24 +887,24 @@ nohex
|
||||||
nobin
|
nobin
|
||||||
noblock
|
noblock
|
||||||
trace_printk
|
trace_printk
|
||||||
nobranch
|
|
||||||
annotate
|
annotate
|
||||||
nouserstacktrace
|
nouserstacktrace
|
||||||
nosym-userobj
|
nosym-userobj
|
||||||
noprintk-msg-only
|
noprintk-msg-only
|
||||||
context-info
|
context-info
|
||||||
nolatency-format
|
nolatency-format
|
||||||
sleep-time
|
|
||||||
graph-time
|
|
||||||
record-cmd
|
record-cmd
|
||||||
|
norecord-tgid
|
||||||
overwrite
|
overwrite
|
||||||
nodisable_on_free
|
nodisable_on_free
|
||||||
irq-info
|
irq-info
|
||||||
markers
|
markers
|
||||||
noevent-fork
|
noevent-fork
|
||||||
function-trace
|
function-trace
|
||||||
|
nofunction-fork
|
||||||
nodisplay-graph
|
nodisplay-graph
|
||||||
nostacktrace
|
nostacktrace
|
||||||
|
nobranch
|
||||||
|
|
||||||
To disable one of the options, echo in the option prepended with
|
To disable one of the options, echo in the option prepended with
|
||||||
"no".
|
"no".
|
||||||
|
@ -830,8 +960,6 @@ Here are the available options:
|
||||||
|
|
||||||
trace_printk - Can disable trace_printk() from writing into the buffer.
|
trace_printk - Can disable trace_printk() from writing into the buffer.
|
||||||
|
|
||||||
branch - Enable branch tracing with the tracer.
|
|
||||||
|
|
||||||
annotate - It is sometimes confusing when the CPU buffers are full
|
annotate - It is sometimes confusing when the CPU buffers are full
|
||||||
and one CPU buffer had a lot of events recently, thus
|
and one CPU buffer had a lot of events recently, thus
|
||||||
a shorter time frame, were another CPU may have only had
|
a shorter time frame, were another CPU may have only had
|
||||||
|
@ -850,7 +978,8 @@ Here are the available options:
|
||||||
<idle>-0 [001] .Ns3 21169.031485: sub_preempt_count <-_raw_spin_unlock
|
<idle>-0 [001] .Ns3 21169.031485: sub_preempt_count <-_raw_spin_unlock
|
||||||
|
|
||||||
userstacktrace - This option changes the trace. It records a
|
userstacktrace - This option changes the trace. It records a
|
||||||
stacktrace of the current userspace thread.
|
stacktrace of the current user space thread after
|
||||||
|
each trace event.
|
||||||
|
|
||||||
sym-userobj - when user stacktrace are enabled, look up which
|
sym-userobj - when user stacktrace are enabled, look up which
|
||||||
object the address belongs to, and print a
|
object the address belongs to, and print a
|
||||||
|
@ -873,29 +1002,21 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
|
||||||
context-info - Show only the event data. Hides the comm, PID,
|
context-info - Show only the event data. Hides the comm, PID,
|
||||||
timestamp, CPU, and other useful data.
|
timestamp, CPU, and other useful data.
|
||||||
|
|
||||||
latency-format - This option changes the trace. When
|
latency-format - This option changes the trace output. When it is enabled,
|
||||||
it is enabled, the trace displays
|
the trace displays additional information about the
|
||||||
additional information about the
|
latency, as described in "Latency trace format".
|
||||||
latencies, as described in "Latency
|
|
||||||
trace format".
|
|
||||||
|
|
||||||
sleep-time - When running function graph tracer, to include
|
|
||||||
the time a task schedules out in its function.
|
|
||||||
When enabled, it will account time the task has been
|
|
||||||
scheduled out as part of the function call.
|
|
||||||
|
|
||||||
graph-time - When running function profiler with function graph tracer,
|
|
||||||
to include the time to call nested functions. When this is
|
|
||||||
not set, the time reported for the function will only
|
|
||||||
include the time the function itself executed for, not the
|
|
||||||
time for functions that it called.
|
|
||||||
|
|
||||||
record-cmd - When any event or tracer is enabled, a hook is enabled
|
record-cmd - When any event or tracer is enabled, a hook is enabled
|
||||||
in the sched_switch trace point to fill comm cache
|
in the sched_switch trace point to fill comm cache
|
||||||
with mapped pids and comms. But this may cause some
|
with mapped pids and comms. But this may cause some
|
||||||
overhead, and if you only care about pids, and not the
|
overhead, and if you only care about pids, and not the
|
||||||
name of the task, disabling this option can lower the
|
name of the task, disabling this option can lower the
|
||||||
impact of tracing.
|
impact of tracing. See "saved_cmdlines".
|
||||||
|
|
||||||
|
record-tgid - When any event or tracer is enabled, a hook is enabled
|
||||||
|
in the sched_switch trace point to fill the cache of
|
||||||
|
mapped Thread Group IDs (TGID) mapping to pids. See
|
||||||
|
"saved_tgids".
|
||||||
|
|
||||||
overwrite - This controls what happens when the trace buffer is
|
overwrite - This controls what happens when the trace buffer is
|
||||||
full. If "1" (default), the oldest events are
|
full. If "1" (default), the oldest events are
|
||||||
|
@ -935,19 +1056,98 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
|
||||||
functions. This keeps the overhead of the tracer down
|
functions. This keeps the overhead of the tracer down
|
||||||
when performing latency tests.
|
when performing latency tests.
|
||||||
|
|
||||||
|
function-fork - When set, tasks with PIDs listed in set_ftrace_pid will
|
||||||
|
have the PIDs of their children added to set_ftrace_pid
|
||||||
|
when those tasks fork. Also, when tasks with PIDs in
|
||||||
|
set_ftrace_pid exit, their PIDs will be removed from the
|
||||||
|
file.
|
||||||
|
|
||||||
display-graph - When set, the latency tracers (irqsoff, wakeup, etc) will
|
display-graph - When set, the latency tracers (irqsoff, wakeup, etc) will
|
||||||
use function graph tracing instead of function tracing.
|
use function graph tracing instead of function tracing.
|
||||||
|
|
||||||
stacktrace - This is one of the options that changes the trace
|
stacktrace - When set, a stack trace is recorded after any trace event
|
||||||
itself. When a trace is recorded, so is the stack
|
is recorded.
|
||||||
of functions. This allows for back traces of
|
|
||||||
trace sites.
|
branch - Enable branch tracing with the tracer. This enables branch
|
||||||
|
tracer along with the currently set tracer. Enabling this
|
||||||
|
with the "nop" tracer is the same as just enabling the
|
||||||
|
"branch" tracer.
|
||||||
|
|
||||||
Note: Some tracers have their own options. They only appear in this
|
Note: Some tracers have their own options. They only appear in this
|
||||||
file when the tracer is active. They always appear in the
|
file when the tracer is active. They always appear in the
|
||||||
options directory.
|
options directory.
|
||||||
|
|
||||||
|
|
||||||
|
Here are the per tracer options:
|
||||||
|
|
||||||
|
Options for function tracer:
|
||||||
|
|
||||||
|
func_stack_trace - When set, a stack trace is recorded after every
|
||||||
|
function that is recorded. NOTE! Limit the functions
|
||||||
|
that are recorded before enabling this, with
|
||||||
|
"set_ftrace_filter" otherwise the system performance
|
||||||
|
will be critically degraded. Remember to disable
|
||||||
|
this option before clearing the function filter.
|
||||||
|
|
||||||
|
Options for function_graph tracer:
|
||||||
|
|
||||||
|
Since the function_graph tracer has a slightly different output
|
||||||
|
it has its own options to control what is displayed.
|
||||||
|
|
||||||
|
funcgraph-overrun - When set, the "overrun" of the graph stack is
|
||||||
|
displayed after each function traced. The
|
||||||
|
overrun, is when the stack depth of the calls
|
||||||
|
is greater than what is reserved for each task.
|
||||||
|
Each task has a fixed array of functions to
|
||||||
|
trace in the call graph. If the depth of the
|
||||||
|
calls exceeds that, the function is not traced.
|
||||||
|
The overrun is the number of functions missed
|
||||||
|
due to exceeding this array.
|
||||||
|
|
||||||
|
funcgraph-cpu - When set, the CPU number of the CPU where the trace
|
||||||
|
occurred is displayed.
|
||||||
|
|
||||||
|
funcgraph-overhead - When set, if the function takes longer than
|
||||||
|
A certain amount, then a delay marker is
|
||||||
|
displayed. See "delay" above, under the
|
||||||
|
header description.
|
||||||
|
|
||||||
|
funcgraph-proc - Unlike other tracers, the process' command line
|
||||||
|
is not displayed by default, but instead only
|
||||||
|
when a task is traced in and out during a context
|
||||||
|
switch. Enabling this options has the command
|
||||||
|
of each process displayed at every line.
|
||||||
|
|
||||||
|
funcgraph-duration - At the end of each function (the return)
|
||||||
|
the duration of the amount of time in the
|
||||||
|
function is displayed in microseconds.
|
||||||
|
|
||||||
|
funcgraph-abstime - When set, the timestamp is displayed at each
|
||||||
|
line.
|
||||||
|
|
||||||
|
funcgraph-irqs - When disabled, functions that happen inside an
|
||||||
|
interrupt will not be traced.
|
||||||
|
|
||||||
|
funcgraph-tail - When set, the return event will include the function
|
||||||
|
that it represents. By default this is off, and
|
||||||
|
only a closing curly bracket "}" is displayed for
|
||||||
|
the return of a function.
|
||||||
|
|
||||||
|
sleep-time - When running function graph tracer, to include
|
||||||
|
the time a task schedules out in its function.
|
||||||
|
When enabled, it will account time the task has been
|
||||||
|
scheduled out as part of the function call.
|
||||||
|
|
||||||
|
graph-time - When running function profiler with function graph tracer,
|
||||||
|
to include the time to call nested functions. When this is
|
||||||
|
not set, the time reported for the function will only
|
||||||
|
include the time the function itself executed for, not the
|
||||||
|
time for functions that it called.
|
||||||
|
|
||||||
|
Options for blk tracer:
|
||||||
|
|
||||||
|
blk_classic - Shows a more minimalistic output.
|
||||||
|
|
||||||
|
|
||||||
irqsoff
|
irqsoff
|
||||||
-------
|
-------
|
||||||
|
@ -1711,6 +1911,85 @@ events.
|
||||||
<idle>-0 2d..3 6us : 0:120:R ==> [002] 5882: 94:R sleep
|
<idle>-0 2d..3 6us : 0:120:R ==> [002] 5882: 94:R sleep
|
||||||
|
|
||||||
|
|
||||||
|
Hardware Latency Detector
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
The hardware latency detector is executed by enabling the "hwlat" tracer.
|
||||||
|
|
||||||
|
NOTE, this tracer will affect the performance of the system as it will
|
||||||
|
periodically make a CPU constantly busy with interrupts disabled.
|
||||||
|
|
||||||
|
# echo hwlat > current_tracer
|
||||||
|
# sleep 100
|
||||||
|
# cat trace
|
||||||
|
# tracer: hwlat
|
||||||
|
#
|
||||||
|
# _-----=> irqs-off
|
||||||
|
# / _----=> need-resched
|
||||||
|
# | / _---=> hardirq/softirq
|
||||||
|
# || / _--=> preempt-depth
|
||||||
|
# ||| / delay
|
||||||
|
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
|
||||||
|
# | | | |||| | |
|
||||||
|
<...>-3638 [001] d... 19452.055471: #1 inner/outer(us): 12/14 ts:1499801089.066141940
|
||||||
|
<...>-3638 [003] d... 19454.071354: #2 inner/outer(us): 11/9 ts:1499801091.082164365
|
||||||
|
<...>-3638 [002] dn.. 19461.126852: #3 inner/outer(us): 12/9 ts:1499801098.138150062
|
||||||
|
<...>-3638 [001] d... 19488.340960: #4 inner/outer(us): 8/12 ts:1499801125.354139633
|
||||||
|
<...>-3638 [003] d... 19494.388553: #5 inner/outer(us): 8/12 ts:1499801131.402150961
|
||||||
|
<...>-3638 [003] d... 19501.283419: #6 inner/outer(us): 0/12 ts:1499801138.297435289 nmi-total:4 nmi-count:1
|
||||||
|
|
||||||
|
|
||||||
|
The above output is somewhat the same in the header. All events will have
|
||||||
|
interrupts disabled 'd'. Under the FUNCTION title there is:
|
||||||
|
|
||||||
|
#1 - This is the count of events recorded that were greater than the
|
||||||
|
tracing_threshold (See below).
|
||||||
|
|
||||||
|
inner/outer(us): 12/14
|
||||||
|
|
||||||
|
This shows two numbers as "inner latency" and "outer latency". The test
|
||||||
|
runs in a loop checking a timestamp twice. The latency detected within
|
||||||
|
the two timestamps is the "inner latency" and the latency detected
|
||||||
|
after the previous timestamp and the next timestamp in the loop is
|
||||||
|
the "outer latency".
|
||||||
|
|
||||||
|
ts:1499801089.066141940
|
||||||
|
|
||||||
|
The absolute timestamp that the event happened.
|
||||||
|
|
||||||
|
nmi-total:4 nmi-count:1
|
||||||
|
|
||||||
|
On architectures that support it, if an NMI comes in during the
|
||||||
|
test, the time spent in NMI is reported in "nmi-total" (in
|
||||||
|
microseconds).
|
||||||
|
|
||||||
|
All architectures that have NMIs will show the "nmi-count" if an
|
||||||
|
NMI comes in during the test.
|
||||||
|
|
||||||
|
hwlat files:
|
||||||
|
|
||||||
|
tracing_threshold - This gets automatically set to "10" to represent 10
|
||||||
|
microseconds. This is the threshold of latency that
|
||||||
|
needs to be detected before the trace will be recorded.
|
||||||
|
|
||||||
|
Note, when hwlat tracer is finished (another tracer is
|
||||||
|
written into "current_tracer"), the original value for
|
||||||
|
tracing_threshold is placed back into this file.
|
||||||
|
|
||||||
|
hwlat_detector/width - The length of time the test runs with interrupts
|
||||||
|
disabled.
|
||||||
|
|
||||||
|
hwlat_detector/window - The length of time of the window which the test
|
||||||
|
runs. That is, the test will run for "width"
|
||||||
|
microseconds per "window" microseconds
|
||||||
|
|
||||||
|
tracing_cpumask - When the test is started. A kernel thread is created that
|
||||||
|
runs the test. This thread will alternate between CPUs
|
||||||
|
listed in the tracing_cpumask between each period
|
||||||
|
(one "window"). To limit the test to specific CPUs
|
||||||
|
set the mask in this file to only the CPUs that the test
|
||||||
|
should run on.
|
||||||
|
|
||||||
function
|
function
|
||||||
--------
|
--------
|
||||||
|
|
||||||
|
@ -1821,15 +2100,15 @@ something like this simple program:
|
||||||
#define STR(x) _STR(x)
|
#define STR(x) _STR(x)
|
||||||
#define MAX_PATH 256
|
#define MAX_PATH 256
|
||||||
|
|
||||||
const char *find_debugfs(void)
|
const char *find_tracefs(void)
|
||||||
{
|
{
|
||||||
static char debugfs[MAX_PATH+1];
|
static char tracefs[MAX_PATH+1];
|
||||||
static int debugfs_found;
|
static int tracefs_found;
|
||||||
char type[100];
|
char type[100];
|
||||||
FILE *fp;
|
FILE *fp;
|
||||||
|
|
||||||
if (debugfs_found)
|
if (tracefs_found)
|
||||||
return debugfs;
|
return tracefs;
|
||||||
|
|
||||||
if ((fp = fopen("/proc/mounts","r")) == NULL) {
|
if ((fp = fopen("/proc/mounts","r")) == NULL) {
|
||||||
perror("/proc/mounts");
|
perror("/proc/mounts");
|
||||||
|
@ -1839,27 +2118,27 @@ const char *find_debugfs(void)
|
||||||
while (fscanf(fp, "%*s %"
|
while (fscanf(fp, "%*s %"
|
||||||
STR(MAX_PATH)
|
STR(MAX_PATH)
|
||||||
"s %99s %*s %*d %*d\n",
|
"s %99s %*s %*d %*d\n",
|
||||||
debugfs, type) == 2) {
|
tracefs, type) == 2) {
|
||||||
if (strcmp(type, "debugfs") == 0)
|
if (strcmp(type, "tracefs") == 0)
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
fclose(fp);
|
fclose(fp);
|
||||||
|
|
||||||
if (strcmp(type, "debugfs") != 0) {
|
if (strcmp(type, "tracefs") != 0) {
|
||||||
fprintf(stderr, "debugfs not mounted");
|
fprintf(stderr, "tracefs not mounted");
|
||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
strcat(debugfs, "/tracing/");
|
strcat(tracefs, "/tracing/");
|
||||||
debugfs_found = 1;
|
tracefs_found = 1;
|
||||||
|
|
||||||
return debugfs;
|
return tracefs;
|
||||||
}
|
}
|
||||||
|
|
||||||
const char *tracing_file(const char *file_name)
|
const char *tracing_file(const char *file_name)
|
||||||
{
|
{
|
||||||
static char trace_file[MAX_PATH+1];
|
static char trace_file[MAX_PATH+1];
|
||||||
snprintf(trace_file, MAX_PATH, "%s/%s", find_debugfs(), file_name);
|
snprintf(trace_file, MAX_PATH, "%s/%s", find_tracefs(), file_name);
|
||||||
return trace_file;
|
return trace_file;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -1898,12 +2177,12 @@ Or this simple script!
|
||||||
------
|
------
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
|
|
||||||
debugfs=`sed -ne 's/^debugfs \(.*\) debugfs.*/\1/p' /proc/mounts`
|
tracefs=`sed -ne 's/^tracefs \(.*\) tracefs.*/\1/p' /proc/mounts`
|
||||||
echo nop > $debugfs/tracing/current_tracer
|
echo nop > $tracefs/tracing/current_tracer
|
||||||
echo 0 > $debugfs/tracing/tracing_on
|
echo 0 > $tracefs/tracing/tracing_on
|
||||||
echo $$ > $debugfs/tracing/set_ftrace_pid
|
echo $$ > $tracefs/tracing/set_ftrace_pid
|
||||||
echo function > $debugfs/tracing/current_tracer
|
echo function > $tracefs/tracing/current_tracer
|
||||||
echo 1 > $debugfs/tracing/tracing_on
|
echo 1 > $tracefs/tracing/tracing_on
|
||||||
exec "$@"
|
exec "$@"
|
||||||
------
|
------
|
||||||
|
|
||||||
|
@ -2145,13 +2424,18 @@ include the -pg switch in the compiling of the kernel.)
|
||||||
At compile time every C file object is run through the
|
At compile time every C file object is run through the
|
||||||
recordmcount program (located in the scripts directory). This
|
recordmcount program (located in the scripts directory). This
|
||||||
program will parse the ELF headers in the C object to find all
|
program will parse the ELF headers in the C object to find all
|
||||||
the locations in the .text section that call mcount. (Note, only
|
the locations in the .text section that call mcount. Starting
|
||||||
white listed .text sections are processed, since processing other
|
with gcc verson 4.6, the -mfentry has been added for x86, which
|
||||||
sections like .init.text may cause races due to those sections
|
calls "__fentry__" instead of "mcount". Which is called before
|
||||||
being freed unexpectedly).
|
the creation of the stack frame.
|
||||||
|
|
||||||
A new section called "__mcount_loc" is created that holds
|
Note, not all sections are traced. They may be prevented by either
|
||||||
references to all the mcount call sites in the .text section.
|
a notrace, or blocked another way and all inline functions are not
|
||||||
|
traced. Check the "available_filter_functions" file to see what functions
|
||||||
|
can be traced.
|
||||||
|
|
||||||
|
A section called "__mcount_loc" is created that holds
|
||||||
|
references to all the mcount/fentry call sites in the .text section.
|
||||||
The recordmcount program re-links this section back into the
|
The recordmcount program re-links this section back into the
|
||||||
original object. The final linking stage of the kernel will add all these
|
original object. The final linking stage of the kernel will add all these
|
||||||
references into a single table.
|
references into a single table.
|
||||||
|
@ -2679,7 +2963,7 @@ in time without stopping tracing. Ftrace swaps the current
|
||||||
buffer with a spare buffer, and tracing continues in the new
|
buffer with a spare buffer, and tracing continues in the new
|
||||||
current (=previous spare) buffer.
|
current (=previous spare) buffer.
|
||||||
|
|
||||||
The following debugfs files in "tracing" are related to this
|
The following tracefs files in "tracing" are related to this
|
||||||
feature:
|
feature:
|
||||||
|
|
||||||
snapshot:
|
snapshot:
|
||||||
|
@ -2752,7 +3036,7 @@ cat: snapshot: Device or resource busy
|
||||||
|
|
||||||
Instances
|
Instances
|
||||||
---------
|
---------
|
||||||
In the debugfs tracing directory is a directory called "instances".
|
In the tracefs tracing directory is a directory called "instances".
|
||||||
This directory can have new directories created inside of it using
|
This directory can have new directories created inside of it using
|
||||||
mkdir, and removing directories with rmdir. The directory created
|
mkdir, and removing directories with rmdir. The directory created
|
||||||
with mkdir in this directory will already contain files and other
|
with mkdir in this directory will already contain files and other
|
||||||
|
|
|
@ -3816,7 +3816,7 @@ match_records(struct ftrace_hash *hash, char *func, int len, char *mod)
|
||||||
int exclude_mod = 0;
|
int exclude_mod = 0;
|
||||||
int found = 0;
|
int found = 0;
|
||||||
int ret;
|
int ret;
|
||||||
int clear_filter;
|
int clear_filter = 0;
|
||||||
|
|
||||||
if (func) {
|
if (func) {
|
||||||
func_g.type = filter_parse_regex(func, len, &func_g.search,
|
func_g.type = filter_parse_regex(func, len, &func_g.search,
|
||||||
|
@ -3950,7 +3950,7 @@ static int cache_mod(struct trace_array *tr,
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
/* no func matches all */
|
/* no func matches all */
|
||||||
if (!func || strcmp(func, "*") == 0 ||
|
if (strcmp(func, "*") == 0 ||
|
||||||
(ftrace_mod->func &&
|
(ftrace_mod->func &&
|
||||||
strcmp(ftrace_mod->func, func) == 0)) {
|
strcmp(ftrace_mod->func, func) == 0)) {
|
||||||
ret = 0;
|
ret = 0;
|
||||||
|
@ -3978,6 +3978,7 @@ static int
|
||||||
ftrace_set_regex(struct ftrace_ops *ops, unsigned char *buf, int len,
|
ftrace_set_regex(struct ftrace_ops *ops, unsigned char *buf, int len,
|
||||||
int reset, int enable);
|
int reset, int enable);
|
||||||
|
|
||||||
|
#ifdef CONFIG_MODULES
|
||||||
static void process_mod_list(struct list_head *head, struct ftrace_ops *ops,
|
static void process_mod_list(struct list_head *head, struct ftrace_ops *ops,
|
||||||
char *mod, bool enable)
|
char *mod, bool enable)
|
||||||
{
|
{
|
||||||
|
@ -4068,6 +4069,7 @@ static void process_cached_mods(const char *mod_name)
|
||||||
|
|
||||||
kfree(mod);
|
kfree(mod);
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* We register the module command as a template to show others how
|
* We register the module command as a template to show others how
|
||||||
|
|
|
@ -1916,7 +1916,11 @@ static int trace_save_cmdline(struct task_struct *tsk)
|
||||||
{
|
{
|
||||||
unsigned pid, idx;
|
unsigned pid, idx;
|
||||||
|
|
||||||
if (!tsk->pid || unlikely(tsk->pid > PID_MAX_DEFAULT))
|
/* treat recording of idle task as a success */
|
||||||
|
if (!tsk->pid)
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
if (unlikely(tsk->pid > PID_MAX_DEFAULT))
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -2002,7 +2006,11 @@ int trace_find_tgid(int pid)
|
||||||
|
|
||||||
static int trace_save_tgid(struct task_struct *tsk)
|
static int trace_save_tgid(struct task_struct *tsk)
|
||||||
{
|
{
|
||||||
if (unlikely(!tgid_map || !tsk->pid || tsk->pid > PID_MAX_DEFAULT))
|
/* treat recording of idle task as a success */
|
||||||
|
if (!tsk->pid)
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
if (unlikely(!tgid_map || tsk->pid > PID_MAX_DEFAULT))
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
tgid_map[tsk->pid] = tsk->tgid;
|
tgid_map[tsk->pid] = tsk->tgid;
|
||||||
|
@ -2029,11 +2037,20 @@ static bool tracing_record_taskinfo_skip(int flags)
|
||||||
*/
|
*/
|
||||||
void tracing_record_taskinfo(struct task_struct *task, int flags)
|
void tracing_record_taskinfo(struct task_struct *task, int flags)
|
||||||
{
|
{
|
||||||
|
bool done;
|
||||||
|
|
||||||
if (tracing_record_taskinfo_skip(flags))
|
if (tracing_record_taskinfo_skip(flags))
|
||||||
return;
|
return;
|
||||||
if ((flags & TRACE_RECORD_CMDLINE) && !trace_save_cmdline(task))
|
|
||||||
return;
|
/*
|
||||||
if ((flags & TRACE_RECORD_TGID) && !trace_save_tgid(task))
|
* Record as much task information as possible. If some fail, continue
|
||||||
|
* to try to record the others.
|
||||||
|
*/
|
||||||
|
done = !(flags & TRACE_RECORD_CMDLINE) || trace_save_cmdline(task);
|
||||||
|
done &= !(flags & TRACE_RECORD_TGID) || trace_save_tgid(task);
|
||||||
|
|
||||||
|
/* If recording any information failed, retry again soon. */
|
||||||
|
if (!done)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
__this_cpu_write(trace_taskinfo_save, false);
|
__this_cpu_write(trace_taskinfo_save, false);
|
||||||
|
@ -2050,15 +2067,22 @@ void tracing_record_taskinfo(struct task_struct *task, int flags)
|
||||||
void tracing_record_taskinfo_sched_switch(struct task_struct *prev,
|
void tracing_record_taskinfo_sched_switch(struct task_struct *prev,
|
||||||
struct task_struct *next, int flags)
|
struct task_struct *next, int flags)
|
||||||
{
|
{
|
||||||
|
bool done;
|
||||||
|
|
||||||
if (tracing_record_taskinfo_skip(flags))
|
if (tracing_record_taskinfo_skip(flags))
|
||||||
return;
|
return;
|
||||||
|
|
||||||
if ((flags & TRACE_RECORD_CMDLINE) &&
|
/*
|
||||||
(!trace_save_cmdline(prev) || !trace_save_cmdline(next)))
|
* Record as much task information as possible. If some fail, continue
|
||||||
return;
|
* to try to record the others.
|
||||||
|
*/
|
||||||
|
done = !(flags & TRACE_RECORD_CMDLINE) || trace_save_cmdline(prev);
|
||||||
|
done &= !(flags & TRACE_RECORD_CMDLINE) || trace_save_cmdline(next);
|
||||||
|
done &= !(flags & TRACE_RECORD_TGID) || trace_save_tgid(prev);
|
||||||
|
done &= !(flags & TRACE_RECORD_TGID) || trace_save_tgid(next);
|
||||||
|
|
||||||
if ((flags & TRACE_RECORD_TGID) &&
|
/* If recording any information failed, retry again soon. */
|
||||||
(!trace_save_tgid(prev) || !trace_save_tgid(next)))
|
if (!done)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
__this_cpu_write(trace_taskinfo_save, false);
|
__this_cpu_write(trace_taskinfo_save, false);
|
||||||
|
@ -3334,14 +3358,23 @@ static void print_func_help_header_irq(struct trace_buffer *buf, struct seq_file
|
||||||
unsigned int flags)
|
unsigned int flags)
|
||||||
{
|
{
|
||||||
bool tgid = flags & TRACE_ITER_RECORD_TGID;
|
bool tgid = flags & TRACE_ITER_RECORD_TGID;
|
||||||
|
const char tgid_space[] = " ";
|
||||||
|
const char space[] = " ";
|
||||||
|
|
||||||
seq_printf(m, "# %s _-----=> irqs-off\n", tgid ? " " : "");
|
seq_printf(m, "# %s _-----=> irqs-off\n",
|
||||||
seq_printf(m, "# %s / _----=> need-resched\n", tgid ? " " : "");
|
tgid ? tgid_space : space);
|
||||||
seq_printf(m, "# %s| / _---=> hardirq/softirq\n", tgid ? " " : "");
|
seq_printf(m, "# %s / _----=> need-resched\n",
|
||||||
seq_printf(m, "# %s|| / _--=> preempt-depth\n", tgid ? " " : "");
|
tgid ? tgid_space : space);
|
||||||
seq_printf(m, "# %s||| / delay\n", tgid ? " " : "");
|
seq_printf(m, "# %s| / _---=> hardirq/softirq\n",
|
||||||
seq_printf(m, "# TASK-PID CPU#%s|||| TIMESTAMP FUNCTION\n", tgid ? " TGID " : "");
|
tgid ? tgid_space : space);
|
||||||
seq_printf(m, "# | | | %s|||| | |\n", tgid ? " | " : "");
|
seq_printf(m, "# %s|| / _--=> preempt-depth\n",
|
||||||
|
tgid ? tgid_space : space);
|
||||||
|
seq_printf(m, "# %s||| / delay\n",
|
||||||
|
tgid ? tgid_space : space);
|
||||||
|
seq_printf(m, "# TASK-PID CPU#%s|||| TIMESTAMP FUNCTION\n",
|
||||||
|
tgid ? " TGID " : space);
|
||||||
|
seq_printf(m, "# | | | %s|||| | |\n",
|
||||||
|
tgid ? " | " : space);
|
||||||
}
|
}
|
||||||
|
|
||||||
void
|
void
|
||||||
|
@ -4689,6 +4722,76 @@ static const struct file_operations tracing_readme_fops = {
|
||||||
.llseek = generic_file_llseek,
|
.llseek = generic_file_llseek,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
static void *saved_tgids_next(struct seq_file *m, void *v, loff_t *pos)
|
||||||
|
{
|
||||||
|
int *ptr = v;
|
||||||
|
|
||||||
|
if (*pos || m->count)
|
||||||
|
ptr++;
|
||||||
|
|
||||||
|
(*pos)++;
|
||||||
|
|
||||||
|
for (; ptr <= &tgid_map[PID_MAX_DEFAULT]; ptr++) {
|
||||||
|
if (trace_find_tgid(*ptr))
|
||||||
|
return ptr;
|
||||||
|
}
|
||||||
|
|
||||||
|
return NULL;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void *saved_tgids_start(struct seq_file *m, loff_t *pos)
|
||||||
|
{
|
||||||
|
void *v;
|
||||||
|
loff_t l = 0;
|
||||||
|
|
||||||
|
if (!tgid_map)
|
||||||
|
return NULL;
|
||||||
|
|
||||||
|
v = &tgid_map[0];
|
||||||
|
while (l <= *pos) {
|
||||||
|
v = saved_tgids_next(m, v, &l);
|
||||||
|
if (!v)
|
||||||
|
return NULL;
|
||||||
|
}
|
||||||
|
|
||||||
|
return v;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void saved_tgids_stop(struct seq_file *m, void *v)
|
||||||
|
{
|
||||||
|
}
|
||||||
|
|
||||||
|
static int saved_tgids_show(struct seq_file *m, void *v)
|
||||||
|
{
|
||||||
|
int pid = (int *)v - tgid_map;
|
||||||
|
|
||||||
|
seq_printf(m, "%d %d\n", pid, trace_find_tgid(pid));
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static const struct seq_operations tracing_saved_tgids_seq_ops = {
|
||||||
|
.start = saved_tgids_start,
|
||||||
|
.stop = saved_tgids_stop,
|
||||||
|
.next = saved_tgids_next,
|
||||||
|
.show = saved_tgids_show,
|
||||||
|
};
|
||||||
|
|
||||||
|
static int tracing_saved_tgids_open(struct inode *inode, struct file *filp)
|
||||||
|
{
|
||||||
|
if (tracing_disabled)
|
||||||
|
return -ENODEV;
|
||||||
|
|
||||||
|
return seq_open(filp, &tracing_saved_tgids_seq_ops);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
static const struct file_operations tracing_saved_tgids_fops = {
|
||||||
|
.open = tracing_saved_tgids_open,
|
||||||
|
.read = seq_read,
|
||||||
|
.llseek = seq_lseek,
|
||||||
|
.release = seq_release,
|
||||||
|
};
|
||||||
|
|
||||||
static void *saved_cmdlines_next(struct seq_file *m, void *v, loff_t *pos)
|
static void *saved_cmdlines_next(struct seq_file *m, void *v, loff_t *pos)
|
||||||
{
|
{
|
||||||
unsigned int *ptr = v;
|
unsigned int *ptr = v;
|
||||||
|
@ -7921,6 +8024,9 @@ static __init int tracer_init_tracefs(void)
|
||||||
trace_create_file("saved_cmdlines_size", 0644, d_tracer,
|
trace_create_file("saved_cmdlines_size", 0644, d_tracer,
|
||||||
NULL, &tracing_saved_cmdlines_size_fops);
|
NULL, &tracing_saved_cmdlines_size_fops);
|
||||||
|
|
||||||
|
trace_create_file("saved_tgids", 0444, d_tracer,
|
||||||
|
NULL, &tracing_saved_tgids_fops);
|
||||||
|
|
||||||
trace_eval_init();
|
trace_eval_init();
|
||||||
|
|
||||||
trace_create_eval_file(d_tracer);
|
trace_create_eval_file(d_tracer);
|
||||||
|
|
|
@ -598,6 +598,14 @@ static struct notifier_block trace_kprobe_module_nb = {
|
||||||
.priority = 1 /* Invoked after kprobe module callback */
|
.priority = 1 /* Invoked after kprobe module callback */
|
||||||
};
|
};
|
||||||
|
|
||||||
|
/* Convert certain expected symbols into '_' when generating event names */
|
||||||
|
static inline void sanitize_event_name(char *name)
|
||||||
|
{
|
||||||
|
while (*name++ != '\0')
|
||||||
|
if (*name == ':' || *name == '.')
|
||||||
|
*name = '_';
|
||||||
|
}
|
||||||
|
|
||||||
static int create_trace_kprobe(int argc, char **argv)
|
static int create_trace_kprobe(int argc, char **argv)
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
|
@ -736,6 +744,7 @@ static int create_trace_kprobe(int argc, char **argv)
|
||||||
else
|
else
|
||||||
snprintf(buf, MAX_EVENT_NAME_LEN, "%c_0x%p",
|
snprintf(buf, MAX_EVENT_NAME_LEN, "%c_0x%p",
|
||||||
is_return ? 'r' : 'p', addr);
|
is_return ? 'r' : 'p', addr);
|
||||||
|
sanitize_event_name(buf);
|
||||||
event = buf;
|
event = buf;
|
||||||
}
|
}
|
||||||
tk = alloc_trace_kprobe(group, event, addr, symbol, offset, maxactive,
|
tk = alloc_trace_kprobe(group, event, addr, symbol, offset, maxactive,
|
||||||
|
|
|
@ -406,6 +406,8 @@ static const struct file_operations stack_trace_fops = {
|
||||||
.release = seq_release,
|
.release = seq_release,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
||||||
|
|
||||||
static int
|
static int
|
||||||
stack_trace_filter_open(struct inode *inode, struct file *file)
|
stack_trace_filter_open(struct inode *inode, struct file *file)
|
||||||
{
|
{
|
||||||
|
@ -423,6 +425,8 @@ static const struct file_operations stack_trace_filter_fops = {
|
||||||
.release = ftrace_regex_release,
|
.release = ftrace_regex_release,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
#endif /* CONFIG_DYNAMIC_FTRACE */
|
||||||
|
|
||||||
int
|
int
|
||||||
stack_trace_sysctl(struct ctl_table *table, int write,
|
stack_trace_sysctl(struct ctl_table *table, int write,
|
||||||
void __user *buffer, size_t *lenp,
|
void __user *buffer, size_t *lenp,
|
||||||
|
@ -477,8 +481,10 @@ static __init int stack_trace_init(void)
|
||||||
trace_create_file("stack_trace", 0444, d_tracer,
|
trace_create_file("stack_trace", 0444, d_tracer,
|
||||||
NULL, &stack_trace_fops);
|
NULL, &stack_trace_fops);
|
||||||
|
|
||||||
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
||||||
trace_create_file("stack_trace_filter", 0444, d_tracer,
|
trace_create_file("stack_trace_filter", 0444, d_tracer,
|
||||||
&trace_ops, &stack_trace_filter_fops);
|
&trace_ops, &stack_trace_filter_fops);
|
||||||
|
#endif
|
||||||
|
|
||||||
if (stack_trace_filter_buf[0])
|
if (stack_trace_filter_buf[0])
|
||||||
ftrace_set_early_filter(&trace_ops, stack_trace_filter_buf, 1);
|
ftrace_set_early_filter(&trace_ops, stack_trace_filter_buf, 1);
|
||||||
|
|
|
@ -0,0 +1,36 @@
|
||||||
|
#!/bin/sh
|
||||||
|
# description: Kprobe event auto/manual naming
|
||||||
|
|
||||||
|
[ -f kprobe_events ] || exit_unsupported # this is configurable
|
||||||
|
|
||||||
|
disable_events
|
||||||
|
echo > kprobe_events
|
||||||
|
|
||||||
|
:;: "Add an event on function without name" ;:
|
||||||
|
|
||||||
|
FUNC=`grep " [tT] .*vfs_read$" /proc/kallsyms | tail -n 1 | cut -f 3 -d " "`
|
||||||
|
[ "x" != "x$FUNC" ] || exit_unresolved
|
||||||
|
echo "p $FUNC" > kprobe_events
|
||||||
|
PROBE_NAME=`echo $FUNC | tr ".:" "_"`
|
||||||
|
test -d events/kprobes/p_${PROBE_NAME}_0 || exit_failure
|
||||||
|
|
||||||
|
:;: "Add an event on function with new name" ;:
|
||||||
|
|
||||||
|
echo "p:event1 $FUNC" > kprobe_events
|
||||||
|
test -d events/kprobes/event1 || exit_failure
|
||||||
|
|
||||||
|
:;: "Add an event on function with new name and group" ;:
|
||||||
|
|
||||||
|
echo "p:kprobes2/event2 $FUNC" > kprobe_events
|
||||||
|
test -d events/kprobes2/event2 || exit_failure
|
||||||
|
|
||||||
|
:;: "Add an event on dot function without name" ;:
|
||||||
|
|
||||||
|
FUNC=`grep -m 10 " [tT] .*\.isra\..*$" /proc/kallsyms | tail -n 1 | cut -f 3 -d " "`
|
||||||
|
[ "x" != "x$FUNC" ] || exit_unresolved
|
||||||
|
echo "p $FUNC" > kprobe_events
|
||||||
|
EVENT=`grep $FUNC kprobe_events | cut -f 1 -d " " | cut -f 2 -d:`
|
||||||
|
[ "x" != "x$EVENT" ] || exit_failure
|
||||||
|
test -d events/$EVENT || exit_failure
|
||||||
|
|
||||||
|
echo > kprobe_events
|
|
@ -0,0 +1,28 @@
|
||||||
|
#!/bin/sh
|
||||||
|
# description: Kprobe dynamic event - probing module
|
||||||
|
|
||||||
|
[ -f kprobe_events ] || exit_unsupported # this is configurable
|
||||||
|
|
||||||
|
disable_events
|
||||||
|
echo > kprobe_events
|
||||||
|
|
||||||
|
:;: "Add an event on a module function without specifying event name" ;:
|
||||||
|
|
||||||
|
MOD=`lsmod | head -n 2 | tail -n 1 | cut -f1 -d" "`
|
||||||
|
FUNC=`grep -m 1 ".* t .*\\[$MOD\\]" /proc/kallsyms | xargs | cut -f3 -d" "`
|
||||||
|
[ "x" != "x$MOD" -a "y" != "y$FUNC" ] || exit_unresolved
|
||||||
|
echo "p $MOD:$FUNC" > kprobe_events
|
||||||
|
PROBE_NAME=`echo $MOD:$FUNC | tr ".:" "_"`
|
||||||
|
test -d events/kprobes/p_${PROBE_NAME}_0 || exit_failure
|
||||||
|
|
||||||
|
:;: "Add an event on a module function with new event name" ;:
|
||||||
|
|
||||||
|
echo "p:event1 $MOD:$FUNC" > kprobe_events
|
||||||
|
test -d events/kprobes/event1 || exit_failure
|
||||||
|
|
||||||
|
:;: "Add an event on a module function with new event and group name" ;:
|
||||||
|
|
||||||
|
echo "p:kprobes1/event1 $MOD:$FUNC" > kprobe_events
|
||||||
|
test -d events/kprobes1/event1 || exit_failure
|
||||||
|
|
||||||
|
echo > kprobe_events
|
|
@ -2,10 +2,10 @@
|
||||||
# description: Register/unregister many kprobe events
|
# description: Register/unregister many kprobe events
|
||||||
|
|
||||||
# ftrace fentry skip size depends on the machine architecture.
|
# ftrace fentry skip size depends on the machine architecture.
|
||||||
# Currently HAVE_KPROBES_ON_FTRACE defined on x86 and powerpc
|
# Currently HAVE_KPROBES_ON_FTRACE defined on x86 and powerpc64le
|
||||||
case `uname -m` in
|
case `uname -m` in
|
||||||
x86_64|i[3456]86) OFFS=5;;
|
x86_64|i[3456]86) OFFS=5;;
|
||||||
ppc*) OFFS=4;;
|
ppc64le) OFFS=8;;
|
||||||
*) OFFS=0;;
|
*) OFFS=0;;
|
||||||
esac
|
esac
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue