The biggest change for this release is in the histogram code.

- Add "onchange(var)" histogram handler that executes a action when $var
    changes.
 
  - Add new "snapshot()" action for histogram handlers, that causes a
    snapshot of the ring buffer when triggered.
    ie. onchange(var).snapshot() will trigger a snapshot if var changes.
 
  - Add alternative for "trace()" action.
    Currently, to trigger a synthetic event, the name of that event is used
    as the handler name, which is inconsistent with the other actions.
    onchange(var).synthetic(param) where it can now be
    onchange(var).trace(synthetic, param). The older method will still be
    allowed, as long as the synthetic events do not overlap with other
    handler names.
 
  - The histogram documentation at testcases were updated for the new
    changes.
 
 Added a quicker way to enable set_ftrace_filter files, that will make
 it much quicker to bisect tracing a function that shouldn't be traced and
 crashes the kernel. (You can echo in numbers to set_ftrace_filter, and it
 will select the corresponding function that is in
 available_filter_functions).
 
 Some better displaying of the tracing data (and more information was added).
 
 The rest are small fixes and more clean ups to the code.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXIXXjRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qrSJAQCbGXAvZE+shfKRhbU1cu1C1nwRMHhH
 eeRecJs1RChGFgD/TwatD4FzARQPjfk7snQD5KWPpoRc9grUACC2cZcaWwQ=
 =LVBI
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "The biggest change for this release is in the histogram code:

   - Add "onchange(var)" histogram handler that executes a action when
     $var changes.

   - Add new "snapshot()" action for histogram handlers, that causes a
     snapshot of the ring buffer when triggered. ie.
     onchange(var).snapshot() will trigger a snapshot if var changes.

   - Add alternative for "trace()" action. Currently, to trigger a
     synthetic event, the name of that event is used as the handler
     name, which is inconsistent with the other actions.
     onchange(var).synthetic(param) where it can now be
     onchange(var).trace(synthetic, param). The older method will still
     be allowed, as long as the synthetic events do not overlap with
     other handler names.

   - The histogram documentation at testcases were updated for the new
     changes.

  Outside of the histogram code, we have:

   - Added a quicker way to enable set_ftrace_filter files, that will
     make it much quicker to bisect tracing a function that shouldn't be
     traced and crashes the kernel. (You can echo in numbers to
     set_ftrace_filter, and it will select the corresponding function
     that is in available_filter_functions).

   - Some better displaying of the tracing data (and more information
     was added).

  The rest are small fixes and more clean ups to the code"

* tag 'trace-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (37 commits)
  tracing: Use strncpy instead of memcpy when copying comm in trace.c
  tracing: Use strncpy instead of memcpy when copying comm for hist triggers
  tracing: Use strncpy instead of memcpy for string keys in hist triggers
  tracing: Use str_has_prefix() in synth_event_create()
  x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace()
  tracing/perf: Use strndup_user() instead of buggy open-coded version
  doc: trace: Fix documentation for uprobe_profile
  tracing: Fix spelling mistake: "analagous" -> "analogous"
  tracing: Comment why cond_snapshot is checked outside of max_lock protection
  tracing: Add hist trigger action 'expected fail' test case
  tracing: Add alternative synthetic event trace action test case
  tracing: Add hist trigger onchange() handler test case
  tracing: Add hist trigger snapshot() action test case
  tracing: Add SPDX license GPL-2.0 license identifier to inter-event testcases
  tracing: Add alternative synthetic event trace action syntax
  tracing: Add hist trigger onchange() handler Documentation
  tracing: Add hist trigger onchange() handler
  tracing: Add hist trigger snapshot() action Documentation
  tracing: Add hist trigger snapshot() action
  tracing: Add conditional snapshot
  ...
This commit is contained in:
Linus Torvalds 2019-03-11 17:01:32 -07:00
commit 6cdfa54cd2
30 changed files with 1697 additions and 422 deletions

View File

@ -233,6 +233,12 @@ of ftrace. Here is a list of some of the key files:
This interface also allows for commands to be used. See the
"Filter commands" section for more details.
As a speed up, since processing strings can't be quite expensive
and requires a check of all functions registered to tracing, instead
an index can be written into this file. A number (starting with "1")
written will instead select the same corresponding at the line position
of the "available_filter_functions" file.
set_ftrace_notrace:
This has an effect opposite to that of
@ -1396,6 +1402,57 @@ enabling function tracing, we incur an added overhead. This
overhead may extend the latency times. But nevertheless, this
trace has provided some very helpful debugging information.
If we prefer function graph output instead of function, we can set
display-graph option::
with echo 1 > options/display-graph
# tracer: irqsoff
#
# irqsoff latency trace v1.1.5 on 4.20.0-rc6+
# --------------------------------------------------------------------
# latency: 3751 us, #274/274, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: bash-1507 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
# => started at: free_debug_processing
# => ended at: return_to_handler
#
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| /
# REL TIME CPU TASK/PID |||| DURATION FUNCTION CALLS
# | | | | |||| | | | | | |
0 us | 0) bash-1507 | d... | 0.000 us | _raw_spin_lock_irqsave();
0 us | 0) bash-1507 | d..1 | 0.378 us | do_raw_spin_trylock();
1 us | 0) bash-1507 | d..2 | | set_track() {
2 us | 0) bash-1507 | d..2 | | save_stack_trace() {
2 us | 0) bash-1507 | d..2 | | __save_stack_trace() {
3 us | 0) bash-1507 | d..2 | | __unwind_start() {
3 us | 0) bash-1507 | d..2 | | get_stack_info() {
3 us | 0) bash-1507 | d..2 | 0.351 us | in_task_stack();
4 us | 0) bash-1507 | d..2 | 1.107 us | }
[...]
3750 us | 0) bash-1507 | d..1 | 0.516 us | do_raw_spin_unlock();
3750 us | 0) bash-1507 | d..1 | 0.000 us | _raw_spin_unlock_irqrestore();
3764 us | 0) bash-1507 | d..1 | 0.000 us | tracer_hardirqs_on();
bash-1507 0d..1 3792us : <stack trace>
=> free_debug_processing
=> __slab_free
=> kmem_cache_free
=> vm_area_free
=> remove_vma
=> exit_mmap
=> mmput
=> flush_old_exec
=> load_elf_binary
=> search_binary_handler
=> __do_execve_file.isra.32
=> __x64_sys_execve
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
preemptoff
----------
@ -2784,6 +2841,38 @@ Produces::
We can see that there's no more lock or preempt tracing.
Selecting function filters via index
------------------------------------
Because processing of strings is expensive (the address of the function
needs to be looked up before comparing to the string being passed in),
an index can be used as well to enable functions. This is useful in the
case of setting thousands of specific functions at a time. By passing
in a list of numbers, no string processing will occur. Instead, the function
at the specific location in the internal array (which corresponds to the
functions in the "available_filter_functions" file), is selected.
::
# echo 1 > set_ftrace_filter
Will select the first function listed in "available_filter_functions"
::
# head -1 available_filter_functions
trace_initcall_finish_cb
# cat set_ftrace_filter
trace_initcall_finish_cb
# head -50 available_filter_functions | tail -1
x86_pmu_commit_txn
# echo 1 50 > set_ftrace_filter
# cat set_ftrace_filter
trace_initcall_finish_cb
x86_pmu_commit_txn
Dynamic ftrace with the function graph tracer
---------------------------------------------

View File

@ -25,7 +25,7 @@ Documentation written by Tom Zanussi
hist:keys=<field1[,field2,...]>[:values=<field1[,field2,...]>]
[:sort=<field1[,field2,...]>][:size=#entries][:pause][:continue]
[:clear][:name=histname1] [if <filter>]
[:clear][:name=histname1][:<handler>.<action>] [if <filter>]
When a matching event is hit, an entry is added to a hash table
using the key(s) and value(s) named. Keys and values correspond to
@ -1831,41 +1831,87 @@ and looks and behaves just like any other event::
Like any other event, once a histogram is enabled for the event, the
output can be displayed by reading the event's 'hist' file.
2.2.3 Hist trigger 'actions'
----------------------------
2.2.3 Hist trigger 'handlers' and 'actions'
-------------------------------------------
A hist trigger 'action' is a function that's executed whenever a
histogram entry is added or updated.
A hist trigger 'action' is a function that's executed (in most cases
conditionally) whenever a histogram entry is added or updated.
The default 'action' if no special function is explicitly specified is
as it always has been, to simply update the set of values associated
with an entry. Some applications, however, may want to perform
additional actions at that point, such as generate another event, or
compare and save a maximum.
When a histogram entry is added or updated, a hist trigger 'handler'
is what decides whether the corresponding action is actually invoked
or not.
The following additional actions are available. To specify an action
for a given event, simply specify the action between colons in the
hist trigger specification.
Hist trigger handlers and actions are paired together in the general
form:
- onmatch(matching.event).<synthetic_event_name>(param list)
<handler>.<action>
The 'onmatch(matching.event).<synthetic_event_name>(params)' hist
trigger action is invoked whenever an event matches and the
histogram entry would be added or updated. It causes the named
synthetic event to be generated with the values given in the
To specify a handler.action pair for a given event, simply specify
that handler.action pair between colons in the hist trigger
specification.
In theory, any handler can be combined with any action, but in
practice, not every handler.action combination is currently supported;
if a given handler.action combination isn't supported, the hist
trigger will fail with -EINVAL;
The default 'handler.action' if none is explicity specified is as it
always has been, to simply update the set of values associated with an
entry. Some applications, however, may want to perform additional
actions at that point, such as generate another event, or compare and
save a maximum.
The supported handlers and actions are listed below, and each is
described in more detail in the following paragraphs, in the context
of descriptions of some common and useful handler.action combinations.
The available handlers are:
- onmatch(matching.event) - invoke action on any addition or update
- onmax(var) - invoke action if var exceeds current max
- onchange(var) - invoke action if var changes
The available actions are:
- trace(<synthetic_event_name>,param list) - generate synthetic event
- save(field,...) - save current event fields
- snapshot() - snapshot the trace buffer
The following commonly-used handler.action pairs are available:
- onmatch(matching.event).trace(<synthetic_event_name>,param list)
The 'onmatch(matching.event).trace(<synthetic_event_name>,param
list)' hist trigger action is invoked whenever an event matches
and the histogram entry would be added or updated. It causes the
named synthetic event to be generated with the values given in the
'param list'. The result is the generation of a synthetic event
that consists of the values contained in those variables at the
time the invoking event was hit.
time the invoking event was hit. For example, if the synthetic
event name is 'wakeup_latency', a wakeup_latency event is
generated using onmatch(event).trace(wakeup_latency,arg1,arg2).
The 'param list' consists of one or more parameters which may be
either variables or fields defined on either the 'matching.event'
or the target event. The variables or fields specified in the
param list may be either fully-qualified or unqualified. If a
variable is specified as unqualified, it must be unique between
the two events. A field name used as a param can be unqualified
if it refers to the target event, but must be fully qualified if
it refers to the matching event. A fully-qualified name is of the
form 'system.event_name.$var_name' or 'system.event_name.field'.
There is also an equivalent alternative form available for
generating synthetic events. In this form, the synthetic event
name is used as if it were a function name. For example, using
the 'wakeup_latency' synthetic event name again, the
wakeup_latency event would be generated by invoking it as if it
were a function call, with the event field values passed in as
arguments: onmatch(event).wakeup_latency(arg1,arg2). The syntax
for this form is:
onmatch(matching.event).<synthetic_event_name>(param list)
In either case, the 'param list' consists of one or more
parameters which may be either variables or fields defined on
either the 'matching.event' or the target event. The variables or
fields specified in the param list may be either fully-qualified
or unqualified. If a variable is specified as unqualified, it
must be unique between the two events. A field name used as a
param can be unqualified if it refers to the target event, but
must be fully qualified if it refers to the matching event. A
fully-qualified name is of the form 'system.event_name.$var_name'
or 'system.event_name.field'.
The 'matching.event' specification is simply the fully qualified
event name of the event that matches the target event for the
@ -1896,6 +1942,12 @@ hist trigger specification.
wakeup_new_test($testpid) if comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger
Or, equivalently, using the 'trace' keyword syntax:
# echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger
Creating and displaying a histogram based on those events is now
just a matter of using the fields and new synthetic event in the
tracing/events/synthetic directory, as usual::
@ -2000,6 +2052,212 @@ hist trigger specification.
Entries: 2
Dropped: 0
- onmax(var).snapshot()
The 'onmax(var).snapshot()' hist trigger action is invoked
whenever the value of 'var' associated with a histogram entry
exceeds the current maximum contained in that variable.
The end result is that a global snapshot of the trace buffer will
be saved in the tracing/snapshot file if 'var' exceeds the current
maximum for any hist trigger entry.
Note that in this case the maximum is a global maximum for the
current trace instance, which is the maximum across all buckets of
the histogram. The key of the specific trace event that caused
the global maximum and the global maximum itself are displayed,
along with a message stating that a snapshot has been taken and
where to find it. The user can use the key information displayed
to locate the corresponding bucket in the histogram for even more
detail.
As an example the below defines a couple of hist triggers, one for
sched_waking and another for sched_switch, keyed on pid. Whenever
a sched_waking event occurs, the timestamp is saved in the entry
corresponding to the current pid, and when the scheduler switches
back to that pid, the timestamp difference is calculated. If the
resulting latency, stored in wakeup_lat, exceeds the current
maximum latency, a snapshot is taken. As part of the setup, all
the scheduler events are also enabled, which are the events that
will show up in the snapshot when it is taken at some point:
# echo 1 > /sys/kernel/debug/tracing/events/sched/enable
# echo 'hist:keys=pid:ts0=common_timestamp.usecs \
if comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_waking/trigger
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \
onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \
prev_comm):onmax($wakeup_lat).snapshot() \
if next_comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_switch/trigger
When the histogram is displayed, for each bucket the max value
and the saved values corresponding to the max are displayed
following the rest of the fields.
If a snaphot was taken, there is also a message indicating that,
along with the value and event that triggered the global maximum:
# cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist
{ next_pid: 2101 } hitcount: 200
max: 52 next_prio: 120 next_comm: cyclictest \
prev_pid: 0 prev_prio: 120 prev_comm: swapper/6
{ next_pid: 2103 } hitcount: 1326
max: 572 next_prio: 19 next_comm: cyclictest \
prev_pid: 0 prev_prio: 120 prev_comm: swapper/1
{ next_pid: 2102 } hitcount: 1982 \
max: 74 next_prio: 19 next_comm: cyclictest \
prev_pid: 0 prev_prio: 120 prev_comm: swapper/5
Snapshot taken (see tracing/snapshot). Details:
triggering value { onmax($wakeup_lat) }: 572 \
triggered by event with key: { next_pid: 2103 }
Totals:
Hits: 3508
Entries: 3
Dropped: 0
In the above case, the event that triggered the global maximum has
the key with next_pid == 2103. If you look at the bucket that has
2103 as the key, you'll find the additional values save()'d along
with the local maximum for that bucket, which should be the same
as the global maximum (since that was the same value that
triggered the global snapshot).
And finally, looking at the snapshot data should show at or near
the end the event that triggered the snapshot (in this case you
can verify the timestamps between the sched_waking and
sched_switch events, which should match the time displayed in the
global maximum):
# cat /sys/kernel/debug/tracing/snapshot
<...>-2103 [005] d..3 309.873125: sched_switch: prev_comm=cyclictest prev_pid=2103 prev_prio=19 prev_state=D ==> next_comm=swapper/5 next_pid=0 next_prio=120
<idle>-0 [005] d.h3 309.873611: sched_waking: comm=cyclictest pid=2102 prio=19 target_cpu=005
<idle>-0 [005] dNh4 309.873613: sched_wakeup: comm=cyclictest pid=2102 prio=19 target_cpu=005
<idle>-0 [005] d..3 309.873616: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2102 next_prio=19
<...>-2102 [005] d..3 309.873625: sched_switch: prev_comm=cyclictest prev_pid=2102 prev_prio=19 prev_state=D ==> next_comm=swapper/5 next_pid=0 next_prio=120
<idle>-0 [005] d.h3 309.874624: sched_waking: comm=cyclictest pid=2102 prio=19 target_cpu=005
<idle>-0 [005] dNh4 309.874626: sched_wakeup: comm=cyclictest pid=2102 prio=19 target_cpu=005
<idle>-0 [005] dNh3 309.874628: sched_waking: comm=cyclictest pid=2103 prio=19 target_cpu=005
<idle>-0 [005] dNh4 309.874630: sched_wakeup: comm=cyclictest pid=2103 prio=19 target_cpu=005
<idle>-0 [005] d..3 309.874633: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2102 next_prio=19
<idle>-0 [004] d.h3 309.874757: sched_waking: comm=gnome-terminal- pid=1699 prio=120 target_cpu=004
<idle>-0 [004] dNh4 309.874762: sched_wakeup: comm=gnome-terminal- pid=1699 prio=120 target_cpu=004
<idle>-0 [004] d..3 309.874766: sched_switch: prev_comm=swapper/4 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=gnome-terminal- next_pid=1699 next_prio=120
gnome-terminal--1699 [004] d.h2 309.874941: sched_stat_runtime: comm=gnome-terminal- pid=1699 runtime=180706 [ns] vruntime=1126870572 [ns]
<idle>-0 [003] d.s4 309.874956: sched_waking: comm=rcu_sched pid=9 prio=120 target_cpu=007
<idle>-0 [003] d.s5 309.874960: sched_wake_idle_without_ipi: cpu=7
<idle>-0 [003] d.s5 309.874961: sched_wakeup: comm=rcu_sched pid=9 prio=120 target_cpu=007
<idle>-0 [007] d..3 309.874963: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=rcu_sched next_pid=9 next_prio=120
rcu_sched-9 [007] d..3 309.874973: sched_stat_runtime: comm=rcu_sched pid=9 runtime=13646 [ns] vruntime=22531430286 [ns]
rcu_sched-9 [007] d..3 309.874978: sched_switch: prev_comm=rcu_sched prev_pid=9 prev_prio=120 prev_state=R+ ==> next_comm=swapper/7 next_pid=0 next_prio=120
<...>-2102 [005] d..4 309.874994: sched_migrate_task: comm=cyclictest pid=2103 prio=19 orig_cpu=5 dest_cpu=1
<...>-2102 [005] d..4 309.875185: sched_wake_idle_without_ipi: cpu=1
<idle>-0 [001] d..3 309.875200: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2103 next_prio=19
- onchange(var).save(field,.. .)
The 'onchange(var).save(field,...)' hist trigger action is invoked
whenever the value of 'var' associated with a histogram entry
changes.
The end result is that the trace event fields specified as the
onchange.save() params will be saved if 'var' changes for that
hist trigger entry. This allows context from the event that
changed the value to be saved for later reference. When the
histogram is displayed, additional fields displaying the saved
values will be printed.
- onchange(var).snapshot()
The 'onchange(var).snapshot()' hist trigger action is invoked
whenever the value of 'var' associated with a histogram entry
changes.
The end result is that a global snapshot of the trace buffer will
be saved in the tracing/snapshot file if 'var' changes for any
hist trigger entry.
Note that in this case the changed value is a global variable
associated withe current trace instance. The key of the specific
trace event that caused the value to change and the global value
itself are displayed, along with a message stating that a snapshot
has been taken and where to find it. The user can use the key
information displayed to locate the corresponding bucket in the
histogram for even more detail.
As an example the below defines a hist trigger on the tcp_probe
event, keyed on dport. Whenever a tcp_probe event occurs, the
cwnd field is checked against the current value stored in the
$cwnd variable. If the value has changed, a snapshot is taken.
As part of the setup, all the scheduler and tcp events are also
enabled, which are the events that will show up in the snapshot
when it is taken at some point:
# echo 1 > /sys/kernel/debug/tracing/events/sched/enable
# echo 1 > /sys/kernel/debug/tracing/events/tcp/enable
# echo 'hist:keys=dport:cwnd=snd_cwnd: \
onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \
onchange($cwnd).snapshot()' >> \
/sys/kernel/debug/tracing/events/tcp/tcp_probe/trigger
When the histogram is displayed, for each bucket the tracked value
and the saved values corresponding to that value are displayed
following the rest of the fields.
If a snaphot was taken, there is also a message indicating that,
along with the value and event that triggered the snapshot:
# cat /sys/kernel/debug/tracing/events/tcp/tcp_probe/hist
{ dport: 1521 } hitcount: 8
changed: 10 snd_wnd: 35456 srtt: 154262 rcv_wnd: 42112
{ dport: 80 } hitcount: 23
changed: 10 snd_wnd: 28960 srtt: 19604 rcv_wnd: 29312
{ dport: 9001 } hitcount: 172
changed: 10 snd_wnd: 48384 srtt: 260444 rcv_wnd: 55168
{ dport: 443 } hitcount: 211
changed: 10 snd_wnd: 26960 srtt: 17379 rcv_wnd: 28800
Snapshot taken (see tracing/snapshot). Details:
triggering value { onchange($cwnd) }: 10
triggered by event with key: { dport: 80 }
Totals:
Hits: 414
Entries: 4
Dropped: 0
In the above case, the event that triggered the snapshot has the
key with dport == 80. If you look at the bucket that has 80 as
the key, you'll find the additional values save()'d along with the
changed value for that bucket, which should be the same as the
global changed value (since that was the same value that triggered
the global snapshot).
And finally, looking at the snapshot data should show at or near
the end the event that triggered the snapshot:
# cat /sys/kernel/debug/tracing/snapshot
gnome-shell-1261 [006] dN.3 49.823113: sched_stat_runtime: comm=gnome-shell pid=1261 runtime=49347 [ns] vruntime=1835730389 [ns]
kworker/u16:4-773 [003] d..3 49.823114: sched_switch: prev_comm=kworker/u16:4 prev_pid=773 prev_prio=120 prev_state=R+ ==> next_comm=kworker/3:2 next_pid=135 next_prio=120
gnome-shell-1261 [006] d..3 49.823114: sched_switch: prev_comm=gnome-shell prev_pid=1261 prev_prio=120 prev_state=R+ ==> next_comm=kworker/6:2 next_pid=387 next_prio=120
kworker/3:2-135 [003] d..3 49.823118: sched_stat_runtime: comm=kworker/3:2 pid=135 runtime=5339 [ns] vruntime=17815800388 [ns]
kworker/6:2-387 [006] d..3 49.823120: sched_stat_runtime: comm=kworker/6:2 pid=387 runtime=9594 [ns] vruntime=14589605367 [ns]
kworker/6:2-387 [006] d..3 49.823122: sched_switch: prev_comm=kworker/6:2 prev_pid=387 prev_prio=120 prev_state=R+ ==> next_comm=gnome-shell next_pid=1261 next_prio=120
kworker/3:2-135 [003] d..3 49.823123: sched_switch: prev_comm=kworker/3:2 prev_pid=135 prev_prio=120 prev_state=T ==> next_comm=swapper/3 next_pid=0 next_prio=120
<idle>-0 [004] ..s7 49.823798: tcp_probe: src=10.0.0.10:54326 dest=23.215.104.193:80 mark=0x0 length=32 snd_nxt=0xe3ae2ff5 snd_una=0xe3ae2ecd snd_cwnd=10 ssthresh=2147483647 snd_wnd=28960 srtt=19604 rcv_wnd=29312
3. User space creating a trigger
--------------------------------

View File

@ -73,10 +73,9 @@ For $comm, the default type is "string"; any other type is invalid.
Event Profiling
---------------
You can check the total number of probe hits and probe miss-hits via
/sys/kernel/debug/tracing/uprobe_profile.
The first column is event name, the second is the number of probe hits,
the third is the number of probe miss-hits.
You can check the total number of probe hits per event via
/sys/kernel/debug/tracing/uprobe_profile. The first column is the filename,
the second is the event name, the third is the number of probe hits.
Usage examples
--------------

View File

@ -49,7 +49,7 @@ int ftrace_arch_code_modify_post_process(void)
union ftrace_code_union {
char code[MCOUNT_INSN_SIZE];
struct {
unsigned char e8;
unsigned char op;
int offset;
} __attribute__((packed));
};
@ -59,20 +59,23 @@ static int ftrace_calc_offset(long ip, long addr)
return (int)(addr - ip);
}
static unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
static unsigned char *
ftrace_text_replace(unsigned char op, unsigned long ip, unsigned long addr)
{
static union ftrace_code_union calc;
calc.e8 = 0xe8;
calc.op = op;
calc.offset = ftrace_calc_offset(ip + MCOUNT_INSN_SIZE, addr);
/*
* No locking needed, this must be called via kstop_machine
* which in essence is like running on a uniprocessor machine.
*/
return calc.code;
}
static unsigned char *
ftrace_call_replace(unsigned long ip, unsigned long addr)
{
return ftrace_text_replace(0xe8, ip, addr);
}
static inline int
within(unsigned long addr, unsigned long start, unsigned long end)
{
@ -665,22 +668,6 @@ int __init ftrace_dyn_arch_init(void)
return 0;
}
#if defined(CONFIG_X86_64) || defined(CONFIG_FUNCTION_GRAPH_TRACER)
static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
{
static union ftrace_code_union calc;
/* Jmp not a call (ignore the .e8) */
calc.e8 = 0xe9;
calc.offset = ftrace_calc_offset(ip + MCOUNT_INSN_SIZE, addr);
/*
* ftrace external locks synchronize the access to the static variable.
*/
return calc.code;
}
#endif
/* Currently only x86_64 supports dynamic trampolines */
#ifdef CONFIG_X86_64
@ -892,8 +879,8 @@ static void *addr_from_call(void *ptr)
return NULL;
/* Make sure this is a call */
if (WARN_ON_ONCE(calc.e8 != 0xe8)) {
pr_warn("Expected e8, got %x\n", calc.e8);
if (WARN_ON_ONCE(calc.op != 0xe8)) {
pr_warn("Expected e8, got %x\n", calc.op);
return NULL;
}
@ -964,6 +951,11 @@ void arch_ftrace_trampoline_free(struct ftrace_ops *ops)
#ifdef CONFIG_DYNAMIC_FTRACE
extern void ftrace_graph_call(void);
static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
{
return ftrace_text_replace(0xe9, ip, addr);
}
static int ftrace_mod_jmp(unsigned long ip, void *func)
{
unsigned char *new;

View File

@ -187,8 +187,6 @@ void ring_buffer_set_clock(struct ring_buffer *buffer,
void ring_buffer_set_time_stamp_abs(struct ring_buffer *buffer, bool abs);
bool ring_buffer_time_stamp_abs(struct ring_buffer *buffer);
size_t ring_buffer_page_len(void *page);
size_t ring_buffer_nr_pages(struct ring_buffer *buffer, int cpu);
size_t ring_buffer_nr_dirty_pages(struct ring_buffer *buffer, int cpu);

View File

@ -53,7 +53,7 @@ static struct percpu_rw_semaphore dup_mmap_sem;
struct uprobe {
struct rb_node rb_node; /* node in the rb tree */
atomic_t ref;
refcount_t ref;
struct rw_semaphore register_rwsem;
struct rw_semaphore consumer_rwsem;
struct list_head pending_list;
@ -547,13 +547,13 @@ set_orig_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long v
static struct uprobe *get_uprobe(struct uprobe *uprobe)
{
atomic_inc(&uprobe->ref);
refcount_inc(&uprobe->ref);
return uprobe;
}
static void put_uprobe(struct uprobe *uprobe)
{
if (atomic_dec_and_test(&uprobe->ref)) {
if (refcount_dec_and_test(&uprobe->ref)) {
/*
* If application munmap(exec_vma) before uprobe_unregister()
* gets called, we don't get a chance to remove uprobe from
@ -644,7 +644,7 @@ static struct uprobe *__insert_uprobe(struct uprobe *uprobe)
rb_link_node(&uprobe->rb_node, parent, p);
rb_insert_color(&uprobe->rb_node, &uprobes_tree);
/* get access + creation ref */
atomic_set(&uprobe->ref, 2);
refcount_set(&uprobe->ref, 2);
return u;
}

View File

@ -3701,6 +3701,31 @@ enter_record(struct ftrace_hash *hash, struct dyn_ftrace *rec, int clear_filter)
return ret;
}
static int
add_rec_by_index(struct ftrace_hash *hash, struct ftrace_glob *func_g,
int clear_filter)
{
long index = simple_strtoul(func_g->search, NULL, 0);
struct ftrace_page *pg;
struct dyn_ftrace *rec;
/* The index starts at 1 */
if (--index < 0)
return 0;
do_for_each_ftrace_rec(pg, rec) {
if (pg->index <= index) {
index -= pg->index;
/* this is a double loop, break goes to the next page */
break;
}
rec = &pg->records[index];
enter_record(hash, rec, clear_filter);
return 1;
} while_for_each_ftrace_rec();
return 0;
}
static int
ftrace_match_record(struct dyn_ftrace *rec, struct ftrace_glob *func_g,
struct ftrace_glob *mod_g, int exclude_mod)
@ -3769,6 +3794,11 @@ match_records(struct ftrace_hash *hash, char *func, int len, char *mod)
if (unlikely(ftrace_disabled))
goto out_unlock;
if (func_g.type == MATCH_INDEX) {
found = add_rec_by_index(hash, &func_g, clear_filter);
goto out_unlock;
}
do_for_each_ftrace_rec(pg, rec) {
if (rec->flags & FTRACE_FL_DISABLED)

View File

@ -353,20 +353,6 @@ static void rb_init_page(struct buffer_data_page *bpage)
local_set(&bpage->commit, 0);
}
/**
* ring_buffer_page_len - the size of data on the page.
* @page: The page to read
*
* Returns the amount of data on the page, including buffer page header.
*/
size_t ring_buffer_page_len(void *page)
{
struct buffer_data_page *bpage = page;
return (local_read(&bpage->commit) & ~RB_MISSED_FLAGS)
+ BUF_PAGE_HDR_SIZE;
}
/*
* Also stolen from mm/slob.c. Thanks to Mathieu Desnoyers for pointing
* this issue out.

View File

@ -894,7 +894,7 @@ int __trace_bputs(unsigned long ip, const char *str)
EXPORT_SYMBOL_GPL(__trace_bputs);
#ifdef CONFIG_TRACER_SNAPSHOT
void tracing_snapshot_instance(struct trace_array *tr)
void tracing_snapshot_instance_cond(struct trace_array *tr, void *cond_data)
{
struct tracer *tracer = tr->current_trace;
unsigned long flags;
@ -920,10 +920,15 @@ void tracing_snapshot_instance(struct trace_array *tr)
}
local_irq_save(flags);
update_max_tr(tr, current, smp_processor_id());
update_max_tr(tr, current, smp_processor_id(), cond_data);
local_irq_restore(flags);
}
void tracing_snapshot_instance(struct trace_array *tr)
{
tracing_snapshot_instance_cond(tr, NULL);
}
/**
* tracing_snapshot - take a snapshot of the current buffer.
*
@ -946,6 +951,54 @@ void tracing_snapshot(void)
}
EXPORT_SYMBOL_GPL(tracing_snapshot);
/**
* tracing_snapshot_cond - conditionally take a snapshot of the current buffer.
* @tr: The tracing instance to snapshot
* @cond_data: The data to be tested conditionally, and possibly saved
*
* This is the same as tracing_snapshot() except that the snapshot is
* conditional - the snapshot will only happen if the
* cond_snapshot.update() implementation receiving the cond_data
* returns true, which means that the trace array's cond_snapshot
* update() operation used the cond_data to determine whether the
* snapshot should be taken, and if it was, presumably saved it along
* with the snapshot.
*/
void tracing_snapshot_cond(struct trace_array *tr, void *cond_data)
{
tracing_snapshot_instance_cond(tr, cond_data);
}
EXPORT_SYMBOL_GPL(tracing_snapshot_cond);
/**
* tracing_snapshot_cond_data - get the user data associated with a snapshot
* @tr: The tracing instance
*
* When the user enables a conditional snapshot using
* tracing_snapshot_cond_enable(), the user-defined cond_data is saved
* with the snapshot. This accessor is used to retrieve it.
*
* Should not be called from cond_snapshot.update(), since it takes
* the tr->max_lock lock, which the code calling
* cond_snapshot.update() has already done.
*
* Returns the cond_data associated with the trace array's snapshot.
*/
void *tracing_cond_snapshot_data(struct trace_array *tr)
{
void *cond_data = NULL;
arch_spin_lock(&tr->max_lock);
if (tr->cond_snapshot)
cond_data = tr->cond_snapshot->cond_data;
arch_spin_unlock(&tr->max_lock);
return cond_data;
}
EXPORT_SYMBOL_GPL(tracing_cond_snapshot_data);
static int resize_buffer_duplicate_size(struct trace_buffer *trace_buf,
struct trace_buffer *size_buf, int cpu_id);
static void set_buffer_entries(struct trace_buffer *buf, unsigned long val);
@ -1025,12 +1078,111 @@ void tracing_snapshot_alloc(void)
tracing_snapshot();
}
EXPORT_SYMBOL_GPL(tracing_snapshot_alloc);
/**
* tracing_snapshot_cond_enable - enable conditional snapshot for an instance
* @tr: The tracing instance
* @cond_data: User data to associate with the snapshot
* @update: Implementation of the cond_snapshot update function
*
* Check whether the conditional snapshot for the given instance has
* already been enabled, or if the current tracer is already using a
* snapshot; if so, return -EBUSY, else create a cond_snapshot and
* save the cond_data and update function inside.
*
* Returns 0 if successful, error otherwise.
*/
int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data,
cond_update_fn_t update)
{
struct cond_snapshot *cond_snapshot;
int ret = 0;
cond_snapshot = kzalloc(sizeof(*cond_snapshot), GFP_KERNEL);
if (!cond_snapshot)
return -ENOMEM;
cond_snapshot->cond_data = cond_data;
cond_snapshot->update = update;
mutex_lock(&trace_types_lock);
ret = tracing_alloc_snapshot_instance(tr);
if (ret)
goto fail_unlock;
if (tr->current_trace->use_max_tr) {
ret = -EBUSY;
goto fail_unlock;
}
/*
* The cond_snapshot can only change to NULL without the
* trace_types_lock. We don't care if we race with it going
* to NULL, but we want to make sure that it's not set to
* something other than NULL when we get here, which we can
* do safely with only holding the trace_types_lock and not
* having to take the max_lock.
*/
if (tr->cond_snapshot) {
ret = -EBUSY;
goto fail_unlock;
}
arch_spin_lock(&tr->max_lock);
tr->cond_snapshot = cond_snapshot;
arch_spin_unlock(&tr->max_lock);
mutex_unlock(&trace_types_lock);
return ret;
fail_unlock:
mutex_unlock(&trace_types_lock);
kfree(cond_snapshot);
return ret;
}
EXPORT_SYMBOL_GPL(tracing_snapshot_cond_enable);
/**
* tracing_snapshot_cond_disable - disable conditional snapshot for an instance
* @tr: The tracing instance
*
* Check whether the conditional snapshot for the given instance is
* enabled; if so, free the cond_snapshot associated with it,
* otherwise return -EINVAL.
*
* Returns 0 if successful, error otherwise.
*/
int tracing_snapshot_cond_disable(struct trace_array *tr)
{
int ret = 0;
arch_spin_lock(&tr->max_lock);
if (!tr->cond_snapshot)
ret = -EINVAL;
else {
kfree(tr->cond_snapshot);
tr->cond_snapshot = NULL;
}
arch_spin_unlock(&tr->max_lock);
return ret;
}
EXPORT_SYMBOL_GPL(tracing_snapshot_cond_disable);
#else
void tracing_snapshot(void)
{
WARN_ONCE(1, "Snapshot feature not enabled, but internal snapshot used");
}
EXPORT_SYMBOL_GPL(tracing_snapshot);
void tracing_snapshot_cond(struct trace_array *tr, void *cond_data)
{
WARN_ONCE(1, "Snapshot feature not enabled, but internal conditional snapshot used");
}
EXPORT_SYMBOL_GPL(tracing_snapshot_cond);
int tracing_alloc_snapshot(void)
{
WARN_ONCE(1, "Snapshot feature not enabled, but snapshot allocation used");
@ -1043,6 +1195,21 @@ void tracing_snapshot_alloc(void)
tracing_snapshot();
}
EXPORT_SYMBOL_GPL(tracing_snapshot_alloc);
void *tracing_cond_snapshot_data(struct trace_array *tr)
{
return NULL;
}
EXPORT_SYMBOL_GPL(tracing_cond_snapshot_data);
int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data, cond_update_fn_t update)
{
return -ENODEV;
}
EXPORT_SYMBOL_GPL(tracing_snapshot_cond_enable);
int tracing_snapshot_cond_disable(struct trace_array *tr)
{
return false;
}
EXPORT_SYMBOL_GPL(tracing_snapshot_cond_disable);
#endif /* CONFIG_TRACER_SNAPSHOT */
void tracer_tracing_off(struct trace_array *tr)
@ -1330,7 +1497,7 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
max_data->critical_start = data->critical_start;
max_data->critical_end = data->critical_end;
memcpy(max_data->comm, tsk->comm, TASK_COMM_LEN);
strncpy(max_data->comm, tsk->comm, TASK_COMM_LEN);
max_data->pid = tsk->pid;
/*
* If tsk == current, then use current_uid(), as that does not use
@ -1354,12 +1521,14 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
* @tr: tracer
* @tsk: the task with the latency
* @cpu: The cpu that initiated the trace.
* @cond_data: User data associated with a conditional snapshot
*
* Flip the buffers between the @tr and the max_tr and record information
* about which task was the cause of this latency.
*/
void
update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
void *cond_data)
{
if (tr->stop_count)
return;
@ -1380,9 +1549,15 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
else
ring_buffer_record_off(tr->max_buffer.buffer);
#ifdef CONFIG_TRACER_SNAPSHOT
if (tr->cond_snapshot && !tr->cond_snapshot->update(tr, cond_data))
goto out_unlock;
#endif
swap(tr->trace_buffer.buffer, tr->max_buffer.buffer);
__update_max_tr(tr, tsk, cpu);
out_unlock:
arch_spin_unlock(&tr->max_lock);
}
@ -1748,7 +1923,7 @@ static inline char *get_saved_cmdlines(int idx)
static inline void set_cmdline(int idx, const char *cmdline)
{
memcpy(get_saved_cmdlines(idx), cmdline, TASK_COMM_LEN);
strncpy(get_saved_cmdlines(idx), cmdline, TASK_COMM_LEN);
}
static int allocate_cmdlines_buffer(unsigned int val,
@ -4702,6 +4877,7 @@ static const char readme_msg[] =
"\t [:size=#entries]\n"
"\t [:pause][:continue][:clear]\n"
"\t [:name=histname1]\n"
"\t [:<handler>.<action>]\n"
"\t [if <filter>]\n\n"
"\t When a matching event is hit, an entry is added to a hash\n"
"\t table using the key(s) and value(s) named, and the value of a\n"
@ -4742,8 +4918,21 @@ static const char readme_msg[] =
"\t unchanged.\n\n"
"\t The enable_hist and disable_hist triggers can be used to\n"
"\t have one event conditionally start and stop another event's\n"
"\t already-attached hist trigger. The syntax is analagous to\n"
"\t the enable_event and disable_event triggers.\n"
"\t already-attached hist trigger. The syntax is analogous to\n"
"\t the enable_event and disable_event triggers.\n\n"
"\t Hist trigger handlers and actions are executed whenever a\n"
"\t a histogram entry is added or updated. They take the form:\n\n"
"\t <handler>.<action>\n\n"
"\t The available handlers are:\n\n"
"\t onmatch(matching.event) - invoke on addition or update\n"
"\t onmax(var) - invoke if var exceeds current max\n"
"\t onchange(var) - invoke action if var changes\n\n"
"\t The available actions are:\n\n"
"\t trace(<synthetic_event>,param list) - generate synthetic event\n"
"\t save(field,...) - save current event fields\n"
#ifdef CONFIG_TRACER_SNAPSHOT
"\t snapshot() - snapshot the trace buffer\n"
#endif
#endif
;
@ -5388,6 +5577,16 @@ static int tracing_set_tracer(struct trace_array *tr, const char *buf)
if (t == tr->current_trace)
goto out;
#ifdef CONFIG_TRACER_SNAPSHOT
if (t->use_max_tr) {
arch_spin_lock(&tr->max_lock);
if (tr->cond_snapshot)
ret = -EBUSY;
arch_spin_unlock(&tr->max_lock);
if (ret)
goto out;
}
#endif
/* Some tracers won't work on kernel command line */
if (system_state < SYSTEM_RUNNING && t->noboot) {
pr_warn("Tracer '%s' is not allowed on command line, ignored\n",
@ -5626,7 +5825,6 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
return ret;
fail:
kfree(iter->trace);
kfree(iter);
__trace_array_put(tr);
mutex_unlock(&trace_types_lock);
@ -6470,6 +6668,13 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt,
goto out;
}
arch_spin_lock(&tr->max_lock);
if (tr->cond_snapshot)
ret = -EBUSY;
arch_spin_unlock(&tr->max_lock);
if (ret)
goto out;
switch (val) {
case 0:
if (iter->cpu_file != RING_BUFFER_ALL_CPUS) {
@ -6495,7 +6700,7 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt,
local_irq_disable();
/* Now, we're going to swap */
if (iter->cpu_file == RING_BUFFER_ALL_CPUS)
update_max_tr(tr, current, smp_processor_id());
update_max_tr(tr, current, smp_processor_id(), NULL);
else
update_max_tr_single(tr, current, iter->cpu_file);
local_irq_enable();

View File

@ -194,6 +194,51 @@ struct trace_pid_list {
unsigned long *pids;
};
typedef bool (*cond_update_fn_t)(struct trace_array *tr, void *cond_data);
/**
* struct cond_snapshot - conditional snapshot data and callback
*
* The cond_snapshot structure encapsulates a callback function and
* data associated with the snapshot for a given tracing instance.
*
* When a snapshot is taken conditionally, by invoking
* tracing_snapshot_cond(tr, cond_data), the cond_data passed in is
* passed in turn to the cond_snapshot.update() function. That data
* can be compared by the update() implementation with the cond_data
* contained wihin the struct cond_snapshot instance associated with
* the trace_array. Because the tr->max_lock is held throughout the
* update() call, the update() function can directly retrieve the
* cond_snapshot and cond_data associated with the per-instance
* snapshot associated with the trace_array.
*
* The cond_snapshot.update() implementation can save data to be
* associated with the snapshot if it decides to, and returns 'true'
* in that case, or it returns 'false' if the conditional snapshot
* shouldn't be taken.
*
* The cond_snapshot instance is created and associated with the
* user-defined cond_data by tracing_cond_snapshot_enable().
* Likewise, the cond_snapshot instance is destroyed and is no longer
* associated with the trace instance by
* tracing_cond_snapshot_disable().
*
* The method below is required.
*
* @update: When a conditional snapshot is invoked, the update()
* callback function is invoked with the tr->max_lock held. The
* update() implementation signals whether or not to actually
* take the snapshot, by returning 'true' if so, 'false' if no
* snapshot should be taken. Because the max_lock is held for
* the duration of update(), the implementation is safe to
* directly retrieven and save any implementation data it needs
* to in association with the snapshot.
*/
struct cond_snapshot {
void *cond_data;
cond_update_fn_t update;
};
/*
* The trace array - an array of per-CPU trace arrays. This is the
* highest level data structure that individual tracers deal with.
@ -277,6 +322,9 @@ struct trace_array {
#endif
int time_stamp_abs_ref;
struct list_head hist_vars;
#ifdef CONFIG_TRACER_SNAPSHOT
struct cond_snapshot *cond_snapshot;
#endif
};
enum {
@ -727,7 +775,8 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
const char __user *ubuf, size_t cnt);
#ifdef CONFIG_TRACER_MAX_TRACE
void update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu);
void update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
void *cond_data);
void update_max_tr_single(struct trace_array *tr,
struct task_struct *tsk, int cpu);
#endif /* CONFIG_TRACER_MAX_TRACE */
@ -855,10 +904,11 @@ static __always_inline bool ftrace_hash_empty(struct ftrace_hash *hash)
#define TRACE_GRAPH_PRINT_PROC 0x8
#define TRACE_GRAPH_PRINT_DURATION 0x10
#define TRACE_GRAPH_PRINT_ABS_TIME 0x20
#define TRACE_GRAPH_PRINT_IRQS 0x40
#define TRACE_GRAPH_PRINT_TAIL 0x80
#define TRACE_GRAPH_SLEEP_TIME 0x100
#define TRACE_GRAPH_GRAPH_TIME 0x200
#define TRACE_GRAPH_PRINT_REL_TIME 0x40
#define TRACE_GRAPH_PRINT_IRQS 0x80
#define TRACE_GRAPH_PRINT_TAIL 0x100
#define TRACE_GRAPH_SLEEP_TIME 0x200
#define TRACE_GRAPH_GRAPH_TIME 0x400
#define TRACE_GRAPH_PRINT_FILL_SHIFT 28
#define TRACE_GRAPH_PRINT_FILL_MASK (0x3 << TRACE_GRAPH_PRINT_FILL_SHIFT)
@ -1458,6 +1508,7 @@ enum regex_type {
MATCH_MIDDLE_ONLY,
MATCH_END_ONLY,
MATCH_GLOB,
MATCH_INDEX,
};
struct regex {
@ -1808,6 +1859,11 @@ static inline bool event_command_needs_rec(struct event_command *cmd_ops)
extern int trace_event_enable_disable(struct trace_event_file *file,
int enable, int soft_disable);
extern int tracing_alloc_snapshot(void);
extern void tracing_snapshot_cond(struct trace_array *tr, void *cond_data);
extern int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data, cond_update_fn_t update);
extern int tracing_snapshot_cond_disable(struct trace_array *tr);
extern void *tracing_cond_snapshot_data(struct trace_array *tr);
extern const char *__start___trace_bprintk_fmt[];
extern const char *__stop___trace_bprintk_fmt[];

View File

@ -65,7 +65,8 @@ FTRACE_ENTRY_REG(function, ftrace_entry,
__field( unsigned long, parent_ip )
),
F_printk(" %lx <-- %lx", __entry->ip, __entry->parent_ip),
F_printk(" %ps <-- %ps",
(void *)__entry->ip, (void *)__entry->parent_ip),
FILTER_TRACE_FN,
@ -83,7 +84,7 @@ FTRACE_ENTRY_PACKED(funcgraph_entry, ftrace_graph_ent_entry,
__field_desc( int, graph_ent, depth )
),
F_printk("--> %lx (%d)", __entry->func, __entry->depth),
F_printk("--> %ps (%d)", (void *)__entry->func, __entry->depth),
FILTER_OTHER
);
@ -102,8 +103,8 @@ FTRACE_ENTRY_PACKED(funcgraph_exit, ftrace_graph_ret_entry,
__field_desc( int, ret, depth )
),
F_printk("<-- %lx (%d) (start: %llx end: %llx) over: %d",
__entry->func, __entry->depth,
F_printk("<-- %ps (%d) (start: %llx end: %llx) over: %d",
(void *)__entry->func, __entry->depth,
__entry->calltime, __entry->rettime,
__entry->depth),
@ -167,12 +168,6 @@ FTRACE_ENTRY_DUP(wakeup, ctx_switch_entry,
#define FTRACE_STACK_ENTRIES 8
#ifndef CONFIG_64BIT
# define IP_FMT "%08lx"
#else
# define IP_FMT "%016lx"
#endif
FTRACE_ENTRY(kernel_stack, stack_entry,
TRACE_STACK,
@ -182,12 +177,13 @@ FTRACE_ENTRY(kernel_stack, stack_entry,
__dynamic_array(unsigned long, caller )
),
F_printk("\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n"
"\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n"
"\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n",
__entry->caller[0], __entry->caller[1], __entry->caller[2],
__entry->caller[3], __entry->caller[4], __entry->caller[5],
__entry->caller[6], __entry->caller[7]),
F_printk("\t=> %ps\n\t=> %ps\n\t=> %ps\n"
"\t=> %ps\n\t=> %ps\n\t=> %ps\n"
"\t=> %ps\n\t=> %ps\n",
(void *)__entry->caller[0], (void *)__entry->caller[1],
(void *)__entry->caller[2], (void *)__entry->caller[3],
(void *)__entry->caller[4], (void *)__entry->caller[5],
(void *)__entry->caller[6], (void *)__entry->caller[7]),
FILTER_OTHER
);
@ -201,12 +197,13 @@ FTRACE_ENTRY(user_stack, userstack_entry,
__array( unsigned long, caller, FTRACE_STACK_ENTRIES )
),
F_printk("\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n"
"\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n"
"\t=> (" IP_FMT ")\n\t=> (" IP_FMT ")\n",
__entry->caller[0], __entry->caller[1], __entry->caller[2],
__entry->caller[3], __entry->caller[4], __entry->caller[5],
__entry->caller[6], __entry->caller[7]),
F_printk("\t=> %ps\n\t=> %ps\n\t=> %ps\n"
"\t=> %ps\n\t=> %ps\n\t=> %ps\n"
"\t=> %ps\n\t=> %ps\n",
(void *)__entry->caller[0], (void *)__entry->caller[1],
(void *)__entry->caller[2], (void *)__entry->caller[3],
(void *)__entry->caller[4], (void *)__entry->caller[5],
(void *)__entry->caller[6], (void *)__entry->caller[7]),
FILTER_OTHER
);

View File

@ -299,15 +299,13 @@ int perf_uprobe_init(struct perf_event *p_event,
if (!p_event->attr.uprobe_path)
return -EINVAL;
path = kzalloc(PATH_MAX, GFP_KERNEL);
if (!path)
return -ENOMEM;
ret = strncpy_from_user(
path, u64_to_user_ptr(p_event->attr.uprobe_path), PATH_MAX);
if (ret == PATH_MAX)
return -E2BIG;
if (ret < 0)
goto out;
path = strndup_user(u64_to_user_ptr(p_event->attr.uprobe_path),
PATH_MAX);
if (IS_ERR(path)) {
ret = PTR_ERR(path);
return (ret == -EINVAL) ? -E2BIG : ret;
}
if (path[0] == '\0') {
ret = -EINVAL;
goto out;

View File

@ -491,10 +491,12 @@ predicate_parse(const char *str, int nr_parens, int nr_preds,
break;
case '&':
case '|':
/* accepting only "&&" or "||" */
if (next[1] == next[0]) {
ptr++;
break;
}
/* fall through */
default:
parse_error(pe, FILT_ERR_TOO_MANY_PREDS,
next - str);
@ -823,6 +825,9 @@ enum regex_type filter_parse_regex(char *buff, int len, char **search, int *not)
*search = buff;
if (isdigit(buff[0]))
return MATCH_INDEX;
for (i = 0; i < len; i++) {
if (buff[i] == '*') {
if (!i) {
@ -860,6 +865,8 @@ static void filter_build_regex(struct filter_pred *pred)
}
switch (type) {
/* MATCH_INDEX should not happen, but if it does, match full */
case MATCH_INDEX:
case MATCH_FULL:
r->match = regex_match_full;
break;

File diff suppressed because it is too large Load Diff

View File

@ -380,6 +380,7 @@ static void print_graph_lat_fmt(struct trace_seq *s, struct trace_entry *entry)
{
trace_seq_putc(s, ' ');
trace_print_lat_fmt(s, entry);
trace_seq_puts(s, " | ");
}
/* If the pid changed since the last trace, output this event */
@ -500,6 +501,17 @@ static void print_graph_abs_time(u64 t, struct trace_seq *s)
(unsigned long)t, usecs_rem);
}
static void
print_graph_rel_time(struct trace_iterator *iter, struct trace_seq *s)
{
unsigned long long usecs;
usecs = iter->ts - iter->trace_buffer->time_start;
do_div(usecs, NSEC_PER_USEC);
trace_seq_printf(s, "%9llu us | ", usecs);
}
static void
print_graph_irq(struct trace_iterator *iter, unsigned long addr,
enum trace_type type, int cpu, pid_t pid, u32 flags)
@ -517,6 +529,10 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr,
if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
print_graph_abs_time(iter->ts, s);
/* Relative time */
if (flags & TRACE_GRAPH_PRINT_REL_TIME)
print_graph_rel_time(iter, s);
/* Cpu */
if (flags & TRACE_GRAPH_PRINT_CPU)
print_graph_cpu(s, cpu);
@ -725,6 +741,10 @@ print_graph_prologue(struct trace_iterator *iter, struct trace_seq *s,
if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
print_graph_abs_time(iter->ts, s);
/* Relative time */
if (flags & TRACE_GRAPH_PRINT_REL_TIME)
print_graph_rel_time(iter, s);
/* Cpu */
if (flags & TRACE_GRAPH_PRINT_CPU)
print_graph_cpu(s, cpu);
@ -1101,6 +1121,8 @@ static void print_lat_header(struct seq_file *s, u32 flags)
if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
size += 16;
if (flags & TRACE_GRAPH_PRINT_REL_TIME)
size += 16;
if (flags & TRACE_GRAPH_PRINT_CPU)
size += 4;
if (flags & TRACE_GRAPH_PRINT_PROC)
@ -1125,12 +1147,14 @@ static void __print_graph_headers_flags(struct trace_array *tr,
seq_putc(s, '#');
if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
seq_puts(s, " TIME ");
if (flags & TRACE_GRAPH_PRINT_REL_TIME)
seq_puts(s, " REL TIME ");
if (flags & TRACE_GRAPH_PRINT_CPU)
seq_puts(s, " CPU");
if (flags & TRACE_GRAPH_PRINT_PROC)
seq_puts(s, " TASK/PID ");
if (lat)
seq_puts(s, "||||");
seq_puts(s, "|||| ");
if (flags & TRACE_GRAPH_PRINT_DURATION)
seq_puts(s, " DURATION ");
seq_puts(s, " FUNCTION CALLS\n");
@ -1139,12 +1163,14 @@ static void __print_graph_headers_flags(struct trace_array *tr,
seq_putc(s, '#');
if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
seq_puts(s, " | ");
if (flags & TRACE_GRAPH_PRINT_REL_TIME)
seq_puts(s, " | ");
if (flags & TRACE_GRAPH_PRINT_CPU)
seq_puts(s, " | ");
if (flags & TRACE_GRAPH_PRINT_PROC)
seq_puts(s, " | | ");
if (lat)
seq_puts(s, "||||");
seq_puts(s, "|||| ");
if (flags & TRACE_GRAPH_PRINT_DURATION)
seq_puts(s, " | | ");
seq_puts(s, " | | | |\n");

View File

@ -239,7 +239,7 @@ static void irqsoff_trace_close(struct trace_iterator *iter)
#define GRAPH_TRACER_FLAGS (TRACE_GRAPH_PRINT_CPU | \
TRACE_GRAPH_PRINT_PROC | \
TRACE_GRAPH_PRINT_ABS_TIME | \
TRACE_GRAPH_PRINT_REL_TIME | \
TRACE_GRAPH_PRINT_DURATION)
static enum print_line_t irqsoff_print_line(struct trace_iterator *iter)

View File

@ -300,6 +300,7 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
case '+': /* deref memory */
arg++; /* Skip '+', because kstrtol() rejects it. */
/* fall through */
case '-':
tmp = strchr(arg, '(');
if (!tmp)

View File

@ -180,8 +180,11 @@ static void wakeup_trace_close(struct trace_iterator *iter)
}
#define GRAPH_TRACER_FLAGS (TRACE_GRAPH_PRINT_PROC | \
TRACE_GRAPH_PRINT_ABS_TIME | \
TRACE_GRAPH_PRINT_DURATION)
TRACE_GRAPH_PRINT_CPU | \
TRACE_GRAPH_PRINT_REL_TIME | \
TRACE_GRAPH_PRINT_DURATION | \
TRACE_GRAPH_PRINT_OVERHEAD | \
TRACE_GRAPH_PRINT_IRQS)
static enum print_line_t wakeup_print_line(struct trace_iterator *iter)
{
@ -472,6 +475,7 @@ probe_wakeup_sched_switch(void *ignore, bool preempt,
__trace_function(wakeup_trace, CALLER_ADDR0, CALLER_ADDR1, flags, pc);
tracing_sched_switch_trace(wakeup_trace, prev, next, flags, pc);
__trace_stack(wakeup_trace, flags, 0, pc);
T0 = data->preempt_timestamp;
T1 = ftrace_now(cpu);
@ -482,7 +486,7 @@ probe_wakeup_sched_switch(void *ignore, bool preempt,
if (likely(!is_tracing_stopped())) {
wakeup_trace->max_latency = delta;
update_max_tr(wakeup_trace, wakeup_task, wakeup_cpu);
update_max_tr(wakeup_trace, wakeup_task, wakeup_cpu, NULL);
}
out_unlock:
@ -583,6 +587,7 @@ probe_wakeup(void *ignore, struct task_struct *p)
data = per_cpu_ptr(wakeup_trace->trace_buffer.data, wakeup_cpu);
data->preempt_timestamp = ftrace_now(cpu);
tracing_sched_wakeup_trace(wakeup_trace, p, current, flags, pc);
__trace_stack(wakeup_trace, flags, 0, pc);
/*
* We must be careful in using CALLER_ADDR2. But since wake_up

View File

@ -0,0 +1,30 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test inter-event histogram trigger expected fail actions
fail() { #msg
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f snapshot ]; then
echo "snapshot is not supported"
exit_unsupported
fi
grep -q "snapshot()" README || exit_unsupported # version issue
echo "Test expected snapshot action failure"
echo 'hist:keys=comm:onmatch(sched.sched_wakeup).snapshot()' >> /sys/kernel/debug/tracing/events/sched/sched_waking/trigger && exit_fail
echo "Test expected save action failure"
echo 'hist:keys=comm:onmatch(sched.sched_wakeup).save(comm,prio)' >> /sys/kernel/debug/tracing/events/sched/sched_waking/trigger && exit_fail
exit_xfail

View File

@ -1,4 +1,5 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test extended error support

View File

@ -1,4 +1,5 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test field variable support
fail() { #msg

View File

@ -1,4 +1,5 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test inter-event combined histogram trigger
fail() { #msg

View File

@ -1,4 +1,5 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test multiple actions on hist trigger
fail() { #msg

View File

@ -0,0 +1,28 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test inter-event histogram trigger onchange action
fail() { #msg
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
grep -q "onchange(var)" README || exit_unsupported # version issue
echo "Test onchange action"
echo 'hist:keys=comm:newprio=prio:onchange($newprio).save(comm,prio) if comm=="ping"' >> /sys/kernel/debug/tracing/events/sched/sched_waking/trigger
ping $LOCALHOST -c 3
nice -n 1 ping $LOCALHOST -c 3
if ! grep -q "changed:" events/sched/sched_waking/hist; then
fail "Failed to create onchange action inter-event histogram"
fi
exit 0

View File

@ -1,4 +1,5 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test inter-event histogram trigger onmatch action
fail() { #msg

View File

@ -1,4 +1,5 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test inter-event histogram trigger onmatch-onmax action
fail() { #msg

View File

@ -1,4 +1,5 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test inter-event histogram trigger onmax action
fail() { #msg

View File

@ -0,0 +1,43 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test inter-event histogram trigger snapshot action
fail() { #msg
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f snapshot ]; then
echo "snapshot is not supported"
exit_unsupported
fi
grep -q "onchange(var)" README || exit_unsupported # version issue
grep -q "snapshot()" README || exit_unsupported # version issue
echo "Test snapshot action"
echo 1 > /sys/kernel/debug/tracing/events/sched/enable
echo 'hist:keys=comm:newprio=prio:onchange($newprio).save(comm,prio):onchange($newprio).snapshot() if comm=="ping"' >> /sys/kernel/debug/tracing/events/sched/sched_waking/trigger
ping $LOCALHOST -c 3
nice -n 1 ping $LOCALHOST -c 3
echo 0 > tracing_on
if ! grep -q "changed:" events/sched/sched_waking/hist; then
fail "Failed to create onchange action inter-event histogram"
fi
if ! grep -q "comm=ping" snapshot; then
fail "Failed to create snapshot action inter-event histogram"
fi
exit 0

View File

@ -1,4 +1,5 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test synthetic event create remove
fail() { #msg

View File

@ -0,0 +1,42 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test inter-event histogram trigger trace action
fail() { #msg
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f synthetic_events ]; then
echo "synthetic event is not supported"
exit_unsupported
fi
grep -q "trace(<synthetic_event>" README || exit_unsupported # version issue
echo "Test create synthetic event"
echo 'wakeup_latency u64 lat pid_t pid char comm[16]' > synthetic_events
if [ ! -d events/synthetic/wakeup_latency ]; then
fail "Failed to create wakeup_latency synthetic event"
fi
echo "Test create histogram for synthetic event using trace action"
echo "Test histogram variables,simple expression support and trace action"
echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="ping"' > events/sched/sched_wakeup/trigger
echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_wakeup).trace(wakeup_latency,$wakeup_lat,next_pid,next_comm) if next_comm=="ping"' > events/sched/sched_switch/trigger
echo 'hist:keys=comm,pid,lat:wakeup_lat=lat:sort=lat' > events/synthetic/wakeup_latency/trigger
ping $LOCALHOST -c 5
if ! grep -q "ping" events/synthetic/wakeup_latency/hist; then
fail "Failed to create trace action inter-event histogram"
fi
exit 0