| .. | .. | 
|---|
| 24 | 24 | performance issues that take place outside of user-space. | 
|---|
| 25 | 25 |  | 
|---|
| 26 | 26 | Although ftrace is typically considered the function tracer, it | 
|---|
| 27 |  | -is really a frame work of several assorted tracing utilities. | 
|---|
|  | 27 | +is really a framework of several assorted tracing utilities. | 
|---|
| 28 | 28 | There's latency tracing to examine what occurs between interrupts | 
|---|
| 29 | 29 | disabled and enabled, as well as for preemption and from a time | 
|---|
| 30 | 30 | a task is woken to the task is actually scheduled in. | 
|---|
| 31 | 31 |  | 
|---|
| 32 | 32 | One of the most common uses of ftrace is the event tracing. | 
|---|
| 33 |  | -Through out the kernel is hundreds of static event points that | 
|---|
|  | 33 | +Throughout the kernel is hundreds of static event points that | 
|---|
| 34 | 34 | can be enabled via the tracefs file system to see what is | 
|---|
| 35 | 35 | going on in certain parts of the kernel. | 
|---|
| 36 | 36 |  | 
|---|
| 37 |  | -See events.txt for more information. | 
|---|
|  | 37 | +See events.rst for more information. | 
|---|
| 38 | 38 |  | 
|---|
| 39 | 39 |  | 
|---|
| 40 | 40 | Implementation Details | 
|---|
| .. | .. | 
|---|
| 76 | 76 | All files located in the tracefs file system will be located in that | 
|---|
| 77 | 77 | debugfs file system directory as well. | 
|---|
| 78 | 78 |  | 
|---|
|  | 79 | +  In order to not automount tracefs in the debugfs filesystem, enable the | 
|---|
|  | 80 | +  defconfig option CONFIG_TRACEFS_DISABLE_AUTOMOUNT. | 
|---|
|  | 81 | + | 
|---|
| 79 | 82 | .. attention:: | 
|---|
| 80 | 83 |  | 
|---|
| 81 | 84 | Any selected ftrace option will also create the tracefs file system. | 
|---|
| .. | .. | 
|---|
| 95 | 98 | current_tracer: | 
|---|
| 96 | 99 |  | 
|---|
| 97 | 100 | This is used to set or display the current tracer | 
|---|
| 98 |  | -	that is configured. | 
|---|
|  | 101 | +	that is configured. Changing the current tracer clears | 
|---|
|  | 102 | +	the ring buffer content as well as the "snapshot" buffer. | 
|---|
| 99 | 103 |  | 
|---|
| 100 | 104 | available_tracers: | 
|---|
| 101 | 105 |  | 
|---|
| .. | .. | 
|---|
| 124 | 128 | trace: | 
|---|
| 125 | 129 |  | 
|---|
| 126 | 130 | This file holds the output of the trace in a human | 
|---|
| 127 |  | -	readable format (described below). Note, tracing is temporarily | 
|---|
| 128 |  | -	disabled while this file is being read (opened). | 
|---|
|  | 131 | +	readable format (described below). Opening this file for | 
|---|
|  | 132 | +	writing with the O_TRUNC flag clears the ring buffer content. | 
|---|
|  | 133 | +        Note, this file is not a consumer. If tracing is off | 
|---|
|  | 134 | +        (no tracer running, or tracing_on is zero), it will produce | 
|---|
|  | 135 | +        the same output each time it is read. When tracing is on, | 
|---|
|  | 136 | +        it may produce inconsistent results as it tries to read | 
|---|
|  | 137 | +        the entire buffer without consuming it. | 
|---|
| 129 | 138 |  | 
|---|
| 130 | 139 | trace_pipe: | 
|---|
| 131 | 140 |  | 
|---|
| .. | .. | 
|---|
| 139 | 148 | will not be read again with a sequential read. The | 
|---|
| 140 | 149 | "trace" file is static, and if the tracer is not | 
|---|
| 141 | 150 | adding more data, it will display the same | 
|---|
| 142 |  | -	information every time it is read. This file will not | 
|---|
| 143 |  | -	disable tracing while being read. | 
|---|
|  | 151 | +	information every time it is read. | 
|---|
| 144 | 152 |  | 
|---|
| 145 | 153 | trace_options: | 
|---|
| 146 | 154 |  | 
|---|
| .. | .. | 
|---|
| 183 | 191 | CPU buffer and not total size of all buffers. The | 
|---|
| 184 | 192 | trace buffers are allocated in pages (blocks of memory | 
|---|
| 185 | 193 | that the kernel uses for allocation, usually 4 KB in size). | 
|---|
| 186 |  | -	If the last page allocated has room for more bytes | 
|---|
|  | 194 | +	A few extra pages may be allocated to accommodate buffer management | 
|---|
|  | 195 | +	meta-data. If the last page allocated has room for more bytes | 
|---|
| 187 | 196 | than requested, the rest of the page will be used, | 
|---|
| 188 | 197 | making the actual allocation bigger than requested or shown. | 
|---|
| 189 | 198 | ( Note, the size may not be a multiple of the page size | 
|---|
| .. | .. | 
|---|
| 233 | 242 | This interface also allows for commands to be used. See the | 
|---|
| 234 | 243 | "Filter commands" section for more details. | 
|---|
| 235 | 244 |  | 
|---|
|  | 245 | +	As a speed up, since processing strings can be quite expensive | 
|---|
|  | 246 | +	and requires a check of all functions registered to tracing, instead | 
|---|
|  | 247 | +	an index can be written into this file. A number (starting with "1") | 
|---|
|  | 248 | +	written will instead select the same corresponding at the line position | 
|---|
|  | 249 | +	of the "available_filter_functions" file. | 
|---|
|  | 250 | + | 
|---|
| 236 | 251 | set_ftrace_notrace: | 
|---|
| 237 | 252 |  | 
|---|
| 238 | 253 | This has an effect opposite to that of | 
|---|
| .. | .. | 
|---|
| 251 | 266 | traced by the function tracer as well. This option will also | 
|---|
| 252 | 267 | cause PIDs of tasks that exit to be removed from the file. | 
|---|
| 253 | 268 |  | 
|---|
|  | 269 | +  set_ftrace_notrace_pid: | 
|---|
|  | 270 | + | 
|---|
|  | 271 | +        Have the function tracer ignore threads whose PID are listed in | 
|---|
|  | 272 | +        this file. | 
|---|
|  | 273 | + | 
|---|
|  | 274 | +        If the "function-fork" option is set, then when a task whose | 
|---|
|  | 275 | +	PID is listed in this file forks, the child's PID will | 
|---|
|  | 276 | +	automatically be added to this file, and the child will not be | 
|---|
|  | 277 | +	traced by the function tracer as well. This option will also | 
|---|
|  | 278 | +	cause PIDs of tasks that exit to be removed from the file. | 
|---|
|  | 279 | + | 
|---|
|  | 280 | +        If a PID is in both this file and "set_ftrace_pid", then this | 
|---|
|  | 281 | +        file takes precedence, and the thread will not be traced. | 
|---|
|  | 282 | + | 
|---|
| 254 | 283 | set_event_pid: | 
|---|
| 255 | 284 |  | 
|---|
| 256 | 285 | Have the events only trace a task with a PID listed in this file. | 
|---|
| 257 | 286 | Note, sched_switch and sched_wake_up will also trace events | 
|---|
| 258 | 287 | listed in this file. | 
|---|
|  | 288 | + | 
|---|
|  | 289 | +	To have the PIDs of children of tasks with their PID in this file | 
|---|
|  | 290 | +	added on fork, enable the "event-fork" option. That option will also | 
|---|
|  | 291 | +	cause the PIDs of tasks to be removed from this file when the task | 
|---|
|  | 292 | +	exits. | 
|---|
|  | 293 | + | 
|---|
|  | 294 | +  set_event_notrace_pid: | 
|---|
|  | 295 | + | 
|---|
|  | 296 | +	Have the events not trace a task with a PID listed in this file. | 
|---|
|  | 297 | +	Note, sched_switch and sched_wakeup will trace threads not listed | 
|---|
|  | 298 | +	in this file, even if a thread's PID is in the file if the | 
|---|
|  | 299 | +        sched_switch or sched_wakeup events also trace a thread that should | 
|---|
|  | 300 | +        be traced. | 
|---|
| 259 | 301 |  | 
|---|
| 260 | 302 | To have the PIDs of children of tasks with their PID in this file | 
|---|
| 261 | 303 | added on fork, enable the "event-fork" option. That option will also | 
|---|
| .. | .. | 
|---|
| 337 | 379 |  | 
|---|
| 338 | 380 | kprobe_events: | 
|---|
| 339 | 381 |  | 
|---|
| 340 |  | -	Enable dynamic trace points. See kprobetrace.txt. | 
|---|
|  | 382 | +	Enable dynamic trace points. See kprobetrace.rst. | 
|---|
| 341 | 383 |  | 
|---|
| 342 | 384 | kprobe_profile: | 
|---|
| 343 | 385 |  | 
|---|
| 344 |  | -	Dynamic trace points stats. See kprobetrace.txt. | 
|---|
|  | 386 | +	Dynamic trace points stats. See kprobetrace.rst. | 
|---|
| 345 | 387 |  | 
|---|
| 346 | 388 | max_graph_depth: | 
|---|
| 347 | 389 |  | 
|---|
| .. | .. | 
|---|
| 374 | 416 |  | 
|---|
| 375 | 417 | By default, 128 comms are saved (see "saved_cmdlines" above). To | 
|---|
| 376 | 418 | increase or decrease the amount of comms that are cached, echo | 
|---|
| 377 |  | -	in a the number of comms to cache, into this file. | 
|---|
|  | 419 | +	the number of comms to cache into this file. | 
|---|
| 378 | 420 |  | 
|---|
| 379 | 421 | saved_tgids: | 
|---|
| 380 | 422 |  | 
|---|
| .. | .. | 
|---|
| 462 | 504 |  | 
|---|
| 463 | 505 | mono_raw: | 
|---|
| 464 | 506 | This is the raw monotonic clock (CLOCK_MONOTONIC_RAW) | 
|---|
| 465 |  | -		which is montonic but is not subject to any rate adjustments | 
|---|
|  | 507 | +		which is monotonic but is not subject to any rate adjustments | 
|---|
| 466 | 508 | and ticks at the same rate as the hardware clocksource. | 
|---|
| 467 | 509 |  | 
|---|
| 468 | 510 | boot: | 
|---|
| .. | .. | 
|---|
| 481 | 523 | To set a clock, simply echo the clock name into this file:: | 
|---|
| 482 | 524 |  | 
|---|
| 483 | 525 | # echo global > trace_clock | 
|---|
|  | 526 | + | 
|---|
|  | 527 | +	Setting a clock clears the ring buffer content as well as the | 
|---|
|  | 528 | +	"snapshot" buffer. | 
|---|
| 484 | 529 |  | 
|---|
| 485 | 530 | trace_marker: | 
|---|
| 486 | 531 |  | 
|---|
| .. | .. | 
|---|
| 519 | 564 |  | 
|---|
| 520 | 565 | trace_marker_raw: | 
|---|
| 521 | 566 |  | 
|---|
| 522 |  | -	This is similar to trace_marker above, but is meant for for binary data | 
|---|
|  | 567 | +	This is similar to trace_marker above, but is meant for binary data | 
|---|
| 523 | 568 | to be written to it, where a tool can be used to parse the data | 
|---|
| 524 | 569 | from trace_pipe_raw. | 
|---|
| 525 | 570 |  | 
|---|
| 526 | 571 | uprobe_events: | 
|---|
| 527 | 572 |  | 
|---|
| 528 | 573 | Add dynamic tracepoints in programs. | 
|---|
| 529 |  | -	See uprobetracer.txt | 
|---|
|  | 574 | +	See uprobetracer.rst | 
|---|
| 530 | 575 |  | 
|---|
| 531 | 576 | uprobe_profile: | 
|---|
| 532 | 577 |  | 
|---|
| .. | .. | 
|---|
| 547 | 592 | files at various levels that can enable the tracepoints | 
|---|
| 548 | 593 | when a "1" is written to them. | 
|---|
| 549 | 594 |  | 
|---|
| 550 |  | -	See events.txt for more information. | 
|---|
|  | 595 | +	See events.rst for more information. | 
|---|
| 551 | 596 |  | 
|---|
| 552 | 597 | set_event: | 
|---|
| 553 | 598 |  | 
|---|
| 554 | 599 | By echoing in the event into this file, will enable that event. | 
|---|
| 555 | 600 |  | 
|---|
| 556 |  | -	See events.txt for more information. | 
|---|
|  | 601 | +	See events.rst for more information. | 
|---|
| 557 | 602 |  | 
|---|
| 558 | 603 | available_events: | 
|---|
| 559 | 604 |  | 
|---|
| 560 | 605 | A list of events that can be enabled in tracing. | 
|---|
| 561 | 606 |  | 
|---|
| 562 |  | -	See events.txt for more information. | 
|---|
|  | 607 | +	See events.rst for more information. | 
|---|
| 563 | 608 |  | 
|---|
| 564 | 609 | timestamp_mode: | 
|---|
| 565 | 610 |  | 
|---|
| .. | .. | 
|---|
| 759 | 804 | tracers from tracing simply echo "nop" into | 
|---|
| 760 | 805 | current_tracer. | 
|---|
| 761 | 806 |  | 
|---|
|  | 807 | +Error conditions | 
|---|
|  | 808 | +---------------- | 
|---|
|  | 809 | + | 
|---|
|  | 810 | +  For most ftrace commands, failure modes are obvious and communicated | 
|---|
|  | 811 | +  using standard return codes. | 
|---|
|  | 812 | + | 
|---|
|  | 813 | +  For other more involved commands, extended error information may be | 
|---|
|  | 814 | +  available via the tracing/error_log file.  For the commands that | 
|---|
|  | 815 | +  support it, reading the tracing/error_log file after an error will | 
|---|
|  | 816 | +  display more detailed information about what went wrong, if | 
|---|
|  | 817 | +  information is available.  The tracing/error_log file is a circular | 
|---|
|  | 818 | +  error log displaying a small number (currently, 8) of ftrace errors | 
|---|
|  | 819 | +  for the last (8) failed commands. | 
|---|
|  | 820 | + | 
|---|
|  | 821 | +  The extended error information and usage takes the form shown in | 
|---|
|  | 822 | +  this example:: | 
|---|
|  | 823 | + | 
|---|
|  | 824 | +    # echo xxx > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger | 
|---|
|  | 825 | +    echo: write error: Invalid argument | 
|---|
|  | 826 | + | 
|---|
|  | 827 | +    # cat /sys/kernel/debug/tracing/error_log | 
|---|
|  | 828 | +    [ 5348.887237] location: error: Couldn't yyy: zzz | 
|---|
|  | 829 | +      Command: xxx | 
|---|
|  | 830 | +               ^ | 
|---|
|  | 831 | +    [ 7517.023364] location: error: Bad rrr: sss | 
|---|
|  | 832 | +      Command: ppp qqq | 
|---|
|  | 833 | +                   ^ | 
|---|
|  | 834 | + | 
|---|
|  | 835 | +  To clear the error log, echo the empty string into it:: | 
|---|
|  | 836 | + | 
|---|
|  | 837 | +    # echo > /sys/kernel/debug/tracing/error_log | 
|---|
| 762 | 838 |  | 
|---|
| 763 | 839 | Examples of using the tracer | 
|---|
| 764 | 840 | ---------------------------- | 
|---|
| .. | .. | 
|---|
| 914 | 990 | current trace and the next trace. | 
|---|
| 915 | 991 |  | 
|---|
| 916 | 992 | - '$' - greater than 1 second | 
|---|
| 917 |  | -	  - '@' - greater than 100 milisecond | 
|---|
| 918 |  | -	  - '*' - greater than 10 milisecond | 
|---|
|  | 993 | +	  - '@' - greater than 100 millisecond | 
|---|
|  | 994 | +	  - '*' - greater than 10 millisecond | 
|---|
| 919 | 995 | - '#' - greater than 1000 microsecond | 
|---|
| 920 | 996 | - '!' - greater than 100 microsecond | 
|---|
| 921 | 997 | - '+' - greater than 10 microsecond | 
|---|
| .. | .. | 
|---|
| 1080 | 1156 | the trace displays additional information about the | 
|---|
| 1081 | 1157 | latency, as described in "Latency trace format". | 
|---|
| 1082 | 1158 |  | 
|---|
|  | 1159 | +  pause-on-trace | 
|---|
|  | 1160 | +	When set, opening the trace file for read, will pause | 
|---|
|  | 1161 | +	writing to the ring buffer (as if tracing_on was set to zero). | 
|---|
|  | 1162 | +	This simulates the original behavior of the trace file. | 
|---|
|  | 1163 | +	When the file is closed, tracing will be enabled again. | 
|---|
|  | 1164 | + | 
|---|
| 1083 | 1165 | record-cmd | 
|---|
| 1084 | 1166 | When any event or tracer is enabled, a hook is enabled | 
|---|
| 1085 | 1167 | in the sched_switch trace point to fill comm cache | 
|---|
| .. | .. | 
|---|
| 1131 | 1213 | tasks fork. Also, when tasks with PIDs in set_event_pid exit, | 
|---|
| 1132 | 1214 | their PIDs will be removed from the file. | 
|---|
| 1133 | 1215 |  | 
|---|
|  | 1216 | +        This affects PIDs listed in set_event_notrace_pid as well. | 
|---|
|  | 1217 | + | 
|---|
| 1134 | 1218 | function-trace | 
|---|
| 1135 | 1219 | The latency tracers will enable function tracing | 
|---|
| 1136 | 1220 | if this option is enabled (default it is). When | 
|---|
| .. | .. | 
|---|
| 1144 | 1228 | when those tasks fork. Also, when tasks with PIDs in | 
|---|
| 1145 | 1229 | set_ftrace_pid exit, their PIDs will be removed from the | 
|---|
| 1146 | 1230 | file. | 
|---|
|  | 1231 | + | 
|---|
|  | 1232 | +        This affects PIDs in set_ftrace_notrace_pid as well. | 
|---|
| 1147 | 1233 |  | 
|---|
| 1148 | 1234 | display-graph | 
|---|
| 1149 | 1235 | When set, the latency tracers (irqsoff, wakeup, etc) will | 
|---|
| .. | .. | 
|---|
| 1311 | 1397 | => x86_64_start_reservations | 
|---|
| 1312 | 1398 | => x86_64_start_kernel | 
|---|
| 1313 | 1399 |  | 
|---|
| 1314 |  | -Here we see that that we had a latency of 16 microseconds (which is | 
|---|
|  | 1400 | +Here we see that we had a latency of 16 microseconds (which is | 
|---|
| 1315 | 1401 | very good). The _raw_spin_lock_irq in run_timer_softirq disabled | 
|---|
| 1316 | 1402 | interrupts. The difference between the 16 and the displayed | 
|---|
| 1317 | 1403 | timestamp 25us occurred because the clock was incremented | 
|---|
| .. | .. | 
|---|
| 1370 | 1456 | => __blk_run_queue_uncond | 
|---|
| 1371 | 1457 | => __blk_run_queue | 
|---|
| 1372 | 1458 | => blk_queue_bio | 
|---|
| 1373 |  | -   => generic_make_request | 
|---|
|  | 1459 | +   => submit_bio_noacct | 
|---|
| 1374 | 1460 | => submit_bio | 
|---|
| 1375 | 1461 | => submit_bh | 
|---|
| 1376 | 1462 | => __ext3_get_inode_loc | 
|---|
| .. | .. | 
|---|
| 1396 | 1482 | overhead may extend the latency times. But nevertheless, this | 
|---|
| 1397 | 1483 | trace has provided some very helpful debugging information. | 
|---|
| 1398 | 1484 |  | 
|---|
|  | 1485 | +If we prefer function graph output instead of function, we can set | 
|---|
|  | 1486 | +display-graph option:: | 
|---|
|  | 1487 | + | 
|---|
|  | 1488 | + with echo 1 > options/display-graph | 
|---|
|  | 1489 | + | 
|---|
|  | 1490 | +  # tracer: irqsoff | 
|---|
|  | 1491 | +  # | 
|---|
|  | 1492 | +  # irqsoff latency trace v1.1.5 on 4.20.0-rc6+ | 
|---|
|  | 1493 | +  # -------------------------------------------------------------------- | 
|---|
|  | 1494 | +  # latency: 3751 us, #274/274, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4) | 
|---|
|  | 1495 | +  #    ----------------- | 
|---|
|  | 1496 | +  #    | task: bash-1507 (uid:0 nice:0 policy:0 rt_prio:0) | 
|---|
|  | 1497 | +  #    ----------------- | 
|---|
|  | 1498 | +  #  => started at: free_debug_processing | 
|---|
|  | 1499 | +  #  => ended at:   return_to_handler | 
|---|
|  | 1500 | +  # | 
|---|
|  | 1501 | +  # | 
|---|
|  | 1502 | +  #                                       _-----=> irqs-off | 
|---|
|  | 1503 | +  #                                      / _----=> need-resched | 
|---|
|  | 1504 | +  #                                     | / _---=> hardirq/softirq | 
|---|
|  | 1505 | +  #                                     || / _--=> preempt-depth | 
|---|
|  | 1506 | +  #                                     ||| / | 
|---|
|  | 1507 | +  #   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS | 
|---|
|  | 1508 | +  #      |          |     |    |        ||||      |   |                     |   |   |   | | 
|---|
|  | 1509 | +          0 us |   0)   bash-1507    |  d... |   0.000 us    |  _raw_spin_lock_irqsave(); | 
|---|
|  | 1510 | +          0 us |   0)   bash-1507    |  d..1 |   0.378 us    |    do_raw_spin_trylock(); | 
|---|
|  | 1511 | +          1 us |   0)   bash-1507    |  d..2 |               |    set_track() { | 
|---|
|  | 1512 | +          2 us |   0)   bash-1507    |  d..2 |               |      save_stack_trace() { | 
|---|
|  | 1513 | +          2 us |   0)   bash-1507    |  d..2 |               |        __save_stack_trace() { | 
|---|
|  | 1514 | +          3 us |   0)   bash-1507    |  d..2 |               |          __unwind_start() { | 
|---|
|  | 1515 | +          3 us |   0)   bash-1507    |  d..2 |               |            get_stack_info() { | 
|---|
|  | 1516 | +          3 us |   0)   bash-1507    |  d..2 |   0.351 us    |              in_task_stack(); | 
|---|
|  | 1517 | +          4 us |   0)   bash-1507    |  d..2 |   1.107 us    |            } | 
|---|
|  | 1518 | +  [...] | 
|---|
|  | 1519 | +       3750 us |   0)   bash-1507    |  d..1 |   0.516 us    |      do_raw_spin_unlock(); | 
|---|
|  | 1520 | +       3750 us |   0)   bash-1507    |  d..1 |   0.000 us    |  _raw_spin_unlock_irqrestore(); | 
|---|
|  | 1521 | +       3764 us |   0)   bash-1507    |  d..1 |   0.000 us    |  tracer_hardirqs_on(); | 
|---|
|  | 1522 | +      bash-1507    0d..1 3792us : <stack trace> | 
|---|
|  | 1523 | +   => free_debug_processing | 
|---|
|  | 1524 | +   => __slab_free | 
|---|
|  | 1525 | +   => kmem_cache_free | 
|---|
|  | 1526 | +   => vm_area_free | 
|---|
|  | 1527 | +   => remove_vma | 
|---|
|  | 1528 | +   => exit_mmap | 
|---|
|  | 1529 | +   => mmput | 
|---|
|  | 1530 | +   => begin_new_exec | 
|---|
|  | 1531 | +   => load_elf_binary | 
|---|
|  | 1532 | +   => search_binary_handler | 
|---|
|  | 1533 | +   => __do_execve_file.isra.32 | 
|---|
|  | 1534 | +   => __x64_sys_execve | 
|---|
|  | 1535 | +   => do_syscall_64 | 
|---|
|  | 1536 | +   => entry_SYSCALL_64_after_hwframe | 
|---|
| 1399 | 1537 |  | 
|---|
| 1400 | 1538 | preemptoff | 
|---|
| 1401 | 1539 | ---------- | 
|---|
| .. | .. | 
|---|
| 1603 | 1741 | => __blk_run_queue_uncond | 
|---|
| 1604 | 1742 | => __blk_run_queue | 
|---|
| 1605 | 1743 | => blk_queue_bio | 
|---|
| 1606 |  | -   => generic_make_request | 
|---|
|  | 1744 | +   => submit_bio_noacct | 
|---|
| 1607 | 1745 | => submit_bio | 
|---|
| 1608 | 1746 | => submit_bh | 
|---|
| 1609 | 1747 | => ext3_bread | 
|---|
| .. | .. | 
|---|
| 2029 | 2167 | # cat trace | 
|---|
| 2030 | 2168 | # tracer: hwlat | 
|---|
| 2031 | 2169 | # | 
|---|
|  | 2170 | +  # entries-in-buffer/entries-written: 13/13   #P:8 | 
|---|
|  | 2171 | +  # | 
|---|
| 2032 | 2172 | #                              _-----=> irqs-off | 
|---|
| 2033 | 2173 | #                             / _----=> need-resched | 
|---|
| 2034 | 2174 | #                            | / _---=> hardirq/softirq | 
|---|
| .. | .. | 
|---|
| 2036 | 2176 | #                            ||| /     delay | 
|---|
| 2037 | 2177 | #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION | 
|---|
| 2038 | 2178 | #              | |       |   ||||       |         | | 
|---|
| 2039 |  | -             <...>-3638  [001] d... 19452.055471: #1     inner/outer(us):   12/14    ts:1499801089.066141940 | 
|---|
| 2040 |  | -             <...>-3638  [003] d... 19454.071354: #2     inner/outer(us):   11/9     ts:1499801091.082164365 | 
|---|
| 2041 |  | -             <...>-3638  [002] dn.. 19461.126852: #3     inner/outer(us):   12/9     ts:1499801098.138150062 | 
|---|
| 2042 |  | -             <...>-3638  [001] d... 19488.340960: #4     inner/outer(us):    8/12    ts:1499801125.354139633 | 
|---|
| 2043 |  | -             <...>-3638  [003] d... 19494.388553: #5     inner/outer(us):    8/12    ts:1499801131.402150961 | 
|---|
| 2044 |  | -             <...>-3638  [003] d... 19501.283419: #6     inner/outer(us):    0/12    ts:1499801138.297435289 nmi-total:4 nmi-count:1 | 
|---|
|  | 2179 | +             <...>-1729  [001] d...   678.473449: #1     inner/outer(us):   11/12    ts:1581527483.343962693 count:6 | 
|---|
|  | 2180 | +             <...>-1729  [004] d...   689.556542: #2     inner/outer(us):   16/9     ts:1581527494.889008092 count:1 | 
|---|
|  | 2181 | +             <...>-1729  [005] d...   714.756290: #3     inner/outer(us):   16/16    ts:1581527519.678961629 count:5 | 
|---|
|  | 2182 | +             <...>-1729  [001] d...   718.788247: #4     inner/outer(us):    9/17    ts:1581527523.889012713 count:1 | 
|---|
|  | 2183 | +             <...>-1729  [002] d...   719.796341: #5     inner/outer(us):   13/9     ts:1581527524.912872606 count:1 | 
|---|
|  | 2184 | +             <...>-1729  [006] d...   844.787091: #6     inner/outer(us):    9/12    ts:1581527649.889048502 count:2 | 
|---|
|  | 2185 | +             <...>-1729  [003] d...   849.827033: #7     inner/outer(us):   18/9     ts:1581527654.889013793 count:1 | 
|---|
|  | 2186 | +             <...>-1729  [007] d...   853.859002: #8     inner/outer(us):    9/12    ts:1581527658.889065736 count:1 | 
|---|
|  | 2187 | +             <...>-1729  [001] d...   855.874978: #9     inner/outer(us):    9/11    ts:1581527660.861991877 count:1 | 
|---|
|  | 2188 | +             <...>-1729  [001] d...   863.938932: #10    inner/outer(us):    9/11    ts:1581527668.970010500 count:1 nmi-total:7 nmi-count:1 | 
|---|
|  | 2189 | +             <...>-1729  [007] d...   878.050780: #11    inner/outer(us):    9/12    ts:1581527683.385002600 count:1 nmi-total:5 nmi-count:1 | 
|---|
|  | 2190 | +             <...>-1729  [007] d...   886.114702: #12    inner/outer(us):    9/12    ts:1581527691.385001600 count:1 | 
|---|
| 2045 | 2191 |  | 
|---|
| 2046 | 2192 |  | 
|---|
| 2047 | 2193 | The above output is somewhat the same in the header. All events will have | 
|---|
| .. | .. | 
|---|
| 2051 | 2197 | This is the count of events recorded that were greater than the | 
|---|
| 2052 | 2198 | tracing_threshold (See below). | 
|---|
| 2053 | 2199 |  | 
|---|
| 2054 |  | - inner/outer(us):   12/14 | 
|---|
|  | 2200 | + inner/outer(us):   11/11 | 
|---|
| 2055 | 2201 |  | 
|---|
| 2056 | 2202 | This shows two numbers as "inner latency" and "outer latency". The test | 
|---|
| 2057 | 2203 | runs in a loop checking a timestamp twice. The latency detected within | 
|---|
| .. | .. | 
|---|
| 2059 | 2205 | after the previous timestamp and the next timestamp in the loop is | 
|---|
| 2060 | 2206 | the "outer latency". | 
|---|
| 2061 | 2207 |  | 
|---|
| 2062 |  | - ts:1499801089.066141940 | 
|---|
|  | 2208 | + ts:1581527483.343962693 | 
|---|
| 2063 | 2209 |  | 
|---|
| 2064 |  | -      The absolute timestamp that the event happened. | 
|---|
|  | 2210 | +      The absolute timestamp that the first latency was recorded in the window. | 
|---|
| 2065 | 2211 |  | 
|---|
| 2066 |  | - nmi-total:4 nmi-count:1 | 
|---|
|  | 2212 | + count:6 | 
|---|
|  | 2213 | + | 
|---|
|  | 2214 | +      The number of times a latency was detected during the window. | 
|---|
|  | 2215 | + | 
|---|
|  | 2216 | + nmi-total:7 nmi-count:1 | 
|---|
| 2067 | 2217 |  | 
|---|
| 2068 | 2218 | On architectures that support it, if an NMI comes in during the | 
|---|
| 2069 | 2219 | test, the time spent in NMI is reported in "nmi-total" (in | 
|---|
| .. | .. | 
|---|
| 2541 | 2691 | recordmcount program (located in the scripts directory). This | 
|---|
| 2542 | 2692 | program will parse the ELF headers in the C object to find all | 
|---|
| 2543 | 2693 | the locations in the .text section that call mcount. Starting | 
|---|
| 2544 |  | -with gcc verson 4.6, the -mfentry has been added for x86, which | 
|---|
|  | 2694 | +with gcc version 4.6, the -mfentry has been added for x86, which | 
|---|
| 2545 | 2695 | calls "__fentry__" instead of "mcount". Which is called before | 
|---|
| 2546 | 2696 | the creation of the stack frame. | 
|---|
| 2547 | 2697 |  | 
|---|
| .. | .. | 
|---|
| 2776 | 2926 | bash-1994  [000] ....  4342.324898: ima_get_action <-process_measurement | 
|---|
| 2777 | 2927 | bash-1994  [000] ....  4342.324898: ima_match_policy <-ima_get_action | 
|---|
| 2778 | 2928 | bash-1994  [000] ....  4342.324899: do_truncate <-do_last | 
|---|
| 2779 |  | -              bash-1994  [000] ....  4342.324899: should_remove_suid <-do_truncate | 
|---|
|  | 2929 | +              bash-1994  [000] ....  4342.324899: setattr_should_drop_suidgid <-do_truncate | 
|---|
| 2780 | 2930 | bash-1994  [000] ....  4342.324899: notify_change <-do_truncate | 
|---|
| 2781 | 2931 | bash-1994  [000] ....  4342.324900: current_fs_time <-notify_change | 
|---|
| 2782 | 2932 | bash-1994  [000] ....  4342.324900: current_kernel_time <-current_fs_time | 
|---|
| .. | .. | 
|---|
| 2784 | 2934 |  | 
|---|
| 2785 | 2935 | We can see that there's no more lock or preempt tracing. | 
|---|
| 2786 | 2936 |  | 
|---|
|  | 2937 | +Selecting function filters via index | 
|---|
|  | 2938 | +------------------------------------ | 
|---|
|  | 2939 | + | 
|---|
|  | 2940 | +Because processing of strings is expensive (the address of the function | 
|---|
|  | 2941 | +needs to be looked up before comparing to the string being passed in), | 
|---|
|  | 2942 | +an index can be used as well to enable functions. This is useful in the | 
|---|
|  | 2943 | +case of setting thousands of specific functions at a time. By passing | 
|---|
|  | 2944 | +in a list of numbers, no string processing will occur. Instead, the function | 
|---|
|  | 2945 | +at the specific location in the internal array (which corresponds to the | 
|---|
|  | 2946 | +functions in the "available_filter_functions" file), is selected. | 
|---|
|  | 2947 | + | 
|---|
|  | 2948 | +:: | 
|---|
|  | 2949 | + | 
|---|
|  | 2950 | +  # echo 1 > set_ftrace_filter | 
|---|
|  | 2951 | + | 
|---|
|  | 2952 | +Will select the first function listed in "available_filter_functions" | 
|---|
|  | 2953 | + | 
|---|
|  | 2954 | +:: | 
|---|
|  | 2955 | + | 
|---|
|  | 2956 | +  # head -1 available_filter_functions | 
|---|
|  | 2957 | +  trace_initcall_finish_cb | 
|---|
|  | 2958 | + | 
|---|
|  | 2959 | +  # cat set_ftrace_filter | 
|---|
|  | 2960 | +  trace_initcall_finish_cb | 
|---|
|  | 2961 | + | 
|---|
|  | 2962 | +  # head -50 available_filter_functions | tail -1 | 
|---|
|  | 2963 | +  x86_pmu_commit_txn | 
|---|
|  | 2964 | + | 
|---|
|  | 2965 | +  # echo 1 50 > set_ftrace_filter | 
|---|
|  | 2966 | +  # cat set_ftrace_filter | 
|---|
|  | 2967 | +  trace_initcall_finish_cb | 
|---|
|  | 2968 | +  x86_pmu_commit_txn | 
|---|
| 2787 | 2969 |  | 
|---|
| 2788 | 2970 | Dynamic ftrace with the function graph tracer | 
|---|
| 2789 | 2971 | --------------------------------------------- | 
|---|
| .. | .. | 
|---|
| 2853 | 3035 | function tracer. By default it is enabled (when function tracing is | 
|---|
| 2854 | 3036 | enabled in the kernel). If it is disabled, all function tracing is | 
|---|
| 2855 | 3037 | disabled. This includes not only the function tracers for ftrace, but | 
|---|
| 2856 |  | -also for any other uses (perf, kprobes, stack tracing, profiling, etc). | 
|---|
|  | 3038 | +also for any other uses (perf, kprobes, stack tracing, profiling, etc). It | 
|---|
|  | 3039 | +cannot be disabled if there is a callback with FTRACE_OPS_FL_PERMANENT set | 
|---|
|  | 3040 | +registered. | 
|---|
| 2857 | 3041 |  | 
|---|
| 2858 | 3042 | Please disable this with care. | 
|---|
| 2859 | 3043 |  | 
|---|
| .. | .. | 
|---|
| 2978 | 3162 | When the function is hit, it will dump the contents of the ftrace | 
|---|
| 2979 | 3163 | ring buffer to the console. This is useful if you need to debug | 
|---|
| 2980 | 3164 | something, and want to dump the trace when a certain function | 
|---|
| 2981 |  | -  is hit. Perhaps its a function that is called before a tripple | 
|---|
|  | 3165 | +  is hit. Perhaps it's a function that is called before a triple | 
|---|
| 2982 | 3166 | fault happens and does not allow you to get a regular dump. | 
|---|
| 2983 | 3167 |  | 
|---|
| 2984 | 3168 | - cpudump: | 
|---|
| .. | .. | 
|---|
| 2986 | 3170 | ring buffer for the current CPU to the console. Unlike the "dump" | 
|---|
| 2987 | 3171 | command, it only prints out the contents of the ring buffer for the | 
|---|
| 2988 | 3172 | CPU that executed the function that triggered the dump. | 
|---|
|  | 3173 | + | 
|---|
|  | 3174 | +- stacktrace: | 
|---|
|  | 3175 | +  When the function is hit, a stack trace is recorded. | 
|---|
| 2989 | 3176 |  | 
|---|
| 2990 | 3177 | trace_pipe | 
|---|
| 2991 | 3178 | ---------- | 
|---|
| .. | .. | 
|---|
| 3029 | 3216 |  | 
|---|
| 3030 | 3217 |  | 
|---|
| 3031 | 3218 | Note, reading the trace_pipe file will block until more input is | 
|---|
| 3032 |  | -added. | 
|---|
|  | 3219 | +added. This is contrary to the trace file. If any process opened | 
|---|
|  | 3220 | +the trace file for reading, it will actually disable tracing and | 
|---|
|  | 3221 | +prevent new entries from being added. The trace_pipe file does | 
|---|
|  | 3222 | +not have this limitation. | 
|---|
| 3033 | 3223 |  | 
|---|
| 3034 | 3224 | trace entries | 
|---|
| 3035 | 3225 | ------------- | 
|---|
| .. | .. | 
|---|
| 3193 | 3383 |  | 
|---|
| 3194 | 3384 | As you can see, the new directory looks similar to the tracing directory | 
|---|
| 3195 | 3385 | itself. In fact, it is very similar, except that the buffer and | 
|---|
| 3196 |  | -events are agnostic from the main director, or from any other | 
|---|
|  | 3386 | +events are agnostic from the main directory, or from any other | 
|---|
| 3197 | 3387 | instances that are created. | 
|---|
| 3198 | 3388 |  | 
|---|
| 3199 | 3389 | The files in the new directory work just like the files with the | 
|---|