~hc/RK356X_SDK_RELEASE.git

..	..	@@ -89,12 +89,13 @@
89	89	- socket: processor socket number the task ran at the time of sample
90	90	- srcline: filename and line number executed at the time of sample. The
91	91	DWARF debugging info must be provided.
92		- - srcfile: file name of the source file of the same. Requires dwarf
	92	+ - srcfile: file name of the source file of the samples. Requires dwarf
93	93	information.
94	94	- weight: Event specific weight, e.g. memory latency or transaction
95	95	abort cost. This is the global weight.
96	96	- local_weight: Local weight version of the weight above.
97	97	- cgroup_id: ID derived from cgroup namespace device and inode numbers.
	98	+ - cgroup: cgroup pathname in the cgroupfs.
98	99	- transaction: Transaction abort flags.
99	100	- overhead: Overhead percentage of sample
100	101	- overhead_sys: Overhead percentage of sample running in system mode
..	..	@@ -105,6 +106,8 @@
105	106	guest machine
106	107	- sample: Number of sample
107	108	- period: Raw number of event count of sample
	109	+ - time: Separate the samples by time stamp with the resolution specified by
	110	+ --time-quantum (default 100ms). Specify with overhead and before it.
108	111
109	112	By default, comm, dso and symbol keys are used.
110	113	(i.e. --sort comm,dso,symbol)
..	..	@@ -125,6 +128,14 @@
125	128
126	129	And default sort keys are changed to comm, dso_from, symbol_from, dso_to
127	130	and symbol_to, see '--branch-stack'.
	131	+
	132	+ When the sort key symbol is specified, columns "IPC" and "IPC Coverage"
	133	+ are enabled automatically. Column "IPC" reports the average IPC per function
	134	+ and column "IPC coverage" reports the percentage of instructions with
	135	+ sampled IPC in this function. IPC means Instruction Per Cycle. If it's low,
	136	+ it indicates there may be a performance bottleneck when the function is
	137	+ executed, such as a memory access bottleneck. If a function has high overhead
	138	+ and low IPC, it's worth further analyzing it to optimize its performance.
128	139
129	140	If the --mem-mode option is used, the following sort keys are also available
130	141	(incompatible with --branch-stack):
..	..	@@ -244,7 +255,7 @@
244	255	Usually more convenient to use --branch-history for this.
245	256
246	257	value can be:
247		- - percent: diplay overhead percent (default)
	258	+ - percent: display overhead percent (default)
248	259	- period: display event period
249	260	- count: display event count
250	261
..	..	@@ -357,9 +368,20 @@
357	368	--objdump=<path>::
358	369	Path to objdump binary.
359	370
	371	+--prefix=PREFIX::
	372	+--prefix-strip=N::
	373	+ Remove first N entries from source file path names in executables
	374	+ and add PREFIX. This allows to display source code compiled on systems
	375	+ with different file system layout.
	376	+
360	377	--group::
361	378	Show event group information together. It forces group output also
362	379	if there are no groups defined in data file.
	380	+
	381	+--group-sort-idx::
	382	+ Sort the output by the event at the index n in group. If n is invalid,
	383	+ sort by the first event. It can support multiple groups with different
	384	+ amount of events. WARNING: This should be used on grouped events.
363	385
364	386	--demangle::
365	387	Demangle symbol names to human readable form. It's enabled by default,
..	..	@@ -402,12 +424,13 @@
402	424
403	425	--time::
404	426	Only analyze samples within given time window: <start>,<stop>. Times
405		- have the format seconds.microseconds. If start is not given (i.e., time
	427	+ have the format seconds.nanoseconds. If start is not given (i.e. time
406	428	string is ',x.y') then analysis starts at the beginning of the file. If
407		- stop time is not given (i.e, time string is 'x.y,') then analysis goes
408		- to end of file.
	429	+ stop time is not given (i.e. time string is 'x.y,') then analysis goes
	430	+ to end of file. Multiple ranges can be separated by spaces, which
	431	+ requires the argument to be quoted e.g. --time "1234.567,1234.789 1235,"
409	432
410		- Also support time percent with multiple time range. Time string is
	433	+ Also support time percent with multiple time ranges. Time string is
411	434	'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
412	435
413	436	For example:
..	..	@@ -426,6 +449,23 @@
426	449	Select from 0% to 10% and 30% to 40% slices:
427	450
428	451	perf report --time 0%-10%,30%-40%
	452	+
	453	+--switch-on EVENT_NAME::
	454	+ Only consider events after this event is found.
	455	+
	456	+ This may be interesting to measure a workload only after some initialization
	457	+ phase is over, i.e. insert a perf probe at that point and then using this
	458	+ option with that probe.
	459	+
	460	+--switch-off EVENT_NAME::
	461	+ Stop considering events after this event is found.
	462	+
	463	+--show-on-off-events::
	464	+ Show the --switch-on/off events too. This has no effect in 'perf report' now
	465	+ but probably we'll make the default not to show the switch-on/off events
	466	+ on the --group mode and if there is only one event besides the off/on ones,
	467	+ go straight to the histogram browser, just like 'perf report' with no events
	468	+ explicitely specified does.
429	469
430	470	--itrace::
431	471	Options for decoding instruction tracing data. The options are:
..	..	@@ -448,8 +488,23 @@
448	488	This option extends the perf report to show reference callgraphs,
449	489	which collected by reference event, in no callgraph event.
450	490
	491	+--stitch-lbr::
	492	+ Show callgraph with stitched LBRs, which may have more complete
	493	+ callgraph. The perf.data file must have been obtained using
	494	+ perf record --call-graph lbr.
	495	+ Disabled by default. In common cases with call stack overflows,
	496	+ it can recreate better call stacks than the default lbr call stack
	497	+ output. But this approach is not full proof. There can be cases
	498	+ where it creates incorrect call stacks from incorrect matches.
	499	+ The known limitations include exception handing such as
	500	+ setjmp/longjmp will have calls/returns not match.
	501	+
451	502	--socket-filter::
452	503	Only report the samples on the processor socket that match with this filter
	504	+
	505	+--samples=N::
	506	+ Save N individual samples for each histogram entry to show context in perf
	507	+ report tui browser.
453	508
454	509	--raw-trace::
455	510	When displaying traceevent output, do not use print fmt or plugins.
..	..	@@ -469,6 +524,9 @@
469	524	Please note that not all mmaps are stored, options affecting which ones
470	525	are include 'perf record --data', for instance.
471	526
	527	+--ns::
	528	+ Show time stamps in nanoseconds.
	529	+
472	530	--stats::
473	531	Display overall events statistics without any further processing.
474	532	(like the one at the end of the perf report -D command)
..	..	@@ -486,8 +544,24 @@
486	544	The period/hits keywords set the base the percentage is computed
487	545	on - the samples period or the number of samples (hits).
488	546
	547	+--time-quantum::
	548	+ Configure time quantum for time sort key. Default 100ms.
	549	+ Accepts s, us, ms, ns units.
	550	+
	551	+--total-cycles::
	552	+ When --total-cycles is specified, it supports sorting for all blocks by
	553	+ 'Sampled Cycles%'. This is useful to concentrate on the globally hottest
	554	+ blocks. In output, there are some new columns:
	555	+
	556	+ 'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles
	557	+ 'Sampled Cycles' - block sampled cycles aggregation
	558	+ 'Avg Cycles%' - block average sampled cycles / sum of total block average
	559	+ sampled cycles
	560	+ 'Avg Cycles' - block average sampled cycles
	561	+
489	562	include::callchain-overhead-calculation.txt[]
490	563
491	564	SEE ALSO
492	565	--------
493		-linkperf:perf-stat[1], linkperf:perf-annotate[1], linkperf:perf-record[1]
	566	+linkperf:perf-stat[1], linkperf:perf-annotate[1], linkperf:perf-record[1],
	567	+linkperf:perf-intel-pt[1]