hc
2024-10-12 a5969cabbb4660eab42b6ef0412cbbd1200cf14d
kernel/Documentation/admin-guide/pm/intel_pstate.rst
....@@ -1,10 +1,13 @@
1
+.. SPDX-License-Identifier: GPL-2.0
2
+.. include:: <isonum.txt>
3
+
14 ===============================================
25 ``intel_pstate`` CPU Performance Scaling Driver
36 ===============================================
47
5
-::
8
+:Copyright: |copy| 2017 Intel Corporation
69
7
- Copyright (c) 2017 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
10
+:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
811
912
1013 General Information
....@@ -20,11 +23,10 @@
2023
2124 For the processors supported by ``intel_pstate``, the P-state concept is broader
2225 than just an operating frequency or an operating performance point (see the
23
-`LinuxCon Europe 2015 presentation by Kristen Accardi <LCEU2015_>`_ for more
26
+LinuxCon Europe 2015 presentation by Kristen Accardi [1]_ for more
2427 information about that). For this reason, the representation of P-states used
2528 by ``intel_pstate`` internally follows the hardware specification (for details
26
-refer to `Intel® 64 and IA-32 Architectures Software Developer’s Manual
27
-Volume 3: System Programming Guide <SDM_>`_). However, the ``CPUFreq`` core
29
+refer to Intel Software Developer’s Manual [2]_). However, the ``CPUFreq`` core
2830 uses frequencies for identifying operating performance points of CPUs and
2931 frequencies are involved in the user space interface exposed by it, so
3032 ``intel_pstate`` maps its internal representation of P-states to frequencies too
....@@ -52,17 +54,21 @@
5254 Operation Modes
5355 ===============
5456
55
-``intel_pstate`` can operate in three different modes: in the active mode with
56
-or without hardware-managed P-states support and in the passive mode. Which of
57
-them will be in effect depends on what kernel command line options are used and
58
-on the capabilities of the processor.
57
+``intel_pstate`` can operate in two different modes, active or passive. In the
58
+active mode, it uses its own internal performance scaling governor algorithm or
59
+allows the hardware to do preformance scaling by itself, while in the passive
60
+mode it responds to requests made by a generic ``CPUFreq`` governor implementing
61
+a certain performance scaling algorithm. Which of them will be in effect
62
+depends on what kernel command line options are used and on the capabilities of
63
+the processor.
5964
6065 Active Mode
6166 -----------
6267
63
-This is the default operation mode of ``intel_pstate``. If it works in this
64
-mode, the ``scaling_driver`` policy attribute in ``sysfs`` for all ``CPUFreq``
65
-policies contains the string "intel_pstate".
68
+This is the default operation mode of ``intel_pstate`` for processors with
69
+hardware-managed P-states (HWP) support. If it works in this mode, the
70
+``scaling_driver`` policy attribute in ``sysfs`` for all ``CPUFreq`` policies
71
+contains the string "intel_pstate".
6672
6773 In this mode the driver bypasses the scaling governors layer of ``CPUFreq`` and
6874 provides its own scaling algorithms for P-state selection. Those algorithms
....@@ -117,7 +123,9 @@
117123 internal P-state selection logic is expected to focus entirely on performance.
118124
119125 This will override the EPP/EPB setting coming from the ``sysfs`` interface
120
-(see `Energy vs Performance Hints`_ below).
126
+(see `Energy vs Performance Hints`_ below). Moreover, any attempts to change
127
+the EPP/EPB to a value different from 0 ("performance") via ``sysfs`` in this
128
+configuration will be rejected.
121129
122130 Also, in this configuration the range of P-states available to the processor's
123131 internal P-state selection logic is always restricted to the upper boundary
....@@ -136,12 +144,13 @@
136144 Active Mode Without HWP
137145 ~~~~~~~~~~~~~~~~~~~~~~~
138146
139
-This is the default operation mode for processors that do not support the HWP
140
-feature. It also is used by default with the ``intel_pstate=no_hwp`` argument
141
-in the kernel command line. However, in this mode ``intel_pstate`` may refuse
142
-to work with the given processor if it does not recognize it. [Note that
143
-``intel_pstate`` will never refuse to work with any processor with the HWP
144
-feature enabled.]
147
+This operation mode is optional for processors that do not support the HWP
148
+feature or when the ``intel_pstate=no_hwp`` argument is passed to the kernel in
149
+the command line. The active mode is used in those cases if the
150
+``intel_pstate=active`` argument is passed to the kernel in the command line.
151
+In this mode ``intel_pstate`` may refuse to work with processors that are not
152
+recognized by it. [Note that ``intel_pstate`` will never refuse to work with
153
+any processor with the HWP feature enabled.]
145154
146155 In this mode ``intel_pstate`` registers utilization update callbacks with the
147156 CPU scheduler in order to run a P-state selection algorithm, either
....@@ -186,10 +195,15 @@
186195 Passive Mode
187196 ------------
188197
189
-This mode is used if the ``intel_pstate=passive`` argument is passed to the
190
-kernel in the command line (it implies the ``intel_pstate=no_hwp`` setting too).
191
-Like in the active mode without HWP support, in this mode ``intel_pstate`` may
192
-refuse to work with the given processor if it does not recognize it.
198
+This is the default operation mode of ``intel_pstate`` for processors without
199
+hardware-managed P-states (HWP) support. It is always used if the
200
+``intel_pstate=passive`` argument is passed to the kernel in the command line
201
+regardless of whether or not the given processor supports HWP. [Note that the
202
+``intel_pstate=no_hwp`` setting causes the driver to start in the passive mode
203
+if it is not combined with ``intel_pstate=active``.] Like in the active mode
204
+without HWP support, in this mode ``intel_pstate`` may refuse to work with
205
+processors that are not recognized by it if HWP is prevented from being enabled
206
+through the kernel command line.
193207
194208 If the driver works in this mode, the ``scaling_driver`` policy attribute in
195209 ``sysfs`` for all ``CPUFreq`` policies contains the string "intel_cpufreq".
....@@ -310,10 +324,9 @@
310324
311325 For this reason, there is a list of supported processors in ``intel_pstate`` and
312326 the driver initialization will fail if the detected processor is not in that
313
-list, unless it supports the `HWP feature <Active Mode_>`_. [The interface to
314
-obtain all of the information listed above is the same for all of the processors
315
-supporting the HWP feature, which is why they all are supported by
316
-``intel_pstate``.]
327
+list, unless it supports the HWP feature. [The interface to obtain all of the
328
+information listed above is the same for all of the processors supporting the
329
+HWP feature, which is why ``intel_pstate`` works with all of them.]
317330
318331
319332 User Space Interface in ``sysfs``
....@@ -417,11 +430,16 @@
417430 as well as the per-policy ones) are then reset to their default
418431 values, possibly depending on the target operation mode.]
419432
420
- That only is supported in some configurations, though (for example, if
421
- the `HWP feature is enabled in the processor <Active Mode With HWP_>`_,
422
- the operation mode of the driver cannot be changed), and if it is not
423
- supported in the current configuration, writes to this attribute will
424
- fail with an appropriate error.
433
+``energy_efficiency``
434
+ This attribute is only present on platforms with CPUs matching the Kaby
435
+ Lake or Coffee Lake desktop CPU model. By default, energy-efficiency
436
+ optimizations are disabled on these CPU models if HWP is enabled.
437
+ Enabling energy-efficiency optimizations may limit maximum operating
438
+ frequency with or without the HWP feature. With HWP enabled, the
439
+ optimizations are done only in the turbo frequency range. Without it,
440
+ they are done in the entire available frequency range. Setting this
441
+ attribute to "1" enables the energy-efficiency optimizations and setting
442
+ to "0" disables them.
425443
426444 Interpretation of Policy Attributes
427445 -----------------------------------
....@@ -465,6 +483,13 @@
465483 policy for the time interval between the last two invocations of the
466484 driver's utilization update callback by the CPU scheduler for that CPU.
467485
486
+One more policy attribute is present if the HWP feature is enabled in the
487
+processor:
488
+
489
+``base_frequency``
490
+ Shows the base frequency of the CPU. Any frequency above this will be
491
+ in the turbo frequency range.
492
+
468493 The meaning of these attributes in the `passive mode <Passive Mode_>`_ is the
469494 same as for other scaling drivers.
470495
....@@ -488,15 +513,23 @@
488513
489514 2. Each individual CPU is affected by its own per-policy limits (that is, it
490515 cannot be requested to run faster than its own per-policy maximum and it
491
- cannot be requested to run slower than its own per-policy minimum).
516
+ cannot be requested to run slower than its own per-policy minimum). The
517
+ effective performance depends on whether the platform supports per core
518
+ P-states, hyper-threading is enabled and on current performance requests
519
+ from other CPUs. When platform doesn't support per core P-states, the
520
+ effective performance can be more than the policy limits set on a CPU, if
521
+ other CPUs are requesting higher performance at that moment. Even with per
522
+ core P-states support, when hyper-threading is enabled, if the sibling CPU
523
+ is requesting higher performance, the other siblings will get higher
524
+ performance than their policy limits.
492525
493526 3. The global and per-policy limits can be set independently.
494527
495
-If the `HWP feature is enabled in the processor <Active Mode With HWP_>`_, the
496
-resulting effective values are written into its registers whenever the limits
497
-change in order to request its internal P-state selection logic to always set
498
-P-states within these limits. Otherwise, the limits are taken into account by
499
-scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver
528
+In the `active mode with the HWP feature enabled <Active Mode With HWP_>`_, the
529
+resulting effective values are written into hardware registers whenever the
530
+limits change in order to request its internal P-state selection logic to always
531
+set P-states within these limits. Otherwise, the limits are taken into account
532
+by scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver
500533 every time before setting a new P-state for a CPU.
501534
502535 Additionally, if the ``intel_pstate=per_cpu_perf_limits`` command line argument
....@@ -507,12 +540,11 @@
507540 Energy vs Performance Hints
508541 ---------------------------
509542
510
-If ``intel_pstate`` works in the `active mode with the HWP feature enabled
511
-<Active Mode With HWP_>`_ in the processor, additional attributes are present
512
-in every ``CPUFreq`` policy directory in ``sysfs``. They are intended to allow
513
-user space to help ``intel_pstate`` to adjust the processor's internal P-state
514
-selection logic by focusing it on performance or on energy-efficiency, or
515
-somewhere between the two extremes:
543
+If the hardware-managed P-states (HWP) is enabled in the processor, additional
544
+attributes, intended to allow user space to help ``intel_pstate`` to adjust the
545
+processor's internal P-state selection logic by focusing it on performance or on
546
+energy-efficiency, or somewhere between the two extremes, are present in every
547
+``CPUFreq`` policy directory in ``sysfs``. They are :
516548
517549 ``energy_performance_preference``
518550 Current value of the energy vs performance hint for the given policy
....@@ -531,7 +563,11 @@
531563 Strings written to the ``energy_performance_preference`` attribute are
532564 internally translated to integer values written to the processor's
533565 Energy-Performance Preference (EPP) knob (if supported) or its
534
-Energy-Performance Bias (EPB) knob.
566
+Energy-Performance Bias (EPB) knob. It is also possible to write a positive
567
+integer value between 0 to 255, if the EPP feature is present. If the EPP
568
+feature is not present, writing integer value to this attribute is not
569
+supported. In this case, user can use the
570
+"/sys/devices/system/cpu/cpu*/power/energy_perf_bias" interface.
535571
536572 [Note that tasks may by migrated from one CPU to another by the scheduler's
537573 load-balancing algorithm and if different energy vs performance hints are
....@@ -546,9 +582,9 @@
546582
547583 On the majority of systems supported by ``intel_pstate``, the ACPI tables
548584 provided by the platform firmware contain ``_PSS`` objects returning information
549
-that can be used for CPU performance scaling (refer to the `ACPI specification`_
550
-for details on the ``_PSS`` objects and the format of the information returned
551
-by them).
585
+that can be used for CPU performance scaling (refer to the ACPI specification
586
+[3]_ for details on the ``_PSS`` objects and the format of the information
587
+returned by them).
552588
553589 The information returned by the ACPI ``_PSS`` objects is used by the
554590 ``acpi-cpufreq`` scaling driver. On systems supported by ``intel_pstate``
....@@ -612,11 +648,13 @@
612648 Do not register ``intel_pstate`` as the scaling driver even if the
613649 processor is supported by it.
614650
651
+``active``
652
+ Register ``intel_pstate`` in the `active mode <Active Mode_>`_ to start
653
+ with.
654
+
615655 ``passive``
616656 Register ``intel_pstate`` in the `passive mode <Passive Mode_>`_ to
617657 start with.
618
-
619
- This option implies the ``no_hwp`` one described below.
620658
621659 ``force``
622660 Register ``intel_pstate`` as the scaling driver instead of
....@@ -632,13 +670,12 @@
632670 driver is used instead of ``acpi-cpufreq``.
633671
634672 ``no_hwp``
635
- Do not enable the `hardware-managed P-states (HWP) feature
636
- <Active Mode With HWP_>`_ even if it is supported by the processor.
673
+ Do not enable the hardware-managed P-states (HWP) feature even if it is
674
+ supported by the processor.
637675
638676 ``hwp_only``
639677 Register ``intel_pstate`` as the scaling driver only if the
640
- `hardware-managed P-states (HWP) feature <Active Mode With HWP_>`_ is
641
- supported by the processor.
678
+ hardware-managed P-states (HWP) feature is supported by the processor.
642679
643680 ``support_acpi_ppc``
644681 Take ACPI ``_PPC`` performance limits into account.
....@@ -685,7 +722,7 @@
685722
686723 The ``ftrace`` interface can be used for low-level diagnostics of
687724 ``intel_pstate``. For example, to check how often the function to set a
688
-P-state is called, the ``ftrace`` filter can be set to to
725
+P-state is called, the ``ftrace`` filter can be set to
689726 :c:func:`intel_pstate_set_pstate`::
690727
691728 # cd /sys/kernel/debug/tracing/
....@@ -713,6 +750,14 @@
713750 <idle>-0 [000] ..s. 2537.654843: intel_pstate_set_pstate <-intel_pstate_timer_func
714751
715752
716
-.. _LCEU2015: http://events.linuxfoundation.org/sites/events/files/slides/LinuxConEurope_2015.pdf
717
-.. _SDM: http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
718
-.. _ACPI specification: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
753
+References
754
+==========
755
+
756
+.. [1] Kristen Accardi, *Balancing Power and Performance in the Linux Kernel*,
757
+ https://events.static.linuxfound.org/sites/events/files/slides/LinuxConEurope_2015.pdf
758
+
759
+.. [2] *Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3: System Programming Guide*,
760
+ https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
761
+
762
+.. [3] *Advanced Configuration and Power Interface Specification*,
763
+ https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf