Merge branches 'pm-avs', 'pm-docs' and 'pm-tools'
* pm-avs: ARM: OMAP2+: SmartReflex: add omap_sr_pdata definition power: avs: smartreflex: Remove superfluous cast in debugfs_create_file() call * pm-docs: PM: Wrap documentation to fit in 80 columns * pm-tools: cpupower: ToDo: Update ToDo with ideas for per_cpu_schedule handling cpupower: mperf_monitor: Update cpupower to use the RDPRU instruction cpupower: mperf_monitor: Introduce per_cpu_schedule flag cpupower: Move needs_root variable into a sub-struct cpupower : Handle set and info subcommands correctly pm-graph info added to MAINTAINERS tools/power/cpupower: Fix initializer override in hsw_ext_cstates
This commit is contained in:
commit
e350b60f4e
|
@ -39,9 +39,10 @@ c) Compile the driver directly into the kernel and try the test modes of
|
|||
d) Attempt to hibernate with the driver compiled directly into the kernel
|
||||
in the "reboot", "shutdown" and "platform" modes.
|
||||
|
||||
e) Try the test modes of suspend (see: Documentation/power/basic-pm-debugging.rst,
|
||||
2). [As far as the STR tests are concerned, it should not matter whether or
|
||||
not the driver is built as a module.]
|
||||
e) Try the test modes of suspend (see:
|
||||
Documentation/power/basic-pm-debugging.rst, 2). [As far as the STR tests are
|
||||
concerned, it should not matter whether or not the driver is built as a
|
||||
module.]
|
||||
|
||||
f) Attempt to suspend to RAM using the s2ram tool with the driver loaded
|
||||
(see: Documentation/power/basic-pm-debugging.rst, 2).
|
||||
|
|
|
@ -215,30 +215,31 @@ VI. Are there any precautions to be taken to prevent freezing failures?
|
|||
|
||||
Yes, there are.
|
||||
|
||||
First of all, grabbing the 'system_transition_mutex' lock to mutually exclude a piece of code
|
||||
from system-wide sleep such as suspend/hibernation is not encouraged.
|
||||
If possible, that piece of code must instead hook onto the suspend/hibernation
|
||||
notifiers to achieve mutual exclusion. Look at the CPU-Hotplug code
|
||||
(kernel/cpu.c) for an example.
|
||||
First of all, grabbing the 'system_transition_mutex' lock to mutually exclude a
|
||||
piece of code from system-wide sleep such as suspend/hibernation is not
|
||||
encouraged. If possible, that piece of code must instead hook onto the
|
||||
suspend/hibernation notifiers to achieve mutual exclusion. Look at the
|
||||
CPU-Hotplug code (kernel/cpu.c) for an example.
|
||||
|
||||
However, if that is not feasible, and grabbing 'system_transition_mutex' is deemed necessary,
|
||||
it is strongly discouraged to directly call mutex_[un]lock(&system_transition_mutex) since
|
||||
that could lead to freezing failures, because if the suspend/hibernate code
|
||||
successfully acquired the 'system_transition_mutex' lock, and hence that other entity failed
|
||||
to acquire the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE
|
||||
state. As a consequence, the freezer would not be able to freeze that task,
|
||||
leading to freezing failure.
|
||||
However, if that is not feasible, and grabbing 'system_transition_mutex' is
|
||||
deemed necessary, it is strongly discouraged to directly call
|
||||
mutex_[un]lock(&system_transition_mutex) since that could lead to freezing
|
||||
failures, because if the suspend/hibernate code successfully acquired the
|
||||
'system_transition_mutex' lock, and hence that other entity failed to acquire
|
||||
the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE state. As a
|
||||
consequence, the freezer would not be able to freeze that task, leading to
|
||||
freezing failure.
|
||||
|
||||
However, the [un]lock_system_sleep() APIs are safe to use in this scenario,
|
||||
since they ask the freezer to skip freezing this task, since it is anyway
|
||||
"frozen enough" as it is blocked on 'system_transition_mutex', which will be released
|
||||
only after the entire suspend/hibernation sequence is complete.
|
||||
So, to summarize, use [un]lock_system_sleep() instead of directly using
|
||||
"frozen enough" as it is blocked on 'system_transition_mutex', which will be
|
||||
released only after the entire suspend/hibernation sequence is complete. So, to
|
||||
summarize, use [un]lock_system_sleep() instead of directly using
|
||||
mutex_[un]lock(&system_transition_mutex). That would prevent freezing failures.
|
||||
|
||||
V. Miscellaneous
|
||||
================
|
||||
|
||||
/sys/power/pm_freeze_timeout controls how long it will cost at most to freeze
|
||||
all user space processes or all freezable kernel threads, in unit of millisecond.
|
||||
The default value is 20000, with range of unsigned integer.
|
||||
all user space processes or all freezable kernel threads, in unit of
|
||||
millisecond. The default value is 20000, with range of unsigned integer.
|
||||
|
|
|
@ -73,19 +73,21 @@ factors. Example usage: Thermal management or other exceptional situations where
|
|||
SoC framework might choose to disable a higher frequency OPP to safely continue
|
||||
operations until that OPP could be re-enabled if possible.
|
||||
|
||||
OPP library facilitates this concept in it's implementation. The following
|
||||
OPP library facilitates this concept in its implementation. The following
|
||||
operational functions operate only on available opps:
|
||||
opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq, dev_pm_opp_get_opp_count
|
||||
opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq,
|
||||
dev_pm_opp_get_opp_count
|
||||
|
||||
dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer which can then
|
||||
be used for dev_pm_opp_enable/disable functions to make an opp available as required.
|
||||
dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer
|
||||
which can then be used for dev_pm_opp_enable/disable functions to make an
|
||||
opp available as required.
|
||||
|
||||
WARNING: Users of OPP library should refresh their availability count using
|
||||
get_opp_count if dev_pm_opp_enable/disable functions are invoked for a device, the
|
||||
exact mechanism to trigger these or the notification mechanism to other
|
||||
dependent subsystems such as cpufreq are left to the discretion of the SoC
|
||||
specific framework which uses the OPP library. Similar care needs to be taken
|
||||
care to refresh the cpufreq table in cases of these operations.
|
||||
get_opp_count if dev_pm_opp_enable/disable functions are invoked for a
|
||||
device, the exact mechanism to trigger these or the notification mechanism
|
||||
to other dependent subsystems such as cpufreq are left to the discretion of
|
||||
the SoC specific framework which uses the OPP library. Similar care needs
|
||||
to be taken care to refresh the cpufreq table in cases of these operations.
|
||||
|
||||
2. Initial OPP List Registration
|
||||
================================
|
||||
|
@ -99,11 +101,11 @@ OPPs dynamically using the dev_pm_opp_enable / disable functions.
|
|||
dev_pm_opp_add
|
||||
Add a new OPP for a specific domain represented by the device pointer.
|
||||
The OPP is defined using the frequency and voltage. Once added, the OPP
|
||||
is assumed to be available and control of it's availability can be done
|
||||
with the dev_pm_opp_enable/disable functions. OPP library internally stores
|
||||
and manages this information in the opp struct. This function may be
|
||||
used by SoC framework to define a optimal list as per the demands of
|
||||
SoC usage environment.
|
||||
is assumed to be available and control of its availability can be done
|
||||
with the dev_pm_opp_enable/disable functions. OPP library
|
||||
internally stores and manages this information in the opp struct.
|
||||
This function may be used by SoC framework to define a optimal list
|
||||
as per the demands of SoC usage environment.
|
||||
|
||||
WARNING:
|
||||
Do not use this function in interrupt context.
|
||||
|
@ -354,7 +356,7 @@ struct dev_pm_opp
|
|||
|
||||
struct device
|
||||
This is used to identify a domain to the OPP layer. The
|
||||
nature of the device and it's implementation is left to the user of
|
||||
nature of the device and its implementation is left to the user of
|
||||
OPP library such as the SoC framework.
|
||||
|
||||
Overall, in a simplistic view, the data structure operations is represented as
|
||||
|
|
|
@ -426,12 +426,12 @@ pm->runtime_idle() callback.
|
|||
2.4. System-Wide Power Transitions
|
||||
----------------------------------
|
||||
There are a few different types of system-wide power transitions, described in
|
||||
Documentation/driver-api/pm/devices.rst. Each of them requires devices to be handled
|
||||
in a specific way and the PM core executes subsystem-level power management
|
||||
callbacks for this purpose. They are executed in phases such that each phase
|
||||
involves executing the same subsystem-level callback for every device belonging
|
||||
to the given subsystem before the next phase begins. These phases always run
|
||||
after tasks have been frozen.
|
||||
Documentation/driver-api/pm/devices.rst. Each of them requires devices to be
|
||||
handled in a specific way and the PM core executes subsystem-level power
|
||||
management callbacks for this purpose. They are executed in phases such that
|
||||
each phase involves executing the same subsystem-level callback for every device
|
||||
belonging to the given subsystem before the next phase begins. These phases
|
||||
always run after tasks have been frozen.
|
||||
|
||||
2.4.1. System Suspend
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
|
@ -636,12 +636,12 @@ System restore requires a hibernation image to be loaded into memory and the
|
|||
pre-hibernation memory contents to be restored before the pre-hibernation system
|
||||
activity can be resumed.
|
||||
|
||||
As described in Documentation/driver-api/pm/devices.rst, the hibernation image is loaded
|
||||
into memory by a fresh instance of the kernel, called the boot kernel, which in
|
||||
turn is loaded and run by a boot loader in the usual way. After the boot kernel
|
||||
has loaded the image, it needs to replace its own code and data with the code
|
||||
and data of the "hibernated" kernel stored within the image, called the image
|
||||
kernel. For this purpose all devices are frozen just like before creating
|
||||
As described in Documentation/driver-api/pm/devices.rst, the hibernation image
|
||||
is loaded into memory by a fresh instance of the kernel, called the boot kernel,
|
||||
which in turn is loaded and run by a boot loader in the usual way. After the
|
||||
boot kernel has loaded the image, it needs to replace its own code and data with
|
||||
the code and data of the "hibernated" kernel stored within the image, called the
|
||||
image kernel. For this purpose all devices are frozen just like before creating
|
||||
the image during hibernation, in the
|
||||
|
||||
prepare, freeze, freeze_noirq
|
||||
|
@ -691,8 +691,8 @@ controlling the runtime power management of their devices.
|
|||
|
||||
At the time of this writing there are two ways to define power management
|
||||
callbacks for a PCI device driver, the recommended one, based on using a
|
||||
dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and the
|
||||
"legacy" one, in which the .suspend(), .suspend_late(), .resume_early(), and
|
||||
dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and
|
||||
the "legacy" one, in which the .suspend(), .suspend_late(), .resume_early(), and
|
||||
.resume() callbacks from struct pci_driver are used. The legacy approach,
|
||||
however, doesn't allow one to define runtime power management callbacks and is
|
||||
not really suitable for any new drivers. Therefore it is not covered by this
|
||||
|
|
|
@ -8,8 +8,8 @@ one of the parameters.
|
|||
|
||||
Two different PM QoS frameworks are available:
|
||||
1. PM QoS classes for cpu_dma_latency
|
||||
2. the per-device PM QoS framework provides the API to manage the per-device latency
|
||||
constraints and PM QoS flags.
|
||||
2. The per-device PM QoS framework provides the API to manage the
|
||||
per-device latency constraints and PM QoS flags.
|
||||
|
||||
Each parameters have defined units:
|
||||
|
||||
|
@ -47,14 +47,14 @@ void pm_qos_add_request(handle, param_class, target_value):
|
|||
pm_qos API functions.
|
||||
|
||||
void pm_qos_update_request(handle, new_target_value):
|
||||
Will update the list element pointed to by the handle with the new target value
|
||||
and recompute the new aggregated target, calling the notification tree if the
|
||||
target is changed.
|
||||
Will update the list element pointed to by the handle with the new target
|
||||
value and recompute the new aggregated target, calling the notification tree
|
||||
if the target is changed.
|
||||
|
||||
void pm_qos_remove_request(handle):
|
||||
Will remove the element. After removal it will update the aggregate target and
|
||||
call the notification tree if the target was changed as a result of removing
|
||||
the request.
|
||||
Will remove the element. After removal it will update the aggregate target
|
||||
and call the notification tree if the target was changed as a result of
|
||||
removing the request.
|
||||
|
||||
int pm_qos_request(param_class):
|
||||
Returns the aggregated value for a given PM QoS class.
|
||||
|
@ -167,9 +167,9 @@ int dev_pm_qos_expose_flags(device, value)
|
|||
change the value of the PM_QOS_FLAG_NO_POWER_OFF flag.
|
||||
|
||||
void dev_pm_qos_hide_flags(device)
|
||||
Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS list
|
||||
of flags and remove sysfs attribute pm_qos_no_power_off from the device's power
|
||||
directory.
|
||||
Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS
|
||||
list of flags and remove sysfs attribute pm_qos_no_power_off from the device's
|
||||
power directory.
|
||||
|
||||
Notification mechanisms:
|
||||
|
||||
|
@ -179,8 +179,8 @@ int dev_pm_qos_add_notifier(device, notifier, type):
|
|||
Adds a notification callback function for the device for a particular request
|
||||
type.
|
||||
|
||||
The callback is called when the aggregated value of the device constraints list
|
||||
is changed.
|
||||
The callback is called when the aggregated value of the device constraints
|
||||
list is changed.
|
||||
|
||||
int dev_pm_qos_remove_notifier(device, notifier, type):
|
||||
Removes the notification callback function for the device.
|
||||
|
|
|
@ -268,8 +268,8 @@ defined in include/linux/pm.h:
|
|||
`unsigned int runtime_auto;`
|
||||
- if set, indicates that the user space has allowed the device driver to
|
||||
power manage the device at run time via the /sys/devices/.../power/control
|
||||
`interface;` it may only be modified with the help of the pm_runtime_allow()
|
||||
and pm_runtime_forbid() helper functions
|
||||
`interface;` it may only be modified with the help of the
|
||||
pm_runtime_allow() and pm_runtime_forbid() helper functions
|
||||
|
||||
`unsigned int no_callbacks;`
|
||||
- indicates that the device does not use the runtime PM callbacks (see
|
||||
|
|
|
@ -106,8 +106,8 @@ execution during resume):
|
|||
* Release system_transition_mutex lock.
|
||||
|
||||
|
||||
It is to be noted here that the system_transition_mutex lock is acquired at the very
|
||||
beginning, when we are just starting out to suspend, and then released only
|
||||
It is to be noted here that the system_transition_mutex lock is acquired at the
|
||||
very beginning, when we are just starting out to suspend, and then released only
|
||||
after the entire cycle is complete (i.e., suspend + resume).
|
||||
|
||||
::
|
||||
|
@ -165,7 +165,8 @@ Important files and functions/entry points:
|
|||
|
||||
- kernel/power/process.c : freeze_processes(), thaw_processes()
|
||||
- kernel/power/suspend.c : suspend_prepare(), suspend_enter(), suspend_finish()
|
||||
- kernel/cpu.c: cpu_[up|down](), _cpu_[up|down](), [disable|enable]_nonboot_cpus()
|
||||
- kernel/cpu.c: cpu_[up|down](), _cpu_[up|down](),
|
||||
[disable|enable]_nonboot_cpus()
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -118,7 +118,8 @@ In a really perfect world::
|
|||
|
||||
echo 1 > /proc/acpi/sleep # for standby
|
||||
echo 2 > /proc/acpi/sleep # for suspend to ram
|
||||
echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power conservative
|
||||
echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power
|
||||
# conservative
|
||||
echo 4 > /proc/acpi/sleep # for suspend to disk
|
||||
echo 5 > /proc/acpi/sleep # for shutdown unfriendly the system
|
||||
|
||||
|
@ -192,8 +193,8 @@ Q:
|
|||
|
||||
A:
|
||||
The freezing of tasks is a mechanism by which user space processes and some
|
||||
kernel threads are controlled during hibernation or system-wide suspend (on some
|
||||
architectures). See freezing-of-tasks.txt for details.
|
||||
kernel threads are controlled during hibernation or system-wide suspend (on
|
||||
some architectures). See freezing-of-tasks.txt for details.
|
||||
|
||||
Q:
|
||||
What is the difference between "platform" and "shutdown"?
|
||||
|
@ -282,7 +283,8 @@ A:
|
|||
suspend(PMSG_FREEZE): devices are frozen so that they don't interfere
|
||||
with state snapshot
|
||||
|
||||
state snapshot: copy of whole used memory is taken with interrupts disabled
|
||||
state snapshot: copy of whole used memory is taken with interrupts
|
||||
disabled
|
||||
|
||||
resume(): devices are woken up so that we can write image to swap
|
||||
|
||||
|
@ -353,8 +355,8 @@ Q:
|
|||
|
||||
A:
|
||||
Generally, yes, you can. However, it requires you to use the "resume=" and
|
||||
"resume_offset=" kernel command line parameters, so the resume from a swap file
|
||||
cannot be initiated from an initrd or initramfs image. See
|
||||
"resume_offset=" kernel command line parameters, so the resume from a swap
|
||||
file cannot be initiated from an initrd or initramfs image. See
|
||||
swsusp-and-swap-files.txt for details.
|
||||
|
||||
Q:
|
||||
|
|
|
@ -13000,6 +13000,15 @@ L: linux-scsi@vger.kernel.org
|
|||
S: Supported
|
||||
F: drivers/scsi/pm8001/
|
||||
|
||||
PM-GRAPH UTILITY
|
||||
M: "Todd E Brandt" <todd.e.brandt@linux.intel.com>
|
||||
L: linux-pm@vger.kernel.org
|
||||
W: https://01.org/pm-graph
|
||||
B: https://bugzilla.kernel.org/buglist.cgi?component=pm-graph&product=Tools
|
||||
T: git git://github.com/intel/pm-graph
|
||||
S: Supported
|
||||
F: tools/power/pm-graph
|
||||
|
||||
PNP SUPPORT
|
||||
M: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
|
||||
S: Maintained
|
||||
|
|
|
@ -905,7 +905,7 @@ static int omap_sr_probe(struct platform_device *pdev)
|
|||
sr_info->dbg_dir = debugfs_create_dir(sr_info->name, sr_dbg_dir);
|
||||
|
||||
debugfs_create_file("autocomp", S_IRUGO | S_IWUSR, sr_info->dbg_dir,
|
||||
(void *)sr_info, &pm_sr_fops);
|
||||
sr_info, &pm_sr_fops);
|
||||
debugfs_create_x32("errweight", S_IRUGO, sr_info->dbg_dir,
|
||||
&sr_info->err_weight);
|
||||
debugfs_create_x32("errmaxlimit", S_IRUGO, sr_info->dbg_dir,
|
||||
|
|
|
@ -293,6 +293,9 @@ struct omap_sr_data {
|
|||
struct voltagedomain *voltdm;
|
||||
};
|
||||
|
||||
|
||||
extern struct omap_sr_data omap_sr_pdata[OMAP_SR_NR];
|
||||
|
||||
#ifdef CONFIG_POWER_AVS_OMAP
|
||||
|
||||
/* Smartreflex module enable/disable interface */
|
||||
|
|
|
@ -8,3 +8,17 @@ ToDos sorted by priority:
|
|||
- Add another c1e debug idle monitor
|
||||
-> Is by design racy with BIOS, but could be added
|
||||
with a --force option and some "be careful" messages
|
||||
- Add cpu_start()/cpu_stop() callbacks for monitor
|
||||
-> This is to move the per_cpu logic from inside the
|
||||
monitor to outside it. This can be given higher
|
||||
priority in fork_it.
|
||||
- Fork as many processes as there are CPUs in case the
|
||||
per_cpu_schedule flag is set.
|
||||
-> Bind forked process to each cpu.
|
||||
-> Execute start measures via the forked processes on
|
||||
each cpu.
|
||||
-> Run test executable in a forked process.
|
||||
-> Execute stop measures via the forked processes on
|
||||
each cpu.
|
||||
This would be ideal as it will not introduce noise in the
|
||||
tested executable.
|
||||
|
|
|
@ -10,6 +10,7 @@
|
|||
#include <errno.h>
|
||||
#include <string.h>
|
||||
#include <getopt.h>
|
||||
#include <sys/utsname.h>
|
||||
|
||||
#include "helpers/helpers.h"
|
||||
#include "helpers/sysfs.h"
|
||||
|
@ -30,6 +31,7 @@ int cmd_info(int argc, char **argv)
|
|||
extern char *optarg;
|
||||
extern int optind, opterr, optopt;
|
||||
unsigned int cpu;
|
||||
struct utsname uts;
|
||||
|
||||
union {
|
||||
struct {
|
||||
|
@ -39,6 +41,13 @@ int cmd_info(int argc, char **argv)
|
|||
} params = {};
|
||||
int ret = 0;
|
||||
|
||||
ret = uname(&uts);
|
||||
if (!ret && (!strcmp(uts.machine, "ppc64le") ||
|
||||
!strcmp(uts.machine, "ppc64"))) {
|
||||
fprintf(stderr, _("Subcommand not supported on POWER.\n"));
|
||||
return ret;
|
||||
}
|
||||
|
||||
setlocale(LC_ALL, "");
|
||||
textdomain(PACKAGE);
|
||||
|
||||
|
|
|
@ -10,6 +10,7 @@
|
|||
#include <errno.h>
|
||||
#include <string.h>
|
||||
#include <getopt.h>
|
||||
#include <sys/utsname.h>
|
||||
|
||||
#include "helpers/helpers.h"
|
||||
#include "helpers/sysfs.h"
|
||||
|
@ -31,6 +32,7 @@ int cmd_set(int argc, char **argv)
|
|||
extern char *optarg;
|
||||
extern int optind, opterr, optopt;
|
||||
unsigned int cpu;
|
||||
struct utsname uts;
|
||||
|
||||
union {
|
||||
struct {
|
||||
|
@ -41,6 +43,13 @@ int cmd_set(int argc, char **argv)
|
|||
int perf_bias = 0;
|
||||
int ret = 0;
|
||||
|
||||
ret = uname(&uts);
|
||||
if (!ret && (!strcmp(uts.machine, "ppc64le") ||
|
||||
!strcmp(uts.machine, "ppc64"))) {
|
||||
fprintf(stderr, _("Subcommand not supported on POWER.\n"));
|
||||
return ret;
|
||||
}
|
||||
|
||||
setlocale(LC_ALL, "");
|
||||
textdomain(PACKAGE);
|
||||
|
||||
|
|
|
@ -131,6 +131,10 @@ int get_cpu_info(struct cpupower_cpu_info *cpu_info)
|
|||
if (ext_cpuid_level >= 0x80000007 &&
|
||||
(cpuid_edx(0x80000007) & (1 << 9)))
|
||||
cpu_info->caps |= CPUPOWER_CAP_AMD_CBP;
|
||||
|
||||
if (ext_cpuid_level >= 0x80000008 &&
|
||||
cpuid_ebx(0x80000008) & (1 << 4))
|
||||
cpu_info->caps |= CPUPOWER_CAP_AMD_RDPRU;
|
||||
}
|
||||
|
||||
if (cpu_info->vendor == X86_VENDOR_INTEL) {
|
||||
|
|
|
@ -69,6 +69,7 @@ enum cpupower_cpu_vendor {X86_VENDOR_UNKNOWN = 0, X86_VENDOR_INTEL,
|
|||
#define CPUPOWER_CAP_HAS_TURBO_RATIO 0x00000010
|
||||
#define CPUPOWER_CAP_IS_SNB 0x00000020
|
||||
#define CPUPOWER_CAP_INTEL_IDA 0x00000040
|
||||
#define CPUPOWER_CAP_AMD_RDPRU 0x00000080
|
||||
|
||||
#define CPUPOWER_AMD_CPBDIS 0x02000000
|
||||
|
||||
|
|
|
@ -328,7 +328,7 @@ struct cpuidle_monitor amd_fam14h_monitor = {
|
|||
.stop = amd_fam14h_stop,
|
||||
.do_register = amd_fam14h_register,
|
||||
.unregister = amd_fam14h_unregister,
|
||||
.needs_root = 1,
|
||||
.flags.needs_root = 1,
|
||||
.overflow_s = OVERFLOW_MS / 1000,
|
||||
};
|
||||
#endif /* #if defined(__i386__) || defined(__x86_64__) */
|
||||
|
|
|
@ -207,6 +207,6 @@ struct cpuidle_monitor cpuidle_sysfs_monitor = {
|
|||
.stop = cpuidle_stop,
|
||||
.do_register = cpuidle_register,
|
||||
.unregister = cpuidle_unregister,
|
||||
.needs_root = 0,
|
||||
.flags.needs_root = 0,
|
||||
.overflow_s = UINT_MAX,
|
||||
};
|
||||
|
|
|
@ -408,7 +408,7 @@ int cmd_monitor(int argc, char **argv)
|
|||
dprint("Try to register: %s\n", all_monitors[num]->name);
|
||||
test_mon = all_monitors[num]->do_register();
|
||||
if (test_mon) {
|
||||
if (test_mon->needs_root && !run_as_root) {
|
||||
if (test_mon->flags.needs_root && !run_as_root) {
|
||||
fprintf(stderr, _("Available monitor %s needs "
|
||||
"root access\n"), test_mon->name);
|
||||
continue;
|
||||
|
|
|
@ -60,7 +60,10 @@ struct cpuidle_monitor {
|
|||
struct cpuidle_monitor* (*do_register) (void);
|
||||
void (*unregister)(void);
|
||||
unsigned int overflow_s;
|
||||
int needs_root;
|
||||
struct {
|
||||
unsigned int needs_root:1;
|
||||
unsigned int per_cpu_schedule:1;
|
||||
} flags;
|
||||
};
|
||||
|
||||
extern long long timespec_diff_us(struct timespec start, struct timespec end);
|
||||
|
|
|
@ -39,7 +39,6 @@ static cstate_t hsw_ext_cstates[HSW_EXT_CSTATE_COUNT] = {
|
|||
{
|
||||
.name = "PC9",
|
||||
.desc = N_("Processor Package C9"),
|
||||
.desc = N_("Processor Package C2"),
|
||||
.id = PC9,
|
||||
.range = RANGE_PACKAGE,
|
||||
.get_count_percent = hsw_ext_get_count_percent,
|
||||
|
@ -188,7 +187,7 @@ struct cpuidle_monitor intel_hsw_ext_monitor = {
|
|||
.stop = hsw_ext_stop,
|
||||
.do_register = hsw_ext_register,
|
||||
.unregister = hsw_ext_unregister,
|
||||
.needs_root = 1,
|
||||
.flags.needs_root = 1,
|
||||
.overflow_s = 922000000 /* 922337203 seconds TSC overflow
|
||||
at 20GHz */
|
||||
};
|
||||
|
|
|
@ -19,6 +19,10 @@
|
|||
#define MSR_APERF 0xE8
|
||||
#define MSR_MPERF 0xE7
|
||||
|
||||
#define RDPRU ".byte 0x0f, 0x01, 0xfd"
|
||||
#define RDPRU_ECX_MPERF 0
|
||||
#define RDPRU_ECX_APERF 1
|
||||
|
||||
#define MSR_TSC 0x10
|
||||
|
||||
#define MSR_AMD_HWCR 0xc0010015
|
||||
|
@ -86,15 +90,51 @@ static int mperf_get_tsc(unsigned long long *tsc)
|
|||
return ret;
|
||||
}
|
||||
|
||||
static int mperf_init_stats(unsigned int cpu)
|
||||
static int get_aperf_mperf(int cpu, unsigned long long *aval,
|
||||
unsigned long long *mval)
|
||||
{
|
||||
unsigned long long val;
|
||||
unsigned long low_a, high_a;
|
||||
unsigned long low_m, high_m;
|
||||
int ret;
|
||||
|
||||
ret = read_msr(cpu, MSR_APERF, &val);
|
||||
aperf_previous_count[cpu] = val;
|
||||
ret |= read_msr(cpu, MSR_MPERF, &val);
|
||||
mperf_previous_count[cpu] = val;
|
||||
/*
|
||||
* Running on the cpu from which we read the registers will
|
||||
* prevent APERF/MPERF from going out of sync because of IPI
|
||||
* latency introduced by read_msr()s.
|
||||
*/
|
||||
if (mperf_monitor.flags.per_cpu_schedule) {
|
||||
if (bind_cpu(cpu))
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_RDPRU) {
|
||||
asm volatile(RDPRU
|
||||
: "=a" (low_a), "=d" (high_a)
|
||||
: "c" (RDPRU_ECX_APERF));
|
||||
asm volatile(RDPRU
|
||||
: "=a" (low_m), "=d" (high_m)
|
||||
: "c" (RDPRU_ECX_MPERF));
|
||||
|
||||
*aval = ((low_a) | (high_a) << 32);
|
||||
*mval = ((low_m) | (high_m) << 32);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
ret = read_msr(cpu, MSR_APERF, aval);
|
||||
ret |= read_msr(cpu, MSR_MPERF, mval);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int mperf_init_stats(unsigned int cpu)
|
||||
{
|
||||
unsigned long long aval, mval;
|
||||
int ret;
|
||||
|
||||
ret = get_aperf_mperf(cpu, &aval, &mval);
|
||||
aperf_previous_count[cpu] = aval;
|
||||
mperf_previous_count[cpu] = mval;
|
||||
is_valid[cpu] = !ret;
|
||||
|
||||
return 0;
|
||||
|
@ -102,13 +142,12 @@ static int mperf_init_stats(unsigned int cpu)
|
|||
|
||||
static int mperf_measure_stats(unsigned int cpu)
|
||||
{
|
||||
unsigned long long val;
|
||||
unsigned long long aval, mval;
|
||||
int ret;
|
||||
|
||||
ret = read_msr(cpu, MSR_APERF, &val);
|
||||
aperf_current_count[cpu] = val;
|
||||
ret |= read_msr(cpu, MSR_MPERF, &val);
|
||||
mperf_current_count[cpu] = val;
|
||||
ret = get_aperf_mperf(cpu, &aval, &mval);
|
||||
aperf_current_count[cpu] = aval;
|
||||
mperf_current_count[cpu] = mval;
|
||||
is_valid[cpu] = !ret;
|
||||
|
||||
return 0;
|
||||
|
@ -305,6 +344,9 @@ struct cpuidle_monitor *mperf_register(void)
|
|||
if (init_maxfreq_mode())
|
||||
return NULL;
|
||||
|
||||
if (cpupower_cpu_info.vendor == X86_VENDOR_AMD)
|
||||
mperf_monitor.flags.per_cpu_schedule = 1;
|
||||
|
||||
/* Free this at program termination */
|
||||
is_valid = calloc(cpu_count, sizeof(int));
|
||||
mperf_previous_count = calloc(cpu_count, sizeof(unsigned long long));
|
||||
|
@ -333,7 +375,7 @@ struct cpuidle_monitor mperf_monitor = {
|
|||
.stop = mperf_stop,
|
||||
.do_register = mperf_register,
|
||||
.unregister = mperf_unregister,
|
||||
.needs_root = 1,
|
||||
.flags.needs_root = 1,
|
||||
.overflow_s = 922000000 /* 922337203 seconds TSC overflow
|
||||
at 20GHz */
|
||||
};
|
||||
|
|
|
@ -208,7 +208,7 @@ struct cpuidle_monitor intel_nhm_monitor = {
|
|||
.stop = nhm_stop,
|
||||
.do_register = intel_nhm_register,
|
||||
.unregister = intel_nhm_unregister,
|
||||
.needs_root = 1,
|
||||
.flags.needs_root = 1,
|
||||
.overflow_s = 922000000 /* 922337203 seconds TSC overflow
|
||||
at 20GHz */
|
||||
};
|
||||
|
|
|
@ -192,7 +192,7 @@ struct cpuidle_monitor intel_snb_monitor = {
|
|||
.stop = snb_stop,
|
||||
.do_register = snb_register,
|
||||
.unregister = snb_unregister,
|
||||
.needs_root = 1,
|
||||
.flags.needs_root = 1,
|
||||
.overflow_s = 922000000 /* 922337203 seconds TSC overflow
|
||||
at 20GHz */
|
||||
};
|
||||
|
|
Loading…
Reference in New Issue