This patch addresses Bug 60727
(https://bugzilla.kernel.org/show_bug.cgi?id=60727)
which was due to the truncation of intermediate values in the
calculations, which causes the code to consistently underestimate the
current cpu frequency, specifically 100% cpu utilization was truncated
down to the setpoint of 97%. This patch fixes the problem by keeping
the results of all intermediate calculations as fixed point numbers
rather scaling them back and forth between integers and fixed point.
References: https://bugzilla.kernel.org/show_bug.cgi?id=60727
Signed-off-by: Brennan Shacklett <bpshacklett@gmail.com>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The expression in line 398 of intel_pstate.c causes the following
warning to be emitted:
drivers/cpufreq/intel_pstate.c:398:3: warning: left shift count >= width of type
which happens because unsigned long is 32-bit on some architectures.
Fix that by using a helper u64 variable and simplify the code
slightly.
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the system is suspended while max_perf_pct is less than 100 percent
or no_turbo set policy->{min,max} will be set incorrectly with scaled
values which turn the scaled values into hard limits.
References: https://bugzilla.kernel.org/show_bug.cgi?id=61241
Reported-by: Patrick Bartels <petzicus@googlemail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Cc: 3.9+ <stable@vger.kernel.org> # 3.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Most of the users of cpufreq_verify_within_limits() calls it for
limiting with min/max from policy->cpuinfo. We can make that code
simple by introducing another routine which will do this for them
automatically.
This patch adds another routine cpufreq_verify_within_cpu_limits()
and updates others to use it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When sysfs for no_turbo is set, then also some p states in turbo regions
are observed. This patch will set IDA Engage bit when no_turbo is set to
explicitly disengage turbo.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Enable the intel_pstate driver for Haswell CPUs. One missing Ivy Bridge
model (0x3E) is also included. Models referenced from
tools/power/x86/turbostat/turbostat.c:has_nehalem_turbo_ratio_limit
Signed-off-by: Nell Hardcastle <nell@spicious.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We don't need to set .owner = THIS_MODULE any more in cpufreq drivers
as this field isn't used any more by the cpufreq core.
This patch removes it and updates all dependent drivers accordingly.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Change to using max P-state instead of max turbo P-state. This
change resolves two issues.
On a quiet system intel_pstate can fail to respond to a load change.
On CPU SKUs that have a limited number of P-states and no turbo range
intel_pstate fails to select the highest available P-state.
This change is suitable for stable v3.9+
References: https://bugzilla.kernel.org/show_bug.cgi?id=59481
Reported-and-tested-by: Arjan van de Ven <arjan@linux.intel.com>
Reported-and-tested-by: dsmythies@telus.net
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
This removes all the drivers/cpufreq uses of the __cpuinit macros
from all C files.
[1] https://lkml.org/lkml/2013/5/20/589
[v2: leave 2nd lines of args misaligned as requested by Viresh]
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: cpufreq@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Add CPU ID for Ivybrigde processor.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use vzalloc() instead of vmalloc() and memset(0).
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The driver can no longer be built as a module remove the compile fence
around cpufreq tracing call.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove dead code from the driver.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The ffmpeg benchmark in the phoronix test suite has threads on
multiple cores that rely on the progress on of threads on other cores
and ping pong back and forth fast enough to make the core appear less
busy than it "should" be. If the core has been at minimum p-state for
a while bump the pstate up to kick the core to see if it is in this
ping pong state. If the core is truly idle the p-state will be
reduced at the next sample time. If the core makes more progress it
will send more work to the thread bringing both threads out of the
ping pong scenario and the p-state will be selected normally.
This fixes a performance regression of approximately 30%
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are two ways that the maximum p-state can be clamped, via a
policy change and via the sysfs file.
The acpi-thermal driver adjusts the p-state policy in response to
thermal events. These changes override the users settings at the
moment.
Use the lowest of the two requested values this ensures that we will
not exceed the requested pstate from either mechanism.
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Idle time is taken into account in the APERF/MPERF ratio calculation
there is no reason for the driver to track it seperately. This
reduces the work in the driver and makes the code more readable.
Removal of the tracking of sample duration removes the possibility of
the divide by zero exception when the duration is sub 1us
References: https://bugzilla.kernel.org/show_bug.cgi?id=56691
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pm-cpufreq: (57 commits)
cpufreq: MAINTAINERS: Add co-maintainer
cpufreq: pxa2xx: initialize variables
ARM: S5pv210: compiling issue, ARM_S5PV210_CPUFREQ needs CONFIG_CPU_FREQ_TABLE=y
cpufreq: cpu0: Put cpu parent node after using it
cpufreq: ARM big LITTLE: Adapt to latest cpufreq updates
cpufreq: ARM big LITTLE: put DT nodes after using them
cpufreq: Don't call __cpufreq_governor() for drivers without target()
cpufreq: exynos5440: Protect OPP search calls with RCU lock
cpufreq: dbx500: Round to closest available freq
cpufreq: Call __cpufreq_governor() with correct policy->cpus mask
cpufreq / intel_pstate: Optimize intel_pstate_set_policy
cpufreq: OMAP: instantiate omap-cpufreq as a platform_driver
arm: exynos: Enable OPP library support for exynos5440
cpufreq: exynos: Remove error return even if no soc is found
cpufreq: exynos: Add cpufreq driver for exynos5440
cpufreq: AMD "frequency sensitivity feedback" powersave bias for ondemand governor
cpufreq: ondemand: allow custom powersave_bias_target handler to be registered
cpufreq: convert cpufreq_driver to using RCU
cpufreq: powerpc/platforms/cell: move cpufreq driver to drivers/cpufreq
cpufreq: sparc: move cpufreq driver to drivers/cpufreq
...
Conflicts:
MAINTAINERS (with commit a8e39c3 from pm-cpuidle)
drivers/cpufreq/cpufreq_governor.h (with commit beb0ff3)
This function is called quite often from other subsystems.
Removed unused call to intel_pstate_get_min_max().
Also when "policy->policy == CPUFREQ_POLICY_PERFORMANCE", then
no need to do calculations as the limits will be forced anyway.
Also corrected filename in the header.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The current calculation of the delay time is wrong and a cut and
paste error from a previous experimental driver. This can result in
the timeout being set to jiffies + 1 which setup the driver to race
with itself if the APIC timer interrupt happens at just the right
time.
References: https://bugzilla.redhat.com/show_bug.cgi?id=920289
Reported-by: Adam Williamson <awilliam@redhat.com>
Reported-and-tested-by: Parag Warudkar <parag.lkml@gmail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
They are defined in coreboot (MSR_PLATFORM) and the other
one is already defined in msr-index.h.
Let's use those.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use the correct pstate value to calculate the effective frequency.
References: https://bugzilla.redhat.com/show_bug.cgi?id=923942
Reported-by: Satish Balay <balay@fastmail.fm>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some VMs seem to try to implement some MSRs but not all the registers
the driver needs. Check to make sure all the MSR that we need are
available. If any of the required MSRs are not available refuse to
load.
References: https://bugzilla.redhat.com/show_bug.cgi?id=922923
Reported-by: Josh Stone <jistone@redhat.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It seems some VMs support the P state MSRs but return zeros. Fail
gracefully if we are running in this environment.
References: https://bugzilla.redhat.com/show_bug.cgi?id=916833
Reported-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If cpufreq_register_driver() fails just free memory that has been
allocated and return. intel_pstate_exit() function is removed since we
are built-in only now there is no reason for a module exit procedure.
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When intel_pstate is configured into the kernel it will become the
preferred scaling driver for processors that it supports. Allow the
user to override this by adding:
intel_pstate=disable
on the kernel command line.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fixes 32 bit build.
on i386:
drivers/built-in.o: In function `intel_pstate_timer_func':
intel_pstate.c:(.text+0x4ce97e): undefined reference to `__udivdi3'
drivers/built-in.o: In function `intel_pstate_cpu_init':
intel_pstate.c:(.cpuinit.text+0x974): undefined reference to `__udivdi3'
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a P-state driver for the Intel Sandy bridge processor. In cpufreq
terminology this driver implements a scaling driver with an internal
governor.
When built into the the kernel this driver will be the preferred
scaling driver for Sandy bridge processors.
In addition to the interfaces provided by the cpufreq subsystem for
controlling scaling drivers. The user may control the behavior of the
driver via three sysfs files located in
"/sys/devices/system/cpu/intel_pstate".
max_perf_pct: limits the maximum P state that will be requested by
the driver stated as a percentage of the avail performance.
min_perf_pct: limits the minimum P state that will be requested by
the driver stated as a percentage of the avail performance.
no_turbo: limits the driver to selecting P states below the turbo
frequency range.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>