We are returning -EINVAL instead of the error returned from kobject_move() when
it fails. Propagate the actual error number.
Also add a meaningful print when sysfs_create_link() fails.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
While hot-unplugging policy->cpu, we call cpufreq_nominate_new_policy_cpu() to
nominate next owner of policy, i.e. policy->cpu. If we fail to move policy
kobject under the new policy->cpu, we try to update policy->cpus with the old
policy->cpu.
This would have been required in case old-CPU is removed from policy->cpus in
the first place. But its not done before calling
cpufreq_nominate_new_policy_cpu(), but during the POST_DEAD notification which
happens quite late in the hot-unplugging path.
So, this is just some useless code hanging around, get rid of it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There exists 350MHz K6-2E+ CPU, so add it to the usual frequency table.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, ondemand calculates the target frequency proportional to load
using the formula:
Target frequency = C * load
where C = policy->cpuinfo.max_freq / 100
Though, in many cases, the minimum available frequency is pretty high and
the above calculation introduces a dead band from load 0 to
100 * policy->cpuinfo.min_freq / policy->cpuinfo.max_freq where the target
frequency is always calculated to less than policy->cpuinfo.min_freq and
the minimum frequency is selected.
For example: on Intel i7-3770 @ 3.4GHz the policy->cpuinfo.min_freq = 1600000
and the policy->cpuinfo.max_freq = 3400000 (without turbo). Thus, the CPU
starts to scale up at a load above 47.
On quad core 1500MHz Krait the policy->cpuinfo.min_freq = 384000
and the policy->cpuinfo.max_freq = 1512000. Thus, the CPU starts to scale
at load above 25.
Change the calculation of target frequency to eliminate the above effect using
the formula:
Target frequency = A + B * load
where A = policy->cpuinfo.min_freq and
B = (policy->cpuinfo.max_freq - policy->cpuinfo->min_freq) / 100
This will map load values 0 to 100 linearly to cpuinfo.min_freq to
cpuinfo.max_freq.
Also, use the CPUFREQ_RELATION_C in __cpufreq_driver_target to select the
closest frequency in frequency_table. This is necessary to avoid selection
of minimum frequency only when load equals to 0. It will also help for selection
of frequencies using a more 'fair' criterion.
Tables below show the difference in selected frequency for specific values
of load without and with this patch. On Intel i7-3770 @ 3.40GHz:
Without With
Load Target Selected Target Selected
0 0 1600000 1600000 1600000
5 170050 1600000 1690050 1700000
10 340100 1600000 1780100 1700000
15 510150 1600000 1870150 1900000
20 680200 1600000 1960200 2000000
25 850250 1600000 2050250 2100000
30 1020300 1600000 2140300 2100000
35 1190350 1600000 2230350 2200000
40 1360400 1600000 2320400 2400000
45 1530450 1600000 2410450 2400000
50 1700500 1900000 2500500 2500000
55 1870550 1900000 2590550 2600000
60 2040600 2100000 2680600 2600000
65 2210650 2400000 2770650 2800000
70 2380700 2400000 2860700 2800000
75 2550750 2600000 2950750 3000000
80 2720800 2800000 3040800 3000000
85 2890850 2900000 3130850 3100000
90 3060900 3100000 3220900 3300000
95 3230950 3300000 3310950 3300000
100 3401000 3401000 3401000 3401000
On ARM quad core 1500MHz Krait:
Without With
Load Target Selected Target Selected
0 0 384000 384000 384000
5 75600 384000 440400 486000
10 151200 384000 496800 486000
15 226800 384000 553200 594000
20 302400 384000 609600 594000
25 378000 384000 666000 702000
30 453600 486000 722400 702000
35 529200 594000 778800 810000
40 604800 702000 835200 810000
45 680400 702000 891600 918000
50 756000 810000 948000 918000
55 831600 918000 1004400 1026000
60 907200 918000 1060800 1026000
65 982800 1026000 1117200 1134000
70 1058400 1134000 1173600 1134000
75 1134000 1134000 1230000 1242000
80 1209600 1242000 1286400 1242000
85 1285200 1350000 1342800 1350000
90 1360800 1458000 1399200 1350000
95 1436400 1458000 1455600 1458000
100 1512000 1512000 1512000 1512000
Tested on Intel i7-3770 CPU @ 3.40GHz and on ARM quad core 1500MHz Krait
(Android smartphone).
Benchmarks on Intel i7 shows a performance improvement on low and medium
work loads with lower power consumption. Specifics:
Phoronix Linux Kernel Compilation 3.1:
Time: -0.40%, energy: -0.07%
Phoronix Apache:
Time: -4.98%, energy: -2.35%
Phoronix FFMPEG:
Time: -6.29%, energy: -4.02%
Also, running mp3 decoding (very low load) shows no differences with and
without this patch.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Introduce CPUFREQ_RELATION_C for frequency selection.
It selects the frequency with the minimum euclidean distance to target.
In case of equal distance between 2 frequencies, it will select the
greater frequency.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PU regulator is not a necessary regulator for cpufreq, not all
i.MX6 SoCs have PU regulator, only if SOC has PU regulator, then its
voltage must be equal to SOC regulator, so remove the dependency
to support i.MX6SX which has no PU regulator.
Signed-off-by: Anson Huang <b20788@freescale.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The specific rounding adds conditionally only 1/256 to fractional
part of core_pct.
We can safely remove it without any noticeable impact in
calculations.
Use div64_u64 instead of div_u64 to avoid possible overflow of
sample->mperf as divisor
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Simplify the code by removing the inline functions pstate_increase and
pstate_decrease and use directly the intel_pstate_set_pstate.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently we shift right aperf and mperf variables by FRAC_BITS
to prevent overflow when we convert them to fix point numbers
(shift left by FRAC_BITS).
But this is not necessary, because we actually use delta aperf and mperf
which are much less than APERF and MPERF values.
So, use the unmodified APERF and MPERF values in calculation.
This also adds 8 bits in precision, although the gain is insignificant.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to Intel 64 and IA-32 Architectures SDM, Volume 3,
Chapter 14.2, "Software needs to exercise care to avoid delays
between the two RDMSRs (for example interrupts)".
So, disable interrupts during reading MSRs IA32_APERF and IA32_MPERF.
This should increase the accuracy of the calculations.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Suppress checkpatch.pl --strict warnings:
CHECK: Alignment should match open parenthesis
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the unnecessary intermediate assignment and use directly the
pid_params.sample_rate_ms variable.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove unnecessary parentheses.
Also, add parentheses in one case for better readability.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We can fit these lines in a single one, under the 80 characters
limit.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
div_s64() accepts the divisor parameter as s32. Helper div_fp()
also accepts divisor as int32_t.
So, remove the unnecessary int64_t type casting.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since we never remove sysfs entry and debugfs files, we can make
the intel_pstate_kobject and debugfs_parent locals.
Also, annotate with __init intel_pstate_sysfs_expose_params()
and intel_pstate_debug_expose_params() in order to be freed
after bootstrap.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is no use for the .resume_clocks() callback now and in fact all
the provided functions are empty, so this patch just removes it in
preparation for further patches.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
This is only relevant to implementations with multiple clusters, where clusters
have separate clock lines but all CPUs within a cluster share it.
Consider a dual cluster platform with 2 cores per cluster. During suspend we
start hot unplugging CPUs in order 1 to 3. When CPU2 is removed, policy->kobj
would be moved to CPU3 and when CPU3 goes down we wouldn't free policy or its
kobj as we want to retain permissions/values/etc.
Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev().
We will recover the old policy and update policy->cpu from 3 to 2 from
update_policy_cpu().
But the kobj is still tied to CPU3 and isn't moved to CPU2. We wouldn't create a
link for CPU2, but would try that for CPU3 while bringing it online. Which will
report errors as CPU3 already has kobj assigned to it.
This bug got introduced with commit 42f921a, which overlooked this scenario.
To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a
cluster back. Also do a WARN_ON() if kobject_move failed, as we would reach here
only for the first CPU of a non-boot cluster. And we can't recover from this
situation, if kobject_move() fails.
Fixes: 42f921a6f1 (cpufreq: remove sysfs files for CPUs which failed to come back after resume)
Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
Reported-and-tested-by: Bu Yitian <ybu@qti.qualcomm.com>
Reported-by: Saravana Kannan <skannan@codeaurora.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa@mit.edu>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
OPPs can be populated statically, via DT, or added at run time with
dev_pm_opp_add().
While this driver handles the first case correctly, it would fail to populate
OPPs added at runtime. Because call to of_init_opp_table() would fail as there
are no OPPs in DT and probe will return early.
To fix this, remove error checking and call dev_pm_opp_init_cpufreq_table()
unconditionally.
Update bindings as well.
Suggested-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit ff1f0018cf ("drivers: Enable
building of Kirkwood drivers for mach-mvebu") added Kirkwood into
mach-mvebu, adding MACH_KIRKWOOD to ARCH_KIRKWOOD in the KConfig files.
The change for ARM_KIRKWOOD_CPUFREQ replaced ARCH_KIRKWOOD with
MACH_KIRKWOOD, whereas all the other changes were ARCH_KIRKWOOD ||
MACH_KIRKWOOD.
As a consequence of this change, the cpufreq driver is no longer enabled
for ARCH_KIRKWOOD. This patch reinstates ARM_KIRKWOOD_CPUFREQ for
ARCH_KIRKWOOD.
Signed-off-by: Quentin Armitage <quentin@armitage.org.uk>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PM_OPP is a library used by several of the existing cpufreq drivers.
ARM IMX6Q cpufreq driver uses this library for its functionality.
Thus, it should be selected in Kconfig.
Reported-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Nicolas Del Piano <ndel314@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Compaq iPAQ h3600 also has the K4S281632b-1H memory type.
Verified by prying apart a broken board.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since commtit 8a7b1227e3 (cpufreq: davinci: move cpufreq driver to
drivers/cpufreq) this added dependancy only for CONFIG_ARCH_DAVINCI_DA850
where as davinci_cpufreq_init() call is used by all davinci platform.
This patch fixes following build error:
arch/arm/mach-davinci/built-in.o: In function `davinci_init_late':
:(.init.text+0x928): undefined reference to `davinci_cpufreq_init'
make: *** [vmlinux] Error 1
Fixes: 8a7b1227e3 (cpufreq: davinci: move cpufreq driver to drivers/cpufreq)
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Ensure that cpu->cpu is set before writing MSR_IA32_PERF_CTL during CPU
initialization. Otherwise only cpu0 has its P-state set and all other
cores are left with their values unchanged.
In most cases, this is not too serious because the P-states will be set
correctly when the timer function is run. But when the default governor
is set to performance, the per-CPU current_pstate stays the same forever
and no attempts are made to write the MSRs again.
Signed-off-by: Vincent Minet <vincent@vincent-minet.net>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If turbo is disabled in the BIOS bit 38 should be set in
MSR_IA32_MISC_ENABLE register per section 14.3.2.1 of the SDM Vol 3
document 325384-050US Feb 2014. If this bit is set do *not* attempt
to disable trubo via the MSR_IA32_PERF_CTL register. On some systems
trying to disable turbo via MSR_IA32_PERF_CTL will cause subsequent
writes to MSR_IA32_PERF_CTL not take affect, in fact reading
MSR_IA32_PERF_CTL will not show the IDA/Turbo DISENGAGE bit(32) as
set. A write of bit 32 to zero returns to normal operation.
Also deal with the case where the processor does not support
turbo and the BIOS does not report the fact in MSR_IA32_MISC_ENABLE
but does report the max and turbo P states as the same value.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=64251
Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 21855ff5 (intel_pstate: Set turbo VID for BayTrail) introduced
setting the turbo VID which is required to prevent a machine check on
some Baytrail SKUs under heavy graphics based workloads. The
docmumentation update that brought the requirement to light also
changed the bit mask used for enumerating P state and VID values from
0x7f to 0x3f.
This change returns the mask value to 0x7f.
Tested with the Intel NUC DN2820FYK,
BIOS version FYBYT10H.86A.0034.2014.0513.1413 with v3.16-rc1 and
v3.14.8 kernel versions.
Fixes: 21855ff5 (intel_pstate: Set turbo VID for BayTrail)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77951
Reported-and-tested-by: Rune Reterson <rune@megahurts.dk>
Reported-and-tested-by: Eric Eickmeyer <erich@ericheickmeyer.com>
Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are a bunch of users open coding the for_each_node_by_name() by
calling of_find_node_by_name() directly instead of using the macro. This
is getting in the way of some cleanups, and the possibility of removing
of_find_node_by_name() entirely. Clean it up so that all the users are
consistent.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Takashi Iwai <tiwai@suse.de>
Commit bd0fa9bb45 introduced a failure path to cpufreq_update_policy() if
cpufreq_driver->get(cpu) returns NULL. However, it jumps to the 'no_policy'
label, which exits without unlocking any of the locks the function acquired
earlier. This causes later calls into cpufreq to hang.
Fix this by creating a new 'unlock' label and jumping to that instead.
Fixes: bd0fa9bb45 ("cpufreq: Return error if ->get() failed in cpufreq_update_policy()")
Link: https://devtalk.nvidia.com/default/topic/751903/kernel-3-15-and-nv-drivers-337-340-failed-to-initialize-the-nvidia-kernel-module-gtx-550-ti-/
Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There was a mistake in the actual rounding portion this previous patch:
f0fe3cd7e1 (intel_pstate: Correct rounding in busy calculation) such that
the rounding was asymetric and incorrect.
Severity: Not very serious, but can increase target pstate by one extra value.
For real world work flows the issue should self correct (but I have no proof).
It is the equivalent of different PID gains for positive and negative numbers.
Examples:
-3.000000 used to round to -4, rounds to -3 with this patch.
-3.503906 used to round to -5, rounds to -4 with this patch.
Fixes: f0fe3cd7e1 (intel_pstate: Correct rounding in busy calculation)
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5fbfbcd3e8 ("cpufreq: cpufreq-cpu0: remove dependency on THERMAL and
REGULATOR") was a little too quick in completely removing the dependency
on the THERMAL driver.
The problem is that while there are inline wrappers to turn the thermal
API calls into empty functions, those do not help if the cpu-thermal
driver is a loadable module and cpufreq-cpu0 is builtin.
Since CONFIG_CPU_THERMAL is a bool option that decides whether the cpu
code is built into the thermal module or not, we have to use a dependency
on the thermal driver itself. However, if CPU_THERMAL is disabled, we
don't need the dependency, hence the strange '!CPU_THERMAL || THERMAL'
construct.
Fixes: 5fbfbcd3e8 ("cpufreq: cpufreq-cpu0: remove dependency on THERMAL and REGULATOR")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pm-cpufreq:
cpufreq: cpufreq-cpu0: remove dependency on THERMAL and REGULATOR
cpufreq: tegra: update comment for clarity
cpufreq: intel_pstate: Remove duplicate CPU ID check
cpufreq: Mark CPU0 driver with CPUFREQ_NEED_INITIAL_FREQ_CHECK flag
cpufreq: governor: remove copy_prev_load from 'struct cpu_dbs_common_info'
cpufreq: governor: Be friendly towards latency-sensitive bursty workloads
cpufreq: ppc-corenet-cpu-freq: do_div use quotient
Revert "cpufreq: Enable big.LITTLE cpufreq driver on arm64"
cpufreq: Tegra: implement intermediate frequency callbacks
cpufreq: add support for intermediate (stable) frequencies
cpufreq-cpu0 uses thermal framework to register a cooling device, but doesn't
depend on it as there are dummy calls provided by thermal layer when
CONFIG_THERMAL=n. And when these calls fail, the driver is still usable.
Similar explanation is valid for regulators as well. We do have dummy calls
available for regulator APIs and the driver can work even when those calls
fail.
So, we don't really need to mention thermal and regulators as a dependency for
cpufreq-cpu0 in Kconfig as platforms without support for thermal/regulator can
also use this driver. Remove this dependency.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tegra's driver got updated a bit (00917dd cpufreq: Tegra: implement intermediate
frequency callbacks) and implements new 'intermediate freq' infrastructure of
core. Above commit updated comments about when to call
clk_prepare_enable(pll_x_clk) and Doug wasn't satisfied with those comments and
said this:
> The "Though when target-freq is intermediate freq, we don't need to
> take this reference." makes me think that this function is actually
> called when target-freq is intermediate freq. I don't think it is,
> right?
For better clarity just make that comment more explicit about when we call
tegra_target_intermediate().
Reviewed-by: Stephen Warren <swarren@nvidia.com>
Reported-and-reviewed-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We check the CPU ID during driver init. There is no need
to do it again per logical CPU initialization.
So, remove the duplicate check.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sometimes boot loaders set CPU frequency to a value outside of frequency table
present with cpufreq core. In such cases CPU might be unstable if it has to run
on that frequency for long duration of time and so its better to set it to a
frequency which is specified in frequency table.
Sachin recently found this problem with cpufreq-cpu0 driver when he was testing
it for Exynos.
Set this flag for cpufreq-cpu0 driver.
Reported-and-tested-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
'copy_prev_load' was recently added by commit: 18b46ab (cpufreq: governor: Be
friendly towards latency-sensitive bursty workloads).
It actually is a bit redundant as we also have 'prev_load' which can store any
integer value and can be used instead of 'copy_prev_load' by setting it zero.
True load can also turn out to be zero during long idle intervals (and hence the
actual value of 'prev_load' and the overloaded value can clash). However this is
not a problem because, if the true load was really zero in the previous
interval, it makes sense to evaluate the load afresh for the current interval
rather than copying the previous load.
So, drop 'copy_prev_load' and use 'prev_load' instead.
Update comments as well to make it more clear.
There is another change here which was probably missed by Srivatsa during the
last version of updates he made. The unlikely in the 'if' statement was covering
only half of the condition and the whole line should actually come under it.
Also checkpatch is made more silent as it was reporting this (--strict option):
CHECK: Alignment should match open parenthesis
+ if (unlikely(wall_time > (2 * sampling_rate) &&
+ j_cdbs->prev_load)) {
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cpufreq governors like the ondemand governor calculate the load on the CPU
periodically by employing deferrable timers. A deferrable timer won't fire
if the CPU is completely idle (and there are no other timers to be run), in
order to avoid unnecessary wakeups and thus save CPU power.
However, the load calculation logic is agnostic to all this, and this can
lead to the problem described below.
Time (ms) CPU 1
100 Task-A running
110 Governor's timer fires, finds load as 100% in the last
10ms interval and increases the CPU frequency.
110.5 Task-A running
120 Governor's timer fires, finds load as 100% in the last
10ms interval and increases the CPU frequency.
125 Task-A went to sleep. With nothing else to do, CPU 1
went completely idle.
200 Task-A woke up and started running again.
200.5 Governor's deferred timer (which was originally programmed
to fire at time 130) fires now. It calculates load for the
time period 120 to 200.5, and finds the load is almost zero.
Hence it decreases the CPU frequency to the minimum.
210 Governor's timer fires, finds load as 100% in the last
10ms interval and increases the CPU frequency.
So, after the workload woke up and started running, the frequency was suddenly
dropped to absolute minimum, and after that, there was an unnecessary delay of
10ms (sampling period) to increase the CPU frequency back to a reasonable value.
And this pattern repeats for every wake-up-from-cpu-idle for that workload.
This can be quite undesirable for latency- or response-time sensitive bursty
workloads. So we need to fix the governor's logic to detect such wake-up-from-
cpu-idle scenarios and start the workload at a reasonably high CPU frequency.
One extreme solution would be to fake a load of 100% in such scenarios. But
that might lead to undesirable side-effects such as frequency spikes (which
might also need voltage changes) especially if the previous frequency happened
to be very low.
We just want to avoid the stupidity of dropping down the frequency to a minimum
and then enduring a needless (and long) delay before ramping it up back again.
So, let us simply carry forward the previous load - that is, let us just pretend
that the 'load' for the current time-window is the same as the load for the
previous window. That way, the frequency and voltage will continue to be set
to whatever values they were set at previously. This means that bursty workloads
will get a chance to influence the CPU frequency at which they wake up from
cpu-idle, based on their past execution history. Thus, they might be able to
avoid suffering from slow wakeups and long response-times.
However, we should take care not to over-do this. For example, such a "copy
previous load" logic will benefit cases like this: (where # represents busy
and . represents idle)
##########.........#########.........###########...........##########........
but it will be detrimental in cases like the one shown below, because it will
retain the high frequency (copied from the previous interval) even in a mostly
idle system:
##########.........#.................#.....................#...............
(i.e., the workload finished and the remaining tasks are such that their busy
periods are smaller than the sampling interval, which causes the timer to
always get deferred. So, this will make the copy-previous-load logic copy
the initial high load to subsequent idle periods over and over again, thus
keeping the frequency high unnecessarily).
So, we modify this copy-previous-load logic such that it is used only once
upon every wakeup-from-idle. Thus if we have 2 consecutive idle periods, the
previous load won't get blindly copied over; cpufreq will freshly evaluate the
load in the second idle interval, thus ensuring that the system comes back to
its normal state.
[ The right way to solve this whole problem is to teach the CPU frequency
governors to also track load on a per-task basis, not just a per-CPU basis,
and then use both the data sources intelligently to set the appropriate
frequency on the CPUs. But that involves redesigning the cpufreq subsystem,
so this patch should make the situation bearable until then. ]
Experimental results:
+-------------------+
I ran a modified version of ebizzy (called 'sleeping-ebizzy') that sleeps in
between its execution such that its total utilization can be a user-defined
value, say 10% or 20% (higher the utilization specified, lesser the amount of
sleeps injected). This ebizzy was run with a single-thread, tied to CPU 8.
Behavior observed with tracing (sample taken from 40% utilization runs):
------------------------------------------------------------------------
Without patch:
~~~~~~~~~~~~~~
kworker/8:2-12137 416.335742: cpu_frequency: state=2061000 cpu_id=8
kworker/8:2-12137 416.335744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40753 416.345741: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-12137 416.345744: cpu_frequency: state=4123000 cpu_id=8
kworker/8:2-12137 416.345746: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40753 416.355738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
<snip> --------------------------------------------------------------------- <snip>
<...>-40753 416.402202: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8
<idle>-0 416.502130: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy
<...>-40753 416.505738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-12137 416.505739: cpu_frequency: state=2061000 cpu_id=8
kworker/8:2-12137 416.505741: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40753 416.515739: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-12137 416.515742: cpu_frequency: state=4123000 cpu_id=8
kworker/8:2-12137 416.515744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
Observation: Ebizzy went idle at 416.402202, and started running again at
416.502130. But cpufreq noticed the long idle period, and dropped the frequency
at 416.505739, only to increase it back again at 416.515742, realizing that the
workload is in-fact CPU bound. Thus ebizzy needlessly ran at the lowest frequency
for almost 13 milliseconds (almost 1 full sample period), and this pattern
repeats on every sleep-wakeup. This could hurt latency-sensitive workloads quite
a lot.
With patch:
~~~~~~~~~~~
kworker/8:2-29802 464.832535: cpu_frequency: state=2061000 cpu_id=8
<snip> --------------------------------------------------------------------- <snip>
kworker/8:2-29802 464.962538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 464.972533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-29802 464.972536: cpu_frequency: state=4123000 cpu_id=8
kworker/8:2-29802 464.972538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 464.982531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
<snip> --------------------------------------------------------------------- <snip>
kworker/8:2-29802 465.022533: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 465.032531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-29802 465.032532: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 465.035797: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8
<idle>-0 465.240178: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy
<...>-40738 465.242533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-29802 465.242535: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 465.252531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
Observation: Ebizzy went idle at 465.035797, and started running again at
465.240178. Since ebizzy was the only real workload running on this CPU,
cpufreq retained the frequency at 4.1Ghz throughout the run of ebizzy, no
matter how many times ebizzy slept and woke-up in-between. Thus, ebizzy
got the 10ms worth of 4.1 Ghz benefit during every sleep-wakeup (as compared
to the run without the patch) and this boost gave a modest improvement in total
throughput, as shown below.
Sleeping-ebizzy records-per-second:
-----------------------------------
Utilization Without patch With patch Difference (Absolute and % values)
10% 274767 277046 + 2279 (+0.829%)
20% 543429 553484 + 10055 (+1.850%)
40% 1090744 1107959 + 17215 (+1.578%)
60% 1634908 1662018 + 27110 (+1.658%)
A rudimentary and somewhat approximately latency-sensitive workload such as
sleeping-ebizzy itself showed a consistent, noticeable performance improvement
with this patch. Hence, workloads that are truly latency-sensitive will benefit
quite a bit from this change. Moreover, this is an overall win-win since this
patch does not hurt power-savings at all (because, this patch does not reduce
the idle time or idle residency; and the high frequency of the CPU when it goes
to cpu-idle does not affect/hurt the power-savings of deep idle states).
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 6712d29319 (cpufreq: ppc-corenet-cpufreq: Fix __udivdi3 modpost
error) used the remainder from do_div instead of the quotient. Fix that
and add one to ensure minimum is met.
Fixes: 6712d29319 (cpufreq: ppc-corenet-cpufreq: Fix __udivdi3 modpost error)
Signed-off-by: Ed Swarthout <Ed.Swarthout@freescale.com>
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit 4920ab8497 (cpufreq: Enable big.LITTLE cpufreq
driver on arm64) that breaks build on arm64.
Fixes: 4920ab8497 (cpufreq: Enable big.LITTLE cpufreq driver on arm64)
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tegra has been switching to intermediate frequency (pll_p_clk) forever.
CPUFreq core has better support for handling notifications for these
frequencies and so we can adapt Tegra's driver to it.
Also do a WARN() if clk_set_parent() fails while moving back to pll_x
as we should have atleast restored to earlier frequency on error.
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Douglas Anderson, recently pointed out an interesting problem due to which
udelay() was expiring earlier than it should.
While transitioning between frequencies few platforms may temporarily switch to
a stable frequency, waiting for the main PLL to stabilize.
For example: When we transition between very low frequencies on exynos, like
between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
No CPUFREQ notification is sent for that. That means there's a period of time
when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
and 300MHz. And so udelay behaves badly.
To get this fixed in a generic way, introduce another set of callbacks
get_intermediate() and target_intermediate(), only for drivers with
target_index() and CPUFREQ_ASYNC_NOTIFICATION unset.
get_intermediate() should return a stable intermediate frequency platform wants
to switch to, and target_intermediate() should set CPU to that frequency,
before jumping to the frequency corresponding to 'index'. Core will take care of
sending notifications and driver doesn't have to handle them in
target_intermediate() or target_index().
NOTE: ->target_index() should restore to policy->restore_freq in case of
failures as core would send notifications for that.
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- ACPICA update to upstream version 20140424. That includes a
number of fixes and improvements related to things like GPE
handling, table loading, headers, memory mapping and unmapping,
DSDT/SSDT overriding, and the Unload() operator. The acpidump
utility from upstream ACPICA is included too. From Bob Moore,
Lv Zheng, David Box, David Binderman, and Colin Ian King.
- Fixes and cleanups related to ACPI video and backlight interfaces
from Hans de Goede. That includes blacklist entries for some new
machines and using native backlight by default.
- ACPI device enumeration changes to create platform devices
rather than PNP devices for ACPI device objects with _HID by
default. PNP devices will still be created for the ACPI device
object with device IDs corresponding to real PNP devices, so
that change should not break things left and right, and we're
expecting to see more and more ACPI-enumerated platform devices
in the future. From Zhang Rui and Rafael J Wysocki.
- Updates for the ACPI LPSS (Low-Power Subsystem) driver allowing
it to handle system suspend/resume on Asus T100 correctly.
From Heikki Krogerus and Rafael J Wysocki.
- PM core update introducing a mechanism to allow runtime-suspended
devices to stay suspended over system suspend/resume transitions
if certain additional conditions related to coordination within
device hierarchy are met. Related PM documentation update and
ACPI PM domain support for the new feature. From Rafael J Wysocki.
- Fixes and improvements related to the "freeze" sleep state. They
affect several places including cpuidle, PM core, ACPI core, and
the ACPI battery driver. From Rafael J Wysocki and Zhang Rui.
- Miscellaneous fixes and updates of the ACPI core from Aaron Lu,
Bjørn Mork, Hanjun Guo, Lan Tianyu, and Rafael J Wysocki.
- Fixes and cleanups for the ACPI processor and ACPI PAD (Processor
Aggregator Device) drivers from Baoquan He, Manuel Schölling,
Tony Camuso, and Toshi Kani.
- System suspend/resume optimization in the ACPI battery driver from
Lan Tianyu.
- OPP (Operating Performance Points) subsystem updates from
Chander Kashyap, Mark Brown, and Nishanth Menon.
- cpufreq core fixes, updates and cleanups from Srivatsa S Bhat,
Stratos Karafotis, and Viresh Kumar.
- Updates, fixes and cleanups for the Tegra, powernow-k8, imx6q,
s5pv210, nforce2, and powernv cpufreq drivers from Brian Norris,
Jingoo Han, Paul Bolle, Philipp Zabel, Stratos Karafotis, and
Viresh Kumar.
- intel_pstate driver fixes and cleanups from Dirk Brandewie,
Doug Smythies, and Stratos Karafotis.
- Enabling the big.LITTLE cpufreq driver on arm64 from Mark Brown.
- Fix for the cpuidle menu governor from Chander Kashyap.
- New ARM clps711x cpuidle driver from Alexander Shiyan.
- Hibernate core fixes and cleanups from Chen Gang, Dan Carpenter,
Fabian Frederick, Pali Rohár, and Sebastian Capella.
- Intel RAPL (Running Average Power Limit) driver updates from
Jacob Pan.
- PNP subsystem updates from Bjorn Helgaas and Fabian Frederick.
- devfreq core updates from Chanwoo Choi and Paul Bolle.
- devfreq updates for exynos4 and exynos5 from Chanwoo Choi and
Bartlomiej Zolnierkiewicz.
- turbostat tool fix from Jean Delvare.
- cpupower tool updates from Prarit Bhargava, Ramkumar Ramachandra
and Thomas Renninger.
- New ACPI ec_access.c tool for poking at the EC in a safe way
from Thomas Renninger.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTjl16AAoJEILEb/54YlRxeKgP/RRQSV7lFtf582Dw/5M/iWOg
qYeNtuYFLArEmJ7SpxHdKsU1ZRm3CahAS1j7grvQMQasUxTzoavMcSBNZefeaoNK
d01LVNqcyKCZs3+izRezk5N1IY+AjdrOcqCdIk8rfgFnc6kOttYUrVcIzKuIKAvJ
MsJ5s/uqP8G69FsAA3Ttdtr0HKiQhN4skSt424wntQRDeJNZPBs74mPKBGh8bxlO
Zr/VCDibKQ2Z8jS7x+TzwZrOxgE1/9x0Cub6GAdTvAfS8A+utPwSkneUyopNqpQ+
tJ5rz5R+HpmPMerizBuU+5s+tvjDPtH4/OZvOPSpYraQSFLOwx3hAm+a5k7fOGmc
XWjXnXWT0i0V3iQkwrspTNjX1RgywbsHbmXrcWn192HResvMQ9zk2gH2ch6m8JhN
yTV5V51dOZicpPuaTCvIkJpsV33p6vRz+EdPBiXoEdua5KKtOg8EnQ470dNaMR92
3ZtWmIvSgGlyPyHlSHLfGXbPUwTYvDNV3aheIoXp9E6WY3WJN9J3WXm4EHKBNVaI
H83kwuk1s92cgqh22H5Pcb0CmDcrbkUdP6hhsPS/aL80/EJMljRP2AYW1Y+l1LAf
pzMLmekHFqQEDjFQltwGvFV/EjFeMHnqOgQONx9ygMaayCGGTYSDx3FbRDesf8t9
qhoFcTPSxoo0XjrGrR6b
=tpdF
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm into next
Pull ACPI and power management updates from Rafael Wysocki:
"ACPICA is the leader this time (63 commits), followed by cpufreq (28
commits), devfreq (15 commits), system suspend/hibernation (12
commits), ACPI video and ACPI device enumeration (10 commits each).
We have no major new features this time, but there are a few
significant changes of how things work. The most visible one will
probably be that we are now going to create platform devices rather
than PNP devices by default for ACPI device objects with _HID. That
was long overdue and will be really necessary to be able to use the
same drivers for the same hardware blocks on ACPI and DT-based systems
going forward. We're not expecting fallout from this one (as usual),
but it's something to watch nevertheless.
The second change having a chance to be visible is that ACPI video
will now default to using native backlight rather than the ACPI
backlight interface which should generally help systems with broken
Win8 BIOSes. We're hoping that all problems with the native backlight
handling that we had previously have been addressed and we are in a
good enough shape to flip the default, but this change should be easy
enough to revert if need be.
In addition to that, the system suspend core has a new mechanism to
allow runtime-suspended devices to stay suspended throughout system
suspend/resume transitions if some extra conditions are met
(generally, they are related to coordination within device hierarchy).
However, enabling this feature requires cooperation from the bus type
layer and for now it has only been implemented for the ACPI PM domain
(used by ACPI-enumerated platform devices mostly today).
Also, the acpidump utility that was previously shipped as a separate
tool will now be provided by the upstream ACPICA along with the rest
of ACPICA code, which will allow it to be more up to date and better
supported, and we have one new cpuidle driver (ARM clps711x).
The rest is improvements related to certain specific use cases,
cleanups and fixes all over the place.
Specifics:
- ACPICA update to upstream version 20140424. That includes a number
of fixes and improvements related to things like GPE handling,
table loading, headers, memory mapping and unmapping, DSDT/SSDT
overriding, and the Unload() operator. The acpidump utility from
upstream ACPICA is included too. From Bob Moore, Lv Zheng, David
Box, David Binderman, and Colin Ian King.
- Fixes and cleanups related to ACPI video and backlight interfaces
from Hans de Goede. That includes blacklist entries for some new
machines and using native backlight by default.
- ACPI device enumeration changes to create platform devices rather
than PNP devices for ACPI device objects with _HID by default. PNP
devices will still be created for the ACPI device object with
device IDs corresponding to real PNP devices, so that change should
not break things left and right, and we're expecting to see more
and more ACPI-enumerated platform devices in the future. From
Zhang Rui and Rafael J Wysocki.
- Updates for the ACPI LPSS (Low-Power Subsystem) driver allowing it
to handle system suspend/resume on Asus T100 correctly. From
Heikki Krogerus and Rafael J Wysocki.
- PM core update introducing a mechanism to allow runtime-suspended
devices to stay suspended over system suspend/resume transitions if
certain additional conditions related to coordination within device
hierarchy are met. Related PM documentation update and ACPI PM
domain support for the new feature. From Rafael J Wysocki.
- Fixes and improvements related to the "freeze" sleep state. They
affect several places including cpuidle, PM core, ACPI core, and
the ACPI battery driver. From Rafael J Wysocki and Zhang Rui.
- Miscellaneous fixes and updates of the ACPI core from Aaron Lu,
Bjørn Mork, Hanjun Guo, Lan Tianyu, and Rafael J Wysocki.
- Fixes and cleanups for the ACPI processor and ACPI PAD (Processor
Aggregator Device) drivers from Baoquan He, Manuel Schölling, Tony
Camuso, and Toshi Kani.
- System suspend/resume optimization in the ACPI battery driver from
Lan Tianyu.
- OPP (Operating Performance Points) subsystem updates from Chander
Kashyap, Mark Brown, and Nishanth Menon.
- cpufreq core fixes, updates and cleanups from Srivatsa S Bhat,
Stratos Karafotis, and Viresh Kumar.
- Updates, fixes and cleanups for the Tegra, powernow-k8, imx6q,
s5pv210, nforce2, and powernv cpufreq drivers from Brian Norris,
Jingoo Han, Paul Bolle, Philipp Zabel, Stratos Karafotis, and
Viresh Kumar.
- intel_pstate driver fixes and cleanups from Dirk Brandewie, Doug
Smythies, and Stratos Karafotis.
- Enabling the big.LITTLE cpufreq driver on arm64 from Mark Brown.
- Fix for the cpuidle menu governor from Chander Kashyap.
- New ARM clps711x cpuidle driver from Alexander Shiyan.
- Hibernate core fixes and cleanups from Chen Gang, Dan Carpenter,
Fabian Frederick, Pali Rohár, and Sebastian Capella.
- Intel RAPL (Running Average Power Limit) driver updates from Jacob
Pan.
- PNP subsystem updates from Bjorn Helgaas and Fabian Frederick.
- devfreq core updates from Chanwoo Choi and Paul Bolle.
- devfreq updates for exynos4 and exynos5 from Chanwoo Choi and
Bartlomiej Zolnierkiewicz.
- turbostat tool fix from Jean Delvare.
- cpupower tool updates from Prarit Bhargava, Ramkumar Ramachandra
and Thomas Renninger.
- New ACPI ec_access.c tool for poking at the EC in a safe way from
Thomas Renninger"
* tag 'pm+acpi-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (187 commits)
ACPICA: Namespace: Remove _PRP method support.
intel_pstate: Improve initial busy calculation
intel_pstate: add sample time scaling
intel_pstate: Correct rounding in busy calculation
intel_pstate: Remove C0 tracking
PM / hibernate: fixed typo in comment
ACPI: Fix x86 regression related to early mapping size limitation
ACPICA: Tables: Add mechanism to control early table checksum verification.
ACPI / scan: use platform bus type by default for _HID enumeration
ACPI / scan: always register ACPI LPSS scan handler
ACPI / scan: always register memory hotplug scan handler
ACPI / scan: always register container scan handler
ACPI / scan: Change the meaning of missing .attach() in scan handlers
ACPI / scan: introduce platform_id device PNP type flag
ACPI / scan: drop unsupported serial IDs from PNP ACPI scan handler ID list
ACPI / scan: drop IDs that do not comply with the ACPI PNP ID rule
ACPI / PNP: use device ID list for PNPACPI device enumeration
ACPI / scan: .match() callback for ACPI scan handlers
ACPI / battery: wakeup the system only when necessary
power_supply: allow power supply devices registered w/o wakeup source
...
SoC-near driver changes that we're merging through our tree. Mostly
because they depend on other changes we have staged, but in some cases
because the driver maintainers preferred that we did it this way.
This contains a largeish cleanup series of the omap_l3_noc bus driver,
cpuidle rework for Exynos, some reset driver conversions and a long
branch of TI EDMA fixes and cleanups, with more to come next release.
The TI EDMA cleanups is a shared branch with the dmaengine tree, with
a handful of Davinci-specific fixes on top.
After discussion at last year's KS (and some more on the mailing lists),
we are here adding a drivers/soc directory. The purpose of this is
to keep per-vendor shared code that's needed by different drivers but
that doesn't fit into the MFD (nor drivers/platform) model. We expect
to keep merging contents for this hierarchy through arm-soc so we can
keep an eye on what the vendors keep adding here and not making it a
free-for-all to shove in crazy stuff.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJTjOFiAAoJEIwa5zzehBx30RYP/0UE+R8ccdsodunmIDrmQ7QP
qFWe1YTWlyXtGBDaPCNfdcU09UYatPKuCv5dJ2ToQCyyFI26PIIhFtnCNXmMuYz+
XPCuqAlJ9hZWx7+j2hXRlyhoZMAaJ5EVVxaK5tnVYXDIfy1Y3xG7i069HD/qGrQp
xrV+XofFmpU2VAds6S+SpecFFfYD7n/pJ1bTSgzPfaUsEUyV882dJ3skgs1VpTzQ
PnL/0Z2t4ePoP3+6p+F7EnJxemLF5IXrlL0c7hODxQKuMqlzoUluywh6SwOHfCQL
u2cc5SFUbbKhExwlGOVibdQMiC0HUOXyRvyYFOIdbv+xNH+Zc/tcoQQ22PWm4Yy1
08qOm3Fr6yw5nH5IT+1wCIFCzJEC/ZHM5B2t+RISFybAMk6Bg1TDYJLmd570zkEL
aTLtS5hdmy4h8Ad5FBtwKNyL//6FJJxhbHUu/m0qaE0phq94+78B2M6vbx6757xC
kCFlpJsHoN0Tn5c9Q1hpTqI/BHxb4UR7Nf+b8Ox8Veuc9JrS35lzi/rWnGxB5WB0
+1KCA8eih9KXTtksxAte1TmSbMciqW559RUR7dNAPXAMPksY2mJV1I+rg0cRsY3i
F90Lnc6LWUM5PYpc4VwiC0sUCLKzTFnpZUELqMOiws3PUblbb0StXuoNo6owbtsK
mp1Juxi1n7VhoN9AFVpL
=SC+e
-----END PGP SIGNATURE-----
Merge tag 'drivers-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc into next
Pull ARM SoC driver changes from Olof Johansson:
"SoC-near driver changes that we're merging through our tree. Mostly
because they depend on other changes we have staged, but in some cases
because the driver maintainers preferred that we did it this way.
This contains a largeish cleanup series of the omap_l3_noc bus driver,
cpuidle rework for Exynos, some reset driver conversions and a long
branch of TI EDMA fixes and cleanups, with more to come next release.
The TI EDMA cleanups is a shared branch with the dmaengine tree, with
a handful of Davinci-specific fixes on top.
After discussion at last year's KS (and some more on the mailing
lists), we are here adding a drivers/soc directory. The purpose of
this is to keep per-vendor shared code that's needed by different
drivers but that doesn't fit into the MFD (nor drivers/platform)
model. We expect to keep merging contents for this hierarchy through
arm-soc so we can keep an eye on what the vendors keep adding here and
not making it a free-for-all to shove in crazy stuff"
* tag 'drivers-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (101 commits)
cpufreq: exynos: Fix driver compilation with ARCH_MULTIPLATFORM
tty: serial: msm: Remove direct access to GSBI
power: reset: keystone-reset: introduce keystone reset driver
Documentation: dt: add bindings for keystone pll control controller
Documentation: dt: add bindings for keystone reset driver
soc: qcom: fix of_device_id table
ARM: EXYNOS: Fix kernel panic when unplugging CPU1 on exynos
ARM: EXYNOS: Move the driver to drivers/cpuidle directory
ARM: EXYNOS: Cleanup all unneeded headers from cpuidle.c
ARM: EXYNOS: Pass the AFTR callback to the platform_data
ARM: EXYNOS: Move S5P_CHECK_SLEEP into pm.c
ARM: EXYNOS: Move the power sequence call in the cpu_pm notifier
ARM: EXYNOS: Move the AFTR state function into pm.c
ARM: EXYNOS: Encapsulate the AFTR code into a function
ARM: EXYNOS: Disable cpuidle for exynos5440
ARM: EXYNOS: Encapsulate boot vector code into a function for cpuidle
ARM: EXYNOS: Pass wakeup mask parameter to function for cpuidle
ARM: EXYNOS: Remove ifdef for scu_enable in pm
ARM: EXYNOS: Move scu_enable in the cpu_pm notifier
ARM: EXYNOS: Use the cpu_pm notifier for pm
...
Cleanups for 3.16. Among these are:
- A bunch of misc cleanups for Broadcom platforms, mostly housekeeping
- Enabling Common Clock Framework on the older s3c24xx Samsung chipsets
- Cleanup of the Versatile Express system controller code, moving it to syscon
- Power management cleanups for OMAP platforms
+ a handful of other cleanups across the place
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJTjMwHAAoJEIwa5zzehBx3MjMP/iELgDsqbNE2wxF9Fb5EEnoe
S11q1QIvVrMVdMcKFN5HfW7f+xNso6+4SwXW0cRrJokGvaqRE758WZWuZq0QBUeS
RYMhfpqmI6pTTJUyy6i6OyXhuRqu8rQ1NPEAatYrKzmtwFX1H4t25f1YtZWhBcK8
ONi45FHeH1OKGGpjpT63uhWEzLk+LZI2MtgxmWoFcemf7guX6vEPJVuVRi8eqLoS
9vl1cAkweYgGhjvQFcSXENaguV50dZlLc9C41dJk9KVvJfRt7o+/cRbG5YpGvnp5
Liu+OWM72w0BkgNk6wDN4kaPX5UGLF8QX11JlvDRCJ2FcPtM4NBG/C9TqLMfkKDR
Ze+ITiXh6NjefdTZWJaM4vzsd6vFws8EYAP24IWFlZ451bNLVN1lzlgqluPNoKmj
CAsFPZhY/x5X9a8VLZ72ohx3N17T/iMsOlbiWtnlfqDcL6N0IoLG1YkFFeQIKEAH
mpobWus8Myq1miWqSaeXh5wOqUVQmYR0I8jNoTfte1nBYSaIGhtMixoQhM6Zw50C
dgSh4p7qhrZUOnYmkPqFXr7NCJ9n3RD10Xu8d/3IIp0u9RJ5Kx6NCEg9adq22jZQ
XGrr/vH0sM8MzpKmfTMi5t2Cx5kP2G+O3enq0hQi4x3Cb4o8vwWQlMgydTd+xBjj
aLo3WTTw0h6nTuKkZL2p
=wuX4
-----END PGP SIGNATURE-----
Merge tag 'cleanup-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc into next
Pull ARM SoC cleanups from Olof Johansson:
"Cleanups for 3.16. Among these are:
- a bunch of misc cleanups for Broadcom platforms, mostly
housekeeping
- enabling Common Clock Framework on the older s3c24xx Samsung
chipsets
- cleanup of the Versatile Express system controller code, moving it
to syscon
- power management cleanups for OMAP platforms
plus a handful of other cleanups across the place"
* tag 'cleanup-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (87 commits)
ARM: kconfig: allow PCI support to be selected with ARCH_MULTIPLATFORM
clk: samsung: fix build error
ARM: vexpress: refine dependencies for new code
clk: samsung: clk-s3c2410-dlck: do not use PNAME macro as it declares __initdata
cpufreq: exynos: Fix the compile error
ARM: S3C24XX: move debug-macro.S into the common space
ARM: S3C24XX: use generic DEBUG_UART_PHY/_VIRT in debug macro
ARM: S3C24XX: trim down debug uart handling
ARM: compressed/head.S: remove s3c24xx special case
ARM: EXYNOS: Remove unnecessary inclusion of cpu.h
ARM: EXYNOS: Migrate Exynos specific macros from plat to mach
ARM: EXYNOS: Remove exynos_subsys registration
ARM: EXYNOS: Remove duplicate lines in Makefile
ARM: EXYNOS: use v7_exit_coherency_flush macro for cache disabling
ARM: OMAP4: PRCM: remove references to cm-regbits-44xx.h from PRCM core files
ARM: OMAP3/4: PRM: add support of late_init call to prm_ll_ops
ARM: OMAP3/OMAP4: PRM: add prm_features flags and add IO wakeup under it
ARM: OMAP3/4: PRM: provide io chain reconfig function through irq setup
ARM: OMAP2+: PRM: remove unnecessary cpu_is_XXX calls from prm_init / exit
ARM: OMAP2+: PRCM: cleanup some header includes
...
This change makes the busy calculation using 64 bit math which prevents
overflow for large values of aperf/mperf.
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The PID assumes that samples are of equal time, which for a deferable
timers this is not true when the system goes idle. This causes the
PID to take a long time to converge to the min P state and depending
on the pattern of the idle load can make the P state appear stuck.
The hold-off value of three sample times before using the scaling is
to give a grace period for applications that have high performance
requirements and spend a lot of time idle, The poster child for this
behavior is the ffmpeg benchmark in the Phoronix test suite.
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Changing to fixed point math throughout the busy calculation in
commit e66c1768 (Change busy calculation to use fixed point
math.) Introduced some inaccuracies by rounding the busy value at two
points in the calculation. This change removes roundings and moves
the rounding to the output of the PID where the calculations are
complete and the value returned as an integer.
Fixes: e66c176837 (intel_pstate: Change busy calculation to use fixed point math.)
Reported-by: Doug Smythies <dsmythies@telus.net>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit fcb6a15c (intel_pstate: Take core C0 time into account for core
busy calculation) introduced a regression referenced below. The issue
with "lockup" after suspend that this commit was addressing is now dealt
with in the suspend path.
Fixes: fcb6a15c2e (intel_pstate: Take core C0 time into account for core busy calculation)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=66581
Link: https://bugzilla.kernel.org/show_bug.cgi?id=75121
Reported-by: Doug Smythies <dsmythies@telus.net>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently Exynos cpufreq drivers rely on globally mapped
clock controller registers to configure frequency of CPU
cores. This is obviously wrong and will be removed in near
future, but to enable support for multi-platform builds
without introducing a regression it needs to be worked
around.
This patch hacks the code to look for clock controller node
in device tree and map its registers using of_iomap(),
instead of relying on global mapping, so dependencies on
platform headers are removed and the driver can compile
again with multiplatform support.
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Handling calls to ->target_index() has got complex over time and might become
more complex. So, its better to take target_index() bits out in another routine
__target_index() for better code readability. Shouldn't have any functional
impact.
Tested-by: Stephen Warren <swarren@nvidia.com>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A pr_err() was added in v3.1. It was guarded by a check for
CONFIG_PM_VERBOSE. The Kconfig symbol PM_VERBOSE was removed in v3.0. So
this pr_err() has never been used. Drop that check and clean up the
message a bit.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Although, a value is assigned to member name of struct cpudata,
it is never used.
We can safely remove it.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 7da83a80 ("ARM: EXYNOS: Migrate Exynos specific macros from
plat to mach") which lands in samsung tree causes build breakage
for cpufreq-exynos like following:
drivers/cpufreq/exynos-cpufreq.c: In function 'exynos_cpufreq_probe':
drivers/cpufreq/exynos-cpufreq.c:166:2: error: implicit declaration of function 'soc_is_exynos4210'
[-Werror=implicit-function-declaration]
drivers/cpufreq/exynos-cpufreq.c:168:2: error: implicit declaration of function 'soc_is_exynos4212'
[-Werror=implicit-function-declaration]
drivers/cpufreq/exynos-cpufreq.c:168:2: error: implicit declaration of function 'soc_is_exynos4412'
[-Werror=implicit-function-declaration]
drivers/cpufreq/exynos-cpufreq.c:170:2: error: implicit declaration of function 'soc_is_exynos5250'
[-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
make[2]: *** [drivers/cpufreq/exynos-cpufreq.o] Error 1
make[2]: *** Waiting for unfinished jobs....
drivers/cpufreq/exynos4x12-cpufreq.c: In function 'exynos4x12_set_clkdiv':
drivers/cpufreq/exynos4x12-cpufreq.c:118:2: error: implicit declaration of function 'soc_is_exynos4212'
[-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
make[2]: *** [drivers/cpufreq/exynos4x12-cpufreq.o] Error 1
make[1]: *** [drivers/cpufreq] Error 2
This fixes above error with getting SoC information via
of_machine_is_compatible() instead of soc_is_exynosXXXX().
Suggested-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com>
[kgene.kim@samsung.com: fixed typo and modified as per Viresh's suggestion]
[kgene.kim@samsung.com: Rafael agreed]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
- ACPICA fix for a stale pointer access introduced by a recent
commit in the XSDT validation code from Lv Zheng.
- ACPICA fix for the default value of the command line switch
to favor 32-bit FADT addresses (in case there's a conflict
between a 64-bit and a 32-bit address). The previous default
was that the 32-bit version would take precedence and we tried
to change it to the other way around and it didn't work.
From Lv Zheng.
- A TPM commit related to ACPI _DSM in 3.14 caused the driver to
refuse to load if a specific _DSM was missing and that broke
resume from system suspend on Chromebooks that require the TPM
hardware to be restored to a working state during resume by the
OS. Restore the old behavior to load the driver if the _DSM
in question is not present, but prevent it from using the
feature the _DSM is for.
- ACPI AC driver conversion in 3.13 broke thermal management on
at least one machine and has to be reverted. From Guenter Roeck.
- Two reverts of 3.13 commits that attempted to remove the old ACPI
battery interface in /proc, but turned out to break some utilities
still using that interface. From Lan Tianyu.
- ACPI processor driver fix to prevent acpi_processor_add() from
modifying the CPU device's .offline field which leads to breakage
if the initial online of the CPU fails. From Igor Mammedov.
- Two intel_pstate fixes, one to take a BayTrail documentation update
into account and one to avoid forcing the maximum P-state on init
which causes CPU PM trouble on systems with P-states coordination
when one of the CPU cores is initialized after an offline/online
cycle triggered by user space. Both stable candidates, from
Dirk Brandewie.
- Fix for the ACPI video DMI blacklist entry for Dell Inspiron 7520
from Aaron Lu.
- Two new ACPI video blacklist entries for machines shipping with
Win8 that need to use native backlight so that it can be controlled
in a usual way (which doesn't work otherwise due bugs in the ACPI
tables) from Hans de Goede.
- Two ACPI _OSI quirks for systems that need them to work correctly
with Linux from Edward Lin and Hans de Goede.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTdow6AAoJEILEb/54YlRxUWcP/0nczFCxZ7C1c7l7Ya8r9iRZ
HXT+AAbakanPs6Ms4VRxao65v1AcdlruWHPhJ6JiEoiO60yKxIIzy7f3mO5gVesr
tcaxaFaCTNdUDFdRDhyN6y+RzO/ohYSKdOJng2tcz2IvcsRD93hXk+095BlVzfJV
EFycqXPb3nmP6oZo1KjPebk4cmlC8Sw9aWcBxK0O1aRoIrAdObf3+rCXfc2/FvC0
vAquOI2OaJ0bwNl7QhGHMLMnvoDvq+/y2mDQ+BvxPERbtDBDS66tkhjsxEx89kpi
ow6WKX1vgfsWYGa5tCxFDZvYIYP5x4+YWPwvYfOFmCO520PUIojT81qT+P6hLHMy
jf2G7QWvL/3qn89qKsR26YNE/fadNDZq0IHh3KD8kOaKtXBV30fIh2rLG3XZfB8q
lhWAcx5ot2ZoQy5ppAuKNG+zA6MniWbN/a5acUIS6zsvVRFkGeZE5ORyEUgkWMjk
QiiL3kcx/Fe528A1cMVXR2fb4kzKBpnVXxWQlzptKCX/3JAOxe6ElZqHMDff983b
LHJjMfVUX9m1yGUZHqbH6CiK9kuQv2fQXSESLrkUDItVs9VFskYQb6AZE5Ow+k+A
QKnDg7n81YlYiau997g0+yA7EqwQmQZx+EwfVtOIfppOlp4LFbCKC6Ytu9pcgbZB
GuYsd/bPvEVswylJMu6U
=v+oA
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"Still fixing regressions (partly by reverting commits that broke
things for people), fixing other stable-candidate bugs and adding some
blacklist entries for ACPI video and _OSI.
Two ACPICA regression fixes (one recent and one for a 3.14 commit), a
fix for an ACPI-related regression in TPM (introduced in 3.14), a
revert of the ACPI AC driver conversion in 3.13 that went wrong for an
unknown reason, two reverts of commits that attempted to remove an old
user space interface in /proc and broke some utilities, in 3.13 too, a
fix for a CPU hotplug bug in the ACPI processor driver (stable
material), two (stable candidate) fixes for intel_pstate and a few new
blacklist entries, mostly for systems that shipped with Windows 8.
Specifics:
- ACPICA fix for a stale pointer access introduced by a recent commit
in the XSDT validation code from Lv Zheng.
- ACPICA fix for the default value of the command line switch to
favor 32-bit FADT addresses (in case there's a conflict between a
64-bit and a 32-bit address). The previous default was that the
32-bit version would take precedence and we tried to change it to
the other way around and it didn't work. From Lv Zheng.
- A TPM commit related to ACPI _DSM in 3.14 caused the driver to
refuse to load if a specific _DSM was missing and that broke resume
from system suspend on Chromebooks that require the TPM hardware to
be restored to a working state during resume by the OS. Restore
the old behavior to load the driver if the _DSM in question is not
present, but prevent it from using the feature the _DSM is for.
- ACPI AC driver conversion in 3.13 broke thermal management on at
least one machine and has to be reverted. From Guenter Roeck.
- Two reverts of 3.13 commits that attempted to remove the old ACPI
battery interface in /proc, but turned out to break some utilities
still using that interface. From Lan Tianyu.
- ACPI processor driver fix to prevent acpi_processor_add() from
modifying the CPU device's .offline field which leads to breakage
if the initial online of the CPU fails. From Igor Mammedov.
- Two intel_pstate fixes, one to take a BayTrail documentation update
into account and one to avoid forcing the maximum P-state on init
which causes CPU PM trouble on systems with P-states coordination
when one of the CPU cores is initialized after an offline/online
cycle triggered by user space. Both stable candidates, from Dirk
Brandewie.
- Fix for the ACPI video DMI blacklist entry for Dell Inspiron 7520
from Aaron Lu.
- Two new ACPI video blacklist entries for machines shipping with
Win8 that need to use native backlight so that it can be controlled
in a usual way (which doesn't work otherwise due bugs in the ACPI
tables) from Hans de Goede.
- Two ACPI _OSI quirks for systems that need them to work correctly
with Linux from Edward Lin and Hans de Goede"
* tag 'pm+acpi-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: Revert native brightness quirk for ThinkPad T530
intel_pstate: remove setting P state to MAX on init
ACPICA: Tables: Restore old behavor to favor 32-bit FADT addresses.
ACPI / video: correct DMI tag for Dell Inspiron 7520
intel_pstate: Set turbo VID for BayTrail
ACPI / TPM: Fix resume regression on Chromebooks
ACPI / proc: Do not say when /proc interfaces will be deleted in Kconfig
ACPI / processor: do not mark present at boot but not onlined CPU as onlined
ACPI: Revert "ACPI / AC: convert ACPI ac driver to platform bus"
ACPI / blacklist: Add dmi_enable_osi_linux quirk for Asus EEE PC 1015PX
ACPI: blacklist win8 OSI for Dell Inspiron 7737
ACPI / video: Add use_native_backlight quirks for more systems
ACPI: Revert "ACPI / Battery: Remove battery's proc directory"
ACPI: Revert "ACPI: Remove CONFIG_ACPI_PROCFS_POWER and cm_sbsc.c"
ACPICA: Tables: Fix invalid pointer accesses in acpi_tb_parse_root_table().
Many drivers keep frequencies in frequency table in ascending
or descending order. When governor tries to change to policy->min
or policy->max respectively then the cpufreq_frequency_table_target
could return on first iteration. This will save some iteration cycles.
So, break out early when a frequency in cpufreq_frequency_table
equals to target one.
Testing this during kernel compilation using ondemand governor
with a frequency table in ascending order, the
cpufreq_frequency_table_target returned early on the first
iteration at about 30% of times called.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This driver is using devres managed calls incorrectly, giving the cpu0
device as first parameter instead of the cpufreq platform device.
This results in resources not being freed if the cpufreq platform device
is unbound, for example if probing has to be deferred for a missing
regulator.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.9+ <stable@vger.kernel.org> # 3.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tegra has implemented an unnecessary wrapper over tegra_update_cpu_speed(), i.e.
tegra_target(), which wasn't doing anything apart of calling
tegra_update_cpu_speed(). Get rid of that and use tegra_target() directly.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is no need to include delay.h.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This driver is using devres managed calls incorrectly, giving the cpu0
device as first parameter instead of the cpufreq platform device.
This results in resources not being freed if the cpufreq platform device
is unbound, for example if probing has to be deferred for a missing
regulator.
Supporting probe deferral properly is a prerequisite to enabling the
internal LDO bypass on i.MX6 and regulating the CPU voltage with an
external regulator.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Suppress the following checkpatch.pl warnings:
- WARNING: Prefer pr_err(... to printk(KERN_ERR ...
- WARNING: Prefer pr_info(... to printk(KERN_INFO ...
- WARNING: Prefer pr_warn(... to printk(KERN_WARNING ...
- WARNING: quoted string split across lines
- WARNING: please, no spaces at the start of a line
Also, define the pr_fmt macro instead of PFX for the module name.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
powernv_cpufreq_get() is only referenced in this file.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org> on V2.
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are arm64 big.LITTLE systems so enable the big.LITTLE cpufreq driver.
While IKS is not available for these systems the driver is still useful
since it manages clusters with shared frequencies which is the common case
for these systems.
Long term combining the cpufreq-cpu0 and big.LITTLE drivers may be a
more sensible option but that is substantially more complex especially
in the case of IKS.
Signed-off-by: Mark Brown <broonie@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Don't use DEFINE_PCI_DEVICE_TABLE macro, because this macro
is deprecated.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Setting the P state of the core to max at init time is a hold over
from early implementation of intel_pstate where intel_pstate disabled
cpufreq and loaded VERY early in the boot sequence. This was to
ensure that intel_pstate did not affect boot time. This in not needed
now that intel_pstate is a cpufreq driver.
Removing this covers the case where a CPU has gone through a manual
CPU offline/online cycle and the P state is set to MAX on init and the
CPU immediately goes idle. Due to HW coordination the P state request
on the idle CPU will drag all cores to MAX P state until the load is
reevaluated when to core goes non-idle.
Reported-by: Patrick Marlier <patrick.marlier@gmail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add support for Broadwell processors.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Loongson2 has been using (incorrectly) kHz for cpu_clk rate. This has
been unnoticed, as loongson2_cpufreq was the only place where the rate
was set/get. After commit 652ed95d5f
(cpufreq: introduce cpufreq_generic_get() routine) things however broke,
and now loops_per_jiffy adjustments are incorrect (1000 times too long).
The patch fixes this by changing cpu_clk rate to Hz.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: stable@vger.kernel.org
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: cpufreq@vger.kernel.org
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Patchwork: https://patchwork.linux-mips.org/patch/6678/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
A documentation update exposed that there is a separate set of VID
values that must be used in the turbo/boost P state range. Add
enumerating and setting the correct VID for P states in the turbo
range.
Cc: v3.13+ <stable@vger.kernel.org> # v3.13+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The s3c24xx cpufreq driver needs to change the mpll speed and was doing
this by writing raw values from a translation table into the MPLLCON
register.
Change this to use a regular clk_set_rate call when using the common
clock framework and only write the raw value in the samsung_clock case.
The s3c cpufreq driver does already aquire the mpll, so simply add a reference
to struct s3c_cpufreq_config to let set_fvco access it.
While struct clk is opaque the differenciation between samsung clock and
common clock is kept, as the samsung-clock mpll clk does not implement a
real set_rate.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
On platforms that use cpufreq_for_each_* macros, build fails if
CONFIG_CPU_FREQ=n, e.g. ARM/shmobile/koelsch/non-multiplatform:
drivers/built-in.o: In function `clk_round_parent':
clkdev.c:(.text+0xcf168): undefined reference to `cpufreq_next_valid'
drivers/built-in.o: In function `clk_rate_table_find':
clkdev.c:(.text+0xcf820): undefined reference to `cpufreq_next_valid'
make[3]: *** [vmlinux] Error 1
Fix this making cpufreq_next_valid function inline and move it to
cpufreq.h.
Fixes: 27e289dce2 (cpufreq: Introduce macros for cpufreq_frequency_table iteration)
Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CPUFreq specific helper functions for OPP (Operating Performance Points)
now use generic OPP functions that allow CPUFreq to be be moved back
into CPUFreq framework. This allows for independent modifications
or future enhancements as needed isolated to just CPUFreq framework
alone.
Here, we just move relevant code and documentation to make this part of
CPUFreq infrastructure.
Cc: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some cpufreq drivers were redundantly invoking the _begin() and _end()
APIs around frequency transitions, and this double invocation (one from
the cpufreq core and the other from the cpufreq driver) used to result
in a self-deadlock, leading to system hangs during boot. (The _begin()
API makes contending callers wait until the previous invocation is
complete. Hence, the cpufreq driver would end up waiting on itself!).
Now all such drivers have been fixed, but debugging this issue was not
very straight-forward (even lockdep didn't catch this). So let us add a
debug infrastructure to the cpufreq core to catch such issues more easily
in the future.
We add a new field called 'transition_task' to the policy structure, to keep
track of the task which is performing the frequency transition. Using this
field, we make note of this task during _begin() and print a warning if we
find a case where the same task is calling _begin() again, before completing
the previous frequency transition using the corresponding _end().
We have left out ASYNC_NOTIFICATION drivers from this debug infrastructure
for 2 reasons:
1. At the moment, we have no way to avoid a particular scenario where this
debug infrastructure can emit false-positive warnings for such drivers.
The scenario is depicted below:
Task A Task B
/* 1st freq transition */
Invoke _begin() {
...
...
}
Change the frequency
/* 2nd freq transition */
Invoke _begin() {
... //waiting for B to
... //finish _end() for
... //the 1st transition
... | Got interrupt for successful
... | change of frequency (1st one).
... |
... | /* 1st freq transition */
... | Invoke _end() {
... | ...
... V }
...
...
}
This scenario is actually deadlock-free because, once Task A changes the
frequency, it is Task B's responsibility to invoke the corresponding
_end() for the 1st frequency transition. Hence it is perfectly legal for
Task A to go ahead and attempt another frequency transition in the meantime.
(Of course it won't be able to proceed until Task B finishes the 1st _end(),
but this doesn't cause a deadlock or a hang).
The debug infrastructure cannot handle this scenario and will treat it as
a deadlock and print a warning. To avoid this, we exclude such drivers
from the purview of this code.
2. Luckily, we don't _need_ this infrastructure for ASYNC_NOTIFICATION drivers
at all! The cpufreq core does not automatically invoke the _begin() and
_end() APIs during frequency transitions in such drivers. Thus, the driver
alone is responsible for invoking _begin()/_end() and hence there shouldn't
be any conflicts which lead to double invocations. So, we can skip these
drivers, since the probability that such drivers will hit this problem is
extremely low, as outlined above.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since commit d37e2b7644 ("intel_pstate: remove unneeded sample buffers")
we use only one sample. So, there is no need to pass the sample
pointer to intel_pstate_calc_busy. Instead, get the pointer from
cpudata. Also, remove the unused SAMPLE_COUNT macro.
While at it, reformat the first line in this function.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix 4 spelling errors in help sections.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There has been confusion all the time about which mailing list to follow
for cpufreq activities, linux-pm@vger.kernel.org or cpufreq@vger.kernel.org.
Since patches sent to cpufreq@vger.kernel.org don't go to Patchwork
which is a maintenance workflow problem, make linux-pm@vger.kernel.org
the official mailing list for cpufreq stuff and remove all references
of cpufreq@vger.kernel.org from kernel source.
Later, we can request that the list be dropped entirely.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch uses dev_err/info function to show accurate log message
with device name instead of pr_err/info function.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cpufreq core now supports the cpufreq_for_each_entry and
cpufreq_for_each_valid_entry macros helpers for iteration over the
cpufreq_frequency_table, so use them.
It should have no functional changes.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Many cpufreq drivers need to iterate over the cpufreq_frequency_table
for various tasks.
This patch introduces two macros which can be used for iteration over
cpufreq_frequency_table keeping a common coding style across drivers:
- cpufreq_for_each_entry: iterate over each entry of the table
- cpufreq_for_each_valid_entry: iterate over each entry that contains
a valid frequency.
It should have no functional changes.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
During frequency transitions, the cpufreq core takes the responsibility of
invoking cpufreq_freq_transition_begin() and cpufreq_freq_transition_end()
for those cpufreq drivers that define the ->target_index callback but don't
set the ASYNC_NOTIFICATION flag.
The powernow-k7 cpufreq driver falls under this category, but this driver was
invoking the _begin() and _end() APIs itself around frequency transitions,
which led to double invocation of the _begin() API. The _begin API makes
contending callers wait until the previous invocation is complete. Hence,
the powernow-k7 driver ended up waiting on itself, leading to system hangs
during boot.
Fix this by removing the calls to the _begin() and _end() APIs from the
powernow-k7 driver, since they rightly belong to the cpufreq core.
Fixes: 12478cf0c5 (cpufreq: Make sure frequency transitions are serialized)
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
During frequency transitions, the cpufreq core takes the responsibility of
invoking cpufreq_freq_transition_begin() and cpufreq_freq_transition_end()
for those cpufreq drivers that define the ->target_index callback but don't
set the ASYNC_NOTIFICATION flag.
The powernow-k6 cpufreq driver falls under this category, but this driver was
invoking the _begin() and _end() APIs itself around frequency transitions,
which led to double invocation of the _begin() API. The _begin API makes
contending callers wait until the previous invocation is complete. Hence,
the powernow-k6 driver ended up waiting on itself, leading to system hangs
during boot.
Fix this by removing the calls to the _begin() and _end() APIs from the
powernow-k6 driver, since they rightly belong to the cpufreq core.
(Note that during ->exit(), the powernow-k6 driver sets the frequency
without any help from the cpufreq core. So add explicit calls to the
_begin() and _end() APIs around that frequency transition alone, to take
care of that special case. Also, add a missing 'break' statement there.)
Fixes: 12478cf0c5 (cpufreq: Make sure frequency transitions are serialized)
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The value of 'max_multiplier' is meant to be used for comparison with
clock_ratio[index].driver_data, not the index itself! Fix the code in
powernow_k6_cpu_exit() that has this bug.
Also, while at it, make the for-loop condition look for CPUFREQ_TABLE_END,
instead of hard-coding the loop count to 8.
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
During frequency transitions, the cpufreq core takes the responsibility of
invoking cpufreq_freq_transition_begin() and cpufreq_freq_transition_end()
for those cpufreq drivers that define the ->target_index callback but don't
set the ASYNC_NOTIFICATION flag.
The longhaul cpufreq driver falls under this category, but this driver was
invoking the _begin() and _end() APIs itself around frequency transitions,
which led to double invocation of the _begin() API. The _begin API makes
contending callers wait until the previous invocation is complete. Hence,
the longhaul driver ended up waiting on itself, leading to system hangs
during boot.
Fix this by removing the calls to the _begin() and _end() APIs from the
longhaul driver, since they rightly belong to the cpufreq core.
(Note that during module_exit(), the longhaul driver sets the frequency
without any help from the cpufreq core. So add explicit calls to the
_begin() and _end() APIs around that frequency transition alone, to take
care of that special case.)
Fixes: 12478cf0c5 (cpufreq: Make sure frequency transitions are serialized)
Reported-and-tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When make ARCH=arm multi_v7_defconfig, we get the following warnings:
warning: (ARM_HIGHBANK_CPUFREQ) selects GENERIC_CPUFREQ_CPU0 which has
unmet direct dependencies (ARCH_HAS_CPUFREQ && CPU_FREQ && HAVE_CLK
&& REGULATOR && OF && THERMAL && CPU_THERMAL)
To fix this, make ARM_HIGHBANK_CPUFREQ depend on ARCH_HAS_CPUFREQ and
REGULATOR instead of selecting them, PM_OPP will be selected by ARCH_HAS_CPUFREQ.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On 32-bit, "12 * NSEC_PER_SEC" doesn't fit in "unsigned long"
(NSEC_PER_SEC is a "long" constant), causing an integer overflow:
drivers/cpufreq/ppc-corenet-cpufreq.c: In function 'corenet_cpufreq_cpu_init':
drivers/cpufreq/ppc-corenet-cpufreq.c:211:9: warning: integer overflow in expression [-Woverflow]
Force the intermediate to be 64-bit by adding an "ULL" suffix to the
constant multiplier to fix this.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Paul Gortmaker reported the following build failure of the powernv cpufreq
driver on UP configs:
drivers/cpufreq/powernv-cpufreq.c:241:2: error: implicit declaration of
function 'cpu_sibling_mask' [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
make[3]: *** [drivers/cpufreq/powernv-cpufreq.o] Error 1
make[2]: *** [drivers/cpufreq] Error 2
make[1]: *** [drivers] Error 2
make: *** [sub-make] Error 2
The trouble here is that cpu_sibling_mask is defined only in <asm/smp.h>,
and <linux/smp.h> includes <asm/smp.h> only in SMP builds.
So fix this build failure by explicitly including <asm/smp.h> in the driver,
so that we get the definition of cpu_sibling_mask even in UP configurations.
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch fixes coccinelle error regarding usage of IS_ERR and
PTR_ERR instead of PTR_ERR_OR_ZERO.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pm-cpufreq:
cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h
cpufreq: create another field .flags in cpufreq_frequency_table
cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table
cpufreq: don't print value of .driver_data from core
cpufreq: ia64: don't set .driver_data to index
cpufreq: powernv: Select CPUFreq related Kconfig options for powernv
cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids
cpufreq: powernv: cpufreq driver for powernv platform
cpufreq: at32ap: don't declare local variable as static
cpufreq: loongson2_cpufreq: don't declare local variable as static
cpufreq: unicore32: fix typo issue for 'clk'
cpufreq: exynos: Disable on multiplatform build
fsl_soc.h was included twice.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The purpose of this single series of commits from Srivatsa S Bhat (with
a small piece from Gautham R Shenoy) touching multiple subsystems that use
CPU hotplug notifiers is to provide a way to register them that will not
lead to deadlocks with CPU online/offline operations as described in the
changelog of commit 93ae4f978c (CPU hotplug: Provide lockless versions
of callback registration functions).
The first three commits in the series introduce the API and document it
and the rest simply goes through the users of CPU hotplug notifiers and
converts them to using the new method.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTQow2AAoJEILEb/54YlRxW4QQAJlYRDUzwFJzJzYhltQYuVR+
4D74XMtvXgoJfg3cwdSWvMKKpJZnA9BVN0f7Hcx9wYmgdexYUuHeZJmMNyc3S2+g
KjKBIsugvgmZhHbbLd6TJ6GBbhGT5JLt9VmSfL9zIkveInU1YHFUUqL/mxdHm4J0
BSGKjk2rN3waRJgmY+xfliFLtQjDKFwJpMuvrgtoUyfas3f4sIV43UNbqdvA/weJ
rzedxXOlKH/id4b56lj/4iIzcoL3mwvJJ7r6n0CEMsKv87z09kqR0O+69Tsq/cgs
j17CsvoJOmZGk3QTeKVMQWBsvk6aPoDu3zK83gLbQMt+qjOpSTbJLz/3HZw4/TrW
ss4nuZne1DLMGS+6hoxYbTP+6Ni//Kn+l/LrHc5jb7m1X3lMO4W2aV3IROtIE1rv
lEP1IG01NU4u9YwkVj1dyhrkSp8tLPul4SrUK8W+oNweOC5crjJV7vJbIPJgmYiM
IZN55wln0yVRtR4TX+rmvN0PixsInE8MeaVCmReApyF9pdzul/StxlBze5BKLSJD
cqo1kNPpsmdxoDucqUpQ/gSvy+IOl2qnlisB5PpV93sk7De6TFDYrGHxjYIW7jMf
StXwdCDDQhzd2Q8Kfpp895A1dbIl8rKtwA6bTU2eX+BfMVFzuMdT44cvosx1+UdQ
sWl//rg76nb13dFjvF+q
=SW7Q
-----END PGP SIGNATURE-----
Merge tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull CPU hotplug notifiers registration fixes from Rafael Wysocki:
"The purpose of this single series of commits from Srivatsa S Bhat
(with a small piece from Gautham R Shenoy) touching multiple
subsystems that use CPU hotplug notifiers is to provide a way to
register them that will not lead to deadlocks with CPU online/offline
operations as described in the changelog of commit 93ae4f978c ("CPU
hotplug: Provide lockless versions of callback registration
functions").
The first three commits in the series introduce the API and document
it and the rest simply goes through the users of CPU hotplug notifiers
and converts them to using the new method"
* tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
net/iucv/iucv.c: Fix CPU hotplug callback registration
net/core/flow.c: Fix CPU hotplug callback registration
mm, zswap: Fix CPU hotplug callback registration
mm, vmstat: Fix CPU hotplug callback registration
profile: Fix CPU hotplug callback registration
trace, ring-buffer: Fix CPU hotplug callback registration
xen, balloon: Fix CPU hotplug callback registration
hwmon, via-cputemp: Fix CPU hotplug callback registration
hwmon, coretemp: Fix CPU hotplug callback registration
thermal, x86-pkg-temp: Fix CPU hotplug callback registration
octeon, watchdog: Fix CPU hotplug callback registration
oprofile, nmi-timer: Fix CPU hotplug callback registration
intel-idle: Fix CPU hotplug callback registration
clocksource, dummy-timer: Fix CPU hotplug callback registration
drivers/base/topology.c: Fix CPU hotplug callback registration
acpi-cpufreq: Fix CPU hotplug callback registration
zsmalloc: Fix CPU hotplug callback registration
scsi, fcoe: Fix CPU hotplug callback registration
scsi, bnx2fc: Fix CPU hotplug callback registration
scsi, bnx2i: Fix CPU hotplug callback registration
...
Currently cpufreq frequency table has two fields: frequency and driver_data.
driver_data is only for drivers' internal use and cpufreq core shouldn't use
it at all. But with the introduction of BOOST frequencies, this assumption
was broken and we started using it as a flag instead.
There are two problems due to this:
- It is against the description of this field, as driver's data is used by
the core now.
- if drivers fill it with -3 for any frequency, then those frequencies are
never considered by cpufreq core as it is exactly same as value of
CPUFREQ_BOOST_FREQ, i.e. ~2.
The best way to get this fixed is by creating another field flags which
will be used for such flags. This patch does that. Along with that various
drivers need modifications due to the change of struct cpufreq_frequency_table.
Reviewed-by: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Few drivers are using kmalloc() to allocate memory for frequency
tables and since we will have an additional field '.flags' in
'struct cpufreq_frequency_table', these might become unstable.
Better get these fixed by replacing kmalloc() by kzalloc() instead.
Along with that we also remove use of .driver_data from SPEAr driver
as it doesn't use it at all. Also, writing zero to .driver_data is not
required for powernow-k8 as it is already zero.
Reported-and-reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CPUFreq core doesn't control value of .driver_data and this field is
completely driver specific. This can contain any value and not only
indexes. For most of the drivers, which aren't using this field, its
value is zero. So, printing this from core doesn't make any sense.
Don't print it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
.driver_data field is only required to be filled if drivers want to
preserve some data in there which they can use according to the value
of .frequency. But this driver isn't using this field at all, but just
setting it equal to the index value. Which isn't required. Fix it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>