When HWP is active turbo active ratio is not used, so we should allow
policy max frequency above turbo activation ratio to be set. When HWP is
not active, then any policy max frequency above turbo activation ratio
can result upto max one-core turbo frequency.
This fix helps better thermal control in turbo region when other methods
like "Running Average Power Limit" is not available to use.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Dynamic boosting of HWP performance on IO wake showed significant
improvement to IO workloads. This series was intended for Skylake Xeon
platforms only and feature was enabled by default based on CPU model
number.
But some Xeon platforms reused the Skylake desktop CPU model number. This
caused some undesirable side effects to some graphics workloads. Since
they are heavily IO bound, the increase in CPU performance decreased the
power available for GPU to do its computing and hence decrease in graphics
benchmark performance.
For example on a Skylake desktop, GpuTest benchmark showed average FPS
reduction from 529 to 506.
This change makes sure that HWP boost feature is only enabled for Skylake
server platforms by using ACPI FADT preferred PM Profile. If some desktop
users wants to get benefit of boost, they can still enable boost from
intel_pstate sysfs attribute "hwp_dynamic_boost".
Fixes: 41ab43c9c8 (cpufreq: intel_pstate: enable boost for Skylake Xeon)
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107410
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
With lockdep turned on, the following circular lock dependency problem
was reported:
[ 57.470040] ======================================================
[ 57.502900] WARNING: possible circular locking dependency detected
[ 57.535208] 4.18.0-0.rc3.1.el8+7.x86_64+debug #1 Tainted: G
[ 57.577761] ------------------------------------------------------
[ 57.609714] tuned/1505 is trying to acquire lock:
[ 57.633808] 00000000559deec5 (cpu_hotplug_lock.rw_sem){++++}, at: store+0x27/0x120
[ 57.672880]
[ 57.672880] but task is already holding lock:
[ 57.702184] 000000002136ca64 (kn->count#118){++++}, at: kernfs_fop_write+0x1d0/0x410
[ 57.742176]
[ 57.742176] which lock already depends on the new lock.
[ 57.742176]
[ 57.785220]
[ 57.785220] the existing dependency chain (in reverse order) is:
:
[ 58.932512] other info that might help us debug this:
[ 58.932512]
[ 58.973344] Chain exists of:
[ 58.973344] cpu_hotplug_lock.rw_sem --> subsys mutex#5 --> kn->count#118
[ 58.973344]
[ 59.030795] Possible unsafe locking scenario:
[ 59.030795]
[ 59.061248] CPU0 CPU1
[ 59.085377] ---- ----
[ 59.108160] lock(kn->count#118);
[ 59.124935] lock(subsys mutex#5);
[ 59.156330] lock(kn->count#118);
[ 59.186088] lock(cpu_hotplug_lock.rw_sem);
[ 59.208541]
[ 59.208541] *** DEADLOCK ***
In the cpufreq_register_driver() function, the lock sequence is:
cpus_read_lock --> kn->count
For the cpufreq sysfs store method, the lock sequence is:
kn->count --> cpus_read_lock
These sequences are actually safe as they are taking a share lock on
cpu_hotplug_lock. However, the current lockdep code doesn't check for
share locking when detecting circular lock dependency. Fixing that
could be a substantial effort.
Instead, we can work around this problem by using cpus_read_trylock()
in the store method which is much simpler. The chance of not getting
the read lock is very small. If that happens, the userspace application
that writes the sysfs file will get an error.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
systrace used for tracing for Android systems has carried a patch for
many years in the Android tree that traces when the cpufreq limits
change. With the help of this information, systrace can know when the
policy limits change and can visually display the data. Lets add
upstream support for the same.
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Make sure of_device_id tables are NULL terminated.
Found by coccinelle spatch "misc/of_table.cocci"
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Ilia Lin <ilia.lin@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On HWP platforms with Turbo 3.0, the HWP capability max ratio shows the
maximum ratio of that core, which can be different than other cores. If
we show the correct maximum frequency in cpufreq sysfs via
cpuinfo_max_freq and scaling_max_freq then, user can know which cores
can run faster for pinning some high priority tasks.
Currently the max turbo frequency is shown as max frequency, which is
the max of all cores, even if some cores can't reach that frequency
even for single threaded workload.
But it is possible that max ratio in HWP capabilities is set as 0xFF or
some high invalid value (E.g. One KBL NUC). Since the actual performance
can never exceed 1 core turbo frequency from MSR TURBO_RATIO_LIMIT, we
use this as a bound check.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.
Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.
For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.
Fixes: fbbcdc0744 (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann <aherrmann@suse.com>
Reviewed-by: Andreas Herrmann <aherrmann@suse.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Andreas Herrmann <aherrmann@suse.com>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The firmware interface used by the pcc-cpufreq driver is
fundamentally not scalable and using it for dynamic CPU performance
scaling on systems with many CPUs leads to degraded performance.
For this reason, disable dynamic CPU performance scaling on systems
with pcc-cpufreq where the number of CPUs present at the driver init
time is greater than 4. Also make the driver print corresponding
complaints to the kernel log.
Reported-by: Andreas Herrmann <aherrmann@suse.com>
Tested-by: Andreas Herrmann <aherrmann@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If of_nvmem_cell_get() fails due to probe deferal, we shouldn't print an
error message. Just be silent in this case.
Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Per Section 8.4.7.1.3 of ACPI 6.2, the platform provides performance
feedback via set of performance counters. To determine the actual
performance level delivered over time, OSPM may read a set of
performance counters from the Reference Performance Counter Register
and the Delivered Performance Counter Register.
OSPM calculates the delivered performance over a given time period by
taking a beginning and ending snapshot of both the reference and
delivered performance counters, and calculating:
delivered_perf = reference_perf X (delta of delivered_perf counter / delta of reference_perf counter).
Implement the above and hook this up to the cpufreq->get method.
Signed-off-by: George Cherian <george.cherian@cavium.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Armada 37xx supports Adaptive Voltage Scaling and thanks to this patch a
voltage is associated to each load level.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cooling device should be part of the i.MX cpufreq driver, but it
cannot be removed for the sake of DT stability. So turn the cooling
device registration into a separate function and perform the
registration only if the CPU OF node does not have the #cooling-cells
property.
Use of_cpufreq_power_cooling_register in imx_thermal code to link the
cooling device to the device tree node provided.
This makes it possible to bind the cpufreq cooling device to a custom
thermal zone via a cooling-maps entry like:
cooling-maps {
map0 {
trip = <&board_alert>;
cooling-device = <&cpu0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};
Assuming a cpu node exists with label "cpu0" and #cooling-cells
property.
Signed-off-by: Bastian Stender <bst@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
match_string() returns the index of an array for a matching string,
which can be used instead of open coded variant.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We should return if get_cpu_device() fails or it leads to a NULL
dereference. Also dev_pm_opp_of_get_opp_desc_node() returns NULL on
error, it never returns error pointers.
Fixes: 46e2856b8e (cpufreq: Add Kryo CPU scaling driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When scaling max/min settings are changed, internally they are converted
to a ratio using the max turbo 1 core turbo frequency. This works fine
when 1 core max is same irrespective of the core. But under Turbo 3.0,
this will not be the case. For example:
Core 0: max turbo pstate: 43 (4.3GHz)
Core 1: max turbo pstate: 45 (4.5GHz)
In this case 1 core turbo ratio will be maximum of all, so it will be
45 (4.5GHz). Suppose scaling max is set to 4GHz (ratio 40) for all cores
,then on core one it will be
= max_state * policy->max / max_freq;
= 43 * (4000000/4500000) = 38 (3.8GHz)
= 38
which is 200MHz less than the desired.
On core2, it will be correctly set to ratio 40 (4GHz). Same holds true
for scaling min frequency limit. So this requires usage of correct turbo
max frequency for core one, which in this case is 4.3GHz. So we need to
adjust per CPU cpu->pstate.turbo_freq using the maximum HWP ratio of that
core.
This change uses the HWP capability of a core to adjust max turbo
frequency. But since Broadwell HWP doesn't use ratios in the HWP
capabilities, we have to use legacy max 1 core turbo ratio. This is not
a problem as the HWP capabilities don't differ among cores in Broadwell.
We need to check for non Broadwell CPU model for applying this change,
though.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add device remove and module exit code to make the driver
functioning as a loadable module.
Fixes: ac28927659 (cpufreq: kryo: allow building as a loadable module)
Signed-off-by: Ilia Lin <ilia.lin@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In event of error returned by the nvmem_cell_read() non-pointer value
may be dereferenced. Fix this with error handling.
Additionally free the allocated speedbin buffer, as per the API.
Fixes: 9ce36edd1a52 (cpufreq: Add Kryo CPU scaling driver)
Signed-off-by: Ilia Lin <ilia.lin@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Revert a recent PM core change that attempted to fix an issue
related to device links, but introduced a regression (Rafael
Wysocki).
- Fix build when the recently added cpufreq driver for Kryo
processors is selected by making it possible to build that
driver as a module (Arnd Bergmann).
- Fix the long idle detection mechanism in the out-of-band
(ondemand and conservative) cpufreq governors (Chen Yu).
- Add support for devices in multiple power domains to the
generic power domains (genpd) framework (Ulf Hansson).
- Add support for iowait boosting on systems with hardware-managed
P-states (HWP) enabled to the intel_pstate driver and make it use
that feature on systems with Skylake Xeon processors as it is
reported to improve performance significantly on those systems
(Srinivas Pandruvada).
- Fix and update the acpi_cpufreq, ti-cpufreq and imx6q cpufreq
drivers (Colin Ian King, Suman Anna, Sébastien Szymanski).
- Change the behavior of the wakeup_count device attribute in
sysfs to expose the number of events when the device might have
aborted system suspend in progress (Ravi Chandra Sadineni).
- Fix two minor issues in the cpupower utility (Abhishek Goel,
Colin Ian King).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJbIOf+AAoJEILEb/54YlRxk5EQAIyLpvR0zdp2gMaMl3rbWqtM
W6XpJbLzL4be9zHKDj4bycO6nbevPOr5oXgm3DQUaUvkLo86cUl2NJlNAv789UZR
NQ8L51WiY4hG4WDrBQntEBw7TDBUDuo6TEa2/0WJQQhj6WQP821oehmF4G+N9A9h
z9YhwbWNgivulyNy09nAcVgJ39cxUVWb9EmTXthp0KnyJzn8de+V3MxlEwJTAmHc
jma9PEil9Key2rS8LRr+djvwa6tYKydOCjkA+o6m7Fo1IVaaVydDgciG4tjnsHNV
wtEfbOZnisnkYrNEbViqQhhnsvSLkTtfAku58Ove5Kz2GPSPjyIoRrK7FUfDetr+
ZQLWq6TPzR9u2m3kQfhHB6C463bGxd4s2BntPH2RLHbs82FENEtGkHdxQOv5B1tW
Gvl9gF9ZDov6gL3jftNdhIz4rQVGaXQlY5/q+alV1I3jhyg7zddht4oh+nNt41XR
ysszEg9K62w/QAuqZeUsHaR7pPoZZDQzr3TRkKX0uvl88jq4HUPj+aKqNYxq0IrZ
uYd92gqvD7HH1UKRPqjvZ65Uj5WTbn7picAYJhTlQR4b73X0j66xDSZp/IZVpbEc
ierDftBxdwklnfxrpy19yJKgIDB89zLP0IX+3BacEC+BWguI//MOb5X0EEpcf/WK
eyG13J1wTF1qLzKDdur9
=VROk
-----END PGP SIGNATURE-----
Merge tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These revert a recent PM core change that introduced a regression, fix
the build when the recently added Kryo cpufreq driver is selected, add
support for devices attached to multiple power domains to the generic
power domains (genpd) framework, add support for iowait boosting on
systens with hardware-managed P-states (HWP) enabled to the
intel_pstate driver, modify the behavior of the wakeup_count device
attribute in sysfs, fix a few issues and clean up some ugliness,
mostly in cpufreq (core and drivers) and in the cpupower utility.
Specifics:
- Revert a recent PM core change that attempted to fix an issue
related to device links, but introduced a regression (Rafael
Wysocki)
- Fix build when the recently added cpufreq driver for Kryo
processors is selected by making it possible to build that driver
as a module (Arnd Bergmann)
- Fix the long idle detection mechanism in the out-of-band (ondemand
and conservative) cpufreq governors (Chen Yu)
- Add support for devices in multiple power domains to the generic
power domains (genpd) framework (Ulf Hansson)
- Add support for iowait boosting on systems with hardware-managed
P-states (HWP) enabled to the intel_pstate driver and make it use
that feature on systems with Skylake Xeon processors as it is
reported to improve performance significantly on those systems
(Srinivas Pandruvada)
- Fix and update the acpi_cpufreq, ti-cpufreq and imx6q cpufreq
drivers (Colin Ian King, Suman Anna, Sébastien Szymanski)
- Change the behavior of the wakeup_count device attribute in sysfs
to expose the number of events when the device might have aborted
system suspend in progress (Ravi Chandra Sadineni)
- Fix two minor issues in the cpupower utility (Abhishek Goel, Colin
Ian King)"
* tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "PM / runtime: Fixup reference counting of device link suppliers at probe"
cpufreq: imx6q: check speed grades for i.MX6ULL
cpufreq: governors: Fix long idle detection logic in load calculation
cpufreq: intel_pstate: enable boost for Skylake Xeon
PM / wakeup: Export wakeup_count instead of event_count via sysfs
PM / Domains: Add dev_pm_domain_attach_by_id() to manage multi PM domains
PM / Domains: Add support for multi PM domains per device to genpd
PM / Domains: Split genpd_dev_pm_attach()
PM / Domains: Don't attach devices in genpd with multi PM domains
PM / Domains: dt: Allow power-domain property to be a list of specifiers
cpufreq: intel_pstate: New sysfs entry to control HWP boost
cpufreq: intel_pstate: HWP boost performance on IO wakeup
cpufreq: intel_pstate: Add HWP boost utility and sched util hooks
cpufreq: ti-cpufreq: Use devres managed API in probe()
cpufreq: ti-cpufreq: Fix an incorrect error return value
cpufreq: ACPI: make function acpi_cpufreq_fast_switch() static
cpufreq: kryo: allow building as a loadable module
cpupower : Fix header name to read idle state name
cpupower: fix spelling mistake: "logilename" -> "logfilename"
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed (Kees)
-----BEGIN PGP SIGNATURE-----
Comment: Kees Cook <kees@outflux.net>
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlsgVtMWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhsJEACLYe2EbwLFJz7emOT1KUGK5R1b
oVxJog0893WyMqgk9XBlA2lvTBRBYzR3tzsadfYo87L3VOBzazUv0YZaweJb65sF
bAvxW3nY06brhKKwTRed1PrMa1iG9R63WISnNAuZAq7+79mN6YgW4G6YSAEF9lW7
oPJoPw93YxcI8JcG+dA8BC9w7pJFKooZH4gvLUSUNl5XKr8Ru5YnWcV8F+8M4vZI
EJtXFmdlmxAledUPxTSCIojO8m/tNOjYTreBJt9K1DXKY6UcgAdhk75TRLEsp38P
fPvMigYQpBDnYz2pi9ourTgvZLkffK1OBZ46PPt8BgUZVf70D6CBg10vK47KO6N2
zreloxkMTrz5XohyjfNjYFRkyyuwV2sSVrRJqF4dpyJ4NJQRjvyywxIP4Myifwlb
ONipCM1EjvQjaEUbdcqKgvlooMdhcyxfshqJWjHzXB6BL22uPzq5jHXXugz8/ol8
tOSM2FuJ2sBLQso+szhisxtMd11PihzIZK9BfxEG3du+/hlI+2XgN7hnmlXuA2k3
BUW6BSDhab41HNd6pp50bDJnL0uKPWyFC6hqSNZw+GOIb46jfFcQqnCB3VZGCwj3
LH53Be1XlUrttc/NrtkvVhm4bdxtfsp4F7nsPFNDuHvYNkalAVoC3An0BzOibtkh
AtfvEeaPHaOyD8/h2Q==
=zUUp
-----END PGP SIGNATURE-----
Merge tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull more overflow updates from Kees Cook:
"The rest of the overflow changes for v4.18-rc1.
This includes the explicit overflow fixes from Silvio, further
struct_size() conversions from Matthew, and a bug fix from Dan.
But the bulk of it is the treewide conversions to use either the
2-factor argument allocators (e.g. kmalloc(a * b, ...) into
kmalloc_array(a, b, ...) or the array_size() macros (e.g. vmalloc(a *
b) into vmalloc(array_size(a, b)).
Coccinelle was fighting me on several fronts, so I've done a bunch of
manual whitespace updates in the patches as well.
Summary:
- Error path bug fix for overflow tests (Dan)
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed
(Kees)"
* tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
treewide: Use array_size in f2fs_kvzalloc()
treewide: Use array_size() in f2fs_kzalloc()
treewide: Use array_size() in f2fs_kmalloc()
treewide: Use array_size() in sock_kmalloc()
treewide: Use array_size() in kvzalloc_node()
treewide: Use array_size() in vzalloc_node()
treewide: Use array_size() in vzalloc()
treewide: Use array_size() in vmalloc()
treewide: devm_kzalloc() -> devm_kcalloc()
treewide: devm_kmalloc() -> devm_kmalloc_array()
treewide: kvzalloc() -> kvcalloc()
treewide: kvmalloc() -> kvmalloc_array()
treewide: kzalloc_node() -> kcalloc_node()
treewide: kzalloc() -> kcalloc()
treewide: kmalloc() -> kmalloc_array()
mm: Introduce kvcalloc()
video: uvesafb: Fix integer overflow in allocation
UBIFS: Fix potential integer overflow in allocation
leds: Use struct_size() in allocation
Convert intel uncore to struct_size
...
This branch contains platform-related driver updates for ARM and ARM64.
Highlights:
- ARM SCMI (System Control & Management Interface) driver cleanups
- Hisilicon support for LPC bus w/ ACPI
- Reset driver updates for several platforms: Uniphier,
- Rockchip power domain bindings and hardware descriptions for several SoCs.
- Tegra memory controller reset improvements
-----BEGIN PGP SIGNATURE-----
iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAlsfB94PHG9sb2ZAbGl4
b20ubmV0AAoJEIwa5zzehBx3k2IP/i9T71QoanZ3k6o/d+YUqmTuUiA+EJWFANry
8KSjBKmYDON/GLgRCiNZR8P0NZ3d1LgFk5gZDdhMrOtoGtd8k8q0KyqLxjKAWHt6
opSrGucmE1gy9FvJdUkK+y148vM+Ea4SXRVOZxbLV5qm3inPwnopJjgKAfnhIn4X
QmkSca90CyEc3kPdBdfMeAKL+7SRb4mbFHAXXVE7QiWvjrEjUkvtNVTazf5Nroc4
PbI97zSFrmSFO4ZK0jZHCd4R2xhsJwzDQ/UKHC9C9/IdFMLfnJ7dxIf97QYn41Kl
H46FneMZZZ1FibN+Mj5hC/tByE8FrMtWh636z031s6kkamSqLiBAZFlGpHABxQJs
3tN1vBP40R7hzm76yQAC4Uopr5xOtmLr6KBMBBRr+Axf9jHMS4m/WP1chwZFpFjI
Awxc0VCjBUm+haHvK85J4eHrzbWPjG+8aV5Ar5DHVo8et3MzCdX0ycoDeUT787qc
qzEcCjGPbXHBR1aXUX8stRW5x8zoGH/4IUYMo5IGadiFuXSna6ERG9IHq3fAU5Fp
ZzNNKedtodn9NoMr3NJJk1ndyrUr0lpXwlVqFeksRTa+INk2FHKd0cQfxwV33kS9
wHXw+v323uxa3Tz2TXKS7PavY5yr6fZ0dLC2+xEDqHq6bsLxo1DnBEnaola+Jg+u
9hKEuSff
=xs+f
-----END PGP SIGNATURE-----
Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Olof Johansson:
"This contains platform-related driver updates for ARM and ARM64.
Highlights:
- ARM SCMI (System Control & Management Interface) driver cleanups
- Hisilicon support for LPC bus w/ ACPI
- Reset driver updates for several platforms: Uniphier,
- Rockchip power domain bindings and hardware descriptions for
several SoCs.
- Tegra memory controller reset improvements"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (59 commits)
ARM: tegra: fix compile-testing PCI host driver
soc: rockchip: power-domain: add power domain support for px30
dt-bindings: power: add binding for px30 power domains
dt-bindings: power: add PX30 SoCs header for power-domain
soc: rockchip: power-domain: add power domain support for rk3228
dt-bindings: power: add binding for rk3228 power domains
dt-bindings: power: add RK3228 SoCs header for power-domain
soc: rockchip: power-domain: add power domain support for rk3128
dt-bindings: power: add binding for rk3128 power domains
dt-bindings: power: add RK3128 SoCs header for power-domain
soc: rockchip: power-domain: add power domain support for rk3036
dt-bindings: power: add binding for rk3036 power domains
dt-bindings: power: add RK3036 SoCs header for power-domain
dt-bindings: memory: tegra: Remove Tegra114 SATA and AFI reset definitions
memory: tegra: Remove Tegra114 SATA and AFI reset definitions
memory: tegra: Register SMMU after MC driver became ready
soc: mediatek: remove unneeded semicolon
soc: mediatek: add a fixed wait for SRAM stable
soc: mediatek: introduce a CAPS flag for scp_domain_data
soc: mediatek: reuse regmap_read_poll_timeout helpers
...
Check the max speed supported from the fuses for i.MX6ULL and update the
operating points table accordingly.
Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Stefan Agner <stefan@agner.ch>
Reviewed-by: Stefan Agner <stefan@agner.ch>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to current code implementation, detecting the long
idle period is done by checking if the interval between two
adjacent utilization update handlers is long enough. Although
this mechanism can detect if the idle period is long enough
(no utilization hooks invoked during idle period), it might
not cover a corner case: if the task has occupied the CPU
for too long which causes no context switches during that
period, then no utilization handler will be launched until this
high prio task is scheduled out. As a result, the idle_periods
field might be calculated incorrectly because it regards the
100% load as 0% and makes the conservative governor who uses
this field confusing.
Change the detection to compare the idle_time with sampling_rate
directly.
Reported-by: Artem S. Tashkinov <t.artem@mailcity.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Enable HWP boost on Skylake server and workstations.
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A new attribute is added to intel_pstate sysfs to enable/disable
HWP dynamic performance boost.
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This change uses SCHED_CPUFREQ_IOWAIT flag to boost HWP performance.
Since SCHED_CPUFREQ_IOWAIT flag is set frequently, we don't start
boosting steps unless we see two consecutive flags in two ticks. This
avoids boosting due to IO because of regular system activities.
To avoid synchronization issues, the actual processing of the flag is
done on the local CPU callback.
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Added two utility functions to HWP boost up gradually and boost down to
the default cached HWP request values.
Boost up:
Boost up updates HWP request minimum value in steps. This minimum value
can reach upto at HWP request maximum values depends on how frequently,
this boost up function is called. At max, boost up will take three steps
to reach the maximum, depending on the current HWP request levels and HWP
capabilities. For example, if the current settings are:
If P0 (Turbo max) = P1 (Guaranteed max) = min
No boost at all.
If P0 (Turbo max) > P1 (Guaranteed max) = min
Should result in one level boost only for P0.
If P0 (Turbo max) = P1 (Guaranteed max) > min
Should result in two level boost:
(min + p1)/2 and P1.
If P0 (Turbo max) > P1 (Guaranteed max) > min
Should result in three level boost:
(min + p1)/2, P1 and P0.
We don't set any level between P0 and P1 as there is no guarantee that
they will be honored.
Boost down:
After the system is idle for hold time of 3ms, the HWP request is reset
to the default value from HWP init or user modified one via sysfs.
Caching of HWP Request and Capabilities
Store the HWP request value last set using MSR_HWP_REQUEST and read
MSR_HWP_CAPABILITIES. This avoid reading of MSRs in the boost utility
functions.
These boost utility functions calculated limits are based on the latest
HWP request value, which can be modified by setpolicy() callback. So if
user space modifies the minimum perf value, that will be accounted for
every time the boost up is called. There will be case when there can be
contention with the user modified minimum perf, in that case user value
will gain precedence. For example just before HWP_REQUEST MSR is updated
from setpolicy() callback, the boost up function is called via scheduler
tick callback. Here the cached MSR value is already the latest and limits
are updated based on the latest user limits, but on return the MSR write
callback called from setpolicy() callback will update the HWP_REQUEST
value. This will be used till next time the boost up function is called.
In addition add a variable to control HWP dynamic boosting. When HWP
dynamic boost is active then set the HWP specific update util hook. The
contents in the utility hooks will be filled in the subsequent patches.
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The ti_cpufreq_probe() function uses regular kzalloc to allocate
the ti_cpufreq_data structure and kfree for freeing this memory
on failures. Simplify this code by using the devres managed
API.
Signed-off-by: Suman Anna <s-anna@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 05829d9431 (cpufreq: ti-cpufreq: kfree opp_data when
failure) has fixed a memory leak in the failure path, however
the patch returned a positive value on get_cpu_device() failure
instead of the previous negative value. Fix this incorrect error
return value properly.
Fixes: 05829d9431 (cpufreq: ti-cpufreq: kfree opp_data when failure)
Cc: 4.14+ <stable@vger.kernel.org> # v4.14+
Signed-off-by: Suman Anna <s-anna@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The acpi_cpufreq_fast_switch() function is local to the source and
does not need to be in global scope, so make it static.
Cleans up sparse warning:
drivers/cpufreq/acpi-cpufreq.c:468:14: warning: symbol
'acpi_cpufreq_fast_switch' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Building the kryo cpufreq driver while QCOM_SMEM is a loadable module
results in a link error:
drivers/cpufreq/qcom-cpufreq-kryo.o: In function `qcom_cpufreq_kryo_probe':
qcom-cpufreq-kryo.c:(.text+0xbc): undefined reference to `qcom_smem_get'
The problem is that Kconfig ignores interprets the dependency as met
when the dependent symbol is a 'bool' one. By making it 'tristate',
it will be forced to be a module here, which builds successfully.
Fixes: 46e2856b8e (cpufreq: Add Kryo CPU scaling driver)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
These update the ACPICA code in the kernel to the 20180508 upstream
revision and make it support the RT patch, add CPPC v3 support to the
ACPI CPPC library, add a WDAT-based watchdog quirk to prevent clashes
with the RTC, add quirks to the ACPI AC and battery drivers, and
update the ACPI SoC drivers.
Specifics:
- Update the ACPICA code in the kernel to the 20180508 upstream
revision including:
* iASL -tc option enhancement (Bob Moore).
* Debugger improvements (Bob Moore).
* Support for tables larger than 1 MB in acpidump/acpixtract
(Bob Moore).
* Minor fixes and cleanups (Colin Ian King, Toomas Soome).
- Make the ACPICA code in the kernel support the RT patch (Sebastian
Andrzej Siewior, Steven Rostedt).
- Add a kmemleak annotation to the ACPICA code (Larry Finger).
- Add CPPC v3 support to the ACPI CPPC library and fix two issues
related to CPPC (Prashanth Prakash, Al Stone).
- Add an ACPI WDAT-based watchdog quirk to prefer iTCO_wdt on
systems where WDAT clashes with the RTC SRAM (Mika Westerberg).
- Add some quirks to the ACPI AC and battery drivers (Carlo Caione,
Hans de Goede).
- Update the ACPI SoC drivers for Intel (LPSS) and AMD (APD)
platforms (Akshu Agrawal, Hans de Goede).
- Fix up some assorted minor issues (Al Stone, Laszlo Toth,
Mathieu Malaterre).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJbFR2IAAoJEILEb/54YlRxsEcP/0vxyGvTXUnHH13h1ayhbcEL
qKA//1Zvw55mFb86nlnIVWUfpVx5pN4sOd2MZL2bDmNP8+ZX1QFbwtkpP8MtnRyX
0pmPb2QpmHz/ZHutVg3rGxFOV4Sk0IPevQW7d4eu/Gc6K/UkfREMj10PWu9EoNWm
6UCKuONxeoUAtE/OTfvU33ghpYgBH506Kg6/EwLLRX0tA10NBa11L9ijUEkVfa0s
RDn5Z9ndjGGed4XtlSNkfabEJpc6uX9JvbAw77DKZgyqEZgQyNa4CZ2C1ZtH66lZ
HCS62eJ6ePGUvzW/KTn825+MOGni5YisGlfonHuF9xHpPNSp7Doxmyny3lMYHL7S
l0XdI2G7UIOFO5eaRM/zYQPQyF05lna28MVvTWJWBA3bcUz4rav9fzPKYp5l+To0
xK1Ol84sowlkOKoi/RIcOx5nsh05SufGqNu9RuXjRZ4iK0bJPCD3YNt+tH95b94M
0YOuoDkIylieVET4xgHB5vBOj8EgMOLtSaGSpEbh816i+BdZMx7YnS4xSmQ+JFjd
rVCJMnXcLZ82j3pkvPpR0W3VLVbEJPjLftENjSVJ+vA9lR567byBFrvzTEmsEQoi
KlNWh5qDmoR96dPheLdhD1HtsL070xLhyqYOUDvtqTiVH8uNxoWfQzcSfvXPvLqc
XGCcG/oPP9FFpNZfLzAB
=FAIF
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA code in the kernel to the 20180508 upstream
revision and make it support the RT patch, add CPPC v3 support to the
ACPI CPPC library, add a WDAT-based watchdog quirk to prevent clashes
with the RTC, add quirks to the ACPI AC and battery drivers, and
update the ACPI SoC drivers.
Specifics:
- Update the ACPICA code in the kernel to the 20180508 upstream
revision including:
* iASL -tc option enhancement (Bob Moore).
* Debugger improvements (Bob Moore).
* Support for tables larger than 1 MB in acpidump/acpixtract (Bob
Moore).
* Minor fixes and cleanups (Colin Ian King, Toomas Soome).
- Make the ACPICA code in the kernel support the RT patch (Sebastian
Andrzej Siewior, Steven Rostedt).
- Add a kmemleak annotation to the ACPICA code (Larry Finger).
- Add CPPC v3 support to the ACPI CPPC library and fix two issues
related to CPPC (Prashanth Prakash, Al Stone).
- Add an ACPI WDAT-based watchdog quirk to prefer iTCO_wdt on systems
where WDAT clashes with the RTC SRAM (Mika Westerberg).
- Add some quirks to the ACPI AC and battery drivers (Carlo Caione,
Hans de Goede).
- Update the ACPI SoC drivers for Intel (LPSS) and AMD (APD)
platforms (Akshu Agrawal, Hans de Goede).
- Fix up some assorted minor issues (Al Stone, Laszlo Toth, Mathieu
Malaterre)"
* tag 'acpi-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
ACPICA: Mark acpi_ut_create_internal_object_dbg() memory allocations as non-leaks
ACPI / watchdog: Prefer iTCO_wdt always when WDAT table uses RTC SRAM
mailbox: PCC: erroneous error message when parsing ACPI PCCT
ACPICA: Update version to 20180508
ACPICA: acpidump/acpixtract: Support for tables larger than 1MB
ACPI: APD: Add AMD misc clock handler support
clk: x86: Add ST oscout platform clock
ACPICA: Update version to 20180427
ACPICA: Debugger: Removed direct support for EC address space in "Test Objects"
ACPICA: Debugger: Add Package support for "test objects" command
ACPICA: Improve error messages for the namespace root node
ACPICA: Fix potential infinite loop in acpi_rs_dump_byte_list
ACPICA: vsnprintf: this statement may fall through
ACPICA: Tables: Fix spelling mistake in comment
ACPICA: iASL: Enhance the -tc option (create AML hex file in C)
ACPI: Add missing prototype_for arch_post_acpi_subsys_init()
ACPI / tables: improve comments regarding acpi_parse_entries_array()
ACPICA: Convert acpi_gbl_hardware lock back to an acpi_raw_spinlock
ACPICA: provide abstraction for raw_spinlock_t
ACPI / CPPC: Fix invalid PCC channel status errors
...
* acpi-cppc:
mailbox: PCC: erroneous error message when parsing ACPI PCCT
ACPI / CPPC: Fix invalid PCC channel status errors
ACPI / CPPC: Document CPPC sysfs interface
cpufreq / CPPC: Support for CPPC v3
ACPI / CPPC: Check for valid PCC subspace only if PCC is used
ACPI / CPPC: Add support for CPPC v3
* acpi-misc:
ACPI: Add missing prototype_for arch_post_acpi_subsys_init()
ACPI: add missing newline to printk
* acpi-battery:
ACPI / battery: Add quirk to avoid checking for PMIC with native driver
ACPI / battery: Ignore AC state in handle_discharging on systems where it is broken
ACPI / battery: Add handling for devices which wrongly report discharging state
ACPI / battery: Remove initializer for unused ident dmi_system_id
ACPI / AC: Remove initializer for unused ident dmi_system_id
* acpi-ac:
ACPI / AC: Add quirk to avoid checking for PMIC with native driver
In Certain QCOM SoCs like apq8096 and msm8996 that have KRYO processors,
the CPU frequency subset and voltage value of each OPP varies
based on the silicon variant in use. Qualcomm Process Voltage Scaling Tables
defines the voltage and frequency value based on the msm-id in SMEM
and speedbin blown in the efuse combination.
The qcom-cpufreq-kryo driver reads the msm-id and efuse value from the SoC
to provide the OPP framework with required information.
This is used to determine the voltage and frequency value for each OPP of
operating-points-v2 table when it is parsed by the OPP framework.
Signed-off-by: Ilia Lin <ilialin@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Tested-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use the static SRCU initializer for `cpufreq_transition_notifier_list'.
This avoids the init_cpufreq_transition_notifier_list() initcall. Its
only purpose is to initialize the SRCU notifier once during boot and set
another variable which is used as an indicator whether the init was
perfromed before cpufreq_register_notifier() was used.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the policy limits are updated via cpufreq_update_policy() and
subsequently via sysfs, the limits stored in user_policy may be
set incorrectly.
For example, if both min and max are set via sysfs to the maximum
available frequency, user_policy.min and user_policy.max will also
be the maximum. If a policy notifier triggered by
cpufreq_update_policy() lowers both the min and the max at this
point, that change is not reflected by the user_policy limits, so
if the max is updated again via sysfs to the same lower value,
then user_policy.max will be lower than user_policy.min which
shouldn't happen. In particular, if one of the policy CPUs is
then taken offline and back online, cpufreq_set_policy() will
fail for it due to a failing limits check.
To prevent that from happening, initialize the min and max fields
of the new_policy object to the ones stored in user_policy that
were previously set via sysfs.
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject & changelog ]
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This routine checks if the CPU running this code belongs to the policy
of the target CPU or if not, can it do remote DVFS for it remotely. But
the current name of it implies as if it is only about doing remote
updates.
Rename it to make it more relevant.
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently tegra20-cpufreq kernel module isn't getting autoloaded because
there is no device associated with the module, this is one of two patches
that resolves the module autoloading. This patch adds a module alias that
will associate the tegra20-cpufreq kernel module with the platform device,
other patch will instantiate the actual platform device. And now it makes
sense to wrap cpufreq driver into a platform driver for consistency.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Nothing prevents Tegra20 CPUFreq module to be unloaded, hence allow it to
be built as a non-builtin kernel module.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Don't even try to request the clocks during of module initialization on
non-Tegra20 machines (this is the case for a multi-platform kernel) for
consistency.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove checking of the CPU number for consistency as it won't ever fail
unless there is a severe bug in the cpufreq core.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>