linux

Commit Graph

Author	SHA1	Message	Date
Ulf Hansson	7fbee48ea0	cpuidle: psci: Split psci_dt_cpu_init_idle() To make the code a bit more readable, let's move the OSI specific initialization out of the psci_dt_cpu_init_idle() and into a separate function. Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-03-14 12:03:05 +01:00
Maciej S. Szmigiero	dd52551fb7	cpuidle: haltpoll: allow force loading on hosts without the REALTIME hint Before commit `1328edca4a` ("cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available") the cpuidle-haltpoll driver could also be used in scenarios when the host does not advertise the KVM_HINTS_REALTIME hint. While the behavior introduced by the aforementioned commit makes sense as the default there are cases where the old behavior is desired, for example, when other kernel changes triggered by presence by this hint are unwanted, for some workloads where the latency benefit from polling overweights the loss from idle CPU capacity that otherwise would be available, or just when running under older Qemu versions that lack this hint. Let's provide a typical "force" module parameter that allows restoring the old behavior. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-03-14 11:39:07 +01:00
Dmitry Osipenko	382ac8e22b	cpuidle: tegra: Disable CC6 state if LP2 unavailable LP2 suspending could be unavailable, for example if it is disabled in a device-tree. CC6 cpuidle state won't work in that case. Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2020-03-13 11:32:01 +01:00
Dmitry Osipenko	14e086baca	cpuidle: tegra: Squash Tegra114 driver into the common driver Tegra20/30/114/124 SoCs have common idling states, thus there is no much point in having separate drivers for a similar hardware. This patch moves Tegra114/124 arch/ drivers into the common driver without any functional changes. The CC6 state is kept disabled on Tegra114/124 because the core Tegra PM code needs some more work in order to support that state. Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com>	2020-03-13 11:32:01 +01:00
Dmitry Osipenko	19461a499c	cpuidle: tegra: Squash Tegra30 driver into the common driver Tegra20 and Terga30 SoCs have common C1 and CC6 idling states and thus share the same code paths, there is no point in having separate drivers for a similar hardware. This patch merely moves functionality of the old driver into the new, although the CC6 state is kept disabled for now since old driver had a rudimentary support for this state (allowing to enter into CC6 only when secondary CPUs are put offline), while new driver can provide a full-featured support. The new feature will be enabled by another patch. Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com> Tested-by: Peter Geis <pgwipeout@gmail.com> Tested-by: Jasper Korten <jja2000@gmail.com> Tested-by: David Heidelberg <david@ixit.cz> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2020-03-13 11:32:01 +01:00
Dmitry Osipenko	860fbde438	cpuidle: Refactor and move out NVIDIA Tegra20 driver into drivers/cpuidle The driver's code is refactored in a way that will make it easy to support Tegra30/114/124 SoCs by this unified driver later on. The current functionality is equal to the old Tegra20 driver, only the code's structure changed a tad. This is also a proper platform driver now. Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com>	2020-03-13 11:31:58 +01:00
Rafael J. Wysocki	f60ccc3558	cpuidle: Call cpu_latency_qos_limit() instead of pm_qos_request() Call cpu_latency_qos_limit() instead of pm_qos_request(), because the latter is going to be dropped. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Tested-by: Amit Kucheria <amit.kucheria@linaro.org>	2020-02-13 11:26:57 +01:00
Rafael J. Wysocki	3a4a004222	PM: QoS: Drop PM_QOS_CPU_DMA_LATENCY notifier chain Notice that pm_qos_remove_notifier() is not used at all and the only caller of pm_qos_add_notifier() is the cpuidle core, which only needs the PM_QOS_CPU_DMA_LATENCY notifier to invoke wake_up_all_idle_cpus() upon changes of the PM_QOS_CPU_DMA_LATENCY target value. First, to ensure that wake_up_all_idle_cpus() will be called whenever the PM_QOS_CPU_DMA_LATENCY target value changes, modify the pm_qos_add/update/remove_request() family of functions to check if the effective constraint for the PM_QOS_CPU_DMA_LATENCY has changed and call wake_up_all_idle_cpus() directly in that case. Next, drop the PM_QOS_CPU_DMA_LATENCY notifier from cpuidle as it is not necessary any more. Finally, drop both pm_qos_add_notifier() and pm_qos_remove_notifier(), as they have no callers now, along with cpu_dma_lat_notifier which is only used by them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Tested-by: Amit Kucheria <amit.kucheria@linaro.org>	2020-02-13 11:26:27 +01:00
Linus Torvalds	eab3540562	ARM: SoC-related driver updates Various driver updates for platforms: - Nvidia: Fuse support for Tegra194, continued memory controller pieces for Tegra30 - NXP/FSL: Refactorings of QuickEngine drivers to support ARM/ARM64/PPC - NXP/FSL: i.MX8MP SoC driver pieces - TI Keystone: ring accelerator driver - Qualcomm: SCM driver cleanup/refactoring + support for new SoCs. - Xilinx ZynqMP: feature checking interface for firmware. Mailbox communication for power management - Overall support patch set for cpuidle on more complex hierarchies (PSCI-based) + Misc cleanups, refactorings of Marvell, TI, other platforms. -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAl4+lTYPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx3nQcQAJm91+6hZbmMjlBySGS7ISjYvOcrI/hMgiOl uhhEP0Dcylvf9A9x3wcIbLwixe+2pvie9DQh2u5F80ShYimidtFi/2xCfuTb9fKu sxxKjrXWyVKhkpW0z+tedY08ftVhkwwcyD4m2C7uVl6AwTP7c367vFeU7XjF2APn drfgmgbjm8U3XbSyAqv+k6z6tyqaCnFM7vbPupSKHgHJ3mfByxOa+XyBN2RdgBbs 0KrVfbXGv80zFIFrMPwaWG7G52bu7K68nVdgy44MpKdRZ6QTjhnR+kerFxHsYgV4 bM55Fya52nTCSTGdKaQakDtKwbAUdCDTSkxgOHGcQoyFi0R/VaEUJtcysnvLbI6c +n/yFIzGyEdXcvIzfv2SoDYhogw19I6RR/M9K5Ni29eazkDVYx2z3rI+2QYeqCiF u7cq52gW6JLP0SI/9kuUrRFiR8v19Ixap7qokAxgqQwYB3NzT8a7WsYPkzdpDZGQ ETSDFMyBWT6UvBe/HWkQluBabbet53rG8BF0OHFrQuMK0u/ieKgSGuTB9XN2djEW PHMOMz2vhi+8XTfpkskhF2tTxlA/k4R6QwCdIMpIkMRVnVQCh1XdPr3Fi2NrgB+S kIXHD4vV6zLYh04zHyKewSPHAXWgraFpg2qKnvL5+KWMTnW6QH+RNjOt9xKDNXOd +iDXpOad =ONtb -----END PGP SIGNATURE----- Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC-related driver updates from Olof Johansson: "Various driver updates for platforms: - Nvidia: Fuse support for Tegra194, continued memory controller pieces for Tegra30 - NXP/FSL: Refactorings of QuickEngine drivers to support ARM/ARM64/PPC - NXP/FSL: i.MX8MP SoC driver pieces - TI Keystone: ring accelerator driver - Qualcomm: SCM driver cleanup/refactoring + support for new SoCs. - Xilinx ZynqMP: feature checking interface for firmware. Mailbox communication for power management - Overall support patch set for cpuidle on more complex hierarchies (PSCI-based) and misc cleanups, refactorings of Marvell, TI, other platforms" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (166 commits) drivers: soc: xilinx: Use mailbox IPI callback dt-bindings: power: reset: xilinx: Add bindings for ipi mailbox drivers: soc: ti: knav_qmss_queue: Pass lockdep expression to RCU lists MAINTAINERS: Add brcmstb PCIe controller entry soc/tegra: fuse: Unmap registers once they are not needed anymore soc/tegra: fuse: Correct straps' address for older Tegra124 device trees soc/tegra: fuse: Warn if straps are not ready soc/tegra: fuse: Cache values of straps and Chip ID registers memory: tegra30-emc: Correct error message for timed out auto calibration memory: tegra30-emc: Firm up hardware programming sequence memory: tegra30-emc: Firm up suspend/resume sequence soc/tegra: regulators: Do nothing if voltage is unchanged memory: tegra: Correct reset value of xusb_hostr soc/tegra: fuse: Add APB DMA dependency for Tegra20 bus: tegra-aconnect: Remove PM_CLK dependency dt-bindings: mediatek: add MT6765 power dt-bindings soc: mediatek: cmdq: delete not used define memory: tegra: Add support for the Tegra194 memory controller memory: tegra: Only include support for enabled SoCs memory: tegra: Support DVFS on Tegra186 and later ...	2020-02-08 14:04:19 -08:00
Rafael J. Wysocki	e6cf623ba3	Merge branch 'intel_idle+acpi' Merge changes updating the ACPI processor driver in order to export acpi_processor_evaluate_cst() to the code outside of it and adding ACPI support to the intel_idle driver based on that. * intel_idle+acpi: Documentation: admin-guide: PM: Add intel_idle document intel_idle: Use ACPI _CST on server systems intel_idle: Add module parameter to prevent ACPI _CST from being used intel_idle: Allow ACPI _CST to be used for selected known processors cpuidle: Allow idle states to be disabled by default intel_idle: Use ACPI _CST for processor models without C-state tables intel_idle: Refactor intel_idle_cpuidle_driver_init() ACPI: processor: Export acpi_processor_evaluate_cst() ACPI: processor: Make ACPI_PROCESSOR_CSTATE depend on ACPI_PROCESSOR ACPI: processor: Clean up acpi_processor_evaluate_cst() ACPI: processor: Introduce acpi_processor_evaluate_cst() ACPI: processor: Export function to claim _CST control	2020-01-23 00:35:50 +01:00
Benjamin Gaignard	cefb9409ff	cpuidle: fix cpuidle_find_deepest_state() kerneldoc warnings Fix cpuidle_find_deepest_state() kernel documentation to avoid warnings when compiling with W=1. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-23 00:33:31 +01:00
Benjamin Gaignard	a09da3fbc1	cpuidle: sysfs: fix warnings when compiling with W=1 Fix kernel documentation comments to remove warnings when compiling with W=1. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-23 00:31:31 +01:00
Benjamin Gaignard	32014c86d4	cpuidle: coupled: fix warnings when compiling with W=1 Fix warnings that show up when compiling with W=1 Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-23 00:30:28 +01:00
Rafael J. Wysocki	f7d50a1534	Merge back cpuidle material for v5.6.	2020-01-17 00:18:31 +01:00
Krzysztof Kozlowski	53eb82b097	cpuidle: arm: Enable compile testing for some of drivers Some of cpuidle drivers for ARMv7 can be compile tested on this architecture because they do not depend on mach-specific bits. Enable compile testing for big.LITTLE, Kirkwood, Zynq, AT91, Exynos and mvebu cpuidle drivers. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-14 23:48:46 +01:00
Ikjoon Jang	57388a2ccb	cpuidle: teo: Fix intervals[] array indexing bug Fix a simple bug in rotating array index. Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Ikjoon Jang <ikjn@chromium.org> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-13 11:14:58 +01:00
Rafael J. Wysocki	577a2f41f4	cpuidle: Drop unused cpuidle_driver_ref/unref() functions The cpuidle_driver_ref() and cpuidle_driver_unref() functions are not used and the refcnt field in struct cpuidle_driver operated by them is not updated anywhere else (so it is permanently equal to 0), so drop both of them along with refcnt. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>	2020-01-09 16:47:22 +01:00
Ulf Hansson	a65a397f24	cpuidle: psci: Add support for PM domains by using genpd When the hierarchical CPU topology layout is used in DT and the PSCI OSI mode is supported by the PSCI FW, let's initialize a corresponding PM domain topology by using genpd. This enables a CPU and a group of CPUs, when attached to the topology, to be power-managed accordingly. To trigger the attempt to initialize the genpd data structures let's use a subsys_initcall, which should be early enough to allow CPUs, but also other devices to be attached. The initialization consists of parsing the PSCI OF node for the topology and the "domain idle states" DT bindings. In case the idle states are compatible with "domain-idle-state", the initialized genpd becomes responsible of selecting an idle state for the PM domain, via assigning it a genpd governor. Note that, a successful initialization of the genpd data structures, is followed by a call to psci_set_osi_mode(), as to try to enable the OSI mode in the PSCI FW. In case this fails, we fall back into a degraded mode rather than bailing out and returning error codes. Co-developed-by: Lina Iyer <lina.iyer@linaro.org> Signed-off-by: Lina Iyer <lina.iyer@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:52:57 +01:00
Ulf Hansson	9c6ceecb65	cpuidle: psci: Support CPU hotplug for the hierarchical model When the hierarchical CPU topology is used and when a CPU is put offline, that CPU prevents its PM domain from being powered off, which is because genpd observes the corresponding attached device as being active from a runtime PM point of view. Furthermore, any potential master PM domains are also prevented from being powered off. To address this limitation, let's add add a new CPU hotplug state (CPUHP_AP_CPU_PM_STARTING) and register up/down callbacks for it, which allows us to deal with runtime PM accordingly. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:52:18 +01:00
Ulf Hansson	ce85aef570	cpuidle: psci: Manage runtime PM in the idle path In case we have succeeded to attach a CPU to its PM domain, let's deploy runtime PM support for the corresponding attached device, to allow the CPU to be powered-managed accordingly. The triggering point for when runtime PM reference counting should be done, has been selected to the deepest idle state for the CPU. However, from the hierarchical point view, there may be good reasons to do runtime PM reference counting even on shallower idle states, but at this point this isn't supported, mainly due to limitations set by the generic PM domain. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:52:06 +01:00
Ulf Hansson	a0cf319460	cpuidle: psci: Prepare to use OS initiated suspend mode via PM domains The per CPU variable psci_power_state, contains an array of fixed values, which reflects the corresponding arm,psci-suspend-param parsed from DT, for each of the available CPU idle states. This isn't sufficient when using the hierarchical CPU topology in DT, in combination with having PSCI OS initiated (OSI) mode enabled. More precisely, in OSI mode, Linux is responsible of telling the PSCI FW what idle state the cluster (a group of CPUs) should enter, while in PSCI Platform Coordinated (PC) mode, each CPU independently votes for an idle state of the cluster. For this reason, introduce a per CPU variable called domain_state and implement two helper functions to read/write its value. Then let the domain_state take precedence over the regular selected state, when entering and idle state. To avoid executing the above OSI specific code in the ->enter() callback, while operating in the default PSCI Platform Coordinated mode, let's also add a new enter-function and use it for OSI. Co-developed-by: Lina Iyer <lina.iyer@linaro.org> Signed-off-by: Lina Iyer <lina.iyer@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:34 +01:00
Ulf Hansson	8554951a4d	cpuidle: psci: Attach CPU devices to their PM domains In order to enable a CPU to be power managed through its PM domain, let's try to attach it by calling psci_dt_attach_cpu() during the cpuidle initialization. psci_dt_attach_cpu() returns a pointer to the attached struct device, which later should be used for runtime PM, hence we need to store it somewhere. Rather than adding yet another per CPU variable, let's create a per CPU struct to collect the relevant per CPU variables. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:28 +01:00
Ulf Hansson	a5e0454cf3	cpuidle: psci: Add a helper to attach a CPU to its PM domain Introduce a PSCI DT helper function, psci_dt_attach_cpu(), which takes a CPU number as an in-parameter and tries to attach the CPU's struct device to its corresponding PM domain. Let's makes use of dev_pm_domain_attach_by_name(), as it allows us to specify "psci" as the "name" of the PM domain to attach to. Additionally, let's also prepare the attached device to be power managed via runtime PM. Note that, the implementation of the new helper function is in a new separate c-file, which may seems a bit too much at this point. However, subsequent changes that implements the remaining part of the PM domain support for cpuidle-psci, helps to justify this split. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:24 +01:00
Ulf Hansson	f08cfbfa4f	cpuidle: psci: Support hierarchical CPU idle states Currently CPU's idle states are represented using the flattened model. Let's add support for the hierarchical layout, via converting to use of_get_cpu_state_node(). Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:19 +01:00
Ulf Hansson	1595e4b09b	cpuidle: psci: Simplify OF parsing of CPU idle state nodes Iterating through the idle state nodes in DT, to find out the number of states that needs to be allocated is unnecessary, as it has already been done from dt_init_idle_driver(). Therefore, drop the iteration and use the number we already have at hand. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:15 +01:00
Lina Iyer	778f173eb4	cpuidle: dt: Support hierarchical CPU idle states Currently CPU's idle states are represented using the flattened model. Let's add support for the hierarchical layout, via converting to use of_get_cpu_state_node(). Suggested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Lina Iyer <lina.iyer@linaro.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Co-developed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:08 +01:00
Sudeep Holla	4386aa866d	cpuidle: psci: Align psci_power_state count with idle state count Instead of allocating 'n-1' states in psci_power_state to manage 'n' idle states which include "ARM WFI" state, it would be simpler to have 1:1 mapping between psci_power_state and cpuidle driver states. ARM WFI state(i.e. idx == 0) is handled specially in the generic macro CPU_PM_CPU_IDLE_ENTER_PARAM and hence state[-1] is not possible. However for sake of code readability, it is better to have 1:1 mapping and not use [idx - 1] to access psci_power_state corresponding to driver cpuidle state for idx. psci_power_state[0] is default initialised to 0 and is never accessed while entering WFI state. Reported-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:42:13 +01:00
Rafael J. Wysocki	75a8026741	cpuidle: Allow idle states to be disabled by default In certain situations it may be useful to prevent some idle states from being used by default while allowing user space to enable them later on. For this purpose, introduce a new state flag, CPUIDLE_FLAG_OFF, to mark idle states that should be disabled by default, make the core set CPUIDLE_STATE_DISABLED_BY_USER for those states at the initialization time and add a new state attribute in sysfs, "default_status", to inform user space of the initial status of the given idle state ("disabled" if CPUIDLE_FLAG_OFF is set for it, "enabled" otherwise). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-27 11:02:08 +01:00
Yangtao Li	85c3ebd4a0	cpuidle: kirkwood: convert to devm_platform_ioremap_resource() Use devm_platform_ioremap_resource() to simplify code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-20 10:05:44 +01:00
Yangtao Li	22c48a439d	cpuidle: clps711x: convert to devm_platform_ioremap_resource() Use devm_platform_ioremap_resource() to simplify code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-20 10:04:27 +01:00
Rafael J. Wysocki	d4d8140176	cpuidle: Drop unnecessary type cast in cpuidle_poll_time() The data type of the target_residency_ns field in struct cpuidle_state is u64, so it does not need to be cast into u64. Get rid of the unnecessary type cast. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-12 17:56:08 +01:00
Rafael J. Wysocki	b0142d66f4	cpuidle: Fix cpuidle_driver_state_disabled() It turns out that cpuidle_driver_state_disabled() can be called before registering the cpufreq driver on some platforms, which was not expected when it was introduced and which leads to a NULL pointer dereference when trying to walk the CPUs associated with the given cpuidle driver. Fix the problem by making cpuidle_driver_state_disabled() check if the driver's mask of CPUs associated with it is present and to set CPUIDLE_FLAG_UNUSABLE for the given idle state in the driver's states list if that is not the case to cause __cpuidle_register_device() to set CPUIDLE_STATE_DISABLED_BY_DRIVER for that state for all cpuidle devices registered by it later. Fixes: `cbda56d5fe` ("cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks") Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-10 23:41:20 +01:00
Marcelo Tosatti	36fcb42924	cpuidle: use first valid target residency as poll time Commit `259231a045` ("cpuidle: add poll_limit_ns to cpuidle_device structure") changed, by mistake, the target residency from the first available sleep state to the last available sleep state (which should be longer). This might cause excessive polling. Fixes: `259231a045` ("cpuidle: add poll_limit_ns to cpuidle_device structure") Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-09 10:37:16 +01:00
Randy Dunlap	4d30d4a044	cpuidle: minor Kconfig help text fixes End sentences in help text with a period (aka full stop). Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-29 11:49:32 +01:00
Rafael J. Wysocki	ba1e78a1dc	cpuidle: Drop disabled field from struct cpuidle_state After recent cpuidle updates the "disabled" field in struct cpuidle_state is only used by two drivers (intel_idle and shmobile cpuidle) for marking unusable idle states, but that may as well be achieved with the help of a state flag, so define an "unusable" idle state flag, CPUIDLE_FLAG_UNUSABLE, make the drivers in question use it instead of the "disabled" field and make the core set CPUIDLE_STATE_DISABLED_BY_DRIVER for the idle states with that flag set. After the above changes, the "disabled" field in struct cpuidle_state is not used any more, so drop it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-29 11:48:39 +01:00
Krzysztof Kozlowski	656b4e6398	cpuidle: Fix Kconfig indentation Adjust indentation from spaces to tab (+optional two spaces) as in coding style with command like: $ sed -e 's/^ /\t/' -i */Kconfig Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-29 11:47:35 +01:00
Daniel Lezcano	5aa9ba6312	cpuidle: Pass exit latency limit to cpuidle_use_deepest_state() Modify cpuidle_use_deepest_state() to take an additional exit latency limit argument to be passed to find_deepest_idle_state() and make cpuidle_idle_call() pass dev->forced_idle_latency_limit_ns to it for forced idle. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: Rebase and rearrange code, subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-20 11:46:18 +01:00
Daniel Lezcano	c55b51a06b	cpuidle: Allow idle injection to apply exit latency limit In some cases it may be useful to specify an exit latency limit for the idle state to be used during CPU idle time injection. Instead of duplicating the information in struct cpuidle_device or propagating the latency limit in the call stack, replace the use_deepest_state field with forced_latency_limit_ns to represent that limit, so that the deepest idle state with exit latency within that limit is forced (i.e. no governors) when it is set. A zero exit latency limit for forced idle means to use governors in the usual way (analogous to use_deepest_state equal to "false" before this change). Additionally, add play_idle_precise() taking two arguments, the duration of forced idle and the idle state exit latency limit, both in nanoseconds, and redefine play_idle() as a wrapper around that new function. This change is preparatory, no functional impact is expected. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: Subject, changelog, cpuidle_use_deepest_state() kerneldoc, whitespace ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-20 11:32:55 +01:00
Rafael J. Wysocki	cbda56d5fe	cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks Commit `99e98d3fb1` ("cpuidle: Consolidate disabled state checks") overlooked the fact that the imx6q and tegra20 cpuidle drivers use the "disabled" field in struct cpuidle_state for quirks which trigger after the initialization of cpuidle, so reading the initial value of that field is not sufficient for those drivers. In order to allow them to implement the quirks without using the "disabled" field in struct cpuidle_state, introduce a new helper function and modify them to use it. Fixes: `99e98d3fb1` ("cpuidle: Consolidate disabled state checks") Reported-by: Len Brown <lenb@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-19 10:35:13 +01:00
Rafael J. Wysocki	85f6a17f24	cpuidle: teo: Avoid code duplication in conditionals There are three places in teo_select() where a given amount of time is compared with TICK_NSEC if tick_nohz_tick_stopped() returns true, which is a bit of duplicated code. Avoid that code duplication by defining a helper function to do the check and using it in all of the places in question. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-15 00:54:33 +01:00
Rafael J. Wysocki	63f202e5ed	cpuidle: teo: Avoid using "early hits" incorrectly If the current state with the maximum "early hits" metric in teo_select() is also the one "matching" the expected idle duration, it will be used as the candidate one for selection even if its "misses" metric is greater than its "hits" metric, which is not correct. In that case, the candidate state should be shallower than the current one and its "early hits" metric should be the maximum among the idle states shallower than the current one. To make that happen, modify teo_select() to save the index of the state whose "early hits" metric is the maximum for the range of states below the current one and go back to that state if it turns out that the current one should be rejected. Fixes: `159e48560f` ("cpuidle: teo: Fix "early hits" handling for disabled idle states") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-13 22:31:17 +01:00
Rafael J. Wysocki	b6495b7f00	cpuidle: teo: Exclude cpuidle overhead from computations One purpose of the computations in teo_update() is to determine whether or not the (saved) time till the next timer event and the measured idle duration fall into the same "bin", so avoid using values that include the cpuidle overhead to obtain the latter. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-13 22:31:17 +01:00
Rafael J. Wysocki	c1d51f684c	cpuidle: Use nanoseconds as the unit of time Currently, the cpuidle subsystem uses microseconds as the unit of time which (among other things) causes the idle loop to incur some integer division overhead for no clear benefit. In order to allow cpuidle to measure time in nanoseconds, add two new fields, exit_latency_ns and target_residency_ns, to represent the exit latency and target residency of an idle state in nanoseconds, respectively, to struct cpuidle_state and initialize them with the help of the corresponding values in microseconds provided by drivers. Additionally, change cpuidle_governor_latency_req() to return the idle state exit latency constraint in nanoseconds. Also meeasure idle state residency (last_residency_ns in struct cpuidle_device and time_ns in struct cpuidle_driver) in nanoseconds and update the cpuidle core and governors accordingly. However, the menu governor still computes typical intervals in microseconds to avoid integer overflows. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net>	2019-11-11 21:56:07 +01:00
Rafael J. Wysocki	99e98d3fb1	cpuidle: Consolidate disabled state checks There are two reasons why CPU idle states may be disabled: either because the driver has disabled them or because they have been disabled by user space via sysfs. In the former case, the state's "disabled" flag is set once during the initialization of the driver and it is never cleared later (it is read-only effectively). In the latter case, the "disable" field of the given state's cpuidle_state_usage struct is set and it may be changed via sysfs. Thus checking whether or not an idle state has been disabled involves reading these two flags every time. In order to avoid the additional check of the state's "disabled" flag (which is effectively read-only anyway), use the value of it at the init time to set a (new) flag in the "disable" field of that state's cpuidle_state_usage structure and use the sysfs interface to manipulate another (new) flag in it. This way the state is disabled whenever the "disable" field of its cpuidle_state_usage structure is nonzero, whatever the reason, and it is the only place to look into to check whether or not the state has been disabled. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2019-11-06 13:19:56 +01:00
Zhenzhong Duan	918c1fe9fb	cpuidle: Do not unset the driver if it is there already Fix __cpuidle_set_driver() to check if any of the CPUs in the mask has a driver different from drv already and, if so, return -EBUSY before updating any cpuidle_drivers per-CPU pointers. Fixes: `82467a5a88` ("cpuidle: simplify multiple driver support") Cc: 3.11+ <stable@vger.kernel.org> # 3.11+ Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-10-24 23:22:33 +02:00
Rafael J. Wysocki	2c2a83d329	Merge back earlier cpuidle material for v5.5.	2019-10-24 23:12:55 +02:00
Zhenzhong Duan	31d851407f	cpuidle: haltpoll: Take 'idle=' override into account Currenly haltpoll isn't aware of the 'idle=' override, the priority is 'idle=poll' > haltpoll > 'idle=halt'. When 'idle=poll' is used, cpuidle driver is bypassed but current_driver in sys still shows 'haltpoll'. When 'idle=halt' is used, haltpoll takes precedence and makes 'idle=halt' have no effect. Add a check to prevent the haltpoll driver from loading if 'idle=' is present. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Co-developed-by: Joao Martins <joao.m.martins@oracle.com> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-10-22 11:43:17 +02:00
Rafael J. Wysocki	159e48560f	cpuidle: teo: Fix "early hits" handling for disabled idle states The TEO governor uses idle duration "bins" defined in accordance with the CPU idle states table provided by the driver, so that each "bin" covers the idle duration range between the target residency of the idle state corresponding to it and the target residency of the closest deeper idle state. The governor collects statistics for each bin regardless of whether or not the idle state corresponding to it is currently enabled. In particular, the "early hits" metric measures the likelihood of a situation in which the idle duration measured after wakeup falls into to given bin, but the time till the next timer (sleep length) falls into a bin corresponding to one of the deeper idle states. It is used when the "hits" and "misses" metrics indicate that the state "matching" the sleep length should not be selected, so that the state with the maximum "early hits" value is selected instead of it. If the idle state corresponding to the given bin is disabled, it cannot be selected and if it turns out to be the one that should be selected, a shallower idle state needs to be used instead of it. Nevertheless, the metrics collected for the bin corresponding to it are still valid and need to be taken into account as though that state had not been disabled. As far as the "early hits" metric is concerned, teo_select() tries to take disabled states into account, but the state index corresponding to the maximum "early hits" value computed by it may be incorrect. Namely, it always uses the index of the previous maximum "early hits" state then, but there may be enabled idle states closer to the disabled one in question. In particular, if the current candidate state (whose index is the idx value) is closer to the disabled one and the "early hits" value of the disabled state is greater than the current maximum, the index of the current candidate state (idx) should replace the "maximum early hits state" index. Modify the code to handle that case correctly. Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+	2019-10-14 10:40:33 +02:00
Rafael J. Wysocki	e43dcf2021	cpuidle: teo: Consider hits and misses metrics of disabled states The TEO governor uses idle duration "bins" defined in accordance with the CPU idle states table provided by the driver, so that each "bin" covers the idle duration range between the target residency of the idle state corresponding to it and the target residency of the closest deeper idle state. The governor collects statistics for each bin regardless of whether or not the idle state corresponding to it is currently enabled. In particular, the "hits" and "misses" metrics measure the likelihood of a situation in which both the time till the next timer (sleep length) and the idle duration measured after wakeup fall into the given bin. Namely, if the "hits" value is greater than the "misses" one, that situation is more likely than the one in which the sleep length falls into the given bin, but the idle duration measured after wakeup falls into a bin corresponding to one of the shallower idle states. If the idle state corresponding to the given bin is disabled, it cannot be selected and if it turns out to be the one that should be selected, a shallower idle state needs to be used instead of it. Nevertheless, the metrics collected for the bin corresponding to it are still valid and need to be taken into account as though that state had not been disabled. For this reason, make teo_select() always use the "hits" and "misses" values of the idle duration range that the sleep length falls into even if the specific idle state corresponding to it is disabled and if the "hits" values is greater than the "misses" one, select the closest enabled shallower idle state in that case. Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+	2019-10-14 10:40:33 +02:00
Rafael J. Wysocki	4f690bb8ce	cpuidle: teo: Rename local variable in teo_select() Rename a local variable in teo_select() in preparation for subsequent code modifications, no intentional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+	2019-10-14 10:40:33 +02:00
Rafael J. Wysocki	069ce2ef1a	cpuidle: teo: Ignore disabled idle states that are too deep Prevent disabled CPU idle state with target residencies beyond the anticipated idle duration from being taken into account by the TEO governor. Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+	2019-10-14 10:40:32 +02:00
Linus Torvalds	77dcfe2b9e	Power management updates for 5.4-rc1 - Rework the main suspend-to-idle control flow to avoid repeating "noirq" device resume and suspend operations in case of spurious wakeups from the ACPI EC and decouple the ACPI EC wakeups support from the LPS0 _DSM support (Rafael Wysocki). - Extend the wakeup sources framework to expose wakeup sources as device objects in sysfs (Tri Vo, Stephen Boyd). - Expose system suspend statistics in sysfs (Kalesh Singh). - Introduce a new haltpoll cpuidle driver and a new matching governor for virtualized guests wanting to do guest-side polling in the idle loop (Marcelo Tosatti, Joao Martins, Wanpeng Li, Stephen Rothwell). - Fix the menu and teo cpuidle governors to allow the scheduler tick to be stopped if PM QoS is used to limit the CPU idle state exit latency in some cases (Rafael Wysocki). - Increase the resolution of the play_idle() argument to microseconds for more fine-grained injection of CPU idle cycles (Daniel Lezcano). - Switch over some users of cpuidle notifiers to the new QoS-based frequency limits and drop the CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events (Viresh Kumar). - Add new cpufreq driver based on nvmem for sun50i (Yangtao Li). - Add support for MT8183 and MT8516 to the mediatek cpufreq driver (Andrew-sh.Cheng, Fabien Parent). - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson Huang). - Add qcs404 to cpufreq-dt-platdev blacklist (Jorge Ramirez-Ortiz). - Update the qcom cpufreq driver (among other things, to make it easier to extend and to use kryo cpufreq for other nvmem-based SoCs) and add qcs404 support to it (Niklas Cassel, Douglas RAILLARD, Sibi Sankar, Sricharan R). - Fix assorted issues and make assorted minor improvements in the cpufreq code (Colin Ian King, Douglas RAILLARD, Florian Fainelli, Gustavo Silva, Hariprasad Kelam). - Add new devfreq driver for NVidia Tegra20 (Dmitry Osipenko, Arnd Bergmann). - Add new Exynos PPMU events to devfreq events and extend that mechanism (Lukasz Luba). - Fix and clean up the exynos-bus devfreq driver (Kamil Konieczny). - Improve devfreq documentation and governor code, fix spelling typos in devfreq (Ezequiel Garcia, Krzysztof Kozlowski, Leonard Crestez, MyungJoo Ham, Gaël PORTAY). - Add regulators enable and disable to the OPP (operating performance points) framework (Kamil Konieczny). - Update the OPP framework to support multiple opp-suspend properties (Anson Huang). - Fix assorted issues and make assorted minor improvements in the OPP code (Niklas Cassel, Viresh Kumar, Yue Hu). - Clean up the generic power domains (genpd) framework (Ulf Hansson). - Clean up assorted pieces of power management code and documentation (Akinobu Mita, Amit Kucheria, Chuhong Yuan). - Update the pm-graph tool to version 5.5 including multiple fixes and improvements (Todd Brandt). - Update the cpupower utility (Benjamin Weis, Geert Uytterhoeven, Sébastien Szymanski). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl2ArZ4SHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxgfYQAK80hs43vWQDmp7XKrN4pQe8+qYULAGO fBfrFl+NG9y/cnuqnt3NtA8MoyNsMMkMLkpkEDMfSbYqqH5ehEzX5+uGJWiWx8+Y oH5KU8MH7Tj/utYaalGzDt0AHfHZDIGC0NCUNQJVtE/4mOANFabwsCwscp4MrD5Q WjFN8U4BrsmWgJdZ/U9QIWcDZ0I+1etCF+rZG2yxSv31FMq2Zk/Qm4YyobqCvQFl TR9rxl08wqUmIYIz5cDjt/3AKH7NLLDqOTstbCL7cmufM5XPFc1yox69xc89UrIa 4AMgmDp7SMwFG/gdUPof0WQNmx7qxmiRAPleAOYBOZW/8jPNZk2y+RhM5NeF72m7 AFqYiuxqatkSb4IsT8fLzH9IUZOdYr8uSmoMQECw+MHdApaKFjFV8Lb/qx5+AwkD y7pwys8dZSamAjAf62eUzJDWcEwkNrujIisGrIXrVHb7ISbweskMOmdAYn9p4KgP dfRzpJBJ45IaMIdbaVXNpg3rP7Apfs7X1X+/ZhG6f+zHH3zYwr8Y81WPqX8WaZJ4 qoVCyxiVWzMYjY2/1lzjaAdqWojPWHQ3or3eBaK52DouyG3jY6hCDTLwU7iuqcCX jzAtrnqrNIKufvaObEmqcmYlIIOFT7QaJCtGUSRFQLfSon8fsVSR7LLeXoAMUJKT JWQenuNaJngK =TBDQ -----END PGP SIGNATURE----- Merge tag 'pm-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These include a rework of the main suspend-to-idle code flow (related to the handling of spurious wakeups), a switch over of several users of cpufreq notifiers to QoS-based limits, a new devfreq driver for Tegra20, a new cpuidle driver and governor for virtualized guests, an extension of the wakeup sources framework to expose wakeup sources as device objects in sysfs, and more. Specifics: - Rework the main suspend-to-idle control flow to avoid repeating "noirq" device resume and suspend operations in case of spurious wakeups from the ACPI EC and decouple the ACPI EC wakeups support from the LPS0 _DSM support (Rafael Wysocki). - Extend the wakeup sources framework to expose wakeup sources as device objects in sysfs (Tri Vo, Stephen Boyd). - Expose system suspend statistics in sysfs (Kalesh Singh). - Introduce a new haltpoll cpuidle driver and a new matching governor for virtualized guests wanting to do guest-side polling in the idle loop (Marcelo Tosatti, Joao Martins, Wanpeng Li, Stephen Rothwell). - Fix the menu and teo cpuidle governors to allow the scheduler tick to be stopped if PM QoS is used to limit the CPU idle state exit latency in some cases (Rafael Wysocki). - Increase the resolution of the play_idle() argument to microseconds for more fine-grained injection of CPU idle cycles (Daniel Lezcano). - Switch over some users of cpuidle notifiers to the new QoS-based frequency limits and drop the CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events (Viresh Kumar). - Add new cpufreq driver based on nvmem for sun50i (Yangtao Li). - Add support for MT8183 and MT8516 to the mediatek cpufreq driver (Andrew-sh.Cheng, Fabien Parent). - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson Huang). - Add qcs404 to cpufreq-dt-platdev blacklist (Jorge Ramirez-Ortiz). - Update the qcom cpufreq driver (among other things, to make it easier to extend and to use kryo cpufreq for other nvmem-based SoCs) and add qcs404 support to it (Niklas Cassel, Douglas RAILLARD, Sibi Sankar, Sricharan R). - Fix assorted issues and make assorted minor improvements in the cpufreq code (Colin Ian King, Douglas RAILLARD, Florian Fainelli, Gustavo Silva, Hariprasad Kelam). - Add new devfreq driver for NVidia Tegra20 (Dmitry Osipenko, Arnd Bergmann). - Add new Exynos PPMU events to devfreq events and extend that mechanism (Lukasz Luba). - Fix and clean up the exynos-bus devfreq driver (Kamil Konieczny). - Improve devfreq documentation and governor code, fix spelling typos in devfreq (Ezequiel Garcia, Krzysztof Kozlowski, Leonard Crestez, MyungJoo Ham, Gaël PORTAY). - Add regulators enable and disable to the OPP (operating performance points) framework (Kamil Konieczny). - Update the OPP framework to support multiple opp-suspend properties (Anson Huang). - Fix assorted issues and make assorted minor improvements in the OPP code (Niklas Cassel, Viresh Kumar, Yue Hu). - Clean up the generic power domains (genpd) framework (Ulf Hansson). - Clean up assorted pieces of power management code and documentation (Akinobu Mita, Amit Kucheria, Chuhong Yuan). - Update the pm-graph tool to version 5.5 including multiple fixes and improvements (Todd Brandt). - Update the cpupower utility (Benjamin Weis, Geert Uytterhoeven, Sébastien Szymanski)" * tag 'pm-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (126 commits) cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available cpuidle-haltpoll: do not set an owner to allow modunload cpuidle-haltpoll: return -ENODEV on modinit failure cpuidle-haltpoll: set haltpoll as preferred governor cpuidle: allow governor switch on cpuidle_register_driver() PM: runtime: Documentation: add runtime_status ABI document pm-graph: make setVal unbuffered again for python2 and python3 powercap: idle_inject: Use higher resolution for idle injection cpuidle: play_idle: Increase the resolution to usec cpuidle-haltpoll: vcpu hotplug support cpufreq: Add qcs404 to cpufreq-dt-platdev blacklist cpufreq: qcom: Add support for qcs404 on nvmem driver cpufreq: qcom: Refactor the driver to make it easier to extend cpufreq: qcom: Re-organise kryo cpufreq to use it for other nvmem based qcom socs dt-bindings: opp: Add qcom-opp bindings with properties needed for CPR dt-bindings: opp: qcom-nvmem: Support pstates provided by a power domain Documentation: cpufreq: Update policy notifier documentation cpufreq: Remove CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events PM / Domains: Verify PM domain type in dev_pm_genpd_set_performance_state() PM / Domains: Simplify genpd_lookup_dev() ...	2019-09-17 19:15:14 -07:00
Wanpeng Li	1328edca4a	cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available The downside of guest side polling is that polling is performed even with other runnable tasks in the host. However, even if poll in kvm can aware whether or not other runnable tasks in the same pCPU, it can still incur extra overhead in over-subscribe scenario. Now we can just enable guest polling when dedicated pCPUs are available. Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:46:15 +02:00
Joao Martins	472f263660	cpuidle-haltpoll: do not set an owner to allow modunload cpuidle-haltpoll can be built as a module to allow optional late load. Given we are setting @owner to THIS_MODULE, cpuidle will attempt to grab a module reference every time a cpuidle_device is registered -- so essentially all online cpus get a reference. This prevents for the module to be unloaded later, which makes the module_exit callback entirely unused. Thus remove the @owner and allow module to be unloaded. Fixes: `fa86ee90eb` ("add cpuidle-haltpoll driver") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:36:30 +02:00
Joao Martins	5cc59f597c	cpuidle-haltpoll: return -ENODEV on modinit failure When a user loads cpuidle-haltpoll on a non KVM guest the module will successfully load, even though idle driver registration didn't take place. We should instead return -ENODEV signaling the user that the driver can't be loaded, like other error paths in haltpoll_init(). An example of such error paths is when we return -EBUSY when attempting to register an idle driver when it had one already (e.g. intel_idle loads at boot and then we attempt to insert module cpuidle-haltpoll). Fixes: `fa86ee90eb` ("add cpuidle-haltpoll driver") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:36:30 +02:00
Joao Martins	7321440829	cpuidle-haltpoll: set haltpoll as preferred governor Right now, guest current governors have the following ratings: * ladder -> 10 * teo -> 19 * menu -> 20 * haltpoll -> 21 * ladder + nohz=off -> 25 haltpoll governor got introduced and it is now the default governor given its highest rating -- with ladder+nohz being the exception -- regardless of idle driver in the guest. An example of an undesirable case is x86 KVM guests with MWAIT which have intel_idle registered first, and consequently will have haltpoll be used as governor which would get limited to a poll state and state 1 and the other states wouldn't get used. To keep the previous defaults we decrease rating of governor to 9 (below current lowest rating) and thus rely on @governor switch on cpuidle_register_driver() to tie in haltpoll idle driver and governor together. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:36:30 +02:00
Joao Martins	cb5d8c45ab	cpuidle: allow governor switch on cpuidle_register_driver() The recently introduced haltpoll driver is largely only useful with haltpoll governor. To allow drivers to associate with a particular idle behaviour, add a @governor property to 'struct cpuidle_driver' and thus allow a cpuidle driver to switch to a preferred governor on idle driver registration. We save the previous governor, and when an idle driver is unregistered we switch back to that. The @governor can be overridden by cpuidle.governor= boot param or alternatively be ignored if the governor doesn't exist. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:36:30 +02:00
Joao Martins	97d3eb9da8	cpuidle-haltpoll: vcpu hotplug support When cpus != maxcpus cpuidle-haltpoll will fail to register all vcpus past the online ones and thus fail to register the idle driver. This is because cpuidle_add_sysfs() will return with -ENODEV as a consequence from get_cpu_device() return no device for a non-existing CPU. Instead switch to cpuidle_register_driver() and manually register each of the present cpus through cpuhp_setup_state() callbacks and future ones that get onlined or offlined. This mimmics similar logic that intel_idle does. Fixes: `fa86ee90eb` ("add cpuidle-haltpoll driver") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-03 09:36:36 +02:00
Rafael J. Wysocki	b7e7fffd3e	cpuidle: teo: Get rid of redundant check in teo_update() Notice that setting measured_us to UINT_MAX in teo_update() earlier doesn't change the behavior of the following code, so do that and eliminate a redundant check used for setting measured_us to UINT_MAX. This change is not expected to alter functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-10 14:34:28 +02:00
Lorenzo Pieralisi	9ffeb6d08c	PSCI: cpuidle: Refactor CPU suspend power_state parameter handling Current PSCI code handles idle state entry through the psci_cpu_suspend_enter() API, that takes an idle state index as a parameter and convert the index into a previously initialized power_state parameter before calling the PSCI.CPU_SUSPEND() with it. This is unwieldly, since it forces the PSCI firmware layer to keep track of power_state parameter for every idle state so that the index->power_state conversion can be made in the PSCI firmware layer instead of the CPUidle driver implementations. Move the power_state handling out of drivers/firmware/psci into the respective ACPI/DT PSCI CPUidle backends and convert the psci_cpu_suspend_enter() API to get the power_state parameter as input, which makes it closer to its firmware interface PSCI.CPU_SUSPEND() API. A notable side effect is that the PSCI ACPI/DT CPUidle backends now can directly handle (and if needed update) power_state parameters before handing them over to the PSCI firmware interface to trigger PSCI.CPU_SUSPEND() calls. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Lorenzo Pieralisi	788961462f	ARM: psci: cpuidle: Enable PSCI CPUidle driver Allow selection of the PSCI CPUidle in the kernel by updating the respective Kconfig entry. Remove PSCI callbacks from ARM/ARM64 generic CPU ops to prevent the PSCI idle driver from clashing with the generic ARM CPUidle driver initialization, that relies on CPU ops to initialize and enter idle states. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Lorenzo Pieralisi	81d549e0c8	ARM: psci: cpuidle: Introduce PSCI CPUidle driver PSCI firmware is the standard power management control for all ARM64 based platforms and it is also deployed on some ARM 32 bit platforms to date. Idle state entry in PSCI is currently achieved by calling arm_cpuidle_init() and arm_cpuidle_suspend() in a generic idle driver, which in turn relies on ARM/ARM64 CPUidle back-end to relay the call into PSCI firmware if PSCI is the boot method. Given that PSCI is the standard idle entry method on ARM64 systems (which means that no other CPUidle driver are expected on ARM64 platforms - so PSCI is already a generic idle driver), in order to simplify idle entry and code maintenance, it makes sense to have a PSCI specific idle driver so that idle code that it is currently living in drivers/firmware directory can be hoisted out of it and moved where it belongs, into a full-fledged PSCI driver, leaving PSCI code in drivers/firmware as a pure firmware interface, as it should be. Implement a PSCI CPUidle driver. By default it is a silent Kconfig entry which is left unselected, since it selection would clash with the generic ARM CPUidle driver that provides a PSCI based idle driver through the arm/arm64 arches back-ends CPU operations. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Lorenzo Pieralisi	6460d7ba48	ARM: cpuidle: Remove overzealous error logging CPUidle back-end operations are not implemented in some platforms but this should not be considered an error serious enough to be logged. Check the arm_cpuidle_init() return value to detect whether the failure must be reported or not in the kernel log and do not log it if the platform does not support CPUidle operations. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Lorenzo Pieralisi	63e3ee6154	ARM: cpuidle: Remove useless header include The generic ARM CPUidle driver includes <linux/topology.h> by mistake. Remove the topology header include. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Rafael J. Wysocki	cab09f3d2d	cpuidle: teo: Allow tick to be stopped if PM QoS is used The TEO goveror prevents the scheduler tick from being stopped (unless stopped already) if there is a PM QoS latency constraint for the given CPU and the target residency of the deepest idle state matching that constraint is below the tick boundary. However, that is problematic if CPUs with PM QoS latency constraints are idle for long times, because it effectively causes the tick to run on them all the time which is wasteful. [It is also confusing and questionable if they are full dynticks CPUs.] To address that issue, modify the TEO governor to carry out the entire search for the most suitable idle state (from the target residency perspective) even if a latency constraint is present, to allow it to determine the expected idle duration in all cases. Also, when using the last several measured idle duration values to refine the idle state selection, make it compare those values with the current expected idle duration value (instead of comparing them with the target residency of the idle state selected so far) which should prevent the tick from being retained when it makes sense to stop it sometimes (especially in the presence of PM QoS latency constraints). Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-05 11:02:44 +02:00
Rafael J. Wysocki	32b91ca153	cpuidle: menu: Allow tick to be stopped if PM QoS is used After commit `554c8aa8ec` ("sched: idle: Select idle state before stopping the tick") the menu governor prevents the scheduler tick from being stopped (unless stopped already) if there is a PM QoS latency constraint for the given CPU and the target residency of the deepest idle state matching that constraint is below the tick boundary. However, that is problematic if CPUs with PM QoS latency constraints are idle for long times, because it effectively causes the tick to run on them all the time which is wasteful. [It is also confusing and questionable if they are full dynticks CPUs.] To address that issue, make the menu governor allow the tick to be stopped only if the idle duration predicted by it is beyond the tick boundary, except when the shallowest idle state is selected upfront and it is not a "polling" one. Fixes: `554c8aa8ec` ("sched: idle: Select idle state before stopping the tick") Link: https://lore.kernel.org/lkml/79b247b3-e056-610e-9a07-e685dfdaa6c9@gmail.com/ Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com> Tested-by: Thomas Lindroth <thomas.lindroth@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-05 11:02:44 +02:00
Marcelo Tosatti	a1c4423b02	cpuidle-haltpoll: disable host side polling when kvm virtualized When performing guest side polling, it is not necessary to also perform host side polling. So disable host side polling, via the new MSR interface, when loading cpuidle-haltpoll driver. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00
Marcelo Tosatti	2cffe9f6b9	cpuidle: add haltpoll governor The cpuidle_haltpoll governor, in conjunction with the haltpoll cpuidle driver, allows guest vcpus to poll for a specified amount of time before halting. This provides the following benefits to host side polling: 1) The POLL flag is set while polling is performed, which allows a remote vCPU to avoid sending an IPI (and the associated cost of handling the IPI) when performing a wakeup. 2) The VM-exit cost can be avoided. The downside of guest side polling is that polling is performed even with other runnable tasks in the host. Results comparing halt_poll_ns and server/client application where a small packet is ping-ponged: host --> 31.33 halt_poll_ns=300000 / no guest busy spin --> 33.40 (93.8%) halt_poll_ns=0 / guest_halt_poll_ns=300000 --> 32.73 (95.7%) For the SAP HANA benchmarks (where idle_spin is a parameter of the previous version of the patch, results should be the same): hpns == halt_poll_ns idle_spin=0/ idle_spin=800/ idle_spin=0/ hpns=200000 hpns=0 hpns=800000 DeleteC06T03 (100 thread) 1.76 1.71 (-3%) 1.78 (+1%) InsertC16T02 (100 thread) 2.14 2.07 (-3%) 2.18 (+1.8%) DeleteC00T01 (1 thread) 1.34 1.28 (-4.5%) 1.29 (-3.7%) UpdateC00T03 (1 thread) 4.72 4.18 (-12%) 4.53 (-5%) Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00
Marcelo Tosatti	7d4daeedd5	governors: unify last_state_idx Since this field is shared by all governors, move it to cpuidle device structure. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00
Marcelo Tosatti	259231a045	cpuidle: add poll_limit_ns to cpuidle_device structure Add a poll_limit_ns variable to cpuidle_device structure. Calculate and configure it in the new cpuidle_poll_time function, in case its zero. Individual governors are allowed to override this value. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00
Marcelo Tosatti	fa86ee90eb	add cpuidle-haltpoll driver Add a cpuidle driver that calls the architecture default_idle routine. To be used in conjunction with the haltpoll governor. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00
Rafael J. Wysocki	918e162e6a	Merge branch 'pm-cpufreq' * pm-cpufreq: cpufreq: Make cpufreq_generic_init() return void cpufreq: imx-cpufreq-dt: Add i.MX8MN support cpufreq: Add QoS requests for userspace constraints cpufreq: intel_pstate: Reuse refresh_frequency_limits() cpufreq: Register notifiers with the PM QoS framework PM / QoS: Add support for MIN/MAX frequency constraints PM / QOS: Pass request type to dev_pm_qos_read_value() PM / QOS: Rename __dev_pm_qos_read_value() and dev_pm_qos_raw_read_value() PM / QOS: Pass request type to dev_pm_qos_{add\|remove}_notifier()	2019-07-18 09:49:30 +02:00
Viresh Kumar	8262331eaa	PM / QOS: Rename __dev_pm_qos_read_value() and dev_pm_qos_raw_read_value() dev_pm_qos_read_value() will soon need to support more constraint types (min/max frequency) and will have another argument to it, i.e. type of the constraint. While that is fine for the existing users of dev_pm_qos_read_value(), but not that optimal for the callers of __dev_pm_qos_read_value() and dev_pm_qos_raw_read_value() as all the callers of these two routines are only looking for resume latency constraint. Lets make these two routines care only about the resume latency constraint and rename them to __dev_pm_qos_resume_latency() and dev_pm_qos_raw_resume_latency(). Suggested-by: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-04 10:40:54 +02:00
Thomas Gleixner	d2912cb15b	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-19 17:09:55 +02:00
Thomas Gleixner	55716d2643	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 Based on 1 normalized pattern(s): this file is released under the gplv2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 68 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190114.292346262@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:16 +02:00
Thomas Gleixner	7925f8f78f	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 215 Based on 1 normalized pattern(s): this code is licenced under the gpl version 2 as described in the copying file that acompanies the linux kernel extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Steve Winslow <swinslow@gmail.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528171439.466585205@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:29:54 -07:00
Thomas Gleixner	9952f6918d	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 228 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:29:52 -07:00
Thomas Gleixner	c942fddf87	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 Based on 3 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [graeme] [gregory] [gg]@[slimlogic] [co] [uk] [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] [based] [on] [twl6030]_[usb] [c] [author] [hema] [hk] [hemahk]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1105 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070033.202006027@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:26:37 -07:00
Thomas Gleixner	2874c5fd28	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:26:32 -07:00
Thomas Gleixner	ec8f24b7fa	treewide: Add SPDX license identifier - Makefile/Kconfig Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-21 10:50:46 +02:00
Ulf Hansson	6f9b83ac87	cpuidle: Export the next timer expiration for CPUs To be able to predict the sleep duration for a CPU entering idle, it is essential to know the expiration time of the next timer. Both the teo and the menu cpuidle governors already use this information for CPU idle state selection. Moving forward, a similar prediction needs to be made for a group of idle CPUs rather than for a single one and the following changes implement a new genpd governor for that purpose. In order to support that feature, add a new function called tick_nohz_get_next_hrtimer() that will return the next hrtimer expiration time of a given CPU to be invoked after deciding whether or not to stop the scheduler tick on that CPU. Make the cpuidle core call tick_nohz_get_next_hrtimer() right before invoking the ->enter() callback provided by the cpuidle driver for the given state and store its return value in the per-CPU struct cpuidle_device, so as to make it available to code outside of cpuidle. Note that at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the governor's ->select() callback has already returned and indicated whether or not the tick should be stopped, so in fact the value returned by tick_nohz_get_next_hrtimer() always is the next hrtimer expiration time for the given CPU, possibly including the tick (if it hasn't been stopped). Co-developed-by: Lina Iyer <lina.iyer@linaro.org> Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-04-10 00:32:34 +02:00
Marek Szyprowski	c324f43aed	cpuidle: exynos: Unify target residency for AFTR and coupled AFTR states Since commit `45f1ff59e2` ("cpuidle: Return nohz hint from cpuidle_select()") Exynos CPUidle driver stopped entering C1 (AFTR) mode on Exynos4412-based Trats2 board. Further analysis revealed that the CPUidle framework changed the way it handles predicted timer ticks and reported target residency for the given idle states. As a result, the C1 (AFTR) state was not chosen anymore on completely idle device. The main issue was to high target residency value. The similar C1 (AFTR) state for 'coupled' CPUidle version used 10 times lower value for the target residency, despite the fact that it is the same state from the hardware perspective. The 100000us value for standard C1 (AFTR) mode is there from the begining of the support for this idle state, added by the commit `67173ca492` ("ARM: EXYNOS: Add support AFTR mode on EXYNOS4210"). That commit doesn't give any reason for it, instead it looks like it was blindly copied from the WFI/IDLE state of the same driver that time. That time, that value was probably not really used by the framework for any critical decision, so it didn't matter that much. Now it turned out to be an issue, so unify the target residency with the 'coupled' version, as it seems to better match the real use case values and restores the operation of the Exynos CPUidle driver on the idle device. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-04-01 23:49:51 +02:00
Rafael J. Wysocki	22782b3f9b	cpuidle: governor: Add new governors to cpuidle_governors again After commit `61cb5758d3` ("cpuidle: Add cpuidle.governor= command line parameter") new cpuidle governors are not added to the list of available governors, so governor selection via sysfs doesn't work as expected (even though it is rarely used anyway). Fix that by making cpuidle_register_governor() add new governors to cpuidle_governors again. Fixes: `61cb5758d3` ("cpuidle: Add cpuidle.governor= command line parameter") Reported-by: Kees Cook <keescook@chromium.org> Cc: 5.0+ <stable@vger.kernel.org> # 5.0+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-03-12 23:46:55 +01:00
Rafael J. Wysocki	814b8797f9	cpuidle: menu: Avoid overflows when computing variance The variance computation in get_typical_interval() may overflow if the square of the value of diff exceeds the maximum for the int64_t data type value which basically is the case when it is of the order of UINT_MAX. However, data points so far in the future don't matter for idle state selection anyway, so change the initial threshold value in get_typical_interval() to INT_MAX which will cause more "outlying" data points to be discarded without affecting the selection result. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-03-07 10:54:22 +01:00
Joseph Lo	db10945cf4	cpuidle: dt: bail out if the idle-state DT node is not compatible Currently, the DT of the idle states will be parsed first whether it's compatible or not. This could cause a warning message that comes from if the CPU doesn't support identical idle states. E.g. Tegra186 can run with 2 Cortex-A57 and 2 Denver cores with different idle states on different types of these cores. So fix it by checking the match node earlier, then it can make sure it only goes through the idle states that the CPU supported. Signed-off-by: Joseph Lo <josephl@nvidia.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-01 12:58:58 +01:00
Rafael J. Wysocki	8a56bdeb09	Merge back earlier cpuidle material for v5.1.	2019-02-01 11:57:46 +01:00
Doug Smythies	1617971c66	cpuidle: poll_state: Fix default time limit The default time is declared in units of microsecnds, but is used as nanoseconds, resulting in significant accounting errors for idle state 0 time when all idle states deeper than 0 are disabled. Under these unusual conditions, we don't really care about the poll time limit anyhow. Fixes: `800fb34a99` ("cpuidle: poll_state: Disregard disable idle states") Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-01-30 22:57:42 +01:00
Rafael J. Wysocki	b26bf6ab71	cpuidle: New timer events oriented governor for tickless systems The venerable menu governor does some things that are quite questionable in my view. First, it includes timer wakeups in the pattern detection data and mixes them up with wakeups from other sources which in some cases causes it to expect what essentially would be a timer wakeup in a time frame in which no timer wakeups are possible (because it knows the time until the next timer event and that is later than the expected wakeup time). Second, it uses the extra exit latency limit based on the predicted idle duration and depending on the number of tasks waiting on I/O, even though those tasks may run on a different CPU when they are woken up. Moreover, the time ranges used by it for the sleep length correction factors depend on whether or not there are tasks waiting on I/O, which again doesn't imply anything in particular, and they are not correlated to the list of available idle states in any way whatever. Also, the pattern detection code in menu may end up considering values that are too large to matter at all, in which cases running it is a waste of time. A major rework of the menu governor would be required to address these issues and the performance of at least some workloads (tuned specifically to the current behavior of the menu governor) is likely to suffer from that. It is thus better to introduce an entirely new governor without them and let everybody use the governor that works better with their actual workloads. The new governor introduced here, the timer events oriented (TEO) governor, uses the same basic strategy as menu: it always tries to find the deepest idle state that can be used in the given conditions. However, it applies a different approach to that problem. First, it doesn't use "correction factors" for the time till the closest timer, but instead it tries to correlate the measured idle duration values with the available idle states and use that information to pick up the idle state that is most likely to "match" the upcoming CPU idle interval. Second, it doesn't take the number of "I/O waiters" into account at all and the pattern detection code in it avoids taking timer wakeups into account. It also only uses idle duration values less than the current time till the closest timer (with the tick excluded) for that purpose. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>	2019-01-16 23:07:30 +01:00
Linus Torvalds	8d6973327e	powerpc updates for 4.21 Notable changes: - Mitigations for Spectre v2 on some Freescale (NXP) CPUs. - A large series adding support for pass-through of Nvidia V100 GPUs to guests on Power9. - Another large series to enable hardware assistance for TLB table walk on MPC8xx CPUs. - Some preparatory changes to our DMA code, to make way for further cleanups from Christoph. - Several fixes for our Transactional Memory handling discovered by fuzzing the signal return path. - Support for generating our system call table(s) from a text file like other architectures. - A fix to our page fault handler so that instead of generating a WARN_ON_ONCE, user accesses of kernel addresses instead print a ratelimited and appropriately scary warning. - A cosmetic change to make our unhandled page fault messages more similar to other arches and also more compact and informative. - Freescale updates from Scott: "Highlights include elimination of legacy clock bindings use from dts files, an 83xx watchdog handler, fixes to old dts interrupt errors, and some minor cleanup." And many clean-ups, reworks and minor fixes etc. Thanks to: Alexandre Belloni, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Benjamin Herrenschmidt, Breno Leitao, Christian Lamparter, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Darren Stevens, David Gibson, Diana Craciun, Dmitry V. Levin, Firoz Khan, Geert Uytterhoeven, Greg Kurz, Gustavo Romero, Hari Bathini, Joel Stanley, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Michal Suchánek, Naveen N. Rao, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras, Ram Pai, Ravi Bangoria, Rob Herring, Russell Currey, Sabyasachi Gupta, Sam Bobroff, Satheesh Rajendran, Scott Wood, Segher Boessenkool, Stephen Rothwell, Tang Yuantian, Thiago Jung Bauermann, Yangtao Li, Yuantian Tang, Yue Haibing. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcJLwZAAoJEFHr6jzI4aWAAv4P/jMvP52lA90i2E8G72LOVSF1 33DbE/Okib3VfmmMcXZpgpEfwIcEmJcIj86WWcLWzBfXLunehkgwh+AOfBLwqWch D08+RR9EZb7ppvGe91hvSgn4/28CWVKAxuDviSuoE1OK8lOTncu889r2+AxVFZiY f6Al9UPlB3FTJonNx8iO4r/GwrPigukjbzp1vkmJJg59LvNUrMQ1Fgf9D3cdlslH z4Ff9zS26RJy7cwZYQZI4sZXJZmeQ1DxOZ+6z6FL/nZp/O4WLgpw6C6o1+vxo1kE 9ZnO/3+zIRhoWiXd6OcOQXBv3NNCjJZlXh9HHAiL8m5ZqbmxrStQWGyKW/jjEZuK wVHxfUT19x9Qy1p+BH3XcUNMlxchYgcCbEi5yPX2p9ZDXD6ogNG7sT1+NO+FBTww ueCT5PCCB/xWOccQlBErFTMkFXFLtyPDNFK7BkV7uxbH0PQ+9guCvjWfBZti6wjD /6NK4mk7FpmCiK13Y1xjwC5OqabxLUYwtVuHYOMr5TOPh8URUPS4+0pIOdoYDM6z Ensrq1CC843h59MWADgFHSlZ78FRtZlG37JAXunjLbqGupLOvL7phC9lnwkylHga 2hWUWFeOV8HFQBP4gidZkLk64pkT9LzqHgdgIB4wUwrhc8r2mMZGdQTq5H7kOn3Q n9I48PWANvEC0PBCJ/KL =cr6s -----END PGP SIGNATURE----- Merge tag 'powerpc-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Mitigations for Spectre v2 on some Freescale (NXP) CPUs. - A large series adding support for pass-through of Nvidia V100 GPUs to guests on Power9. - Another large series to enable hardware assistance for TLB table walk on MPC8xx CPUs. - Some preparatory changes to our DMA code, to make way for further cleanups from Christoph. - Several fixes for our Transactional Memory handling discovered by fuzzing the signal return path. - Support for generating our system call table(s) from a text file like other architectures. - A fix to our page fault handler so that instead of generating a WARN_ON_ONCE, user accesses of kernel addresses instead print a ratelimited and appropriately scary warning. - A cosmetic change to make our unhandled page fault messages more similar to other arches and also more compact and informative. - Freescale updates from Scott: "Highlights include elimination of legacy clock bindings use from dts files, an 83xx watchdog handler, fixes to old dts interrupt errors, and some minor cleanup." And many clean-ups, reworks and minor fixes etc. Thanks to: Alexandre Belloni, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Benjamin Herrenschmidt, Breno Leitao, Christian Lamparter, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Darren Stevens, David Gibson, Diana Craciun, Dmitry V. Levin, Firoz Khan, Geert Uytterhoeven, Greg Kurz, Gustavo Romero, Hari Bathini, Joel Stanley, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Michal Suchánek, Naveen N. Rao, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras, Ram Pai, Ravi Bangoria, Rob Herring, Russell Currey, Sabyasachi Gupta, Sam Bobroff, Satheesh Rajendran, Scott Wood, Segher Boessenkool, Stephen Rothwell, Tang Yuantian, Thiago Jung Bauermann, Yangtao Li, Yuantian Tang, Yue Haibing" * tag 'powerpc-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (201 commits) Revert "powerpc/fsl_pci: simplify fsl_pci_dma_set_mask" powerpc/zImage: Also check for stdout-path powerpc: Fix HMIs on big-endian with CONFIG_RELOCATABLE=y macintosh: Use of_node_name_{eq, prefix} for node name comparisons ide: Use of_node_name_eq for node name comparisons powerpc: Use of_node_name_eq for node name comparisons powerpc/pseries/pmem: Convert to %pOFn instead of device_node.name powerpc/mm: Remove very old comment in hash-4k.h powerpc/pseries: Fix node leak in update_lmb_associativity_index() powerpc/configs/85xx: Enable CONFIG_DEBUG_KERNEL powerpc/dts/fsl: Fix dtc-flagged interrupt errors clk: qoriq: add more compatibles strings powerpc/fsl: Use new clockgen binding powerpc/83xx: handle machine check caused by watchdog timer powerpc/fsl-rio: fix spelling mistake "reserverd" -> "reserved" powerpc/fsl_pci: simplify fsl_pci_dma_set_mask arch/powerpc/fsl_rmu: Use dma_zalloc_coherent vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver vfio_pci: Allow regions to add own capabilities vfio_pci: Allow mapping extra regions ...	2018-12-27 10:43:24 -08:00
Rafael J. Wysocki	04dab58a39	cpuidle: Add 'above' and 'below' idle state metrics Add two new metrics for CPU idle states, "above" and "below", to count the number of times the given state had been asked for (or entered from the kernel's perspective), but the observed idle duration turned out to be too short or too long for it (respectively). These metrics help to estimate the quality of the CPU idle governor in use. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-12-12 23:22:18 +01:00
Yangtao Li	9456823c84	cpuidle: big.LITTLE: fix refcount leak of_find_node_by_path() acquires a reference to the node returned by it and that reference needs to be dropped by its caller. bl_idle_init() doesn't do that, so fix it. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-12-11 12:09:48 +01:00
Rafael J. Wysocki	61cb5758d3	cpuidle: Add cpuidle.governor= command line parameter Add cpuidle.governor= command line parameter to allow the default cpuidle governor to be replaced. That is useful, for example, if someone running a tickful kernel wants to use the menu governor on it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-12-11 12:08:44 +01:00
Rafael J. Wysocki	800fb34a99	cpuidle: poll_state: Disregard disable idle states When computing the limit of time to spend in the loop in poll_idle(), use the target residency of the first enabled idle state deeper than state 0 instead of always using the target residency of state 1. This helps when state 1 is disabled for diagnostics, for instance. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-12-11 12:07:07 +01:00
Breno Leitao	2b038cbc5f	powerpc/pseries/cpuidle: Fix preempt warning When booting a pseries kernel with PREEMPT enabled, it dumps the following warning: BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 caller is pseries_processor_idle_init+0x5c/0x22c CPU: 13 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc3-00090-g12201a0128bc-dirty #828 Call Trace: [c000000429437ab0] [c0000000009c8878] dump_stack+0xec/0x164 (unreliable) [c000000429437b00] [c0000000005f2f24] check_preemption_disabled+0x154/0x160 [c000000429437b90] [c000000000cab8e8] pseries_processor_idle_init+0x5c/0x22c [c000000429437c10] [c000000000010ed4] do_one_initcall+0x64/0x300 [c000000429437ce0] [c000000000c54500] kernel_init_freeable+0x3f0/0x500 [c000000429437db0] [c0000000000112dc] kernel_init+0x2c/0x160 [c000000429437e20] [c00000000000c1d0] ret_from_kernel_thread+0x5c/0x6c This happens because the code calls get_lppaca() which calls get_paca() and it checks if preemption is disabled through check_preemption_disabled(). Preemption should be disabled because the per CPU variable may make no sense if there is a preemption (and a CPU switch) after it reads the per CPU data and when it is used. In this device driver specifically, it is not a problem, because this code just needs to have access to one lppaca struct, and it does not matter if it is the current per CPU lppaca struct or not (i.e. when there is a preemption and a CPU migration). That said, the most appropriate fix seems to be related to avoiding the debug_smp_processor_id() call at get_paca(), instead of calling preempt_disable() before get_paca(). Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-04 19:45:01 +11:00
Ulf Hansson	3e452e636d	ARM: cpuidle: Convert to use cpuidle_register\|unregister() The only reason that remains, to why the ARM cpuidle driver calls cpuidle_register_driver(), is to avoid printing an error message in case another driver already have been registered for the CPU. This seems a bit silly, but more importantly, if that is a common scenario, perhaps we should change cpuidle_register() accordingly instead. In either case, let's consolidate the code, by converting to use cpuidle_register\|unregister(), which also avoids the unnecessary allocation of the struct cpuidle_device. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-11-08 18:53:00 +01:00
Ulf Hansson	763f191af5	ARM: cpuidle: Don't register the driver when back-end init returns -ENXIO There's no point to register the cpuidle driver for the current CPU, when the initialization of the arch specific back-end data fails by returning -ENXIO. Instead, let's re-order the sequence to its original flow, by first trying to initialize the back-end part and then act accordingly on the returned error code. Additionally, let's print the error message, no matter of what error code that was returned. Fixes: `a0d46a3dfd` (ARM: cpuidle: Register per cpuidle device) Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: 4.19+ <stable@vger.kernel.org> # v4.19+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-11-08 18:52:02 +01:00
Linus Torvalds	6ef746769e	More power management updates for 4.20-rc1 - Fix build regression in the intel_pstate driver that doesn't build without CONFIG_ACPI after recent changes (Dominik Brodowski). - One of the heuristics in the menu cpuidle governor is based on a function returning 0 most of the time, so drop it and clean up the scheduler code related to it (Daniel Lezcano). - Prevent the arm_big_little cpufreq driver from being used on ARM64 which is not suitable for it and drop the arm_big_little_dt driver that is not used any more (Sudeep Holla). - Prevent the hung task watchdog from triggering during resume from system-wide sleep states by disabling it before freezing tasks and enabling it again after they have been thawed (Vitaly Kuznetsov). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJb2BJ7AAoJEILEb/54YlRx/kwP/iD7tUUZ6mT84OI0FTbEj8A/ fM+uHrwy25PmqyWGGtbHpaWU9OxVxUReSicsBCt+2LZmX3sFYpbSb243mv3pmxqb A0kLflG4lWCKJNIfa/a3OMDTUw26mxSTCidE3jJXkd8HkWrzeAWvMair+UCuzMf3 A4Omu0IkNL8C0MKtUOb3PlUk3dnLYMxuairNhozBPhi+P+0tLW9/9XvgPJBVhnbZ CKn/aFsDoc08tAfxC8N32cgKwE7nbeIgTJTBFyu2lQmInsd4TTuoM50vSC5i+x88 AmBOoH9IX0fhXJ6hgm+VMW8+x9S+H7jAVy/3C2xoUBeCclzlxX6eUCtjV5YNZqqn 1nXQfGeAwgzX6Tyu6HjM7vjbfObk59ZwpmDRPJEUEhLDEBMS+iDStlp9zmKTedNm G4iSTzS6qJCNPtx4y5wkLp/FvzTofIuWqVFJSJC4+EoVKkbbw9xwaY+JKXUt1Uwx j+U6EtRhzL/kVX0nq+iQXXeANxCFNzI56Ov5O7mxjF1m/hDE/Gb2QEeIb6nRZC2A H3I2so2J3h1yTgadpGFFvJWaqfHkgcBTsm06tSgHVb86quiTANJIQ9mqfFyOzDDJ KaZ82MROt7UuCMI6X9n+oIBDZWLHmADge6RdHCD1wB+zrUmusCtNEHUZACXd0mPf s8MUK4bWVhViVXGS5bMP =/bnR -----END PGP SIGNATURE----- Merge tag 'pm-4.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These remove a questionable heuristic from the menu cpuidle governor, fix a recent build regression in the intel_pstate driver, clean up ARM big-Little support in cpufreq and fix up hung task watchdog's interaction with system-wide power management transitions. Specifics: - Fix build regression in the intel_pstate driver that doesn't build without CONFIG_ACPI after recent changes (Dominik Brodowski). - One of the heuristics in the menu cpuidle governor is based on a function returning 0 most of the time, so drop it and clean up the scheduler code related to it (Daniel Lezcano). - Prevent the arm_big_little cpufreq driver from being used on ARM64 which is not suitable for it and drop the arm_big_little_dt driver that is not used any more (Sudeep Holla). - Prevent the hung task watchdog from triggering during resume from system-wide sleep states by disabling it before freezing tasks and enabling it again after they have been thawed (Vitaly Kuznetsov)" * tag 'pm-4.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: kernel: hung_task.c: disable on suspend cpufreq: remove unused arm_big_little_dt driver cpufreq: drop ARM_BIG_LITTLE_CPUFREQ support for ARM64 cpufreq: intel_pstate: Fix compilation for !CONFIG_ACPI cpuidle: menu: Remove get_loadavg() from the performance multiplier sched: Factor out nr_iowait and nr_iowait_cpu	2018-10-30 09:08:07 -07:00
Johannes Weiner	8508cf3ffa	sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD There are several definitions of those functions/macros in places that mess with fixed-point load averages. Provide an official version. [akpm@linux-foundation.org: fix missed conversion in block/blk-iolatency.c] Link: http://lkml.kernel.org/r/20180828172258.3185-5-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Suren Baghdasaryan <surenb@google.com> Tested-by: Daniel Drake <drake@endlessm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-10-26 16:26:32 -07:00
Daniel Lezcano	a7fe5190c0	cpuidle: menu: Remove get_loadavg() from the performance multiplier The function get_loadavg() returns almost always zero. To be more precise, statistically speaking for a total of 1023379 times passing in the function, the load is equal to zero 1020728 times, greater than 100, 610 times, the remaining is between 0 and 5. In 2011, the get_loadavg() was removed from the Android tree because of the above [1]. At this time, the load was: unsigned long this_cpu_load(void) { struct rq this = this_rq(); return this->cpu_load[0]; } In 2014, the code was changed by commit `372ba8cb46` (cpuidle: menu: Lookup CPU runqueues less) and the load is: void get_iowait_load(unsigned long nr_waiters, unsigned long load) { struct rq rq = this_rq(); nr_waiters = atomic_read(&rq->nr_iowait); load = rq->load.weight; } with the same result. Both measurements show using the load in this code path does no matter anymore. Removing it. [1] `4dedd9f124` Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-10-25 16:49:27 +02:00
Rafael J. Wysocki	f1c8e410cd	cpuidle: menu: Avoid computations when result will be discarded If the minimum interval taken into account in the average computation loop in get_typical_interval() is less than the expected idle duration determined so far, the resultant average cannot be greater than that value as well and the entire return result of the function is going to be discarded anyway going forward. In that case, it is a waste of time to carry out the remaining computations in get_typical_interval(), so avoid that by returning early if the minimum interval is not below the expected idle duration. No intentional changes of behavior. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-10-18 09:34:13 +02:00

1 2 3 4 5 ...

681 Commits