linux

Commit Graph

Author	SHA1	Message	Date
Viresh Kumar	44139ed494	cpufreq: Allow drivers to enable boost support after registering driver In some cases it wouldn't be known at time of driver registration, if the driver needs to support boost frequencies. For example, while getting boost information from DT with opp-v2 bindings, we need to parse the bindings for all the CPUs to know if turbo/boost OPPs are supported or not. One way out to do that efficiently is to delay supporting boost mode (i.e. creating /sys/devices/system/cpu/cpufreq/boost file), until the time OPP bindings are parsed. At that point, the driver can enable boost support. This can be done at ->init(), where the frequency table is created. To do that, the driver requires few APIs from cpufreq core that let him do this. This patch provides these APIs. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-08-07 03:25:23 +02:00
Viresh Kumar	71db87ba57	bus: subsys: update return type of ->remove_dev() to void Its return value is not used by the subsys core and nothing meaningful can be done with it, even if we want to use it. The subsys device is anyway getting removed. Update prototype of ->remove_dev() to make its return type as void. Fix all usage sites as well. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 17:08:14 -07:00
Pan Xinhui	fba9573b33	cpufreq: Correct a freq check in cpufreq_set_policy() This check was originally added by commit `9c9a43ed27` ("[CPUFREQ] return error when failing to set minfreq").It attempt to return an error on obviously incorrect limits when we echo xxx >.../scaling_max,min_freq Actually we just need check if new_policy->min > new_policy->max. Because at least one of max/min is copied from cpufreq_get_policy(). For example, when we echo xxx > .../scaling_min_freq, new_policy is copied from policy in cpufreq_get_policy. new_policy->max is same with policy->max. new_policy->min is set to a new value. Let me explain it in deduction method, first statement in if (): new_policy->min > policy->max policy->max == new_policy->max ==> new_policy->min > new_policy->max second statement in if(): new_policy->max < policy->min policy->max < policy->min ==>new_policy->min > new_policy->max (induction method) So we have proved that we only need check if new_policy->min > new_policy->max. After apply this patch, we can also modify ->min and ->max at same time if new freq range is very much different from current freq range. For example, if current freq range is 480000-960000, then we want to set this range to 1120000-2240000, we would fail in the past because new_policy->min > policy->max. As long as the cpufreq range is valid, we has no reason to reject the user. So correct the check to avoid such case. Signed-off-by: Pan Xinhui <xinhuix.pan@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-07-31 23:22:16 +02:00
Rafael J. Wysocki	fdd320da84	cpufreq: Lock CPU online/offline in cpufreq_register_driver() To protect against races with concurrent CPU online/offline, call get_online_cpus() before registering a cpufreq driver. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-31 22:01:19 +02:00
Rafael J. Wysocki	194d99c7e3	cpufreq: Replace recover_policy with new_policy in cpufreq_online() The recover_policy is unsed in cpufreq_online() to indicate whether a new policy object is created or an existing one is reinitialized. The "recover" part of the name is slightly confusing (it should be "reinitialization" rather than "recovery") and the logical not (!) operator is applied to it in almost all of the checks it is used in, so replace that variable with a new one called "new_policy" that will be true in the case of a new policy creation. While at it, drop one of the labels that is jumped to from only one spot. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-31 22:00:31 +02:00
Rafael J. Wysocki	0b27535287	cpufreq: Separate CPU device registration from CPU online To separate the CPU online interface from the CPU device registration, split cpufreq_online() out of cpufreq_add_dev() and make cpufreq_cpu_callback() call the former, while cpufreq_add_dev() itself will only be used as the CPU device addition subsystem interface callback. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Suggested-by: Russell King <linux@arm.linux.org.uk> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-31 21:59:37 +02:00
Rafael J. Wysocki	a34e63b144	cpufreq: Pass CPU number to cpufreq_policy_alloc() Change cpufreq_policy_alloc() to take a CPU number instead of a CPU device pointer as its argument, as it is the only function called by cpufreq_add_dev() taking a device pointer argument at this point. That will allow us to split the CPU online part from cpufreq_add_dev() more cleanly going forward. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-28 17:24:12 +02:00
Rafael J. Wysocki	4d1f3a5bcb	cpufreq: Do not update related_cpus on every policy activation The related_cpus mask includes CPUs whose cpufreq_cpu_data per-CPU pointers have been set the the given policy. Since those pointers are only set at the policy creation time and unset when the policy is deleted, the related_cpus should not be updated between those two operations. For this reason, avoid updating it whenever the first of the "related" CPUs goes online. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-28 17:24:12 +02:00
Rafael J. Wysocki	d9612a495b	cpufreq: Drop unused dev argument from two functions The dev argument of cpufreq_add_policy_cpu() and cpufreq_add_dev_interface() is not used by any of them, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-28 17:24:11 +02:00
Rafael J. Wysocki	d4d854d6c7	cpufreq: Drop unnecessary label from cpufreq_add_dev() The leftover out_release_rwsem label in cpufreq_add_dev() is not necessary any more and confusing, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-28 17:24:11 +02:00
Rafael J. Wysocki	11ce707e6c	cpufreq: Drop cpufreq_policy_restore() Notice that when cpufreq_policy_restore() is called, its per-CPU cpufreq_cpu_data variable has been already dereferenced and if that variable is not NULL, the policy local pointer in cpufreq_add_dev() contains its value. Therefore it is not necessary to dereference it again and the policy pointer can be used directly. Moreover, if that pointer is not NULL, the policy is inactive (or the previous check would have made us return from cpufreq_add_dev()) so the restoration code from cpufreq_policy_restore() can be moved to that point in cpufreq_add_dev(). Do that and drop cpufreq_policy_restore(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-28 17:24:10 +02:00
Rafael J. Wysocki	15c0b4d222	cpufreq: Rework two functions related to CPU offline Since __cpufreq_remove_dev_prepare() and __cpufreq_remove_dev_finish() are about CPU offline rather than about CPU removal, rename them to cpufreq_offline_prepare() and cpufreq_offline_finish(), respectively. Also change their argument from a struct device pointer to a CPU number, because they use the CPU number only internally anyway and make them void as their return values are ignored. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-28 17:24:10 +02:00
Rafael J. Wysocki	c6e53c69ef	Merge back earlier cpufreq material for v4.3.	2015-07-28 17:21:32 +02:00
Rafael J. Wysocki	559ed40752	cpufreq: Avoid attempts to create duplicate symbolic links After commit `87549141d5` (cpufreq: Stop migrating sysfs files on hotplug) there is a problem with CPUs that share cpufreq policy objects with other CPUs and are initially offline. Say CPU1 shares a policy with CPU0 which is online and is registered first. As part of the registration process, cpufreq_add_dev() is called for it. It creates the policy object and a symbolic link to it from the CPU1's sysfs directory. If CPU1 is registered subsequently and it is offline at that time, cpufreq_add_dev() will attempt to create a symbolic link to the policy object for it, but that link is present already, so a warning about that will be triggered. To avoid that warning, make cpufreq use an additional CPU mask containing related CPUs that are actually present for each policy object. That mask is initialized when the policy object is populated after its creation (for the first online CPU using it) and it includes CPUs from the "policy CPUs" mask returned by the cpufreq driver's ->init() callback that are physically present at that time. Symbolic links to the policy are created only for the CPUs in that mask. If cpufreq_add_dev() is invoked for an offline CPU, it checks the new mask and only creates the symlink if the CPU was not in it (the CPU is added to the mask at the same time). In turn, cpufreq_remove_dev() drops the given CPU from the new mask, removes its symlink to the policy object and returns, unless it is the CPU owning the policy object. In that case, the policy object is moved to a new CPU's sysfs directory or deleted if the CPU being removed was the last user of the policy. While at it, notice that cpufreq_remove_dev() can't fail, because its return value is ignored, so make it ignore return values from __cpufreq_remove_dev_prepare() and __cpufreq_remove_dev_finish() and prevent these functions from aborting on errors returned by __cpufreq_governor(). Also drop the now unused sif argument from them. Fixes: `87549141d5` (cpufreq: Stop migrating sysfs files on hotplug) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reported-and-tested-by: Russell King <linux@arm.linux.org.uk> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-28 17:19:26 +02:00
Sebastian Andrzej Siewior	454d3a2500	cpufreq: Remove cpufreq_rwsem cpufreq_rwsem was introduced in commit `6eed9404ab` ("cpufreq: Use rwsem for protecting critical sections) in order to replace try_module_get() on the cpu-freq driver. That try_module_get() worked well until the refcount was so heavily used that module removal became more or less impossible. Though when looking at the various (undocumented) protection mechanisms in that code, the randomly sprinkeled around cpufreq_rwsem locking sites are superfluous. The policy, which is acquired in cpufreq_cpu_get() and released in cpufreq_cpu_put() is sufficiently protected already. cpufreq_cpu_get(cpu) /* Protects against concurrent driver removal */ read_lock_irqsave(&cpufreq_driver_lock, flags); policy = per_cpu(cpufreq_cpu_data, cpu); kobject_get(&policy->kobj); read_unlock_irqrestore(&cpufreq_driver_lock, flags); The reference on the policy serializes versus module unload already: cpufreq_unregister_driver() subsys_interface_unregister() __cpufreq_remove_dev_finish() per_cpu(cpufreq_cpu_data) = NULL; cpufreq_policy_put_kobj() If there is a reference held on the policy, i.e. obtained prior to the unregister call, then cpufreq_policy_put_kobj() will wait until that reference is dropped. So once subsys_interface_unregister() returns there is no policy pointer in flight and no new reference can be obtained. So that rwsem protection is useless. The other usage of cpufreq_rwsem in show()/store() of the sysfs interface is redundant as well because sysfs already does the proper kobject_get()/put() pairs. That leaves CPU hotplug versus module removal. The current down_write() around the write_lock() in cpufreq_unregister_driver() is silly at best as it protects actually nothing. The trivial solution to this is to prevent hotplug across cpufreq_unregister_driver completely. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-07-25 01:49:01 +02:00
Viresh Kumar	4bc384ae62	cpufreq: propagate errors returned from __cpufreq_governor() Return codes aren't honored properly in cpufreq_set_policy(). This can lead to two problems: - wrong errors propagated to sysfs - we try to do next state-change even if the previous one failed cpufreq_governor_dbs() now returns proper errors on all invalid state-transition requests and this code should honor that. Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-07-21 01:12:02 +02:00
Viresh Kumar	7f0fa40f5a	cpufreq: Properly handle errors from cpufreq_init_policy() cpufreq_init_policy() can fail, and we don't do anything except a call to ->exit() on that. The policy should be freed if this happens. Do it properly. Reported-and-tested-by: "Jon Medhurst (Tixy)" <tixy@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-07-16 23:51:26 +02:00
Viresh Kumar	8101f99703	cpufreq: cpufreq_add_dev: name goto labels based on what they do These labels are are named in two ways normally: - Based on what caused to jump to such labels - Based on what we do under such labels We follow the first naming convention today and that leads to multiple labels for doing the same work. Fix it by switching to the second way of naming them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-07-16 23:51:25 +02:00
Viresh Kumar	5a31d594a9	cpufreq: Allow freq_table to be obtained for offline CPUs Users of freq table may want to access it for any CPU from policy->related_cpus mask. One such user is cpu-cooling layer. It gets a list of 'clip_cpus' (equivalent to policy->related_cpus) during registration and tries to get freq_table for the first CPU of this mask. If the CPU, for which it tries to fetch freq_table, is offline, cpufreq_frequency_get_table() fails. This happens because it relies on cpufreq_cpu_get_raw() for its functioning which returns policy only for online CPUs. The fix is to access the policy data structure for the given CPU directly (which also returns a valid policy for offline CPUs), but the policy itself has to be active (meaning that at least one CPU using it is online) for the frequency table to be returned. Because we will be using 'cpufreq_cpu_data' now, which is internal to the cpufreq core, move cpufreq_frequency_get_table() to cpufreq.c. Reported-and-tested-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-07-10 01:43:27 +02:00
Viresh Kumar	35afd02e30	cpufreq: Initialize the governor again while restoring policy When all CPUs of a policy are hot-unplugged, we EXIT the governor but don't mark policy->governor as NULL. This was done in order to keep last used governor's information intact in sysfs, while the CPUs are offline. But we also need to clear policy->governor when restoring the policy. Because policy->governor still points to the last governor while policy is restored, following sequence of event happens: - cpufreq_init_policy() called while restoring policy - find_governor() matches last_governor string for present governors and returns last used governor's pointer, say ondemand. policy->governor already has the same address, unless the governor was removed in between. - cpufreq_set_policy() is called with both old/new policies governor set as ondemand. - Because governors matched, we skip governor initialization and return after calling __cpufreq_governor(CPUFREQ_GOV_LIMITS). Because the governor wasn't initialized for this policy, it returned -EBUSY. - cpufreq_init_policy() exits the policy on this error, but doesn't destroy it properly (should be fixed separately). - And so we enter a scenario where the policy isn't completely initialized but used. Fix this by setting policy->governor to NULL while restoring the policy. Reported-and-tested-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org> Reported-and-tested-by: "Jon Medhurst (Tixy)" <tixy@linaro.org> Reported-and-tested-by: Steven Rostedt <rostedt@goodmis.org> Fixes: `18bf3a124e` (cpufreq: Mark policy->governor = NULL for inactive policies) Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-07-10 01:36:27 +02:00
Viresh Kumar	3782902983	cpufreq: Remove cpufreq_update_policy() cpufreq_update_policy() was kept as a separate routine earlier as it was handling migration of sysfs directories, which isn't the case anymore. It is only updating policy->cpu now and is called by a single caller. The WARN_ON() isn't really required anymore, as we are just updating the cpu now, not moving the sysfs directories. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-06-11 01:03:04 +02:00
Viresh Kumar	9591becbf2	cpufreq: Restart governor as soon as possible __cpufreq_remove_dev_finish() is doing two things today: - Restarts the governor if some CPUs from concerned policy are still online. - Frees the policy if all CPUs are offline. The first task of restarting the governor can be moved to __cpufreq_remove_dev_prepare() to restart the governor early. There is no race between _prepare() and _finish() as they would be handling completely different cases. _finish() will only be required if we are going to free the policy and that has nothing to do with restarting the governor. Original-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-06-11 01:02:45 +02:00
Viresh Kumar	3654c5cc81	cpufreq: Call cpufreq_policy_put_kobj() from cpufreq_policy_free() cpufreq_policy_put_kobj() is actually part of freeing the policy and can be called from cpufreq_policy_free() directly instead of a separate call. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-06-11 01:02:40 +02:00
Viresh Kumar	2fc3384dc7	cpufreq: Initialize policy->kobj while allocating policy policy->kobj is required to be initialized once in the lifetime of a policy. Currently we are initializing it from __cpufreq_add_dev() and that doesn't look to be the best place for doing so as we have to do this on special cases (like: !recover_policy). We can initialize it from a more obvious place cpufreq_policy_alloc() and that will make code look cleaner, specially the error handling part. The error handling part of __cpufreq_add_dev() was doing almost the same thing while recover_policy is true or false. Fix that as well by always calling cpufreq_policy_put_kobj() with an additional parameter to skip notification part of it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-06-11 01:01:54 +02:00
Viresh Kumar	87549141d5	cpufreq: Stop migrating sysfs files on hotplug When we hot-unplug a cpu, we remove its sysfs cpufreq directory and if the outgoing cpu was the owner of policy->kobj earlier then we migrate the sysfs directory to under another online cpu. There are few disadvantages this brings: - Code Complexity - Slower hotplug/suspend/resume - sysfs file permissions are reset after all policy->cpus are offlined - CPUFreq stats history lost after all policy->cpus are offlined - Special management of sysfs stuff during suspend/resume To overcome these, this patch modifies the way sysfs directories are managed: - Select sysfs kobjects owner while initializing policy and don't change it during hotplugs. Track it with kobj_cpu created earlier. - Create symlinks for all related CPUs (can be offline) instead of affected CPUs on policy initialization and remove them only when the policy is freed. - Free policy structure only on the removal of cpufreq-driver and not during hotplug/suspend/resume, detected by checking 'struct subsys_interface *' (Valid only when called from subsys_interface_unregister() while unregistering driver). Apart from this, special care is taken to handle physical hoplug of CPUs as we wouldn't remove sysfs links or remove policies on logical hotplugs. Physical hotplug happens in the following sequence. Hot removal: - CPU is offlined first, ~ 'echo 0 > /sys/devices/system/cpu/cpuX/online' - Then its device is removed along with all sysfs files, cpufreq core notified with cpufreq_remove_dev() callback from subsys-interface.. Hot addition: - First the device along with its sysfs files is added, cpufreq core notified with cpufreq_add_dev() callback from subsys-interface.. - CPU is onlined, ~ 'echo 1 > /sys/devices/system/cpu/cpuX/online' We call the same routines with both hotplug and subsys callbacks, and we sense physical hotplug with cpu_offline() check in subsys callback. We can handle most of the stuff with regular hotplug callback paths and add/remove cpufreq sysfs links or free policy from subsys callbacks. Original-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-06-11 01:00:42 +02:00
Viresh Kumar	11e584cfb8	cpufreq: Don't allow updating inactive policies from sysfs Later commits would change the way policies are managed today. Policies wouldn't be freed on cpu hotplug (currently they aren't freed only for suspend), and while the CPU is offline, the sysfs cpufreq files would still be present. User may accidentally try to update the sysfs files in following directory: '/sys/devices/system/cpu/cpuX/cpufreq/'. And that would result in undefined behavior as policy wouldn't be active then. Apart from updating the store() routine, we also update __cpufreq_get() which can call cpufreq_out_of_sync(). The later routine tries to update policy->cur and starts notifying kernel about it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-06-10 02:11:45 +02:00
Saravana Kannan	9d16f20711	cpufreq: Track cpu managing sysfs kobjects separately In order to prepare for the next few commits, that will stop migrating sysfs files on cpu hotplug, this patch starts managing sysfs-cpu separately. The behavior is still the same as we are still migrating sysfs files on hotplug, later commits would change that. Signed-off-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-23 00:49:04 +02:00
Shailendra Verma	58405af632	cpufreq: Fix for typos in two comments Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-22 23:59:44 +02:00
Viresh Kumar	18bf3a124e	cpufreq: Mark policy->governor = NULL for inactive policies Later commits would change the way policies are managed today. Policies wouldn't be freed on cpu hotplug (currently they aren't freed on suspend), and while the CPU is offline, the sysfs cpufreq files would still be present. Because we don't mark policy->governor as NULL, it still contains pointer of the last used governor. And if the governor is removed, while all the CPUs of a policy are hotplugged out, this pointer wouldn't be valid anymore. And if we try to read the 'scaling_governor', etc. from sysfs, it will result in kernel OOPs. To prevent this, mark policy->governor as NULL for all inactive policies while the governor is removed from kernel. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:46:45 +02:00
Viresh Kumar	4573237b01	cpufreq: Manage governor usage history with 'policy->last_governor' History of which governor was used last is common to all CPUs within a policy and maintaining it per-cpu isn't the best approach for sure. Apart from wasting memory, this also increases the complexity of managing this data structure as it has to be updated for all CPUs. To make that somewhat simpler, lets store this information in a new field 'last_governor' in struct cpufreq_policy and update it on removal of last cpu of a policy. As a side-effect it also solves an old problem, consider a system with two clusters 0 & 1. And there is one policy per cluster. Cluster 0: CPU0 and 1. Cluster 1: CPU2 and 3. - CPU2 is first brought online, and governor is set to performance (default as cpufreq_cpu_governor wasn't set). - Governor is changed to ondemand. - CPU2 is taken offline and cpufreq_cpu_governor is updated for CPU2. - CPU3 is brought online. - Because cpufreq_cpu_governor wasn't set for CPU3, the default governor performance is picked for CPU3. This patch fixes the bug as we now have a single variable to update for policy. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:44:17 +02:00
Viresh Kumar	9104bb26c7	cpufreq: Don't traverse all active policies to find policy for a cpu We reach here while adding policy for a CPU and enter into the 'if' block only if a policy already exists for the CPU. As cpufreq_cpu_data is set for all policy->related_cpus now, when the policy is first added, we can use that to find the CPU's policy instead of traversing the list of all active policies. Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:38:18 +02:00
Viresh Kumar	3914d37910	cpufreq: Get rid of cpufreq_cpu_data_fallback We can extract the same information from cpufreq_cpu_data as it is also available for inactive policies now. And so don't need cpufreq_cpu_data_fallback anymore. Also add a WARN_ON() for the case where we try to restore from an active policy. Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:35:57 +02:00
Viresh Kumar	988bed09d3	cpufreq: Don't clear cpufreq_cpu_data and policy list for inactive policies Now that we can check policy->cpus to find if policy is active or not, we don't need to clean cpufreq_cpu_data and delete policy from the list on light weight tear down of policies (like in suspend). To make it consistent and clean, set cpufreq_cpu_data for all related CPUs when the policy is first created and clean it only while it is freed. Also update cpufreq_cpu_get_raw() to check if cpu is part of policy->cpus mask, so that we don't end up getting policies for offline CPUs. In order to make sure that no users of 'policy' are using an inactive policy, use cpufreq_cpu_get_raw() instead of directly accessing cpufreq_cpu_data. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:32:46 +02:00
Viresh Kumar	f963735a3c	cpufreq: Create for_each_{in}active_policy() policy->cpus is cleared unconditionally now on hotplug-out of a CPU and it can be checked to know if a policy is active or not. Create helper routines to iterate over all active/inactive policies, based on policy->cpus field. Replace all instances of for_each_policy() with for_each_active_policy() to make them iterate only for active policies. (We haven't made changes yet to keep inactive policies in the same list, but that will be followed in a later patch). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:26:07 +02:00
Viresh Kumar	303ae72307	cpufreq: Clear policy->cpus even for the last CPU We clear policy->cpus mask while CPUs are hotplugged out. We do it for all CPUs except the last CPU of the policy. I don't remember what the rationale behind that was, but I couldn't think of anything that will break if we remove this conditional clearing and always clear policy->cpus. The benefit we get out of it is, we can know if a policy is active or not by checking if this field is empty or not. That will be used by later commits. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:38:35 +02:00
Viresh Kumar	bb29ae152e	cpufreq: Keep a single path for adding managed CPUs There are two cases when we may try to add CPUs we're already handling: - On boot, the first cpu has marked all policy->cpus managed and so we will find policy for all other policy->cpus later on. - When a managed cpu is hotplugged out and later brought back in. Currently, separate paths and checks take care of the two. While the first one is detected by testing cpu against 'policy->cpus', the other one is detected by testing cpu against 'policy->related_cpus'. We can handle them both via a single path and there is no need to do special checking for the first one. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> [ rjw: Changelog, comments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:36:41 +02:00
Viresh Kumar	1b947c904c	cpufreq: Throw warning when we try to get policy for an invalid CPU Simply returning here with an error is not enough. It shouldn't be allowed at all to try calling cpufreq_cpu_get() for an invalid CPU. Add a WARN here to make it clear that it wouldn't be acceptable at all. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:29:57 +02:00
Viresh Kumar	23faf0b743	cpufreq: Merge __cpufreq_add_dev() and cpufreq_add_dev() cpufreq_add_dev() is an unnecessary wrapper over __cpufreq_add_dev(). Merge them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:28:28 +02:00
Viresh Kumar	50e9c85213	cpufreq: Add doc style comment about cpufreq_cpu_{get\|put}() This clearly states what the code inside these routines is doing and how these must be used. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:27:20 +02:00
Viresh Kumar	c75de0ac07	cpufreq: Schedule work for the first-online CPU on resume All CPUs leaving the first-online CPU are hotplugged out on suspend and and cpufreq core stops managing them. On resume, we need to call cpufreq_update_policy() for this CPU's policy to make sure its frequency is in sync with cpufreq's cached value, as it might have got updated by hardware during suspend/resume. The policies are always added to the top of the policy-list. So, in normal circumstances, CPU 0's policy will be the last one in the list. And so the code checks for the last policy. But there are cases where it will fail. Consider quad-core system, with policy-per core. If CPU0 is hotplugged out and added back again, the last policy will be on CPU1 :( To fix this in a proper way, always look for the policy of the first online CPU. That way we will be sure that we are calling cpufreq_update_policy() for the only CPU that wasn't hotplugged out. Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ Fixes: `2f0aea9363` ("cpufreq: suspend governors on system suspend/hibernate") Reported-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-04-03 12:59:47 +02:00
Viresh Kumar	f7b2706117	cpufreq: Create for_each_governor() To make code more readable and less error prone, lets create a helper macro for iterating over all available governors. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-02-03 23:28:01 +01:00
Viresh Kumar	b4f0676fe2	cpufreq: Create for_each_policy() To make code more readable and less error prone, lets create a helper macro for iterating over all active policies. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-02-03 23:27:45 +01:00
Viresh Kumar	1e63eaf0c4	cpufreq: Drop cpufreq_disabled() check from cpufreq_cpu_{get\|put}() When cpufreq is disabled, the per-cpu variable would have been set to NULL. Remove this unnecessary check. [ Changelog from Saravana Kannan. ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-02-03 23:26:02 +01:00
Viresh Kumar	6ffae8c06f	cpufreq: Set cpufreq_cpu_data to NULL before putting kobject In __cpufreq_remove_dev_finish(), per-cpu 'cpufreq_cpu_data' needs to be cleared before calling kobject_put(&policy->kobj) and under cpufreq_driver_lock. Otherwise, if someone else calls cpufreq_cpu_get() in parallel with it, they can obtain a non-NULL policy from that after kobject_put(&policy->kobj) was executed. Consider this case: Thread A Thread B cpufreq_cpu_get() acquire cpufreq_driver_lock read-per-cpu cpufreq_cpu_data kobject_put(&policy->kobj); kobject_get(&policy->kobj); ... per_cpu(&cpufreq_cpu_data, cpu) = NULL And this will result in a warning like this one: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47 kobject_get+0x41/0x50() Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon ... Call Trace: [<ffffffff81661b14>] dump_stack+0x46/0x58 [<ffffffff81072b61>] warn_slowpath_common+0x81/0xa0 [<ffffffff81072c7a>] warn_slowpath_null+0x1a/0x20 [<ffffffff812e16d1>] kobject_get+0x41/0x50 [<ffffffff815262a5>] cpufreq_cpu_get+0x75/0xc0 [<ffffffff81527c3e>] cpufreq_update_policy+0x2e/0x1f0 [<ffffffff810b8cb2>] ? up+0x32/0x50 [<ffffffff81381aa9>] ? acpi_ns_get_node+0xcb/0xf2 [<ffffffff81381efd>] ? acpi_evaluate_object+0x22c/0x252 [<ffffffff813824f6>] ? acpi_get_handle+0x95/0xc0 [<ffffffff81360967>] ? acpi_has_method+0x25/0x40 [<ffffffff81391e08>] acpi_processor_ppc_has_changed+0x77/0x82 [<ffffffff81089566>] ? move_linked_works+0x66/0x90 [<ffffffff8138e8ed>] acpi_processor_notify+0x58/0xe7 [<ffffffff8137410c>] acpi_ev_notify_dispatch+0x44/0x5c [<ffffffff8135f293>] acpi_os_execute_deferred+0x15/0x22 [<ffffffff8108c910>] process_one_work+0x160/0x410 [<ffffffff8108d05b>] worker_thread+0x11b/0x520 [<ffffffff8108cf40>] ? rescuer_thread+0x380/0x380 [<ffffffff81092421>] kthread+0xe1/0x100 [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0 [<ffffffff81669ebc>] ret_from_fork+0x7c/0xb0 [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0 ---[ end trace 89e66eb9795efdf7 ]--- The actual code flow is as follows: Thread A: Workqueue: kacpi_notify acpi_processor_notify() acpi_processor_ppc_has_changed() cpufreq_update_policy() cpufreq_cpu_get() kobject_get() Thread B: xenbus_thread() xenbus_thread() msg->u.watch.handle->callback() handle_vcpu_hotplug_event() vcpu_hotplug() cpu_down() __cpu_notify(CPU_POST_DEAD..) cpufreq_cpu_callback() __cpufreq_remove_dev_finish() cpufreq_policy_put_kobj() kobject_put() cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data under cpufreq_driver_lock, and once it gets a valid policy it expects it to not be freed until cpufreq_cpu_put() is called. But the race happens when another thread puts the kobject first and updates cpufreq_cpu_data before or later. And so the first thread gets a valid policy structure and before it does kobject_get() on it, the second one has already done kobject_put(). Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that too under locks. Reported-by: Ethan Zhao <ethan.zhao@oracle.com> Reported-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.12+ <stable@vger.kernel.org> # 3.12+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-02-03 00:59:29 +01:00
Viresh Kumar	d9f354460d	cpufreq: remove CPUFREQ_UPDATE_POLICY_CPU notifications CPUFREQ_UPDATE_POLICY_CPU notifications were used only from cpufreq-stats which doesn't use it anymore. Remove them. This also decrements values of other notification macros defined after CPUFREQ_UPDATE_POLICY_CPU by 1 to remove gaps. Hopefully all users are using macro's instead of direct numbers and so they wouldn't break as macro values are changed now. Reviewed-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 23:06:44 +01:00
Viresh Kumar	7c418ff099	cpufreq: Remove (now) unused 'last_cpu' from struct cpufreq_policy 'last_cpu' was used only from cpufreq-stats and isn't used anymore. Get rid of it. Reviewed-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 23:06:44 +01:00
Viresh Kumar	818c57126e	cpufreq: move some initialization stuff to cpufreq_policy_alloc() We need to initialize completion and work only on policy allocation and not really on the policy restore side and so we better move this piece of code to cpufreq_policy_alloc(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:35 +01:00
Viresh Kumar	ce1bcfe94d	cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs CPUFREQ_STICKY flag is set by drivers which don't want to get unregistered even if cpufreq-core isn't able to initialize policy for any CPU. When this flag isn't set, we try to unregister the driver. To find out which CPUs are registered and which are not, we try to check per_cpu cpufreq_cpu_data for all CPUs. Because we have a list of valid policies available now, we better check if the list is empty or not instead of the 'for' loop. That will be much more efficient. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:35 +01:00
Viresh Kumar	39c132eebd	cpufreq: limit the scope of l_p_j variables These variables are just used within adjust_jiffies() and so must be local to it. Also there is no need of a dummy routine for CONFIG_SMP case as we can take care of all that with help of macros in the same routine. It doesn't look that ugly. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	d7a9771c1a	cpufreq: use light-weight cpufreq_cpu_get_raw() in __cpufreq_add_dev() We just need to check if a 'policy' is already present for the cpu we are adding. We don't need to take all the locks and do kobject usage updates. Use the light-weight cpufreq_cpu_get_raw() routine instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	7f0c020ab6	cpufreq: get rid of 'tpolicy' from __cpufreq_add_dev() There is no need of this separate variable, use 'policy' instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	22a7cfb014	cpufreq: get rid of CONFIG_{HOTPLUG_CPU\|SMP} mess These are messing up more than the benefit they provide. It isn't a lot of code anyway, that we will compile without them. Kill them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	bc68b7dfda	cpufreq: update driver_data->flags only if we are registering driver We should first check if a cpufreq driver is already registered or not before updating driver_data->flags. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	d92d50a462	cpufreq: pass policy to __cpufreq_get() There is no point finding out the 'policy' again within __cpufreq_get() when all the callers already have it. Just make them pass policy instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	a1e1dc41c4	cpufreq: pass policy to cpufreq_out_of_sync There is no point finding out the 'policy' again within cpufreq_out_of_sync() when all the callers already have it. Just make them pass policy instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	2e1cc3a5d7	cpufreq: No need to check for has_target() Either we can be setpolicy or target type, nothing else. And so the else part of setpolicy will automatically be of has_target() type. And so we don't need to check it again. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	42f91fa116	cpufreq: s/__find_governor/find_governor Remove unnecessary from find_governor's name. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Viresh Kumar	db5f299574	cpufreq: merge 'if' blocks in __cpufreq_remove_dev_prepare() There are two 'if' blocks here, checking for !cpufreq_driver->setpolicy and has_target(). Both are actually doing the same thing, merge them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Viresh Kumar	09347b2905	cpufreq: don't need line break in show_scaling_cur_freq() No need of an unnecessary line break. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Viresh Kumar	f13f1184a1	cpufreq: remove extra parenthesis We can live without it and so we should. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Viresh Kumar	e673f1639b	cpufreq: remove dangling comment It doesn't make any sense at all and is a leftover of some earlier commit. Remove it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Doug Anderson	90de2a4aa9	cpufreq: suspend cpufreq governors on shutdown We should stop cpufreq governors when we shut down the system. If we don't do this, we can end up with this deadlock: 1. cpufreq governor may be running on a CPU other than CPU0. 2. In machine_restart() we call smp_send_stop() which stops CPUs. If one of these CPUs was actively running a cpufreq governor then it may have the mutex / spinlock needed to access the main PMIC in the system (perhaps over I2C) 3. If a machine needs access to the main PMIC in order to shutdown then it will never get it since the mutex was lost when the other CPU stopped. 4. We'll hang (possibly eventually hitting the hard lockup detector). Let's avoid the problem by stopping the cpufreq governor at shutdown, which is a sensible thing to do anyway. Signed-off-by: Doug Anderson <dianders@chromium.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:20:30 +01:00
Ethan Zhao	cb57720bf7	cpufreq: fix a NULL pointer dereference in __cpufreq_governor() If ACPI _PPC changed notification happens before governor was initiated while kernel is booting, a NULL pointer dereference will be triggered: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: [<ffffffff81470453>] __cpufreq_governor+0x23/0x1e0 PGD 0 Oops: 0000 [#1] SMP ... ... RIP: 0010:[<ffffffff81470453>] [<ffffffff81470453>] __cpufreq_governor+0x23/0x1e0 RSP: 0018:ffff881fcfbcfbb8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff881fd11b3980 RCX: ffff88407fc20000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff881fd11b3980 RBP: ffff881fcfbcfbd8 R08: 0000000000000000 R09: 000000000000000f R10: ffffffff818068d0 R11: 0000000000000043 R12: 0000000000000004 R13: 0000000000000000 R14: ffffffff8196cae0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff881fffc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000030 CR3: 00000000018ae000 CR4: 00000000000407f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/0:3 (pid: 750, threadinfo ffff881fcfbce000, task ffff881fcf556400) Stack: ffff881fffc17d00 ffff881fcfbcfc18 ffff881fd11b3980 0000000000000000 ffff881fcfbcfc08 ffffffff81470d08 ffff881fd11b3980 0000000000000007 ffff881fcfbcfc18 ffff881fffc17d00 ffff881fcfbcfd28 ffffffff81472e9a Call Trace: [<ffffffff81470d08>] __cpufreq_set_policy+0x1b8/0x2e0 [<ffffffff81472e9a>] cpufreq_update_policy+0xca/0x150 [<ffffffff81472f20>] ? cpufreq_update_policy+0x150/0x150 [<ffffffff81324a96>] acpi_processor_ppc_has_changed+0x71/0x7b [<ffffffff81320bcd>] acpi_processor_notify+0x55/0x115 [<ffffffff812f9c29>] acpi_device_notify+0x19/0x1b [<ffffffff813084ca>] acpi_ev_notify_dispatch+0x41/0x5f [<ffffffff812f64a4>] acpi_os_execute_deferred+0x27/0x34 The root cause is a race conditon -- cpufreq core and acpi-cpufreq driver were initiated, but cpufreq_governor wasn't and _PPC changed notification happened, __cpufreq_governor() was called within acpi_os_execute_deferred kernel thread context. To fix this panic issue, add pointer checking code in __cpufreq_governor() before pointer policy->governor is to be dereferenced. Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-12-19 22:49:07 +01:00
Viresh Kumar	7c45cf31b3	cpufreq: Introduce ->ready() callback for cpufreq drivers Currently there is no callback for cpufreq drivers which is called once the policy is ready to be used. There are some requirements where such a callback is required. One of them is registering a cooling device with the help of of_cpufreq_cooling_register(). This routine tries to get 'struct cpufreq_policy' for CPUs which isn't yet initialed at the time ->init() is called and so we face issues while registering the cooling device. Because we can't register cooling device from ->init(), we need a callback that is called after the policy is ready to be used and hence we introduce ->ready() callback. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Eduardo Valentin <edubezval@gmail.com> Tested-by: Eduardo Valentin <edubezval@gmail.com> Reviewed-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-29 23:38:38 +01:00
Tomeu Vizoso	6d4e81ed89	cpufreq: Ref the policy object sooner Do it before it's assigned to cpufreq_cpu_data, otherwise when a driver tries to get the cpu frequency during initialization the policy kobj is referenced and we get this warning: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 64 at include/linux/kref.h:47 kobject_get+0x64/0x70() Modules linked in: CPU: 1 PID: 64 Comm: irq/77-tegra-ac Not tainted 3.18.0-rc4-next-20141114ccu-00050-g3eff942 #326 [<c0016fac>] (unwind_backtrace) from [<c001272c>] (show_stack+0x10/0x14) [<c001272c>] (show_stack) from [<c06085d8>] (dump_stack+0x98/0xd8) [<c06085d8>] (dump_stack) from [<c002892c>] (warn_slowpath_common+0x84/0xb4) [<c002892c>] (warn_slowpath_common) from [<c00289f8>] (warn_slowpath_null+0x1c/0x24) [<c00289f8>] (warn_slowpath_null) from [<c0220290>] (kobject_get+0x64/0x70) [<c0220290>] (kobject_get) from [<c03e944c>] (cpufreq_cpu_get+0x88/0xc8) [<c03e944c>] (cpufreq_cpu_get) from [<c03e9500>] (cpufreq_get+0xc/0x64) [<c03e9500>] (cpufreq_get) from [<c0285288>] (actmon_thread_isr+0x134/0x198) [<c0285288>] (actmon_thread_isr) from [<c0069008>] (irq_thread_fn+0x1c/0x40) [<c0069008>] (irq_thread_fn) from [<c0069324>] (irq_thread+0x134/0x174) [<c0069324>] (irq_thread) from [<c0040290>] (kthread+0xdc/0xf4) [<c0040290>] (kthread) from [<c000f4b8>] (ret_from_fork+0x14/0x3c) ---[ end trace b7bd64a81b340c59 ]--- Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-25 22:44:17 +01:00
Rafael J. Wysocki	bd2a0f6754	Merge back cpufreq material for 3.19-rc1.	2014-11-18 01:22:29 +01:00
Vince Hsu	619c144c84	cpufreq: respect the min/max settings from user space When the user space tries to set scaling_(max\|min)_freq through sysfs, the cpufreq_set_policy() asks other driver's opinions for the max/min frequencies. Some device drivers, like Tegra CPU EDP which is not upstreamed yet though, may constrain the CPU maximum frequency dynamically because of board design. So if the user space access happens and some driver is capping the cpu frequency at the same time, the user_policy->(max\|min) is overridden by the capped value, and that's not expected by the user space. And if the user space is not invoked again, the CPU will always be capped by the user_policy->(max\|min) even no drivers limit the CPU frequency any more. This patch preserves the user specified min/max settings, so that every time the cpufreq policy is updated, the new max/min can be re-evaluated correctly based on the user's expection and the present device drivers' status. Signed-off-by: Vince Hsu <vinceh@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-11 23:04:28 +01:00
Geert Uytterhoeven	09712f557b	cpufreq: Avoid crash in resume on SMP without OPP When resuming from s2ram on an SMP system without cpufreq operating points (e.g. there's no "operating-points" property for the CPU node in DT, or the platform doesn't use DT yet), the kernel crashes when bringing CPU 1 online: Enabling non-boot CPUs ... CPU1: Booted secondary processor Unable to handle kernel NULL pointer dereference at virtual address 0000003c pgd = ee5e6b00 [0000003c] pgd=6e579003, pmd=6e588003, *pte=00000000 Internal error: Oops: a07 [#1] SMP ARM Modules linked in: CPU: 0 PID: 1246 Comm: s2ram Tainted: G W 3.18.0-rc3-koelsch-01614-g0377af242bb175c8-dirty #589 task: eeec5240 ti: ee704000 task.ti: ee704000 PC is at __cpufreq_add_dev.isra.24+0x24c/0x77c LR is at __cpufreq_add_dev.isra.24+0x244/0x77c pc : [<c0298efc>] lr : [<c0298ef4>] psr: 60000153 sp : ee705d48 ip : ee705d48 fp : ee705d84 r10: c04e0450 r9 : 00000000 r8 : 00000001 r7 : c05426a8 r6 : 00000001 r5 : 00000001 r4 : 00000000 r3 : 00000000 r2 : 00000000 r1 : 20000153 r0 : c0542734 Verify that policy is not NULL before dereferencing it to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: `8414809c6a` (cpufreq: Preserve policy structure across suspend/resume) Cc: 3.12+ <stable@vger.kernel.org> # 3.12+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-08 02:10:04 +01:00
Dirk Brandewie	c034b02e21	cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers Currently the core does not expose scaling_cur_freq for set_policy() drivers this breaks some userspace monitoring tools. Change the core to expose this file for all drivers and if the set_policy() driver supports the get() callback use it to retrieve the current frequency. Link: https://bugzilla.kernel.org/show_bug.cgi?id=73741 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-10-23 22:59:59 +02:00
Thomas Petazzoni	51315cdfa0	cpufreq: allow driver-specific data This commit extends the cpufreq_driver structure with an additional 'void *driver_data' field that can be filled by the ->probe() function of a cpufreq driver to pass additional custom information to the driver itself. A new function called cpufreq_get_driver_data() is added to allow a cpufreq driver to retrieve those driver data, since they are typically needed from a cpufreq_policy->init() callback, which does not have access to the cpufreq_driver structure. This function call is similar to the existing cpufreq_get_current_driver() function call. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-10-21 00:51:01 +02:00
Rafael J. Wysocki	6f1293ff74	Merge back cpufreq material for v3.18.	2014-10-03 15:41:16 +02:00
Viresh Kumar	b1b12babe3	cpufreq: update 'cpufreq_suspended' after stopping governors Commit `8e30444e15` ("cpufreq: fix cpufreq suspend/resume for intel_pstate") introduced a bug where the governors wouldn't be stopped anymore for ->target{_index}() drivers during suspend. This happens because 'cpufreq_suspended' is updated before stopping the governors during suspend and due to this __cpufreq_governor() would return early due to this check: /* Don't start any governor operations if we are entering suspend */ if (cpufreq_suspended) return 0; Fixes: `8e30444e15` ("cpufreq: fix cpufreq suspend/resume for intel_pstate") Cc: 3.15+ <stable@vger.kernel.org> # 3.15+: `8e30444e15` "cpufreq: fix cpufreq suspend/resume for intel_pstate" Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-30 21:02:34 +02:00
Rasmus Villemoes	7c4f453970	cpufreq: Replace strnicmp with strncasecmp The kernel used to contain two functions for length-delimited, case-insensitive string comparison, strnicmp with correct semantics and a slightly buggy strncasecmp. The latter is the POSIX name, so strnicmp was renamed to strncasecmp, and strnicmp made into a wrapper for the new strncasecmp to avoid breaking existing users. To allow the compat wrapper strnicmp to be removed at some point in the future, and to avoid the extra indirection cost, do s/strnicmp/strncasecmp/g. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-29 15:53:04 +02:00
Preeti U Murthy	789ca24374	cpufreq: Allow stop CPU callback to be used by all cpufreq drivers Commit `367dc4aa93` ("cpufreq: Add stop CPU callback to cpufreq_driver interface") introduced the stop CPU callback for intel_pstate drivers. During the CPU_DOWN_PREPARE stage, this callback is invoked so that drivers can take some action on the pstate of the cpu before it is taken offline. This callback was assumed to be useful only for those drivers which have implemented the set_policy CPU callback because they have no other way to take action about the cpufreq of a CPU which is being hotplugged out except in the exit callback which is called very late in the offline process. The drivers which implement the target/target_index callbacks were expected to take care of requirements like the ones that commit `367dc4aa` addresses in the GOV_STOP notification event. But there are disadvantages to restricting the usage of stop CPU callback to cpufreq drivers that implement the set_policy callbacks and who want to take explicit action on the setting the cpufreq during a hotplug operation. 1.GOV_STOP gets called for every CPU offline and drivers would usually want to take action when the last cpu in the policy->cpus mask is taken offline. As long as there is more than one cpu in the policy->cpus mask, cpufreq core itself makes sure that the freq for the other cpus in this mask is set according to the maximum load. This is sensible and drivers which implement the target_index callback would mostly not want to modify that. However the cpufreq core leaves a loose end when the cpu in the policy->cpus mask is the last one to go offline; it does nothing explicit to the frequency of the core. Drivers may need a way to take some action here and stop CPU callback mechanism is the best way to do it today. 2. We cannot implement driver specific actions in the GOV_STOP mechanism. So we will need another driver callback which is invoked from here which is unnecessary. Therefore this patch extends the usage of stop CPU callback to be used by all cpufreq drivers as long as they have this callback implemented and irrespective of whether they are set_policy/target_index drivers. The assumption is if the drivers find the GOV_STOP path to be a suitable way of implementing what they want to do with the freq of the cpu going offine,they will not implement the stop CPU callback at all. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-29 15:47:12 +02:00
Prarit Bhargava	7106e02bae	cpufreq: release policy->rwsem on error While debugging a cpufreq-related hardware failure on a system I saw the following lockdep warning: ========================= [ BUG: held lock freed! ] 3.17.0-rc4+ #1 Tainted: G E ------------------------- insmod/2247 is freeing memory ffff88006e1b1400-ffff88006e1b17ff, with a lock still held there! (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 3 locks held by insmod/2247: #0: (subsys mutex#5){+.+.+.}, at: [<ffffffff81485579>] subsys_interface_register+0x69/0x120 #1: (cpufreq_rwsem){.+.+.+}, at: [<ffffffff8156cf73>] __cpufreq_add_dev.isra.21+0x73/0xb80 #2: (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 stack backtrace: CPU: 0 PID: 2247 Comm: insmod Tainted: G E 3.17.0-rc4+ #1 Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 08/24/2013 0000000000000000 000000008f3063c4 ffff88006f87bb30 ffffffff8171b358 ffff88006bcf3750 ffff88006f87bb68 ffffffff810e09e1 ffff88006e1b1400 ffffea0001b86c00 ffffffff8156d327 ffff880073003500 0000000000000246 Call Trace: [<ffffffff8171b358>] dump_stack+0x4d/0x66 [<ffffffff810e09e1>] debug_check_no_locks_freed+0x171/0x180 [<ffffffff8156d327>] ? __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff8121412b>] kfree+0xab/0x2b0 [<ffffffff8156d327>] __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff81724cf7>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff8156da8e>] cpufreq_add_dev+0xe/0x10 [<ffffffff814855d1>] subsys_interface_register+0xc1/0x120 [<ffffffff8156bcf2>] cpufreq_register_driver+0x112/0x340 [<ffffffff8121415a>] ? kfree+0xda/0x2b0 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffffa003562e>] pcc_cpufreq_init+0x4af/0xe81 [pcc_cpufreq] [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff81002144>] do_one_initcall+0xd4/0x210 [<ffffffff811f7472>] ? __vunmap+0xd2/0x120 [<ffffffff81127155>] load_module+0x1315/0x1b70 [<ffffffff811222a0>] ? store_uevent+0x70/0x70 [<ffffffff811229d9>] ? copy_module_from_fd.isra.44+0x129/0x180 [<ffffffff81127b86>] SyS_finit_module+0xa6/0xd0 [<ffffffff81725b69>] system_call_fastpath+0x16/0x1b cpufreq: __cpufreq_add_dev: ->get() failed insmod: ERROR: could not insert module pcc-cpufreq.ko: No such device The warning occurs in the __cpufreq_add_dev() code which does down_write(&policy->rwsem); ... if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { policy->cur = cpufreq_driver->get(policy->cpu); if (!policy->cur) { pr_err("%s: ->get() failed\n", __func__); goto err_get_freq; } If cpufreq_driver->get(policy->cpu) returns an error we execute the code at err_get_freq, which does not up the policy->rwsem. This causes the lockdep warning. Trivial patch to up the policy->rwsem in the error path. After the patch has been applied, and an error occurs in the cpufreq_driver->get(policy->cpu) call we will now see cpufreq: __cpufreq_add_dev: ->get() failed cpufreq: __cpufreq_add_dev: ->get() failed modprobe: ERROR: could not insert 'pcc_cpufreq': No such device Fixes: `4e97b631f2` (cpufreq: Initialize governor for a new policy under policy->rwsem) Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.14+ <stable@vger.kernel.org> # 3.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-22 14:32:43 +02:00
Lan Tianyu	8e30444e15	cpufreq: fix cpufreq suspend/resume for intel_pstate Cpufreq core introduces cpufreq_suspended flag to let cpufreq sysfs nodes across S2RAM/S2DISK. But the flag is only set in the cpufreq_suspend() for cpufreq drivers which have target or target_index callback. This skips intel_pstate driver. This patch is to set the flag before checking target or target_index callback. Fixes: `2f0aea9363` (cpufreq: suspend governors on system suspend/hibernate) Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ [rjw: Subject] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-22 14:23:18 +02:00
Viresh Kumar	1bfb425b3b	cpufreq: move policy kobj to update_policy_cpu() We are calling kobject_move() from two separate places currently and both these places share another routine update_policy_cpu() which is handling everything around updating policy->cpu. Moving ownership of policy->kobj also lies under the role of update_policy_cpu() routine and must be handled from there. So, Lets move kobject_move() to update_policy_cpu() and get rid of cpufreq_nominate_new_policy_cpu() as it doesn't have anything significant left. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-07-21 13:43:20 +02:00
Viresh Kumar	41dfd908fc	cpufreq: propagate error returned by kobject_move() We are returning -EINVAL instead of the error returned from kobject_move() when it fails. Propagate the actual error number. Also add a meaningful print when sysfs_create_link() fails. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-07-21 13:43:20 +02:00
Viresh Kumar	1461dc7d1c	cpufreq: don't restore policy->cpus on failure to move kobj While hot-unplugging policy->cpu, we call cpufreq_nominate_new_policy_cpu() to nominate next owner of policy, i.e. policy->cpu. If we fail to move policy kobject under the new policy->cpu, we try to update policy->cpus with the old policy->cpu. This would have been required in case old-CPU is removed from policy->cpus in the first place. But its not done before calling cpufreq_nominate_new_policy_cpu(), but during the POST_DEAD notification which happens quite late in the hot-unplugging path. So, this is just some useless code hanging around, get rid of it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-07-21 13:43:20 +02:00
Viresh Kumar	92c14bd947	cpufreq: move policy kobj to policy->cpu at resume This is only relevant to implementations with multiple clusters, where clusters have separate clock lines but all CPUs within a cluster share it. Consider a dual cluster platform with 2 cores per cluster. During suspend we start hot unplugging CPUs in order 1 to 3. When CPU2 is removed, policy->kobj would be moved to CPU3 and when CPU3 goes down we wouldn't free policy or its kobj as we want to retain permissions/values/etc. Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev(). We will recover the old policy and update policy->cpu from 3 to 2 from update_policy_cpu(). But the kobj is still tied to CPU3 and isn't moved to CPU2. We wouldn't create a link for CPU2, but would try that for CPU3 while bringing it online. Which will report errors as CPU3 already has kobj assigned to it. This bug got introduced with commit `42f921a`, which overlooked this scenario. To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a cluster back. Also do a WARN_ON() if kobject_move failed, as we would reach here only for the first CPU of a non-boot cluster. And we can't recover from this situation, if kobject_move() fails. Fixes: `42f921a6f1` (cpufreq: remove sysfs files for CPUs which failed to come back after resume) Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Reported-and-tested-by: Bu Yitian <ybu@qti.qualcomm.com> Reported-by: Saravana Kannan <skannan@codeaurora.org> Reviewed-by: Srivatsa S. Bhat <srivatsa@mit.edu> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-07-17 14:23:22 +02:00
Aaron Plattner	fefa8ff810	cpufreq: unlock when failing cpufreq_update_policy() Commit `bd0fa9bb45` introduced a failure path to cpufreq_update_policy() if cpufreq_driver->get(cpu) returns NULL. However, it jumps to the 'no_policy' label, which exits without unlocking any of the locks the function acquired earlier. This causes later calls into cpufreq to hang. Fix this by creating a new 'unlock' label and jumping to that instead. Fixes: `bd0fa9bb45` ("cpufreq: Return error if ->get() failed in cpufreq_update_policy()") Link: https://devtalk.nvidia.com/default/topic/751903/kernel-3-15-and-nv-drivers-337-340-failed-to-initialize-the-nvidia-kernel-module-gtx-550-ti-/ Signed-off-by: Aaron Plattner <aplattner@nvidia.com> Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-06-18 21:52:20 +02:00
Viresh Kumar	1c03a2d04d	cpufreq: add support for intermediate (stable) frequencies Douglas Anderson, recently pointed out an interesting problem due to which udelay() was expiring earlier than it should. While transitioning between frequencies few platforms may temporarily switch to a stable frequency, waiting for the main PLL to stabilize. For example: When we transition between very low frequencies on exynos, like between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz. No CPUFREQ notification is sent for that. That means there's a period of time when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz and 300MHz. And so udelay behaves badly. To get this fixed in a generic way, introduce another set of callbacks get_intermediate() and target_intermediate(), only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION unset. get_intermediate() should return a stable intermediate frequency platform wants to switch to, and target_intermediate() should set CPU to that frequency, before jumping to the frequency corresponding to 'index'. Core will take care of sending notifications and driver doesn't have to handle them in target_intermediate() or target_index(). NOTE: ->target_index() should restore to policy->restore_freq in case of failures as core would send notifications for that. Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-06-05 23:32:29 +02:00
Viresh Kumar	8d65775d17	cpufreq: handle calls to ->target_index() in separate routine Handling calls to ->target_index() has got complex over time and might become more complex. So, its better to take target_index() bits out in another routine __target_index() for better code readability. Shouldn't have any functional impact. Tested-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-05-29 01:27:38 +02:00
Stratos Karafotis	5eeaf1f189	cpufreq: Fix build error on some platforms that use cpufreq_for_each_* On platforms that use cpufreq_for_each_* macros, build fails if CONFIG_CPU_FREQ=n, e.g. ARM/shmobile/koelsch/non-multiplatform: drivers/built-in.o: In function `clk_round_parent': clkdev.c:(.text+0xcf168): undefined reference to `cpufreq_next_valid' drivers/built-in.o: In function `clk_rate_table_find': clkdev.c:(.text+0xcf820): undefined reference to `cpufreq_next_valid' make[3]: *** [vmlinux] Error 1 Fix this making cpufreq_next_valid function inline and move it to cpufreq.h. Fixes: `27e289dce2` (cpufreq: Introduce macros for cpufreq_frequency_table iteration) Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-05-08 13:10:56 +02:00
Srivatsa S. Bhat	ca654dc3a9	cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end Some cpufreq drivers were redundantly invoking the _begin() and _end() APIs around frequency transitions, and this double invocation (one from the cpufreq core and the other from the cpufreq driver) used to result in a self-deadlock, leading to system hangs during boot. (The _begin() API makes contending callers wait until the previous invocation is complete. Hence, the cpufreq driver would end up waiting on itself!). Now all such drivers have been fixed, but debugging this issue was not very straight-forward (even lockdep didn't catch this). So let us add a debug infrastructure to the cpufreq core to catch such issues more easily in the future. We add a new field called 'transition_task' to the policy structure, to keep track of the task which is performing the frequency transition. Using this field, we make note of this task during _begin() and print a warning if we find a case where the same task is calling _begin() again, before completing the previous frequency transition using the corresponding _end(). We have left out ASYNC_NOTIFICATION drivers from this debug infrastructure for 2 reasons: 1. At the moment, we have no way to avoid a particular scenario where this debug infrastructure can emit false-positive warnings for such drivers. The scenario is depicted below: Task A Task B /* 1st freq transition / Invoke _begin() { ... ... } Change the frequency / 2nd freq transition / Invoke _begin() { ... //waiting for B to ... //finish _end() for ... //the 1st transition ... \| Got interrupt for successful ... \| change of frequency (1st one). ... \| ... \| / 1st freq transition */ ... \| Invoke _end() { ... \| ... ... V } ... ... } This scenario is actually deadlock-free because, once Task A changes the frequency, it is Task B's responsibility to invoke the corresponding _end() for the 1st frequency transition. Hence it is perfectly legal for Task A to go ahead and attempt another frequency transition in the meantime. (Of course it won't be able to proceed until Task B finishes the 1st _end(), but this doesn't cause a deadlock or a hang). The debug infrastructure cannot handle this scenario and will treat it as a deadlock and print a warning. To avoid this, we exclude such drivers from the purview of this code. 2. Luckily, we don't _need_ this infrastructure for ASYNC_NOTIFICATION drivers at all! The cpufreq core does not automatically invoke the _begin() and _end() APIs during frequency transitions in such drivers. Thus, the driver alone is responsible for invoking _begin()/_end() and hence there shouldn't be any conflicts which lead to double invocations. So, we can skip these drivers, since the probability that such drivers will hit this problem is extremely low, as outlined above. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-05-07 00:32:39 +02:00
Stratos Karafotis	27e289dce2	cpufreq: Introduce macros for cpufreq_frequency_table iteration Many cpufreq drivers need to iterate over the cpufreq_frequency_table for various tasks. This patch introduces two macros which can be used for iteration over cpufreq_frequency_table keeping a common coding style across drivers: - cpufreq_for_each_entry: iterate over each entry of the table - cpufreq_for_each_valid_entry: iterate over each entry that contains a valid frequency. It should have no functional changes. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-04-30 00:05:31 +02:00
Viresh Kumar	236a980052	cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static cpufreq_notify_transition() and cpufreq_notify_post_transition() shouldn't be called directly by cpufreq drivers anymore and so these should be marked static. Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-26 16:41:41 +01:00
Viresh Kumar	8fec051eea	cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin\|end} CPUFreq core has new infrastructure that would guarantee serialized calls to target() or target_index() callbacks. These are called cpufreq_freq_transition_begin() and cpufreq_freq_transition_end(). This patch converts existing drivers to use these new set of routines. Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-26 16:41:41 +01:00
Srivatsa S. Bhat	12478cf0c5	cpufreq: Make sure frequency transitions are serialized Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers should strictly alternate, thereby preventing two different sets of PRECHANGE or POSTCHANGE notifiers from interleaving arbitrarily. The following examples illustrate why this is important: Scenario 1: ----------- A thread reading the value of cpuinfo_cur_freq, will call __cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition() The ondemand governor can decide to change the frequency of the CPU at the same time and hence it can end up sending the notifications via ->target(). If the notifiers are not serialized, the following sequence can occur: - PRECHANGE Notification for freq A (from cpuinfo_cur_freq) - PRECHANGE Notification for freq B (from target()) - Freq changed by target() to B - POSTCHANGE Notification for freq B - POSTCHANGE Notification for freq A We can see from the above that the last POSTCHANGE Notification happens for freq A but the hardware is set to run at freq B. Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback() in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the loops_per_jiffy calculations will get messed up. Scenario 2: ----------- The governor calls __cpufreq_driver_target() to change the frequency. At the same time, if we change scaling_{min\|max}_freq from sysfs, it will end up calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call __cpufreq_driver_target(). And hence we end up issuing concurrent calls to ->target(). Typically, platforms have the following logic in their ->target() routines: (Eg: cpufreq-cpu0, omap, exynos, etc) A. If new freq is more than old: Increase voltage B. Change freq C. If new freq is less than old: decrease voltage Now, if the two concurrent calls to ->target() are X and Y, where X is trying to increase the freq and Y is trying to decrease it, we get the following race condition: X.A: voltage gets increased for larger freq Y.A: nothing happens Y.B: freq gets decreased Y.C: voltage gets decreased X.B: freq gets increased X.C: nothing happens Thus we can end up setting a freq which is not supported by the voltage we have set. That will probably make the clock to the CPU unstable and the system might not work properly anymore. This patch introduces a set of synchronization primitives to serialize frequency transitions, which are to be used as shown below: cpufreq_freq_transition_begin(); //Perform the frequency change cpufreq_freq_transition_end(); The _begin() call sends the PRECHANGE notification whereas the _end() call sends the POSTCHANGE notification. Also, all the necessary synchronization is handled within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION flag can also use these APIs for performing frequency transitions (ie., you can call _begin() from one task, and call the corresponding _end() from a different task). The actual synchronization underneath is not that complicated: The key challenge is to allow drivers to begin the transition from one thread and end it in a completely different thread (this is to enable drivers that do asynchronous POSTCHANGE notification from bottom-halves, to also use the same interface). To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a wait-queue are added per-policy. The flag and the wait-queue are used in conjunction to create an "uninterrupted flow" from _begin() to _end(). The spinlock is used to ensure that only one such "flow" is in flight at any given time. Put together, this provides us all the necessary synchronization. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-26 16:41:40 +01:00
Viresh Kumar	0c5aa405a9	cpufreq: resume drivers before enabling governors During suspend, we first stop governors and then suspend cpufreq drivers and resume must be exactly opposite of that. i.e. resume drivers first and then start governors. But the current code in resume enables governors first and then resume drivers. Fix it be changing code sequence there. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-26 16:37:18 +01:00
Dirk Brandewie	367dc4aa93	cpufreq: Add stop CPU callback to cpufreq_driver interface This callback allows the driver to do clean up before the CPU is completely down and its state cannot be modified. This is used by the intel_pstate driver to reduce the requested P state prior to the core going away. This is required because the requested P state of the offline core is used to select the package P state. This effectively sets the floor package P state to the requested P state on the offline core. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> [rjw: Minor modifications] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 03:50:12 +01:00
Stratos Karafotis	bda9f552f9	cpufreq: Remove unnecessary braces Remove unnecessary braces from a single statement. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 03:40:48 +01:00
Stratos Karafotis	e5c87b7628	cpufreq: Fix checkpatch errors and warnings Fix 2 checkpatch errors about using assignment in if condition, 1 checkpatch error about a required space after comma and 3 warnings about line over 80 characters. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 03:39:28 +01:00
Viresh Kumar	0b443ead71	cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE\|RESUMECHANGE} Two cpufreq notifiers CPUFREQ_RESUMECHANGE and CPUFREQ_SUSPENDCHANGE have not been used for some time, so remove them to clean up code a bit. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-19 14:10:24 +01:00
Rafael J. Wysocki	9832235f3f	cpufreq: Do not allow ->setpolicy drivers to provide ->target cpufreq drivers that provide the ->setpolicy() callback are supposed to have integrated governors, so they don't need to set ->target() or ->target_index() and may confuse the core if any of these callbacks is present. For this reason, add a check preventing ->setpolicy cpufreq drivers from registering if they have non-NULL ->target or ->target_index. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2014-03-19 12:48:30 +01:00
Rafael J. Wysocki	15afee3aea	Merge back earlier 'pm-cpufreq' material.	2014-03-17 13:51:39 +01:00
Rafael J. Wysocki	2ed99e39cb	cpufreq: Skip current frequency initialization for ->setpolicy drivers After commit `da60ce9f2f` (cpufreq: call cpufreq_driver->get() after calling ->init()) __cpufreq_add_dev() sometimes fails for CPUs handled by intel_pstate, because that driver may return 0 from its ->get() callback if it has not run long enough to collect enough samples on the given CPU. That didn't happen before commit `da60ce9f2f` which added policy->cur initialization to __cpufreq_add_dev() to help reduce code duplication in other cpufreq drivers. However, the code added by commit `da60ce9f2f` need not be executed for cpufreq drivers having the ->setpolicy callback defined, because the subsequent invocation of cpufreq_set_policy() will use that callback to initialize the policy anyway and it doesn't need policy->cur to be initialized upfront. The analogous code in cpufreq_update_policy() is also unnecessary for cpufreq drivers having ->setpolicy set and may be skipped for them as well. Since intel_pstate provides ->setpolicy, skipping the upfront policy->cur initialization for cpufreq drivers with that callback set will cover intel_pstate and the problem it's been having after commit `da60ce9f2f` will be addressed. Fixes: `da60ce9f2f` (cpufreq: call cpufreq_driver->get() after calling ->init()) References: https://bugzilla.kernel.org/show_bug.cgi?id=71931 Reported-and-tested-by: Patrik Lundquist <patrik.lundquist@gmail.com> Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-13 00:37:16 +01:00
Viresh Kumar	96bbbe4a2a	cpufreq: Remove unnecessary variable/parameter 'frozen' We have used 'frozen' variable/function parameter at many places to distinguish between CPU offline/online on suspend/resume vs sysfs removals. We now have another variable cpufreq_suspended which can be used in these cases, so we can get rid of all those variables or function parameters. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-12 01:06:01 +01:00
Viresh Kumar	e0b3165ba5	cpufreq: add 'freq_table' in struct cpufreq_policy freq table is not per CPU but per policy, so it makes more sense to keep it within struct cpufreq_policy instead of a per-cpu variable. This patch does it. Over that, there is no need to set policy->freq_table to NULL in ->exit(), as policy structure is going to be freed soon. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-12 01:06:00 +01:00
Joe Perches	e837f9b58b	cpufreq: Reformat printk() statements - Add missing newlines - Coalesce format fragments - Convert printks to pr_<level> - Align arguments Based-on-patch-by: Sören Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-12 00:49:22 +01:00

1 2 3 4 5 ...

444 Commits