linux

Commit Graph

Author	SHA1	Message	Date
Shailendra Verma	58405af632	cpufreq: Fix for typos in two comments Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-22 23:59:44 +02:00
Viresh Kumar	18bf3a124e	cpufreq: Mark policy->governor = NULL for inactive policies Later commits would change the way policies are managed today. Policies wouldn't be freed on cpu hotplug (currently they aren't freed on suspend), and while the CPU is offline, the sysfs cpufreq files would still be present. Because we don't mark policy->governor as NULL, it still contains pointer of the last used governor. And if the governor is removed, while all the CPUs of a policy are hotplugged out, this pointer wouldn't be valid anymore. And if we try to read the 'scaling_governor', etc. from sysfs, it will result in kernel OOPs. To prevent this, mark policy->governor as NULL for all inactive policies while the governor is removed from kernel. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:46:45 +02:00
Viresh Kumar	4573237b01	cpufreq: Manage governor usage history with 'policy->last_governor' History of which governor was used last is common to all CPUs within a policy and maintaining it per-cpu isn't the best approach for sure. Apart from wasting memory, this also increases the complexity of managing this data structure as it has to be updated for all CPUs. To make that somewhat simpler, lets store this information in a new field 'last_governor' in struct cpufreq_policy and update it on removal of last cpu of a policy. As a side-effect it also solves an old problem, consider a system with two clusters 0 & 1. And there is one policy per cluster. Cluster 0: CPU0 and 1. Cluster 1: CPU2 and 3. - CPU2 is first brought online, and governor is set to performance (default as cpufreq_cpu_governor wasn't set). - Governor is changed to ondemand. - CPU2 is taken offline and cpufreq_cpu_governor is updated for CPU2. - CPU3 is brought online. - Because cpufreq_cpu_governor wasn't set for CPU3, the default governor performance is picked for CPU3. This patch fixes the bug as we now have a single variable to update for policy. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:44:17 +02:00
Viresh Kumar	9104bb26c7	cpufreq: Don't traverse all active policies to find policy for a cpu We reach here while adding policy for a CPU and enter into the 'if' block only if a policy already exists for the CPU. As cpufreq_cpu_data is set for all policy->related_cpus now, when the policy is first added, we can use that to find the CPU's policy instead of traversing the list of all active policies. Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:38:18 +02:00
Viresh Kumar	3914d37910	cpufreq: Get rid of cpufreq_cpu_data_fallback We can extract the same information from cpufreq_cpu_data as it is also available for inactive policies now. And so don't need cpufreq_cpu_data_fallback anymore. Also add a WARN_ON() for the case where we try to restore from an active policy. Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:35:57 +02:00
Viresh Kumar	988bed09d3	cpufreq: Don't clear cpufreq_cpu_data and policy list for inactive policies Now that we can check policy->cpus to find if policy is active or not, we don't need to clean cpufreq_cpu_data and delete policy from the list on light weight tear down of policies (like in suspend). To make it consistent and clean, set cpufreq_cpu_data for all related CPUs when the policy is first created and clean it only while it is freed. Also update cpufreq_cpu_get_raw() to check if cpu is part of policy->cpus mask, so that we don't end up getting policies for offline CPUs. In order to make sure that no users of 'policy' are using an inactive policy, use cpufreq_cpu_get_raw() instead of directly accessing cpufreq_cpu_data. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:32:46 +02:00
Viresh Kumar	f963735a3c	cpufreq: Create for_each_{in}active_policy() policy->cpus is cleared unconditionally now on hotplug-out of a CPU and it can be checked to know if a policy is active or not. Create helper routines to iterate over all active/inactive policies, based on policy->cpus field. Replace all instances of for_each_policy() with for_each_active_policy() to make them iterate only for active policies. (We haven't made changes yet to keep inactive policies in the same list, but that will be followed in a later patch). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-15 02:26:07 +02:00
Viresh Kumar	303ae72307	cpufreq: Clear policy->cpus even for the last CPU We clear policy->cpus mask while CPUs are hotplugged out. We do it for all CPUs except the last CPU of the policy. I don't remember what the rationale behind that was, but I couldn't think of anything that will break if we remove this conditional clearing and always clear policy->cpus. The benefit we get out of it is, we can know if a policy is active or not by checking if this field is empty or not. That will be used by later commits. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:38:35 +02:00
Viresh Kumar	bb29ae152e	cpufreq: Keep a single path for adding managed CPUs There are two cases when we may try to add CPUs we're already handling: - On boot, the first cpu has marked all policy->cpus managed and so we will find policy for all other policy->cpus later on. - When a managed cpu is hotplugged out and later brought back in. Currently, separate paths and checks take care of the two. While the first one is detected by testing cpu against 'policy->cpus', the other one is detected by testing cpu against 'policy->related_cpus'. We can handle them both via a single path and there is no need to do special checking for the first one. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> [ rjw: Changelog, comments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:36:41 +02:00
Viresh Kumar	1b947c904c	cpufreq: Throw warning when we try to get policy for an invalid CPU Simply returning here with an error is not enough. It shouldn't be allowed at all to try calling cpufreq_cpu_get() for an invalid CPU. Add a WARN here to make it clear that it wouldn't be acceptable at all. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:29:57 +02:00
Viresh Kumar	23faf0b743	cpufreq: Merge __cpufreq_add_dev() and cpufreq_add_dev() cpufreq_add_dev() is an unnecessary wrapper over __cpufreq_add_dev(). Merge them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:28:28 +02:00
Viresh Kumar	50e9c85213	cpufreq: Add doc style comment about cpufreq_cpu_{get\|put}() This clearly states what the code inside these routines is doing and how these must be used. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-05-07 23:27:20 +02:00
Viresh Kumar	c75de0ac07	cpufreq: Schedule work for the first-online CPU on resume All CPUs leaving the first-online CPU are hotplugged out on suspend and and cpufreq core stops managing them. On resume, we need to call cpufreq_update_policy() for this CPU's policy to make sure its frequency is in sync with cpufreq's cached value, as it might have got updated by hardware during suspend/resume. The policies are always added to the top of the policy-list. So, in normal circumstances, CPU 0's policy will be the last one in the list. And so the code checks for the last policy. But there are cases where it will fail. Consider quad-core system, with policy-per core. If CPU0 is hotplugged out and added back again, the last policy will be on CPU1 :( To fix this in a proper way, always look for the policy of the first online CPU. That way we will be sure that we are calling cpufreq_update_policy() for the only CPU that wasn't hotplugged out. Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ Fixes: `2f0aea9363` ("cpufreq: suspend governors on system suspend/hibernate") Reported-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-04-03 12:59:47 +02:00
Viresh Kumar	f7b2706117	cpufreq: Create for_each_governor() To make code more readable and less error prone, lets create a helper macro for iterating over all available governors. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-02-03 23:28:01 +01:00
Viresh Kumar	b4f0676fe2	cpufreq: Create for_each_policy() To make code more readable and less error prone, lets create a helper macro for iterating over all active policies. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-02-03 23:27:45 +01:00
Viresh Kumar	1e63eaf0c4	cpufreq: Drop cpufreq_disabled() check from cpufreq_cpu_{get\|put}() When cpufreq is disabled, the per-cpu variable would have been set to NULL. Remove this unnecessary check. [ Changelog from Saravana Kannan. ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-02-03 23:26:02 +01:00
Viresh Kumar	6ffae8c06f	cpufreq: Set cpufreq_cpu_data to NULL before putting kobject In __cpufreq_remove_dev_finish(), per-cpu 'cpufreq_cpu_data' needs to be cleared before calling kobject_put(&policy->kobj) and under cpufreq_driver_lock. Otherwise, if someone else calls cpufreq_cpu_get() in parallel with it, they can obtain a non-NULL policy from that after kobject_put(&policy->kobj) was executed. Consider this case: Thread A Thread B cpufreq_cpu_get() acquire cpufreq_driver_lock read-per-cpu cpufreq_cpu_data kobject_put(&policy->kobj); kobject_get(&policy->kobj); ... per_cpu(&cpufreq_cpu_data, cpu) = NULL And this will result in a warning like this one: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47 kobject_get+0x41/0x50() Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon ... Call Trace: [<ffffffff81661b14>] dump_stack+0x46/0x58 [<ffffffff81072b61>] warn_slowpath_common+0x81/0xa0 [<ffffffff81072c7a>] warn_slowpath_null+0x1a/0x20 [<ffffffff812e16d1>] kobject_get+0x41/0x50 [<ffffffff815262a5>] cpufreq_cpu_get+0x75/0xc0 [<ffffffff81527c3e>] cpufreq_update_policy+0x2e/0x1f0 [<ffffffff810b8cb2>] ? up+0x32/0x50 [<ffffffff81381aa9>] ? acpi_ns_get_node+0xcb/0xf2 [<ffffffff81381efd>] ? acpi_evaluate_object+0x22c/0x252 [<ffffffff813824f6>] ? acpi_get_handle+0x95/0xc0 [<ffffffff81360967>] ? acpi_has_method+0x25/0x40 [<ffffffff81391e08>] acpi_processor_ppc_has_changed+0x77/0x82 [<ffffffff81089566>] ? move_linked_works+0x66/0x90 [<ffffffff8138e8ed>] acpi_processor_notify+0x58/0xe7 [<ffffffff8137410c>] acpi_ev_notify_dispatch+0x44/0x5c [<ffffffff8135f293>] acpi_os_execute_deferred+0x15/0x22 [<ffffffff8108c910>] process_one_work+0x160/0x410 [<ffffffff8108d05b>] worker_thread+0x11b/0x520 [<ffffffff8108cf40>] ? rescuer_thread+0x380/0x380 [<ffffffff81092421>] kthread+0xe1/0x100 [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0 [<ffffffff81669ebc>] ret_from_fork+0x7c/0xb0 [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0 ---[ end trace 89e66eb9795efdf7 ]--- The actual code flow is as follows: Thread A: Workqueue: kacpi_notify acpi_processor_notify() acpi_processor_ppc_has_changed() cpufreq_update_policy() cpufreq_cpu_get() kobject_get() Thread B: xenbus_thread() xenbus_thread() msg->u.watch.handle->callback() handle_vcpu_hotplug_event() vcpu_hotplug() cpu_down() __cpu_notify(CPU_POST_DEAD..) cpufreq_cpu_callback() __cpufreq_remove_dev_finish() cpufreq_policy_put_kobj() kobject_put() cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data under cpufreq_driver_lock, and once it gets a valid policy it expects it to not be freed until cpufreq_cpu_put() is called. But the race happens when another thread puts the kobject first and updates cpufreq_cpu_data before or later. And so the first thread gets a valid policy structure and before it does kobject_get() on it, the second one has already done kobject_put(). Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that too under locks. Reported-by: Ethan Zhao <ethan.zhao@oracle.com> Reported-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.12+ <stable@vger.kernel.org> # 3.12+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-02-03 00:59:29 +01:00
Viresh Kumar	d9f354460d	cpufreq: remove CPUFREQ_UPDATE_POLICY_CPU notifications CPUFREQ_UPDATE_POLICY_CPU notifications were used only from cpufreq-stats which doesn't use it anymore. Remove them. This also decrements values of other notification macros defined after CPUFREQ_UPDATE_POLICY_CPU by 1 to remove gaps. Hopefully all users are using macro's instead of direct numbers and so they wouldn't break as macro values are changed now. Reviewed-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 23:06:44 +01:00
Viresh Kumar	7c418ff099	cpufreq: Remove (now) unused 'last_cpu' from struct cpufreq_policy 'last_cpu' was used only from cpufreq-stats and isn't used anymore. Get rid of it. Reviewed-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 23:06:44 +01:00
Viresh Kumar	818c57126e	cpufreq: move some initialization stuff to cpufreq_policy_alloc() We need to initialize completion and work only on policy allocation and not really on the policy restore side and so we better move this piece of code to cpufreq_policy_alloc(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:35 +01:00
Viresh Kumar	ce1bcfe94d	cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs CPUFREQ_STICKY flag is set by drivers which don't want to get unregistered even if cpufreq-core isn't able to initialize policy for any CPU. When this flag isn't set, we try to unregister the driver. To find out which CPUs are registered and which are not, we try to check per_cpu cpufreq_cpu_data for all CPUs. Because we have a list of valid policies available now, we better check if the list is empty or not instead of the 'for' loop. That will be much more efficient. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:35 +01:00
Viresh Kumar	39c132eebd	cpufreq: limit the scope of l_p_j variables These variables are just used within adjust_jiffies() and so must be local to it. Also there is no need of a dummy routine for CONFIG_SMP case as we can take care of all that with help of macros in the same routine. It doesn't look that ugly. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	d7a9771c1a	cpufreq: use light-weight cpufreq_cpu_get_raw() in __cpufreq_add_dev() We just need to check if a 'policy' is already present for the cpu we are adding. We don't need to take all the locks and do kobject usage updates. Use the light-weight cpufreq_cpu_get_raw() routine instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	7f0c020ab6	cpufreq: get rid of 'tpolicy' from __cpufreq_add_dev() There is no need of this separate variable, use 'policy' instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	22a7cfb014	cpufreq: get rid of CONFIG_{HOTPLUG_CPU\|SMP} mess These are messing up more than the benefit they provide. It isn't a lot of code anyway, that we will compile without them. Kill them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	bc68b7dfda	cpufreq: update driver_data->flags only if we are registering driver We should first check if a cpufreq driver is already registered or not before updating driver_data->flags. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	d92d50a462	cpufreq: pass policy to __cpufreq_get() There is no point finding out the 'policy' again within __cpufreq_get() when all the callers already have it. Just make them pass policy instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	a1e1dc41c4	cpufreq: pass policy to cpufreq_out_of_sync There is no point finding out the 'policy' again within cpufreq_out_of_sync() when all the callers already have it. Just make them pass policy instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	2e1cc3a5d7	cpufreq: No need to check for has_target() Either we can be setpolicy or target type, nothing else. And so the else part of setpolicy will automatically be of has_target() type. And so we don't need to check it again. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:34 +01:00
Viresh Kumar	42f91fa116	cpufreq: s/__find_governor/find_governor Remove unnecessary from find_governor's name. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Viresh Kumar	db5f299574	cpufreq: merge 'if' blocks in __cpufreq_remove_dev_prepare() There are two 'if' blocks here, checking for !cpufreq_driver->setpolicy and has_target(). Both are actually doing the same thing, merge them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Viresh Kumar	09347b2905	cpufreq: don't need line break in show_scaling_cur_freq() No need of an unnecessary line break. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Viresh Kumar	f13f1184a1	cpufreq: remove extra parenthesis We can live without it and so we should. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Viresh Kumar	e673f1639b	cpufreq: remove dangling comment It doesn't make any sense at all and is a leftover of some earlier commit. Remove it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:49:33 +01:00
Doug Anderson	90de2a4aa9	cpufreq: suspend cpufreq governors on shutdown We should stop cpufreq governors when we shut down the system. If we don't do this, we can end up with this deadlock: 1. cpufreq governor may be running on a CPU other than CPU0. 2. In machine_restart() we call smp_send_stop() which stops CPUs. If one of these CPUs was actively running a cpufreq governor then it may have the mutex / spinlock needed to access the main PMIC in the system (perhaps over I2C) 3. If a machine needs access to the main PMIC in order to shutdown then it will never get it since the mutex was lost when the other CPU stopped. 4. We'll hang (possibly eventually hitting the hard lockup detector). Let's avoid the problem by stopping the cpufreq governor at shutdown, which is a sensible thing to do anyway. Signed-off-by: Doug Anderson <dianders@chromium.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-23 22:20:30 +01:00
Ethan Zhao	cb57720bf7	cpufreq: fix a NULL pointer dereference in __cpufreq_governor() If ACPI _PPC changed notification happens before governor was initiated while kernel is booting, a NULL pointer dereference will be triggered: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: [<ffffffff81470453>] __cpufreq_governor+0x23/0x1e0 PGD 0 Oops: 0000 [#1] SMP ... ... RIP: 0010:[<ffffffff81470453>] [<ffffffff81470453>] __cpufreq_governor+0x23/0x1e0 RSP: 0018:ffff881fcfbcfbb8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff881fd11b3980 RCX: ffff88407fc20000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff881fd11b3980 RBP: ffff881fcfbcfbd8 R08: 0000000000000000 R09: 000000000000000f R10: ffffffff818068d0 R11: 0000000000000043 R12: 0000000000000004 R13: 0000000000000000 R14: ffffffff8196cae0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff881fffc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000030 CR3: 00000000018ae000 CR4: 00000000000407f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/0:3 (pid: 750, threadinfo ffff881fcfbce000, task ffff881fcf556400) Stack: ffff881fffc17d00 ffff881fcfbcfc18 ffff881fd11b3980 0000000000000000 ffff881fcfbcfc08 ffffffff81470d08 ffff881fd11b3980 0000000000000007 ffff881fcfbcfc18 ffff881fffc17d00 ffff881fcfbcfd28 ffffffff81472e9a Call Trace: [<ffffffff81470d08>] __cpufreq_set_policy+0x1b8/0x2e0 [<ffffffff81472e9a>] cpufreq_update_policy+0xca/0x150 [<ffffffff81472f20>] ? cpufreq_update_policy+0x150/0x150 [<ffffffff81324a96>] acpi_processor_ppc_has_changed+0x71/0x7b [<ffffffff81320bcd>] acpi_processor_notify+0x55/0x115 [<ffffffff812f9c29>] acpi_device_notify+0x19/0x1b [<ffffffff813084ca>] acpi_ev_notify_dispatch+0x41/0x5f [<ffffffff812f64a4>] acpi_os_execute_deferred+0x27/0x34 The root cause is a race conditon -- cpufreq core and acpi-cpufreq driver were initiated, but cpufreq_governor wasn't and _PPC changed notification happened, __cpufreq_governor() was called within acpi_os_execute_deferred kernel thread context. To fix this panic issue, add pointer checking code in __cpufreq_governor() before pointer policy->governor is to be dereferenced. Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-12-19 22:49:07 +01:00
Viresh Kumar	7c45cf31b3	cpufreq: Introduce ->ready() callback for cpufreq drivers Currently there is no callback for cpufreq drivers which is called once the policy is ready to be used. There are some requirements where such a callback is required. One of them is registering a cooling device with the help of of_cpufreq_cooling_register(). This routine tries to get 'struct cpufreq_policy' for CPUs which isn't yet initialed at the time ->init() is called and so we face issues while registering the cooling device. Because we can't register cooling device from ->init(), we need a callback that is called after the policy is ready to be used and hence we introduce ->ready() callback. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Eduardo Valentin <edubezval@gmail.com> Tested-by: Eduardo Valentin <edubezval@gmail.com> Reviewed-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-29 23:38:38 +01:00
Tomeu Vizoso	6d4e81ed89	cpufreq: Ref the policy object sooner Do it before it's assigned to cpufreq_cpu_data, otherwise when a driver tries to get the cpu frequency during initialization the policy kobj is referenced and we get this warning: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 64 at include/linux/kref.h:47 kobject_get+0x64/0x70() Modules linked in: CPU: 1 PID: 64 Comm: irq/77-tegra-ac Not tainted 3.18.0-rc4-next-20141114ccu-00050-g3eff942 #326 [<c0016fac>] (unwind_backtrace) from [<c001272c>] (show_stack+0x10/0x14) [<c001272c>] (show_stack) from [<c06085d8>] (dump_stack+0x98/0xd8) [<c06085d8>] (dump_stack) from [<c002892c>] (warn_slowpath_common+0x84/0xb4) [<c002892c>] (warn_slowpath_common) from [<c00289f8>] (warn_slowpath_null+0x1c/0x24) [<c00289f8>] (warn_slowpath_null) from [<c0220290>] (kobject_get+0x64/0x70) [<c0220290>] (kobject_get) from [<c03e944c>] (cpufreq_cpu_get+0x88/0xc8) [<c03e944c>] (cpufreq_cpu_get) from [<c03e9500>] (cpufreq_get+0xc/0x64) [<c03e9500>] (cpufreq_get) from [<c0285288>] (actmon_thread_isr+0x134/0x198) [<c0285288>] (actmon_thread_isr) from [<c0069008>] (irq_thread_fn+0x1c/0x40) [<c0069008>] (irq_thread_fn) from [<c0069324>] (irq_thread+0x134/0x174) [<c0069324>] (irq_thread) from [<c0040290>] (kthread+0xdc/0xf4) [<c0040290>] (kthread) from [<c000f4b8>] (ret_from_fork+0x14/0x3c) ---[ end trace b7bd64a81b340c59 ]--- Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-25 22:44:17 +01:00
Rafael J. Wysocki	bd2a0f6754	Merge back cpufreq material for 3.19-rc1.	2014-11-18 01:22:29 +01:00
Vince Hsu	619c144c84	cpufreq: respect the min/max settings from user space When the user space tries to set scaling_(max\|min)_freq through sysfs, the cpufreq_set_policy() asks other driver's opinions for the max/min frequencies. Some device drivers, like Tegra CPU EDP which is not upstreamed yet though, may constrain the CPU maximum frequency dynamically because of board design. So if the user space access happens and some driver is capping the cpu frequency at the same time, the user_policy->(max\|min) is overridden by the capped value, and that's not expected by the user space. And if the user space is not invoked again, the CPU will always be capped by the user_policy->(max\|min) even no drivers limit the CPU frequency any more. This patch preserves the user specified min/max settings, so that every time the cpufreq policy is updated, the new max/min can be re-evaluated correctly based on the user's expection and the present device drivers' status. Signed-off-by: Vince Hsu <vinceh@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-11 23:04:28 +01:00
Geert Uytterhoeven	09712f557b	cpufreq: Avoid crash in resume on SMP without OPP When resuming from s2ram on an SMP system without cpufreq operating points (e.g. there's no "operating-points" property for the CPU node in DT, or the platform doesn't use DT yet), the kernel crashes when bringing CPU 1 online: Enabling non-boot CPUs ... CPU1: Booted secondary processor Unable to handle kernel NULL pointer dereference at virtual address 0000003c pgd = ee5e6b00 [0000003c] pgd=6e579003, pmd=6e588003, *pte=00000000 Internal error: Oops: a07 [#1] SMP ARM Modules linked in: CPU: 0 PID: 1246 Comm: s2ram Tainted: G W 3.18.0-rc3-koelsch-01614-g0377af242bb175c8-dirty #589 task: eeec5240 ti: ee704000 task.ti: ee704000 PC is at __cpufreq_add_dev.isra.24+0x24c/0x77c LR is at __cpufreq_add_dev.isra.24+0x244/0x77c pc : [<c0298efc>] lr : [<c0298ef4>] psr: 60000153 sp : ee705d48 ip : ee705d48 fp : ee705d84 r10: c04e0450 r9 : 00000000 r8 : 00000001 r7 : c05426a8 r6 : 00000001 r5 : 00000001 r4 : 00000000 r3 : 00000000 r2 : 00000000 r1 : 20000153 r0 : c0542734 Verify that policy is not NULL before dereferencing it to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: `8414809c6a` (cpufreq: Preserve policy structure across suspend/resume) Cc: 3.12+ <stable@vger.kernel.org> # 3.12+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-08 02:10:04 +01:00
Dirk Brandewie	c034b02e21	cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers Currently the core does not expose scaling_cur_freq for set_policy() drivers this breaks some userspace monitoring tools. Change the core to expose this file for all drivers and if the set_policy() driver supports the get() callback use it to retrieve the current frequency. Link: https://bugzilla.kernel.org/show_bug.cgi?id=73741 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-10-23 22:59:59 +02:00
Thomas Petazzoni	51315cdfa0	cpufreq: allow driver-specific data This commit extends the cpufreq_driver structure with an additional 'void *driver_data' field that can be filled by the ->probe() function of a cpufreq driver to pass additional custom information to the driver itself. A new function called cpufreq_get_driver_data() is added to allow a cpufreq driver to retrieve those driver data, since they are typically needed from a cpufreq_policy->init() callback, which does not have access to the cpufreq_driver structure. This function call is similar to the existing cpufreq_get_current_driver() function call. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-10-21 00:51:01 +02:00
Rafael J. Wysocki	6f1293ff74	Merge back cpufreq material for v3.18.	2014-10-03 15:41:16 +02:00
Viresh Kumar	b1b12babe3	cpufreq: update 'cpufreq_suspended' after stopping governors Commit `8e30444e15` ("cpufreq: fix cpufreq suspend/resume for intel_pstate") introduced a bug where the governors wouldn't be stopped anymore for ->target{_index}() drivers during suspend. This happens because 'cpufreq_suspended' is updated before stopping the governors during suspend and due to this __cpufreq_governor() would return early due to this check: /* Don't start any governor operations if we are entering suspend */ if (cpufreq_suspended) return 0; Fixes: `8e30444e15` ("cpufreq: fix cpufreq suspend/resume for intel_pstate") Cc: 3.15+ <stable@vger.kernel.org> # 3.15+: `8e30444e15` "cpufreq: fix cpufreq suspend/resume for intel_pstate" Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-30 21:02:34 +02:00
Rasmus Villemoes	7c4f453970	cpufreq: Replace strnicmp with strncasecmp The kernel used to contain two functions for length-delimited, case-insensitive string comparison, strnicmp with correct semantics and a slightly buggy strncasecmp. The latter is the POSIX name, so strnicmp was renamed to strncasecmp, and strnicmp made into a wrapper for the new strncasecmp to avoid breaking existing users. To allow the compat wrapper strnicmp to be removed at some point in the future, and to avoid the extra indirection cost, do s/strnicmp/strncasecmp/g. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-29 15:53:04 +02:00
Preeti U Murthy	789ca24374	cpufreq: Allow stop CPU callback to be used by all cpufreq drivers Commit `367dc4aa93` ("cpufreq: Add stop CPU callback to cpufreq_driver interface") introduced the stop CPU callback for intel_pstate drivers. During the CPU_DOWN_PREPARE stage, this callback is invoked so that drivers can take some action on the pstate of the cpu before it is taken offline. This callback was assumed to be useful only for those drivers which have implemented the set_policy CPU callback because they have no other way to take action about the cpufreq of a CPU which is being hotplugged out except in the exit callback which is called very late in the offline process. The drivers which implement the target/target_index callbacks were expected to take care of requirements like the ones that commit `367dc4aa` addresses in the GOV_STOP notification event. But there are disadvantages to restricting the usage of stop CPU callback to cpufreq drivers that implement the set_policy callbacks and who want to take explicit action on the setting the cpufreq during a hotplug operation. 1.GOV_STOP gets called for every CPU offline and drivers would usually want to take action when the last cpu in the policy->cpus mask is taken offline. As long as there is more than one cpu in the policy->cpus mask, cpufreq core itself makes sure that the freq for the other cpus in this mask is set according to the maximum load. This is sensible and drivers which implement the target_index callback would mostly not want to modify that. However the cpufreq core leaves a loose end when the cpu in the policy->cpus mask is the last one to go offline; it does nothing explicit to the frequency of the core. Drivers may need a way to take some action here and stop CPU callback mechanism is the best way to do it today. 2. We cannot implement driver specific actions in the GOV_STOP mechanism. So we will need another driver callback which is invoked from here which is unnecessary. Therefore this patch extends the usage of stop CPU callback to be used by all cpufreq drivers as long as they have this callback implemented and irrespective of whether they are set_policy/target_index drivers. The assumption is if the drivers find the GOV_STOP path to be a suitable way of implementing what they want to do with the freq of the cpu going offine,they will not implement the stop CPU callback at all. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-29 15:47:12 +02:00
Prarit Bhargava	7106e02bae	cpufreq: release policy->rwsem on error While debugging a cpufreq-related hardware failure on a system I saw the following lockdep warning: ========================= [ BUG: held lock freed! ] 3.17.0-rc4+ #1 Tainted: G E ------------------------- insmod/2247 is freeing memory ffff88006e1b1400-ffff88006e1b17ff, with a lock still held there! (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 3 locks held by insmod/2247: #0: (subsys mutex#5){+.+.+.}, at: [<ffffffff81485579>] subsys_interface_register+0x69/0x120 #1: (cpufreq_rwsem){.+.+.+}, at: [<ffffffff8156cf73>] __cpufreq_add_dev.isra.21+0x73/0xb80 #2: (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 stack backtrace: CPU: 0 PID: 2247 Comm: insmod Tainted: G E 3.17.0-rc4+ #1 Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 08/24/2013 0000000000000000 000000008f3063c4 ffff88006f87bb30 ffffffff8171b358 ffff88006bcf3750 ffff88006f87bb68 ffffffff810e09e1 ffff88006e1b1400 ffffea0001b86c00 ffffffff8156d327 ffff880073003500 0000000000000246 Call Trace: [<ffffffff8171b358>] dump_stack+0x4d/0x66 [<ffffffff810e09e1>] debug_check_no_locks_freed+0x171/0x180 [<ffffffff8156d327>] ? __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff8121412b>] kfree+0xab/0x2b0 [<ffffffff8156d327>] __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff81724cf7>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff8156da8e>] cpufreq_add_dev+0xe/0x10 [<ffffffff814855d1>] subsys_interface_register+0xc1/0x120 [<ffffffff8156bcf2>] cpufreq_register_driver+0x112/0x340 [<ffffffff8121415a>] ? kfree+0xda/0x2b0 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffffa003562e>] pcc_cpufreq_init+0x4af/0xe81 [pcc_cpufreq] [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff81002144>] do_one_initcall+0xd4/0x210 [<ffffffff811f7472>] ? __vunmap+0xd2/0x120 [<ffffffff81127155>] load_module+0x1315/0x1b70 [<ffffffff811222a0>] ? store_uevent+0x70/0x70 [<ffffffff811229d9>] ? copy_module_from_fd.isra.44+0x129/0x180 [<ffffffff81127b86>] SyS_finit_module+0xa6/0xd0 [<ffffffff81725b69>] system_call_fastpath+0x16/0x1b cpufreq: __cpufreq_add_dev: ->get() failed insmod: ERROR: could not insert module pcc-cpufreq.ko: No such device The warning occurs in the __cpufreq_add_dev() code which does down_write(&policy->rwsem); ... if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { policy->cur = cpufreq_driver->get(policy->cpu); if (!policy->cur) { pr_err("%s: ->get() failed\n", __func__); goto err_get_freq; } If cpufreq_driver->get(policy->cpu) returns an error we execute the code at err_get_freq, which does not up the policy->rwsem. This causes the lockdep warning. Trivial patch to up the policy->rwsem in the error path. After the patch has been applied, and an error occurs in the cpufreq_driver->get(policy->cpu) call we will now see cpufreq: __cpufreq_add_dev: ->get() failed cpufreq: __cpufreq_add_dev: ->get() failed modprobe: ERROR: could not insert 'pcc_cpufreq': No such device Fixes: `4e97b631f2` (cpufreq: Initialize governor for a new policy under policy->rwsem) Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.14+ <stable@vger.kernel.org> # 3.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-22 14:32:43 +02:00
Lan Tianyu	8e30444e15	cpufreq: fix cpufreq suspend/resume for intel_pstate Cpufreq core introduces cpufreq_suspended flag to let cpufreq sysfs nodes across S2RAM/S2DISK. But the flag is only set in the cpufreq_suspend() for cpufreq drivers which have target or target_index callback. This skips intel_pstate driver. This patch is to set the flag before checking target or target_index callback. Fixes: `2f0aea9363` (cpufreq: suspend governors on system suspend/hibernate) Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ [rjw: Subject] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-09-22 14:23:18 +02:00
Viresh Kumar	1bfb425b3b	cpufreq: move policy kobj to update_policy_cpu() We are calling kobject_move() from two separate places currently and both these places share another routine update_policy_cpu() which is handling everything around updating policy->cpu. Moving ownership of policy->kobj also lies under the role of update_policy_cpu() routine and must be handled from there. So, Lets move kobject_move() to update_policy_cpu() and get rid of cpufreq_nominate_new_policy_cpu() as it doesn't have anything significant left. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-07-21 13:43:20 +02:00
Viresh Kumar	41dfd908fc	cpufreq: propagate error returned by kobject_move() We are returning -EINVAL instead of the error returned from kobject_move() when it fails. Propagate the actual error number. Also add a meaningful print when sysfs_create_link() fails. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-07-21 13:43:20 +02:00
Viresh Kumar	1461dc7d1c	cpufreq: don't restore policy->cpus on failure to move kobj While hot-unplugging policy->cpu, we call cpufreq_nominate_new_policy_cpu() to nominate next owner of policy, i.e. policy->cpu. If we fail to move policy kobject under the new policy->cpu, we try to update policy->cpus with the old policy->cpu. This would have been required in case old-CPU is removed from policy->cpus in the first place. But its not done before calling cpufreq_nominate_new_policy_cpu(), but during the POST_DEAD notification which happens quite late in the hot-unplugging path. So, this is just some useless code hanging around, get rid of it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-07-21 13:43:20 +02:00
Viresh Kumar	92c14bd947	cpufreq: move policy kobj to policy->cpu at resume This is only relevant to implementations with multiple clusters, where clusters have separate clock lines but all CPUs within a cluster share it. Consider a dual cluster platform with 2 cores per cluster. During suspend we start hot unplugging CPUs in order 1 to 3. When CPU2 is removed, policy->kobj would be moved to CPU3 and when CPU3 goes down we wouldn't free policy or its kobj as we want to retain permissions/values/etc. Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev(). We will recover the old policy and update policy->cpu from 3 to 2 from update_policy_cpu(). But the kobj is still tied to CPU3 and isn't moved to CPU2. We wouldn't create a link for CPU2, but would try that for CPU3 while bringing it online. Which will report errors as CPU3 already has kobj assigned to it. This bug got introduced with commit `42f921a`, which overlooked this scenario. To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a cluster back. Also do a WARN_ON() if kobject_move failed, as we would reach here only for the first CPU of a non-boot cluster. And we can't recover from this situation, if kobject_move() fails. Fixes: `42f921a6f1` (cpufreq: remove sysfs files for CPUs which failed to come back after resume) Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Reported-and-tested-by: Bu Yitian <ybu@qti.qualcomm.com> Reported-by: Saravana Kannan <skannan@codeaurora.org> Reviewed-by: Srivatsa S. Bhat <srivatsa@mit.edu> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-07-17 14:23:22 +02:00
Aaron Plattner	fefa8ff810	cpufreq: unlock when failing cpufreq_update_policy() Commit `bd0fa9bb45` introduced a failure path to cpufreq_update_policy() if cpufreq_driver->get(cpu) returns NULL. However, it jumps to the 'no_policy' label, which exits without unlocking any of the locks the function acquired earlier. This causes later calls into cpufreq to hang. Fix this by creating a new 'unlock' label and jumping to that instead. Fixes: `bd0fa9bb45` ("cpufreq: Return error if ->get() failed in cpufreq_update_policy()") Link: https://devtalk.nvidia.com/default/topic/751903/kernel-3-15-and-nv-drivers-337-340-failed-to-initialize-the-nvidia-kernel-module-gtx-550-ti-/ Signed-off-by: Aaron Plattner <aplattner@nvidia.com> Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-06-18 21:52:20 +02:00
Viresh Kumar	1c03a2d04d	cpufreq: add support for intermediate (stable) frequencies Douglas Anderson, recently pointed out an interesting problem due to which udelay() was expiring earlier than it should. While transitioning between frequencies few platforms may temporarily switch to a stable frequency, waiting for the main PLL to stabilize. For example: When we transition between very low frequencies on exynos, like between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz. No CPUFREQ notification is sent for that. That means there's a period of time when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz and 300MHz. And so udelay behaves badly. To get this fixed in a generic way, introduce another set of callbacks get_intermediate() and target_intermediate(), only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION unset. get_intermediate() should return a stable intermediate frequency platform wants to switch to, and target_intermediate() should set CPU to that frequency, before jumping to the frequency corresponding to 'index'. Core will take care of sending notifications and driver doesn't have to handle them in target_intermediate() or target_index(). NOTE: ->target_index() should restore to policy->restore_freq in case of failures as core would send notifications for that. Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-06-05 23:32:29 +02:00
Viresh Kumar	8d65775d17	cpufreq: handle calls to ->target_index() in separate routine Handling calls to ->target_index() has got complex over time and might become more complex. So, its better to take target_index() bits out in another routine __target_index() for better code readability. Shouldn't have any functional impact. Tested-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-05-29 01:27:38 +02:00
Stratos Karafotis	5eeaf1f189	cpufreq: Fix build error on some platforms that use cpufreq_for_each_* On platforms that use cpufreq_for_each_* macros, build fails if CONFIG_CPU_FREQ=n, e.g. ARM/shmobile/koelsch/non-multiplatform: drivers/built-in.o: In function `clk_round_parent': clkdev.c:(.text+0xcf168): undefined reference to `cpufreq_next_valid' drivers/built-in.o: In function `clk_rate_table_find': clkdev.c:(.text+0xcf820): undefined reference to `cpufreq_next_valid' make[3]: *** [vmlinux] Error 1 Fix this making cpufreq_next_valid function inline and move it to cpufreq.h. Fixes: `27e289dce2` (cpufreq: Introduce macros for cpufreq_frequency_table iteration) Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-05-08 13:10:56 +02:00
Srivatsa S. Bhat	ca654dc3a9	cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end Some cpufreq drivers were redundantly invoking the _begin() and _end() APIs around frequency transitions, and this double invocation (one from the cpufreq core and the other from the cpufreq driver) used to result in a self-deadlock, leading to system hangs during boot. (The _begin() API makes contending callers wait until the previous invocation is complete. Hence, the cpufreq driver would end up waiting on itself!). Now all such drivers have been fixed, but debugging this issue was not very straight-forward (even lockdep didn't catch this). So let us add a debug infrastructure to the cpufreq core to catch such issues more easily in the future. We add a new field called 'transition_task' to the policy structure, to keep track of the task which is performing the frequency transition. Using this field, we make note of this task during _begin() and print a warning if we find a case where the same task is calling _begin() again, before completing the previous frequency transition using the corresponding _end(). We have left out ASYNC_NOTIFICATION drivers from this debug infrastructure for 2 reasons: 1. At the moment, we have no way to avoid a particular scenario where this debug infrastructure can emit false-positive warnings for such drivers. The scenario is depicted below: Task A Task B /* 1st freq transition / Invoke _begin() { ... ... } Change the frequency / 2nd freq transition / Invoke _begin() { ... //waiting for B to ... //finish _end() for ... //the 1st transition ... \| Got interrupt for successful ... \| change of frequency (1st one). ... \| ... \| / 1st freq transition */ ... \| Invoke _end() { ... \| ... ... V } ... ... } This scenario is actually deadlock-free because, once Task A changes the frequency, it is Task B's responsibility to invoke the corresponding _end() for the 1st frequency transition. Hence it is perfectly legal for Task A to go ahead and attempt another frequency transition in the meantime. (Of course it won't be able to proceed until Task B finishes the 1st _end(), but this doesn't cause a deadlock or a hang). The debug infrastructure cannot handle this scenario and will treat it as a deadlock and print a warning. To avoid this, we exclude such drivers from the purview of this code. 2. Luckily, we don't _need_ this infrastructure for ASYNC_NOTIFICATION drivers at all! The cpufreq core does not automatically invoke the _begin() and _end() APIs during frequency transitions in such drivers. Thus, the driver alone is responsible for invoking _begin()/_end() and hence there shouldn't be any conflicts which lead to double invocations. So, we can skip these drivers, since the probability that such drivers will hit this problem is extremely low, as outlined above. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-05-07 00:32:39 +02:00
Stratos Karafotis	27e289dce2	cpufreq: Introduce macros for cpufreq_frequency_table iteration Many cpufreq drivers need to iterate over the cpufreq_frequency_table for various tasks. This patch introduces two macros which can be used for iteration over cpufreq_frequency_table keeping a common coding style across drivers: - cpufreq_for_each_entry: iterate over each entry of the table - cpufreq_for_each_valid_entry: iterate over each entry that contains a valid frequency. It should have no functional changes. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-04-30 00:05:31 +02:00
Viresh Kumar	236a980052	cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static cpufreq_notify_transition() and cpufreq_notify_post_transition() shouldn't be called directly by cpufreq drivers anymore and so these should be marked static. Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-26 16:41:41 +01:00
Viresh Kumar	8fec051eea	cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin\|end} CPUFreq core has new infrastructure that would guarantee serialized calls to target() or target_index() callbacks. These are called cpufreq_freq_transition_begin() and cpufreq_freq_transition_end(). This patch converts existing drivers to use these new set of routines. Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-26 16:41:41 +01:00
Srivatsa S. Bhat	12478cf0c5	cpufreq: Make sure frequency transitions are serialized Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers should strictly alternate, thereby preventing two different sets of PRECHANGE or POSTCHANGE notifiers from interleaving arbitrarily. The following examples illustrate why this is important: Scenario 1: ----------- A thread reading the value of cpuinfo_cur_freq, will call __cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition() The ondemand governor can decide to change the frequency of the CPU at the same time and hence it can end up sending the notifications via ->target(). If the notifiers are not serialized, the following sequence can occur: - PRECHANGE Notification for freq A (from cpuinfo_cur_freq) - PRECHANGE Notification for freq B (from target()) - Freq changed by target() to B - POSTCHANGE Notification for freq B - POSTCHANGE Notification for freq A We can see from the above that the last POSTCHANGE Notification happens for freq A but the hardware is set to run at freq B. Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback() in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the loops_per_jiffy calculations will get messed up. Scenario 2: ----------- The governor calls __cpufreq_driver_target() to change the frequency. At the same time, if we change scaling_{min\|max}_freq from sysfs, it will end up calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call __cpufreq_driver_target(). And hence we end up issuing concurrent calls to ->target(). Typically, platforms have the following logic in their ->target() routines: (Eg: cpufreq-cpu0, omap, exynos, etc) A. If new freq is more than old: Increase voltage B. Change freq C. If new freq is less than old: decrease voltage Now, if the two concurrent calls to ->target() are X and Y, where X is trying to increase the freq and Y is trying to decrease it, we get the following race condition: X.A: voltage gets increased for larger freq Y.A: nothing happens Y.B: freq gets decreased Y.C: voltage gets decreased X.B: freq gets increased X.C: nothing happens Thus we can end up setting a freq which is not supported by the voltage we have set. That will probably make the clock to the CPU unstable and the system might not work properly anymore. This patch introduces a set of synchronization primitives to serialize frequency transitions, which are to be used as shown below: cpufreq_freq_transition_begin(); //Perform the frequency change cpufreq_freq_transition_end(); The _begin() call sends the PRECHANGE notification whereas the _end() call sends the POSTCHANGE notification. Also, all the necessary synchronization is handled within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION flag can also use these APIs for performing frequency transitions (ie., you can call _begin() from one task, and call the corresponding _end() from a different task). The actual synchronization underneath is not that complicated: The key challenge is to allow drivers to begin the transition from one thread and end it in a completely different thread (this is to enable drivers that do asynchronous POSTCHANGE notification from bottom-halves, to also use the same interface). To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a wait-queue are added per-policy. The flag and the wait-queue are used in conjunction to create an "uninterrupted flow" from _begin() to _end(). The spinlock is used to ensure that only one such "flow" is in flight at any given time. Put together, this provides us all the necessary synchronization. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-26 16:41:40 +01:00
Viresh Kumar	0c5aa405a9	cpufreq: resume drivers before enabling governors During suspend, we first stop governors and then suspend cpufreq drivers and resume must be exactly opposite of that. i.e. resume drivers first and then start governors. But the current code in resume enables governors first and then resume drivers. Fix it be changing code sequence there. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-26 16:37:18 +01:00
Dirk Brandewie	367dc4aa93	cpufreq: Add stop CPU callback to cpufreq_driver interface This callback allows the driver to do clean up before the CPU is completely down and its state cannot be modified. This is used by the intel_pstate driver to reduce the requested P state prior to the core going away. This is required because the requested P state of the offline core is used to select the package P state. This effectively sets the floor package P state to the requested P state on the offline core. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> [rjw: Minor modifications] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 03:50:12 +01:00
Stratos Karafotis	bda9f552f9	cpufreq: Remove unnecessary braces Remove unnecessary braces from a single statement. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 03:40:48 +01:00
Stratos Karafotis	e5c87b7628	cpufreq: Fix checkpatch errors and warnings Fix 2 checkpatch errors about using assignment in if condition, 1 checkpatch error about a required space after comma and 3 warnings about line over 80 characters. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 03:39:28 +01:00
Viresh Kumar	0b443ead71	cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE\|RESUMECHANGE} Two cpufreq notifiers CPUFREQ_RESUMECHANGE and CPUFREQ_SUSPENDCHANGE have not been used for some time, so remove them to clean up code a bit. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-19 14:10:24 +01:00
Rafael J. Wysocki	9832235f3f	cpufreq: Do not allow ->setpolicy drivers to provide ->target cpufreq drivers that provide the ->setpolicy() callback are supposed to have integrated governors, so they don't need to set ->target() or ->target_index() and may confuse the core if any of these callbacks is present. For this reason, add a check preventing ->setpolicy cpufreq drivers from registering if they have non-NULL ->target or ->target_index. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2014-03-19 12:48:30 +01:00
Rafael J. Wysocki	15afee3aea	Merge back earlier 'pm-cpufreq' material.	2014-03-17 13:51:39 +01:00
Rafael J. Wysocki	2ed99e39cb	cpufreq: Skip current frequency initialization for ->setpolicy drivers After commit `da60ce9f2f` (cpufreq: call cpufreq_driver->get() after calling ->init()) __cpufreq_add_dev() sometimes fails for CPUs handled by intel_pstate, because that driver may return 0 from its ->get() callback if it has not run long enough to collect enough samples on the given CPU. That didn't happen before commit `da60ce9f2f` which added policy->cur initialization to __cpufreq_add_dev() to help reduce code duplication in other cpufreq drivers. However, the code added by commit `da60ce9f2f` need not be executed for cpufreq drivers having the ->setpolicy callback defined, because the subsequent invocation of cpufreq_set_policy() will use that callback to initialize the policy anyway and it doesn't need policy->cur to be initialized upfront. The analogous code in cpufreq_update_policy() is also unnecessary for cpufreq drivers having ->setpolicy set and may be skipped for them as well. Since intel_pstate provides ->setpolicy, skipping the upfront policy->cur initialization for cpufreq drivers with that callback set will cover intel_pstate and the problem it's been having after commit `da60ce9f2f` will be addressed. Fixes: `da60ce9f2f` (cpufreq: call cpufreq_driver->get() after calling ->init()) References: https://bugzilla.kernel.org/show_bug.cgi?id=71931 Reported-and-tested-by: Patrik Lundquist <patrik.lundquist@gmail.com> Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-13 00:37:16 +01:00
Viresh Kumar	96bbbe4a2a	cpufreq: Remove unnecessary variable/parameter 'frozen' We have used 'frozen' variable/function parameter at many places to distinguish between CPU offline/online on suspend/resume vs sysfs removals. We now have another variable cpufreq_suspended which can be used in these cases, so we can get rid of all those variables or function parameters. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-12 01:06:01 +01:00
Viresh Kumar	e0b3165ba5	cpufreq: add 'freq_table' in struct cpufreq_policy freq table is not per CPU but per policy, so it makes more sense to keep it within struct cpufreq_policy instead of a per-cpu variable. This patch does it. Over that, there is no need to set policy->freq_table to NULL in ->exit(), as policy structure is going to be freed soon. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-12 01:06:00 +01:00
Joe Perches	e837f9b58b	cpufreq: Reformat printk() statements - Add missing newlines - Coalesce format fragments - Convert printks to pr_<level> - Align arguments Based-on-patch-by: Sören Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-12 00:49:22 +01:00
Viresh Kumar	e28867eab7	cpufreq: Implement cpufreq_generic_suspend() Multiple platforms need to set CPUs to a particular frequency before suspending the system, so provide a common infrastructure for them. Those platforms only need to point their ->suspend callback pointers to the generic routine. Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-06 15:04:12 +01:00
Viresh Kumar	2f0aea9363	cpufreq: suspend governors on system suspend/hibernate This patch adds cpufreq suspend/resume calls to dpm_{suspend\|resume}() for handling suspend/resume of cpufreq governors. Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found an issue where the tunables configuration for clusters/sockets with non-boot CPUs was lost after system suspend/resume, as we were notifying governors with CPUFREQ_GOV_POLICY_EXIT on removal of the last CPU for that policy which caused the tunables memory to be freed. This is fixed by preventing any governor operations from being carried out between the device suspend and device resume stages of system suspend and resume, respectively. We could have added these callbacks at dpm_{suspend\|resume}_noirq() level, but there is an additional problem that the majority of I/O devices is already suspended at that point and if cpufreq drivers want to change the frequency before suspending, then that not be possible on some platforms (which depend on peripherals like i2c, regulators, etc). Reported-and-tested-by: Lan Tianyu <tianyu.lan@intel.com> Reported-by: Jinhyuk Choi <jinchoi@broadcom.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-06 15:04:12 +01:00
viresh kumar	6e2c89d16d	cpufreq: move call to __find_governor() to cpufreq_init_policy() We call __find_governor() during the addition of the first CPU of each policy from __cpufreq_add_dev() to find the last governor used for this CPU before it was hot-removed. After that we call cpufreq_parse_governor() in cpufreq_init_policy(), either with this governor, or with the default governor. Right after that policy->governor is set to NULL. While that code is not functionally problematic, the structure of it is suboptimal, because some of the code required in cpufreq_init_policy() is being executed by its caller, __cpufreq_add_dev(). So, it would make more sense to get all of it together in a single place to make code more readable. Accordingly, move the code needed for policy initialization to cpufreq_init_policy() and initialize policy->governor to NULL at the beginning. In order to clean up the code a bit more, some of the #ifdefs for CONFIG_HOTPLUG_CPU are dropped too. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-06 14:38:44 +01:00
Rafael J. Wysocki	3b4aff0472	Merge back earlier 'pm-cpufreq' material.	2014-03-06 13:25:59 +01:00
Viresh Kumar	4e97b631f2	cpufreq: Initialize governor for a new policy under policy->rwsem policy->rwsem is used to lock access to all parts of code modifying struct cpufreq_policy, but it's not used on a new policy created by __cpufreq_add_dev(). Because of that, if cpufreq_update_policy() is called in a tight loop on one CPU in parallel with offline/online of another CPU, then the following crash can be triggered: Unable to handle kernel NULL pointer dereference at virtual address 00000020 pgd = c0003000 [00000020] pgd=80000000004003, pmd=00000000 Internal error: Oops: 206 [#1] PREEMPT SMP ARM PC is at __cpufreq_governor+0x10/0x1ac LR is at cpufreq_update_policy+0x114/0x150 ---[ end trace f23a8defea6cd706 ]--- Kernel panic - not syncing: Fatal exception CPU0: stopping CPU: 0 PID: 7136 Comm: mpdecision Tainted: G D W 3.10.0-gd727407-00074-g979ede8 #396 [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) from [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) from [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) from [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) from [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) from [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) from [<c0afe180>] (notifier_call_chain+0x40/0x68) [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02812dc>] (__cpu_notify+0x28/0x44) [<c02812dc>] (__cpu_notify+0x28/0x44) from [<c0aeed90>] (_cpu_up+0xf4/0x1dc) [<c0aeed90>] (_cpu_up+0xf4/0x1dc) from [<c0aeeed4>] (cpu_up+0x5c/0x78) [<c0aeeed4>] (cpu_up+0x5c/0x78) from [<c0aec808>] (store_online+0x44/0x74) [<c0aec808>] (store_online+0x44/0x74) from [<c03a40f4>] (sysfs_write_file+0x108/0x14c) [<c03a40f4>] (sysfs_write_file+0x108/0x14c) from [<c03517d4>] (vfs_write+0xd0/0x180) [<c03517d4>] (vfs_write+0xd0/0x180) from [<c0351ca8>] (SyS_write+0x38/0x68) [<c0351ca8>] (SyS_write+0x38/0x68) from [<c0205de0>] (ret_fast_syscall+0x0/0x30) Fix that by taking locks at appropriate places in __cpufreq_add_dev() as well. Reported-by: Saravana Kannan <skannan@codeaurora.org> Suggested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-06 13:25:30 +01:00
Viresh Kumar	5a7e56a5d2	cpufreq: Initialize policy before making it available for others to use Policy must be fully initialized before it is being made available for use by others. Otherwise cpufreq_cpu_get() would be able to grab a half initialized policy structure that might not have affected_cpus (for example) populated. Then, anybody accessing those fields will get a wrong value and that will lead to unpredictable results. In order to fix this, do all the necessary initialization before we make the policy structure available via cpufreq_cpu_get(). That will guarantee that any code accessing fields of the policy will get correct data from them. Reported-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-06 13:25:29 +01:00
Aaron Plattner	999976e0f6	cpufreq: use cpufreq_cpu_get() to avoid cpufreq_get() race conditions If a module calls cpufreq_get while cpufreq is initializing, it's possible for it to be called after cpufreq_driver is set but before cpufreq_cpu_data is written during subsys_interface_register. This happens because cpufreq_get doesn't take the cpufreq_driver_lock around its use of cpufreq_cpu_data. Fix this by using cpufreq_cpu_get(cpu) to look up the policy rather than reading it out of cpufreq_cpu_data directly. cpufreq_cpu_get() takes the appropriate locks to prevent this race from happening. Since it's possible for policy to be NULL if the caller passes in an invalid CPU number or calls the function before cpufreq is initialized, delete the BUG_ON(!policy) and simply return 0. Don't try to return -ENOENT because that's negative and the function returns an unsigned integer. References: https://bbs.archlinux.org/viewtopic.php?id=177934 Signed-off-by: Aaron Plattner <aplattner@nvidia.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-06 13:25:16 +01:00
Viresh Kumar	bd0fa9bb45	cpufreq: Return error if ->get() failed in cpufreq_update_policy() cpufreq_update_policy() calls cpufreq_driver->get() to get current frequency of a CPU and it is not supposed to fail or return zero. Return error in case that happens. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-01 01:39:30 +01:00
Rashika Kheria	8a5c74a175	cpufreq: Mark function as static in cpufreq.c Mark function as static in cpufreq.c because it is not used outside this file. This eliminates the following warning in cpufreq.c: drivers/cpufreq/cpufreq.c:355:9: warning: no previous prototype for ‘show_boost’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-02-27 00:49:36 +01:00
Viresh Kumar	1c0ca90207	cpufreq: don't call cpufreq_update_policy() on CPU addition cpufreq_update_policy() is called from two places currently. From a workqueue handled queued from cpufreq_bp_resume() for boot CPU and from cpufreq_cpu_callback() whenever a CPU is added. The first one makes sure that boot CPU is running on the frequency present in policy->cpu. But we don't really need a call from cpufreq_cpu_callback(), because we always call cpufreq_driver->init() (which will set policy->cur correctly) whenever first CPU of any policy is added back. And so every policy structure is guaranteed to have the right frequency in policy->cur. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-02-24 13:37:43 +01:00
Rafael J. Wysocki	d9a789c7a0	cpufreq: Refactor cpufreq_set_policy() Reduce the rampant usage of goto and the indentation level in cpufreq_set_policy() to improve the readability of that code. No functional changes should result from that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>	2014-02-24 13:37:43 +01:00
viresh kumar	6964d91db2	cpufreq: remove sysfs link when a cpu != policy->cpu, is removed Commit `42f921a` (cpufreq: remove sysfs files for CPUs which failed to come back after resume) tried to do this but missed this piece of code to fix. Currently we are getting this on suspend/resume: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 877 at fs/sysfs/dir.c:52 sysfs_warn_dup+0x68/0x84() sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq' Modules linked in: brcmfmac brcmutil CPU: 0 PID: 877 Comm: test-rtc-resume Not tainted 3.14.0-rc2-00259-g9398a10cd964 #12 [<c0015bac>] (unwind_backtrace) from [<c0011850>] (show_stack+0x10/0x14) [<c0011850>] (show_stack) from [<c056e018>] (dump_stack+0x80/0xcc) [<c056e018>] (dump_stack) from [<c0025e44>] (warn_slowpath_common+0x64/0x88) [<c0025e44>] (warn_slowpath_common) from [<c0025efc>] (warn_slowpath_fmt+0x30/0x40) [<c0025efc>] (warn_slowpath_fmt) from [<c012776c>] (sysfs_warn_dup+0x68/0x84) [<c012776c>] (sysfs_warn_dup) from [<c0127a54>] (sysfs_do_create_link_sd+0xb0/0xb8) [<c0127a54>] (sysfs_do_create_link_sd) from [<c038ef64>] (__cpufreq_add_dev.isra.27+0x2a8/0x814) [<c038ef64>] (__cpufreq_add_dev.isra.27) from [<c038f548>] (cpufreq_cpu_callback+0x70/0x8c) [<c038f548>] (cpufreq_cpu_callback) from [<c0043864>] (notifier_call_chain+0x44/0x84) [<c0043864>] (notifier_call_chain) from [<c0025f60>] (__cpu_notify+0x28/0x44) [<c0025f60>] (__cpu_notify) from [<c00261e8>] (_cpu_up+0xf0/0x140) [<c00261e8>] (_cpu_up) from [<c0569eb8>] (enable_nonboot_cpus+0x68/0xb0) [<c0569eb8>] (enable_nonboot_cpus) from [<c006339c>] (suspend_devices_and_enter+0x198/0x2dc) [<c006339c>] (suspend_devices_and_enter) from [<c0063654>] (pm_suspend+0x174/0x1e8) [<c0063654>] (pm_suspend) from [<c00624e0>] (state_store+0x6c/0xbc) [<c00624e0>] (state_store) from [<c01fc200>] (kobj_attr_store+0x14/0x20) [<c01fc200>] (kobj_attr_store) from [<c0126e50>] (sysfs_kf_write+0x44/0x48) [<c0126e50>] (sysfs_kf_write) from [<c012a274>] (kernfs_fop_write+0xb4/0x14c) [<c012a274>] (kernfs_fop_write) from [<c00d4818>] (vfs_write+0xa8/0x180) [<c00d4818>] (vfs_write) from [<c00d4bb8>] (SyS_write+0x3c/0x70) [<c00d4bb8>] (SyS_write) from [<c000e620>] (ret_fast_syscall+0x0/0x30) ---[ end trace 76969904b614c18f ]--- Fix this by removing sysfs link for cpufreq directory when cpu removed isn't policy->cpu. Revamps: `42f921a` (cpufreq: remove sysfs files for CPUs which failed to come back after resume) Reported-and-tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-02-19 01:04:40 +01:00
Lukasz Majewski	6f19efc0a1	cpufreq: Add boost frequency support in core This commit adds boost frequency support in cpufreq core (Hardware & Software). Some SoCs (like Exynos4 - e.g. 4x12) allow setting frequency above its normal operation limits. Such mode shall be only used for a short time. Overclocking (boost) support is essentially provided by platform dependent cpufreq driver. This commit unifies support for SW and HW (Intel) overclocking solutions in the core cpufreq driver. Previously the "boost" sysfs attribute was defined in the ACPI processor driver code. By default boost is disabled. One global attribute is available at: /sys/devices/system/cpu/cpufreq/boost. It only shows up when cpufreq driver supports overclocking. Under the hood frequencies dedicated for boosting are marked with a special flag (CPUFREQ_BOOST_FREQ) at driver's frequency table. It is the user's concern to enable/disable overclocking with a proper call to sysfs. The cpufreq_boost_trigger_state() function is defined non static on purpose. It is used later with thermal subsystem to provide automatic enable/disable of the BOOST feature. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-01-17 02:00:44 +01:00
Viresh Kumar	652ed95d5f	cpufreq: introduce cpufreq_generic_get() routine CPUFreq drivers that use clock frameworks interface,i.e. clk_get_rate(), to get CPUs clk rate, have similar sort of code used in most of them. This patch adds a generic ->get() which will do the same thing for them. All those drivers are required to now is to set .get to cpufreq_generic_get() and set their clk pointer in policy->clk during ->init(). Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no> Acked-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-01-17 02:00:44 +01:00
Viresh Kumar	fcd7af917a	cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly There are several problems with cpufreq stats in the way it handles cpufreq_unregister_driver() and suspend/resume.. - We must not lose data collected so far when suspend/resume happens and so stats directories must not be removed/allocated during these operations, which is done currently. - cpufreq_stat has registered notifiers with both cpufreq and hotplug. It adds sysfs stats directory with a cpufreq notifier: CPUFREQ_NOTIFY and removes this directory with a notifier from hotplug core. In case cpufreq_unregister_driver() is called (on rmmod cpufreq driver), stats directories per cpu aren't removed as CPUs are still online. The only call cpufreq_stats gets is cpufreq_stats_update_policy_cpu() for all CPUs except the last of each policy. And pointer to stat information is stored in the entry for last CPU in the per-cpu cpufreq_stats_table. But policy structure would be freed inside cpufreq core and so that will result in memory leak inside cpufreq stats (as we are never freeing memory for stats). Now if we again insert the module cpufreq_register_driver() will be called and we will again allocate stats data and put it on for first CPU of every policy. In case we only have a single CPU per policy, we will return with a error from cpufreq_stats_create_table() due to this code: if (per_cpu(cpufreq_stats_table, cpu)) return -EBUSY; And so probably cpufreq stats directory would not show up anymore (as it was added inside last policies->kobj which doesn't exist anymore). I haven't tested it, though. Also the values in stats files wouldn't be refreshed as we are using the earlier stats structure. - CPUFREQ_NOTIFY is called from cpufreq_set_policy() which is called for scenarios where we don't really want cpufreq_stat_notifier_policy() to get called. For example whenever we are changing anything related to a policy: min/max/current freq, etc. cpufreq_set_policy() is called and so cpufreq stats is notified. Where we don't do any useful stuff other than simply returning with -EBUSY from cpufreq_stats_create_table(). And so this isn't the right notifier that cpufreq stats.. Due to all above reasons this patch does following changes: - Add new notifiers CPUFREQ_CREATE_POLICY and CPUFREQ_REMOVE_POLICY, which are only called when policy is created/destroyed. They aren't called for suspend/resume paths.. - Use these notifiers in cpufreq_stat_notifier_policy() to create/destory stats sysfs entries. And so cpufreq_unregister_driver() or suspend/resume shouldn't be a problem for cpufreq_stats. - Return early from cpufreq_stat_cpu_callback() for suspend/resume sequence, so that we don't free stats structure. Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-01-17 02:00:43 +01:00
Viresh Kumar	d3916691c9	cpufreq: Make sure CPU is running on a freq from freq-table Sometimes boot loaders set CPU frequency to a value outside of frequency table present with cpufreq core. In such cases CPU might be unstable if it has to run on that frequency for long duration of time and so its better to set it to a frequency which is specified in freq-table. This also makes cpufreq stats inconsistent as cpufreq-stats would fail to register because current frequency of CPU isn't found in freq-table. Because we don't want this change to affect boot process badly, we go for the next freq which is >= policy->cur ('cur' must be set by now, otherwise we will end up setting freq to lowest of the table as 'cur' is initialized to zero). In case current frequency doesn't match any frequency from freq-table, we throw warnings to user, so that user can get this fixed in their bootloaders or freq-tables. Reported-by: Carlos Hernandez <ceh@ti.com> Reported-and-tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-01-06 14:17:25 +01:00
Viresh Kumar	ab1b1c4e82	cpufreq: send new set of notification for transition failures In the current code, if we fail during a frequency transition, we simply send the POSTCHANGE notification with the old frequency. This isn't enough. One of the core users of these notifications is the code responsible for keeping loops_per_jiffy aligned with frequency changes. And mostly it is written as: if ((val == CPUFREQ_PRECHANGE && freq->old < freq->new) \|\| (val == CPUFREQ_POSTCHANGE && freq->old > freq->new)) { update-loops-per-jiffy... } So, suppose we are changing to a higher frequency and failed during transition, then following will happen: - CPUFREQ_PRECHANGE notification with freq-new > freq-old - CPUFREQ_POSTCHANGE notification with freq-new == freq-old The first one will update loops_per_jiffy and second one will do nothing. Even if we send the 2nd notification by exchanging values of freq-new and old, some users of these notifications might get unstable. This can be fixed by simply calling cpufreq_notify_post_transition() with error code and this routine will take care of sending notifications in the correct order. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Folded 3 patches into one, rebased unicore2 changes] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-01-06 01:43:44 +01:00
Viresh Kumar	f7ba3b41e2	cpufreq: Introduce cpufreq_notify_post_transition() This introduces a new routine cpufreq_notify_post_transition() which can be used to send POSTCHANGE notification for new freq with or without both {PRE\|POST}CHANGE notifications for last freq. This is useful at multiple places, especially for sending transition failure notifications. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-01-06 01:43:44 +01:00
Jane Li	6f1e4efd88	cpufreq: Fix timer/workqueue corruption by protecting reading governor_enabled When a CPU is hot removed we'll cancel all the delayed work items via gov_cancel_work(). Sometimes the delayed work function determines that it should adjust the delay for all other CPUs that the policy is managing. If this scenario occurs, the canceling CPU will cancel its own work but queue up the other CPUs works to run. Commit 3617f2 (cpufreq: Fix timer/workqueue corruption due to double queueing) has tried to fix this, but reading governor_enabled is not protected by cpufreq_governor_lock. Even though od_dbs_timer() checks governor_enabled before gov_queue_work(), this scenario may occur. For example: CPU0 CPU1 ---- ---- cpu_down() ... <work runs> __cpufreq_remove_dev() od_dbs_timer() __cpufreq_governor() policy->governor_enabled policy->governor_enabled = false; cpufreq_governor_dbs() case CPUFREQ_GOV_STOP: gov_cancel_work(dbs_data, policy); cpu0 work is canceled timer is canceled cpu1 work is canceled <waits for cpu1> gov_queue_work(, , true); cpu0 work queued cpu1 work queued cpu2 work queued ... cpu1 work is canceled cpu2 work is canceled ... At the end of the GOV_STOP case cpu0 still has a work queued to run although the code is expecting all of the works to be canceled. __cpufreq_remove_dev() will then proceed to re-initialize all the other CPUs works except for the CPU that is going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs() will trample over the queued work and debugobjects will spit out a warning: WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc() ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x14 Modules linked in: CPU: 1 PID: 1205 Comm: sh Tainted: G W 3.10.0 #200 [<c01144f0>] (unwind_backtrace+0x0/0xf8) from [<c0111d98>] (show_stack+0x10/0x14) [<c0111d98>] (show_stack+0x10/0x14) from [<c01272cc>] (warn_slowpath_common+0x4c/0x68) [<c01272cc>] (warn_slowpath_common+0x4c/0x68) from [<c012737c>] (warn_slowpath_fmt+0x30/0x40) [<c012737c>] (warn_slowpath_fmt+0x30/0x40) from [<c034c640>] (debug_print_object+0x94/0xbc) [<c034c640>] (debug_print_object+0x94/0xbc) from [<c034c7f8>] (__debug_object_init+0xc8/0x3c0) [<c034c7f8>] (__debug_object_init+0xc8/0x3c0) from [<c01360e0>] (init_timer_key+0x20/0x104) [<c01360e0>] (init_timer_key+0x20/0x104) from [<c04872ac>] (cpufreq_governor_dbs+0x1dc/0x68c) [<c04872ac>] (cpufreq_governor_dbs+0x1dc/0x68c) from [<c04833a8>] (__cpufreq_governor+0x80/0x1b0) [<c04833a8>] (__cpufreq_governor+0x80/0x1b0) from [<c0483704>] (__cpufreq_remove_dev.isra.12+0x22c/0x380) [<c0483704>] (__cpufreq_remove_dev.isra.12+0x22c/0x380) from [<c0692f38>] (cpufreq_cpu_callback+0x48/0x5c) [<c0692f38>] (cpufreq_cpu_callback+0x48/0x5c) from [<c014fb40>] (notifier_call_chain+0x44/0x84) [<c014fb40>] (notifier_call_chain+0x44/0x84) from [<c012ae44>] (__cpu_notify+0x2c/0x48) [<c012ae44>] (__cpu_notify+0x2c/0x48) from [<c068dd40>] (_cpu_down+0x80/0x258) [<c068dd40>] (_cpu_down+0x80/0x258) from [<c068df40>] (cpu_down+0x28/0x3c) [<c068df40>] (cpu_down+0x28/0x3c) from [<c068e4c0>] (store_online+0x30/0x74) [<c068e4c0>] (store_online+0x30/0x74) from [<c03a7308>] (dev_attr_store+0x18/0x24) [<c03a7308>] (dev_attr_store+0x18/0x24) from [<c0256fe0>] (sysfs_write_file+0x100/0x180) [<c0256fe0>] (sysfs_write_file+0x100/0x180) from [<c01fec9c>] (vfs_write+0xbc/0x184) [<c01fec9c>] (vfs_write+0xbc/0x184) from [<c01ff034>] (SyS_write+0x40/0x68) [<c01ff034>] (SyS_write+0x40/0x68) from [<c010e200>] (ret_fast_syscall+0x0/0x48) In gov_queue_work(), lock cpufreq_governor_lock before gov_queue_work, and unlock it after __gov_queue_work(). In this way, governor_enabled is guaranteed not changed in gov_queue_work(). Signed-off-by: Jane Li <jiel@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-01-06 01:22:02 +01:00
Viresh Kumar	08fd8c1cf0	cpufreq: preserve user_policy across suspend/resume Prevent __cpufreq_add_dev() from overwriting the existing values of user_policy.{min\|max\|policy\|governor} with defaults during resume from system suspend. Fixes: `5302c3fb2e` ("cpufreq: Perform light-weight init/teardown during suspend/resume") Reported-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.12+ <stable@vger.kernel.org> # 3.12+ [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-12-29 15:31:21 +01:00
Rafael J. Wysocki	72368d122c	cpufreq: Clean up after a failing light-weight initialization If cpufreq_policy_restore() returns NULL during system resume, __cpufreq_add_dev() should just fall back to the full initialization instead of returning an error, because that may actually make things work. Moreover, it should not leave stale fallback data behind after it has failed to restore a previously existing policy. This change is based on Viresh Kumar's work. Fixes: `5302c3fb2e` ("cpufreq: Perform light-weight init/teardown during suspend/resume") Reported-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.12+ <stable@vger.kernel.org> # 3.12+	2013-12-29 15:30:36 +01:00
Jason Baron	a27a9ab706	cpufreq: Use CONFIG_CPU_FREQ_DEFAULT_* to set initial policy for setpolicy drivers When configuring a default governor (via CONFIG_CPU_FREQ_DEFAULT_*) with the intel_pstate driver, the desired default policy is not properly set. For example, setting 'CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE' ends up with the 'powersave' policy being set. Fix by configuring the correct default policy, if either 'powersave' or 'performance' are requested. Otherwise, fallback to what the driver originally set via its 'init' routine. Signed-off-by: Jason Baron <jbaron@akamai.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-12-22 00:51:52 +01:00
Viresh Kumar	42f921a6f1	cpufreq: remove sysfs files for CPUs which failed to come back after resume There are cases where cpufreq_add_dev() may fail for some CPUs during system resume. With the current code we will still have sysfs cpufreq files for those CPUs and struct cpufreq_policy would be already freed for them. Hence any operation on those sysfs files would result in kernel warnings. Example of problems resulting from resume errors (from Bjørn Mork): WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212() missing sysfs attribute operations for kobject: (null) Modules linked in: [stripped as irrelevant] CPU: 0 PID: 6055 Comm: grep Tainted: G D 3.13.0-rc2 #153 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011 0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006 ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000 ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310 Call Trace: [<ffffffff81380b0e>] dump_stack+0x55/0x76 [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96 [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212 [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43 [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82 [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212 [<ffffffff811823c7>] sysfs_open_file+0x77/0x212 [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac [<ffffffff81122562>] do_dentry_open+0x17c/0x257 [<ffffffff8112267e>] finish_open+0x41/0x4f [<ffffffff81130225>] do_last+0x80c/0x9ba [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42 [<ffffffff81130606>] path_openat+0x233/0x4a1 [<ffffffff81130b7e>] do_filp_open+0x35/0x85 [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184 [<ffffffff811232ea>] do_sys_open+0x6b/0xfa [<ffffffff811233a7>] SyS_openat+0xf/0x11 [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b To fix this, remove those sysfs files or put the associated kobject in case of such errors. Also, to make it simple, remove the cpufreq sysfs links from all the CPUs (except for the policy->cpu) during suspend, as that operation won't result in a loss of sysfs file permissions and we can create those links during resume just fine. Fixes: `5302c3fb2e` ("cpufreq: Perform light-weight init/teardown during suspend/resume") Reported-and-tested-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.12+ <stable@vger.kernel.org> # 3.12+ [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-12-22 00:47:46 +01:00
Rafael J. Wysocki	d4faadd5d5	Revert "cpufreq: fix garbage kobjects on errors during suspend/resume" Commit `2167e2399d` (cpufreq: fix garbage kobjects on errors during suspend/resume) breaks suspend/resume on Martin Ziegler's system (hard lockup during resume), so revert it. Fixes: `2167e2399d` (cpufreq: fix garbage kobjects on errors during suspend/resume) References: https://bugzilla.kernel.org/show_bug.cgi?id=66751 Reported-by: Martin Ziegler <ziegler@uni-freiburg.de> Cc: 3.12+ <stable@vger.kernel.org> # 3.12+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-12-08 01:32:41 +01:00
Rafael J. Wysocki	12205a4b79	Revert "cpufreq: suspend governors on system suspend/hibernate" Commit `5a87182aa2` (cpufreq: suspend governors on system suspend/hibernate) causes hibernation problems to happen on Bjørn Mork's and Paul Bolle's systems, so revert it. Fixes: `5a87182aa2` (cpufreq: suspend governors on system suspend/hibernate) Reported-by: Bjørn Mork <bjorn@mork.no> Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-12-08 01:04:17 +01:00
Bjørn Mork	2167e2399d	cpufreq: fix garbage kobjects on errors during suspend/resume This is effectively a revert of commit `5302c3fb2e` ("cpufreq: Perform light-weight init/teardown during suspend/resume"), which enabled suspend/resume optimizations leaving the sysfs files in place. Errors during suspend/resume are not handled properly, leaving dead sysfs attributes in case of failures. There are are number of functions with special code for the "frozen" case, and all these need to also have special error handling. The problem is easy to demonstrate by making cpufreq_driver->init() or cpufreq_driver->get() fail during resume. The code is too complex for a simple fix, with split code paths in multiple blocks within a number of functions. It is therefore best to revert the patch enabling this code until the error handling is in place. Examples of problems resulting from resume errors: WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212() missing sysfs attribute operations for kobject: (null) Modules linked in: [stripped as irrelevant] CPU: 0 PID: 6055 Comm: grep Tainted: G D 3.13.0-rc2 #153 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011 0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006 ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000 ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310 Call Trace: [<ffffffff81380b0e>] dump_stack+0x55/0x76 [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96 [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212 [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43 [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82 [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212 [<ffffffff811823c7>] sysfs_open_file+0x77/0x212 [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac [<ffffffff81122562>] do_dentry_open+0x17c/0x257 [<ffffffff8112267e>] finish_open+0x41/0x4f [<ffffffff81130225>] do_last+0x80c/0x9ba [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42 [<ffffffff81130606>] path_openat+0x233/0x4a1 [<ffffffff81130b7e>] do_filp_open+0x35/0x85 [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184 [<ffffffff811232ea>] do_sys_open+0x6b/0xfa [<ffffffff811233a7>] SyS_openat+0xf/0x11 [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b The failure to restore cpufreq devices on cancelled hibernation is not a new bug. It is caused by the ACPI _PPC call failing unless the hibernate is completed. This makes the acpi_cpufreq driver fail its init. Previously, the cpufreq device could be restored by offlining the cpu temporarily. And as a complete hibernation cycle would do this, it would be automatically restored most of the time. But after commit `5302c3fb2e` the leftover sysfs attributes will block any device add action. Therefore offlining and onlining CPU 1 will no longer restore the cpufreq object, and a complete suspend/resume cycle will replace it with garbage. Fixes: `5302c3fb2e` ("cpufreq: Perform light-weight init/teardown during suspend/resume") Cc: 3.12+ <stable@vger.kernel.org> # 3.12+ Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-12-03 15:25:52 +01:00
Viresh Kumar	5a87182aa2	cpufreq: suspend governors on system suspend/hibernate This patch adds cpufreq suspend/resume calls to dpm_{suspend\|resume}_noirq() for handling suspend/resume of cpufreq governors. Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found anr issue where tunables configuration for clusters/sockets with non-boot CPUs was getting lost after suspend/resume, as we were notifying governors with CPUFREQ_GOV_POLICY_EXIT on removal of the last cpu for that policy and so deallocating memory for tunables. This is fixed by this patch as we don't allow any operation on governors after device suspend and before device resume now. Reported-and-tested-by: Lan Tianyu <tianyu.lan@intel.com> Reported-by: Jinhyuk Choi <jinchoi@broadcom.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog, minor cleanups] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-11-28 14:47:31 +01:00

1 2 3 4 5 ...

417 Commits