linux_old1

Commit Graph

Author	SHA1	Message	Date
Arjan van de Ven	6b8fcd9029	ondemand: Solve a big performance issue by counting IOWAIT time as busy The ondemand cpufreq governor uses CPU busy time (e.g. not-idle time) as a measure for scaling the CPU frequency up or down. If the CPU is busy, the CPU frequency scales up, if it's idle, the CPU frequency scales down. Effectively, it uses the CPU busy time as proxy variable for the more nebulous "how critical is performance right now" question. This algorithm falls flat on its face in the light of workloads where you're alternatingly disk and CPU bound, such as the ever popular "git grep", but also things like startup of programs and maildir using email clients... much to the chagarin of Andrew Morton. This patch changes the ondemand algorithm to count iowait time as busy, not idle, time. As shown in the breakdown cases above, iowait is performance critical often, and by counting iowait, the proxy variable becomes a more accurate representation of the "how critical is performance" question. The problem and fix are both verified with the "perf timechar" tool. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100509082606.3d9f00d0@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-09 19:35:27 +02:00
Borislav Petkov	6dad2a2964	cpufreq: Unify sysfs attribute definition macros Multiple modules used to define those which are with identical functionality and were needlessly replicated among the different cpufreq drivers. Push them into the header and remove duplication. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1270065406-1814-7-git-send-email-bp@amd64.org> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-04-09 14:07:56 -07:00
Nagananda.Chumbalkar@hp.com	1dbf58881f	[CPUFREQ] Fix ondemand to not request targets outside policy limits Dominik said: target_freq cannot be below policy->min or above policy->max. If it were, the whole cpufreq subsystem is broken. But (answer): I think the "ondemand" governor can ask for a target frequency that is below policy->min. ... A patch such as below may be needed to sanitize the target frequency requested by "ondemand". The "conservative" governor already has this check: Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com> # diff -bur x/drivers/cpufreq/cpufreq_ondemand.c.orig y/drivers/cpufreq/cpufreq_ondemand.c	2010-01-13 10:55:16 -05:00
Pallipadi, Venkatesh	54c9a35d9f	[CPUFREQ] Resolve time unit thinko in ondemand/conservative govs ondemand and conservative governors are messing up time units in the code path where NO_HZ is not enabled and ignore_nice is set. The walltime idletime stored is in jiffies and nice time calculation is happening in microseconds. The problem was reported and diagnosed by Alexander here. http://marc.info/?l=linux-kernel&m=125752550404513&w=2 The patch below fixes this thinko. Reported-by: Alexander Miller <Miller@fmi.uni-stuttgart.de> Tested-by: Alexander Miller <Miller@fmi.uni-stuttgart.de> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-17 23:15:04 -05:00
Linus Torvalds	714af06938	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Fix NULL ptr regression in powernow-k8 [CPUFREQ] Create a blacklist for processors that should not load the acpi-cpufreq module. [CPUFREQ] Powernow-k8: Enable more than 2 low P-states [CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site) [CPUFREQ] ondemand - Use global sysfs dir for tuning settings [CPUFREQ] Introduce global, not per core: /sys/devices/system/cpu/cpufreq [CPUFREQ] Bail out of cpufreq_add_dev if the link for a managed CPU got created [CPUFREQ] Factor out policy setting from cpufreq_add_dev [CPUFREQ] Factor out interface creation from cpufreq_add_dev [CPUFREQ] Factor out symlink creation from cpufreq_add_dev [CPUFREQ] cleanup up -ENOMEM handling in cpufreq_add_dev [CPUFREQ] Reduce scope of cpu_sys_dev in cpufreq_add_dev [CPUFREQ] update Doc for cpuinfo_cur_freq and scaling_cur_freq	2009-09-18 09:16:57 -07:00
Thomas Renninger	0e625ac153	[CPUFREQ] ondemand - Use global sysfs dir for tuning settings Ondemand has only global variables for userspace tunings via sysfs. But they were exposed per CPU which wrongly implies to the user that his settings are applied per cpu. Also locking sysfs against concurrent access won't be necessary anymore after deprecation time. This means the ondemand config dir is moved: /sys/devices/system/cpu/cpu*/cpufreq/ondemand -> /sys/devices/system/cpu/cpufreq/ondemand The old files will still exist, but reading or writing to them will result in one (printk_once) deprecation msg to syslog per file. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:18 -04:00
Tejun Heo	384be2b18a	Merge branch 'percpu-for-linus' into percpu-for-next Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-08-14 14:45:31 +09:00
venkatesh.pallipadi@intel.com	5a75c82828	[CPUFREQ] Cleanup locking in ondemand governor Redesign the locking inside ondemand driver. Make dbs_mutex handle all the global state changes inside the driver and invent a new percpu mutex to serialize percpu timer and frequency limit change. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-07-06 21:38:28 -04:00
venkatesh.pallipadi@intel.com	7d26e2d5e2	[CPUFREQ] Eliminate the recent lockdep warnings in cpufreq Commit `b14893a62c` although it was very much needed to properly cleanup ondemand timer, opened-up a can of worms related to locking dependencies in cpufreq. Patch here defines the need for dbs_mutex and cleans up its usage in ondemand governor. This also resolves the lockdep warnings reported here http://lkml.indiana.edu/hypermail/linux/kernel/0906.1/01925.html http://lkml.indiana.edu/hypermail/linux/kernel/0907.0/00820.html and few others.. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-07-06 21:38:27 -04:00
Tejun Heo	245b2e70ea	percpu: clean up percpu variable definitions Percpu variable definition is about to be updated such that all percpu symbols including the static ones must be unique. Update percpu variable definitions accordingly. * as,cfq: rename ioc_count uniquely * cpufreq: rename cpu_dbs_info uniquely * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and rename it * ipv4,6: rename cookie_scratch uniquely * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to pmc_irq_entry and nmi_entry to pmc_nmi_entry * perf_counter: rename disable_count to perf_disable_count * ftrace: rename test_event_disable to ftrace_test_event_disable * kmemleak: rename test_pointer to kmemleak_test_pointer * mce: rename next_interval to mce_next_interval [ Impact: percpu usage cleanups, no duplicate static percpu var names ] Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Dave Jones <davej@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: linux-mm <linux-mm@kvack.org> Cc: David S. Miller <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andi Kleen <andi@firstfloor.org>	2009-06-24 15:13:48 +09:00
Thomas Renninger	4f4d1ad6ee	[CPUFREQ] Only set sampling_rate_max deprecated, sampling_rate_min is useful Update the documentation accordingly. Cleanup and use printk_once. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-06-15 11:49:41 -04:00
Thomas Renninger	cef9615a85	[CPUFREQ] ondemand: Uncouple minimal sampling rate from HZ in NO_HZ case With this patch you have following minimal sampling rate restrictions: Kernel restrictions: If CONFIG_NO_HZ is set, the limit is 10ms fixed. If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the limits depend on the CONFIG_HZ option: HZ=1000: min=20000us (20ms) HZ=250: min=80000us (80ms) HZ=100: min=200000us (200ms) HW restrictions: Do not sample/poll more often than HW latency * 100 exported by the low level cpufreq HW driver The higher value of above restrictions is the minimal sampling rate that can be set (and can be seen via ondemand/sampling_rate_min sysfs file) Default sampling rate still is HW latency * 1000, but this will now end up in lower values on latest (Intel and AMD) hardware as these can switch really fast and sampling rate mostly was limited to the 80ms or 200ms (depending on whether HZ=250 or HZ=1000 is used). Signed-off-by: Thomas Renninger <trenn@suse.de> Cc: Pallipadi Venkatesh <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-06-15 11:49:41 -04:00
Mathieu Desnoyers	b14893a62c	[CPUFREQ] fix timer teardown in ondemand governor * Rafael J. Wysocki (rjw@sisk.pl) wrote: > This message has been generated automatically as a part of a report > of regressions introduced between 2.6.28 and 2.6.29. > > The following bug entry is on the current list of known regressions > introduced between 2.6.28 and 2.6.29. Please verify if it still should > be listed and let me know (either way). > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186 > Subject : cpufreq timer teardown problem > Submitter : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> > Date : 2009-04-23 14:00 (24 days old) > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4 > Handled-By : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> > Patch : http://patchwork.kernel.org/patch/19754/ > http://patchwork.kernel.org/patch/19753/ > (updated changelog) cpufreq fix timer teardown in ondemand governor The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the workqueue handler to exit. The ondemand governor does not seem to be affected because the "if (!dbs_info->enable)" check at the beginning of the workqueue handler returns immediately without rescheduling the work. The conservative governor in 2.6.30-rc has the same check as the ondemand governor, which makes things usually run smoothly. However, if the governor is quickly stopped and then started, this could lead to the following race : dbs_enable could be reenabled and multiple do_dbs_timer handlers would run. This is why a synchronized teardown is required. The following patch applies to, at least, 2.6.28.x, 2.6.29.1, 2.6.30-rc2. Depends on patch cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Andrew Morton <akpm@linux-foundation.org> CC: gregkh@suse.de CC: stable@kernel.org CC: cpufreq@vger.kernel.org CC: Ingo Molnar <mingo@elte.hu> CC: rjw@sisk.pl CC: Ben Slusky <sluskyb@paranoiacs.org> Signed-off-by: Dave Jones <davej@redhat.com>	2009-05-26 12:04:50 -04:00
Thomas Renninger	112124ab0a	[CPUFREQ] ondemand/conservative: sanitize sampling_rate restrictions Limit sampling rate to transition_latency * 100 or kernel limits. If sampling_rate is tried to be set too low, set the lowest allowed value. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-02-24 22:47:31 -05:00
Thomas Renninger	9411b4ef7f	[CPUFREQ] ondemand/conservative: deprecate sampling_rate{min,max} The same info can be obtained via the transition_latency sysfs file Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-02-24 22:47:31 -05:00
Dave Jones	2b03f891ad	[CPUFREQ] checkpatch cleanups for ondemand governor. Signed-off-by: Dave Jones <davej@redhat.com>	2009-02-24 22:47:30 -05:00
Venkatesh Pallipadi	1ca3abdb6a	[CPUFREQ] Make ignore_nice_load setting of ondemand work as expected. ondemand micro-accounting of idle time changes broke ignore_nice_load sysfs setting due to a thinko in the code. The bug entry: http://bugzilla.kernel.org/show_bug.cgi?id=12310 Reported-by: Jim Bray <jimsantelmo@gmail.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-02-05 12:25:26 -05:00
Rusty Russell	835481d9bc	cpumask: convert struct cpufreq_policy to cpumask_var_t Impact: use new cpumask API to reduce memory usage This is part of an effort to reduce structure sizes for machines configured with large NR_CPUS. cpumask_t gets replaced by cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or struct cpumask * (large NR_CPUS). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Dave Jones <davej@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-06 09:05:31 +01:00
Andrea Righi	4f6e6b9f97	[CPUFREQ] Fix BUG: using smp_processor_id() in preemptible code Use get_cpu()/put_cpu() in cpufreq_ondemand init routine, instead of smp_processor_id() to avoid the following BUG: [ 35.313118] BUG: using smp_processor_id() in preemptible [00000000] code=: modprobe/4952 [ 35.313132] caller is cpufreq_gov_dbs_init+0xa/0x8f [cpufreq_ondemand] [ 35.313140] Pid: 4952, comm: modprobe Not tainted 2.6.27-rc5-mm1 #23 [ 35.313145] Call Trace: [ 35.313158] [<ffffffff80361ff7>] debug_smp_processor_id+0xd7/0xe0 [ 35.313167] [<ffffffffa010800a>] cpufreq_gov_dbs_init+0xa/0x8f [cpufreq_ondemand] [ 35.313176] [<ffffffff8020903b>] _stext+0x3b/0x160 [ 35.313185] [<ffffffff804768c5>] __mutex_unlock_slowpath+0xe5/0x190 [ 35.313195] [<ffffffff8026236a>] trace_hardirqs_on_caller+0xca/0x140 [ 35.313205] [<ffffffff8026ef4c>] sys_init_module+0xdc/0x210 [ 35.313212] [<ffffffff8020b7cb>] system_call_fastpath+0x16/0x1b Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:44 -04:00
Sven Wegener	c4d14bc0bb	[CPUFREQ] Don't export governors for default governor We don't need to export the governors for use as the default governor, because the default governor will be built-in anyway and we can access the symbol directly. This also fixes the following sparse warnings: drivers/cpufreq/cpufreq_conservative.c:578:25: warning: symbol 'cpufreq_gov_conservative' was not declared. Should it be static? drivers/cpufreq/cpufreq_ondemand.c:582:25: warning: symbol 'cpufreq_gov_ondemand' was not declared. Should it be static? drivers/cpufreq/cpufreq_performance.c:39:25: warning: symbol 'cpufreq_gov_performance' was not declared. Should it be static? drivers/cpufreq/cpufreq_powersave.c:38:25: warning: symbol 'cpufreq_gov_powersave' was not declared. Should it be static? drivers/cpufreq/cpufreq_userspace.c:190:25: warning: symbol 'cpufreq_gov_userspace' was not declared. Should it be static? Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:44 -04:00
venkatesh.pallipadi@intel.com	8080091310	[CPUFREQ][6/6] cpufreq: Add idle microaccounting in ondemand governor Use get_cpu_idle_time_us() to get micro-accounted idle information. This enables ondemand to get more accurate idle and busy timings than the jiffy based calculation. As a result, we can decrease the ondemand safety gaurd band from 80-10 to 95-3. Results in more aggressive power savings. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:44 -04:00
venkatesh.pallipadi@intel.com	e9d95bf7eb	[CPUFREQ][4/6] cpufreq_ondemand: Parameterize down differential Use a parameter for down differential, instead of hardcoded 10%. Follow-on patch changes the down-differential dynamically, based on whether we are using idle micro-accounting or not. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:44 -04:00
venkatesh.pallipadi@intel.com	3430502d35	[CPUFREQ][3/6] cpufreq: get_cpu_idle_time() changes in ondemand for idle-microaccounting Preparatory changes for doing idle micro-accounting in ondemand governor. get_cpu_idle_time() gets extra parameter and returns idle time and also the wall time that corresponds to the idle time measurement. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:44 -04:00
venkatesh.pallipadi@intel.com	c43aa3bd99	[CPUFREQ][2/6] cpufreq: Change load calculation in ondemand for software coordination Change the load calculation algorithm in ondemand to work well with software coordination of frequency across the dependent cpus. Multiply individual CPU utilization with the average freq of that logical CPU during the measurement interval (using getavg call). And find the max CPU utilization number in terms of CPU freq. That number is then used to get to the target freq for next sampling interval. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:43 -04:00
venkatesh.pallipadi@intel.com	bf0b90e357	[CPUFREQ][1/6] cpufreq: Add cpu number parameter to __cpufreq_driver_getavg() Add a cpu parameter to __cpufreq_driver_getavg(). This is needed for software cpufreq coordination where policy->cpu may not be same as the CPU on which we want to getavg frequency. A follow-on patch will use this parameter to getavg freq from all cpus in policy->cpus. Change since last patch. Fix the offline/online and suspend/resume oops reported by Youquan Song <youquan.song@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:43 -04:00
Akinobu Mita	888a794cac	[CPUFREQ] add error handling for cpufreq_register_governor() error Add error handling for cpufreq_register_governor() error Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: cpufreq@lists.linux.org.uk Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:43 -04:00
Mike Travis	068b12772a	cpufreq: use performance variant for_each_cpu_mask_nr Change references from for_each_cpu_mask to for_each_cpu_mask_nr where appropriate Reviewed-by: Paul Jackson <pj@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 18:35:12 +02:00
Johannes Weiner	6915719b36	cpufreq: Initialise default governor before use When the cpufreq driver starts up at boot time, it calls into the default governor which might not be initialised yet. This hurts when the governor's worker function relies on memory that is not yet set up by its init function. This migrates all governors from module_init() to fs_initcall() when being the default, as was already done in cpufreq_performance when it was the only possible choice. The performance governor is always initialized early because it might be used as fallback even when not being the default. Fixes at least one actual oops where ondemand is the default governor and cpufreq_governor_dbs() uses the uninitialised kondemand_wq work-queue during boot-time. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-17 15:38:58 -08:00
Thomas Renninger	1c2562459f	[CPUFREQ] allow ondemand and conservative cpufreq governors to be used as default Depending on the transition latency of the HW for cpufreq switches, the ondemand or conservative governor cannot be used with certain cpufreq drivers. Still the ondemand should be the default governor on a wide range of systems. This patch allows this and lets the governor fallback to the performance governor at cpufreq driver load time, if the driver does not support fast enough frequency switching. Main benefit is that on e.g. installation or other systems without userspace support a working dynamic cpufreq support can be achieved on most systems by simply loading the cpufreq driver. This is especially essential for recent x86(_64) laptop hardware which may rely on working dynamic cpufreq OS support. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com>	2007-10-04 18:40:57 -04:00
Venki Pallipadi	ea48761519	[CPUFREQ] ondemand: fix tickless accounting and software coordination bug With tickless kernel and software coordination os P-states, ondemand can look at wrong idle statistics. This can happen when ondemand sampling is happening on CPU 0 and due to software coordination sampling also looks at utilization of CPU 1. If CPU 1 is in tickless state at that moment, its idle statistics will not be uptodate and CPU 0 thinks CPU 1 is idle for less amount of time than it actually is. This can be resolved by looking at all the busy times of CPUs, which is accurate, even with tickless, and use that to determine idle time in a round about way (total time - busy time). Thanks to Arjan for originally reporting the ondemand bug on Lenovo T61. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2007-06-21 12:57:53 -04:00
Venki Pallipadi	0af99b13c9	[CPUFREQ] ondemand: add a check to avoid negative load calculation Due to rounding and inexact jiffy accounting, idle_ticks can sometimes be higher than total_ticks. Make sure those cases are handled as zero load case. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2007-06-21 12:57:53 -04:00
Venki Pallipadi	28287033e1	Add a new deferrable delayed work init Add a new deferrable delayed work init. This can be used to schedule work that are 'unimportant' when CPU is idle and can be called later, when CPU eventually comes out of idle. Use this init in cpufreq ondemand governor. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:05 -07:00
Oleg Nesterov	48ac3271e5	[CPUFREQ] cpufreq_ondemand.c: don't use _WORK_NAR Looks like dbs_timer() is very careful wrt per_cpu(cpu_dbs_info), and it doesn't need the help of WORK_STRUCT_NOAUTOREL. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-By: David Howells <dhowells@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>	2007-02-20 14:23:43 -05:00
Dave Jones	c18a1483f4	[CPUFREQ] Whitespace fixup Signed-off-by: Dave Jones <davej@redhat.com>	2007-02-10 20:03:51 -05:00
Venkatesh Pallipadi	56463b78cd	[CPUFREQ] ondemand governor use new cpufreq rwsem locking in work callback Eliminate flush_workqueue in cpufreq_governor(STOP) callpath. Using flush there has a deadlock potential as in http://uwsg.iu.edu/hypermail/linux/kernel/0611.3/1223.html Also, cleanup the locking issues with do_dbs_timer delayed_work callback. As it changes the CPU frequency using __cpufreq_target, it needs to have policy_rwsem in write mode, which also protects it from hot plug. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com>	2007-02-10 20:01:48 -05:00
Venkatesh Pallipadi	529af7a14f	[CPUFREQ] ondemand governor restructure the work callback Restructure the delayed_work callback in ondemand. This eliminates the need for smp_processor_id in the callback function and also helps in proper locking and avoiding flush_workqueue when stopping the governor (done in subsequent patch). Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com>	2007-02-10 20:01:47 -05:00
Dave Jones	c120069779	[CPUFREQ] Remove hotplug cpu crap The hotplug CPU locking in cpufreq is horrendous. No-one seems to care enough to fix it, so just remove it so that the 99.9% of the real world users of this code can use cpufreq without being bothered by warnings. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com>	2007-02-10 20:01:47 -05:00
Dave Jones	c4366889dd	Merge ../linus Conflicts: drivers/cpufreq/cpufreq.c	2006-12-12 17:41:41 -05:00
David Howells	c4028958b6	WorkStruct: make allyesconfig Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-11-22 14:57:56 +00:00
Gautham R Shenoy	e08f5f5bb5	[CPUFREQ] Fix coding style issues in cpufreq. Clean up cpufreq subsystem to fix coding style issues and to improve the readability. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Dave Jones <davej@redhat.com>	2006-11-06 19:16:34 -05:00
Jeff Garzik	914f7c31b0	[CPUFREQ] handle sysfs errors Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dave Jones <davej@redhat.com>	2006-10-21 01:33:12 -04:00
Venkatesh Pallipadi	dfde5d62ed	[CPUFREQ][8/8] acpi-cpufreq: Add support for freq feedback from hardware Enable ondemand governor and acpi-cpufreq to use IA32_APERF and IA32_MPERF MSR to get active frequency feedback for the last sampling interval. This will make ondemand take right frequency decisions when hardware coordination of frequency is going on. Without APERF/MPERF, ondemand can take wrong decision at times due to underlying hardware coordination or TM2. Example: * CPU 0 and CPU 1 are hardware cooridnated. * CPU 1 running at highest frequency. * CPU 0 was running at highest freq. Now ondemand reduces it to some intermediate frequency based on utilization. * Due to underlying hardware coordination with other CPU 1, CPU 0 continues to run at highest frequency (as long as other CPU is at highest). * When ondemand samples CPU 0 again next time, without actual frequency feedback from APERF/MPERF, it will think that previous frequency change was successful and can go to wrong target frequency. This is because it thinks that utilization it has got this sampling interval is when running at intermediate frequency, rather than actual highest frequency. More information about IA32_APERF IA32_MPERF MSR: Refer to IA-32 Intel® Architecture Software Developer's Manual at http://developer.intel.com Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2006-10-15 19:57:11 -04:00
Dave Jones	3906f4edee	[CPUFREQ] Fix sparse warning in ondemand drivers/cpufreq/cpufreq_ondemand.c:323:2: warning: Using plain integer as NULL pointer Signed-off-by: Dave Jones <davej@redhat.com>	2006-09-05 17:15:47 -04:00
Adrian Bunk	b5ecf60fe6	[CPUFREQ] make drivers/cpufreq/cpufreq_ondemand.c:powersave_bias_target() static This patch makes the needlessly global powersave_bias_target() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Dave Jones <davej@redhat.com>	2006-08-14 01:18:54 -04:00
Alexey Starikovskiy	05ca0350e8	[CPUFREQ][2/2] ondemand: updated add powersave_bias tunable ondemand selects the minimum frequency that can retire a workload with negligible idle time -- ideally resulting in the highest performance/power efficiency with negligible performance impact. But on some systems and some workloads, this algorithm is more performance biased than necessary, and de-tuning it a bit to allow some performance impact can save measurable power. This patch adds a "powersave_bias" tunable to ondemand to allow it to reduce its target frequency by a specified percent. By default, the powersave_bias is 0 and has no effect. powersave_bias is in units of 0.1%, so it has an effective range of 1 through 1000, resulting in 0.1% to 100% impact. In practice, users will not be able to detect a difference between 0.1% increments, but 1.0% increments turned out to be too large. Also, the max value of 1000 (100%) would simply peg the system in its deepest power saving P-state, unless the processor really has a hardware P-state at 0Hz:-) For example, If ondemand requests 2.0GHz based on utilization, and powersave_bias=100, this code will knock 10% off the target and seek a target of 1.8GHz instead of 2.0GHz until the next sampling. If 1.8 is an exact match with an hardware frequency we use it, otherwise we average our time between the frequency next higher than 1.8 and next lower than 1.8. Note that a user or administrative program can change powersave_bias at run-time depending on how they expect the system to be used. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi at intel.com> Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy at intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2006-08-11 17:59:57 -04:00
Alexey Starikovskiy	1ce28d6b19	[CPUFREQ][1/2] ondemand: updated tune for hardware coordination Try to make dbs_check_cpu() call on all CPUs at the same jiffy. This will help when multiple cores share P-states via Hardware Coordination. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi at intel.com> Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy at intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2006-08-11 17:59:56 -04:00
Arjan van de Ven	153d7f3fca	[PATCH] Reorganize the cpufreq cpu hotplug locking to not be totally bizare The patch below moves the cpu hotplugging higher up in the cpufreq layering; this is needed to avoid recursive taking of the cpu hotplug lock and to otherwise detangle the mess. The new rules are: 1. you must do lock_cpu_hotplug() around the following functions: __cpufreq_driver_target __cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only) __cpufreq_set_policy 2. governer methods (.governer) must NOT take the lock_cpu_hotplug() lock in any way; they are called with the lock taken already 3. if your governer spawns a thread that does things, like calling __cpufreq_driver_target, your thread must honor rule #1. 4. the policy lock and other cpufreq internal locks nest within the lock_cpu_hotplug() lock. I'm not entirely happy about how the __cpufreq_governor rule ended up (conditional locking rule depending on the argument) but basically all callers pass this as a constant so it's not too horrible. The patch also removes the cpufreq_governor() function since during the locking audit it turned out to be entirely unused (so no need to fix it) The patch works on my testbox, but it could use more testing (otoh... it can't be much worse than the current code) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-26 07:21:40 -07:00
Linus Torvalds	2cd7cbdf4b	[cpufreq] ondemand: make shutdown sequence more robust Shutting down the ondemand policy was fraught with potential problems, causing issues for SMP suspend (which wants to hot- unplug) all but the last CPU. This should fix at least the worst problems (divide-by-zero and infinite wait for the workqueue to shut down). Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-23 12:05:00 -07:00
Venkatesh Pallipadi	ffac80e925	[CPUFREQ] Misc cleanups in ondemand. Misc cleanups in ondemand. Should have zero functional impact. Also adding Alexey as author. Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2006-06-30 01:36:40 -04:00
Venkatesh Pallipadi	2f8a835c70	[CPUFREQ] Make ondemand sampling per CPU and remove the mutex usage in sampling path. Make ondemand sampling per CPU and remove the mutex usage in sampling path. Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2006-06-30 01:33:31 -04:00
Venkatesh Pallipadi	ccb2fe209d	[CPUFREQ] Remove slowdown from ondemand sampling path. Remove slowdown from ondemand sampling path. This reduces the code path length in dbs_check_cpu() by half. slowdown was not used by ondemand by default. If there are any user level tools that were using this tunable, they may report error now. Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2006-06-30 01:29:47 -04:00
Andrew Morton	138a0128c0	[PATCH] cpufreq build fix drivers/cpufreq/cpufreq_ondemand.c: In function 'do_dbs_timer': drivers/cpufreq/cpufreq_ondemand.c:374: warning: implicit declaration of function 'lock_cpu_hotplug' drivers/cpufreq/cpufreq_ondemand.c:381: warning: implicit declaration of function 'unlock_cpu_hotplug' drivers/cpufreq/cpufreq_conservative.c: In function 'do_dbs_timer': drivers/cpufreq/cpufreq_conservative.c:425: warning: implicit declaration of function 'lock_cpu_hotplug' drivers/cpufreq/cpufreq_conservative.c:432: warning: implicit declaration of function 'unlock_cpu_hotplug' Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 08:47:27 -07:00
Venkatesh Pallipadi	4ec223d02f	[CPUFREQ] Fix ondemand vs suspend deadlock Rootcaused the bug to a deadlock in cpufreq and ondemand. Due to non-existent ordering between cpu_hotplug lock and dbs_mutex. Basically a race condition between cpu_down() and do_dbs_timer(). cpu_down() flow: * cpu_down() call for CPU 1 * Takes hot plug lock * Calls pre down notifier * cpufreq notifier handler calls cpufreq_driver_target() which takes cpu_hotplug lock again. OK as cpu_hotplug lock is recursive in same process context * CPU 1 goes down * Calls post down notifier * cpufreq notifier handler calls ondemand event stop which takes dbs_mutex So, cpu_hotplug lock is taken before dbs_mutex in this flow. do_dbs_timer is triggerred by a periodic timer event. It first takes dbs_mutex and then takes cpu_hotplug lock in cpufreq_driver_target(). Note the reverse order here compared to above. So, if this timer event happens at right moment during cpu_down, system will deadlok. Attached patch fixes the issue for both ondemand and conservative. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2006-06-21 18:30:26 -04:00
Andi Kleen	6810b548b2	[PATCH] x86_64: Move ondemand timer into own work queue Taking the cpu hotplug semaphore in a normal events workqueue is unsafe because other tasks can wait for any workqueues with it hold. This results in a deadlock. Move the DBS timer into its own work queue which is not affected by other work queue flushes to avoid this. Has been acked by Venkatesh. Cc: venkatesh.pallipadi@intel.com Cc: cpufreq@lists.linux.org.uk Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-05-08 09:34:56 -07:00
Dominik Brodowski	7c9d8c0e84	[PATCH] cpufreq_ondemand: add range check Assert that cpufreq_target is, at least, called with the minimum frequency allowed by this policy, not something lower. It triggered problems on ARM. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>	2006-03-26 11:11:03 +02:00
Eric Piel	9cbad61b41	[PATCH] cpufreq_ondemand: keep ignore_nice_load value when it is reselected Keep the value of ignore_nice_load of the ondemand governor even after the governor has been deselected and selected back. This is the behavior of the other exported values of the ondemand governor and it's much more user-friendly. Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>	2006-03-26 10:46:18 +02:00
Eric Piel	ff8c288d7d	[PATCH] cpufreq_ondemand: Warn if it cannot run due to too long transition latency Display a warning if the ondemand governor can not be selected due to a transition latency of the cpufreq driver which is too long. Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>	2006-03-26 10:43:06 +02:00
Dave Jones	32ee8c3e47	[CPUFREQ] Lots of whitespace & CodingStyle cleanup. Signed-off-by: Dave Jones <davej@redhat.com>	2006-02-28 00:43:23 -05:00
akpm@osdl.org	3fc54d37ab	[CPUFREQ] Convert drivers/cpufreq semaphores to mutexes. Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dave Jones <davej@redhat.com>	2006-01-18 13:53:45 -08:00
Alexander Clouter	001893cda2	[PATCH] cpufreq_conservative/ondemand: invert meaning of 'ignore nice' The use of the 'ignore_nice' sysfs file is confusing to anyone using it. This removes the sysfs file 'ignore_nice' and in its place creates a 'ignore_nice_load' entry that defaults to '0'; meaning nice'd processes _are_ counted towards the 'business' calculation. WARNING: this obvious breaks any userland tools that expected ignore_nice' to exist, to draw attention to this fact it was concluded on the mailing list that the entry should be removed altogether so the userland app breaks and so the author can build simple to detect workaround. Having said that it seems currently very few tools even make use of this functionality; all I could find was a Gentoo Wiki entry. Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dave Jones <davej@redhat.com>	2005-12-01 01:23:23 -08:00
Dave Jones	df8b59be09	[CPUFREQ] Avoid the ondemand cpufreq governor to use a too high frequency for stats. The problem is in the ondemand governor, there is a periodic measurement of the CPU usage. This CPU usage is updated by the scheduler after every tick (basically, by adding 1 either to "idle" or to "user" or to "system"). So if the frequency of the governor is too high, the stat will be meaningless (as mostly no number have changed). So this patch checks that the measurements are separated by at least 10 ticks. It means that by default, stats will have about 5% error (20 ticks). Of course those numbers can be argued but, IMHO, they look sane. The patch also includes a small clean-up to check more explictly the result of the conversion from ns to µs being null. Let's note that (on x86) this has never been really needed before 2.6.13 because HZ was always 1000. Now that HZ can be 100, some CPU might be affected by this problem. For instance when HZ=100, the centrino ,which has a 10µs transition latency, would lead to the governor allowing to read stats every tick (10ms)! Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Dave Jones <davej@redhat.com>	2005-09-20 12:39:35 -07:00
Dave Jones	e131832ca7	[CPUFREQ] ondemand governor default sampling downfactor as 1 [PATCH] [5/5] ondemand governor default sampling downfactor as 1 Make default sampling downfactor 1. This works better with earlier auto downscaling change in ondemand governor. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:50 -07:00
Dave Jones	c29f140309	[CPUFREQ] ondemand governor automatic downscaling [PATCH] [4/5] ondemand governor automatic downscaling Here is a change of policy for the ondemand governor. The modification concerns the frequency downscaling. Instead of decreasing to a lower frequency when the CPU usage is under 20%, this new policy automatically scales to the optimal frequency. The optimal frequency being the lowest frequency which provides enough power to not trigger the upscaling policy. Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:50 -07:00
Dave Jones	9c7d269b9b	[CPUFREQ] ondemand,conservative governor idle_tick clean-up [PATCH] [3/5] ondemand,conservative governor idle_tick clean-up Ondemand and conservative governor clean-up, it factorises the idle ticks measurement. Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:49 -07:00
Dave Jones	790d76fa97	[CPUFREQ] ondemand,conservative governor store the idle ticks for all cpus [PATCH] [2/5] ondemand,conservative governor store the idle ticks for all cpus Ondemand, conservative governor did not store prev_cpu_idle_up into prev_cpu_idle_down for other CPUs than the current CPU. Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:49 -07:00
Dave Jones	dac1c1a562	[CPUFREQ] ondemand,conservative minor bug-fix and cleanup [PATCH] [1/5] ondemand,conservative minor bug-fix and cleanup Attached patch fixes some minor issues with Alexander's patch and related cleanup in both ondemand and conservative governor. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:49 -07:00
Dave Jones	1206aaac28	[CPUFREQ] Allow ondemand stepping to be changed by user. Adds support so that the cpufreq change stepping is no longer fixed at 5% and can be changed dynamically by the user Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:48 -07:00
Dave Jones	c11420a616	[CPUFREQ] Prevents un-necessary cpufreq changes if we are already at min/max Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:48 -07:00
Dave Jones	3d5ee9e55d	[CPUFREQ] Add support to cpufreq_ondemand to ignore 'nice' cpu time Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:47 -07:00
Dave Jones	7f335d4ef2	[CPUFREQ] make cpufreq_gov_dbs static This patch makes a needlessly global and EXPORT_SYMBOL'ed struct static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:46 -07:00
Dave Jones	6fe711658f	[CPUFREQ] ondemand: trivial clean-ups Trivial ondemand governor clean-ups: - change from sampling_rate_in_HZ() to the official function usecs_to_jiffies(). - use for_each_online_cpu() to instead of using "if (cpu_online(i))" Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:44 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

1 2 3

122 Commits