linux_old1

Commit Graph

Author	SHA1	Message	Date
Rafael J. Wysocki	6d45b719cb	intel_pstate: Fix intel_pstate_get() After commit `8fa520af50` "intel_pstate: Remove freq calculation from intel_pstate_calc_busy()" intel_pstate_get() calls get_avg_frequency() to compute the average frequency, which is problematic for two reasons. First, intel_pstate_get() may be invoked before the driver reads the CPU feedback registers for the first time and if that happens, get_avg_frequency() will attempt to divide by zero. Second, the get_avg_frequency() call in intel_pstate_get() is racy with respect to intel_pstate_sample() and it may end up returning completely meaningless values for this reason. Moreover, after commit `7349ec0470` "intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()" sample.core_pct_busy is never computed on Atom, but it is used in intel_pstate_adjust_busy_pstate() in that case too. To address those problems notice that if sample.core_pct_busy was used in the average frequency computation carried out by get_avg_frequency(), both the divide by zero problem and the race with respect to intel_pstate_sample() would be avoided. Accordingly, move the invocation of intel_pstate_calc_busy() from get_target_pstate_use_performance() to intel_pstate_update_util(), which also will take care of the uninitialized sample.core_pct_busy on Atom, and modify get_avg_frequency() to use sample.core_pct_busy as per the above. Reported-by: kernel test robot <ying.huang@linux.intel.com> Link: http://marc.info/?l=linux-kernel&m=146226437623173&w=4 Fixes: `8fa520af50` "intel_pstate: Remove freq calculation from intel_pstate_calc_busy()" Fixes: `7349ec0470` "intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()" Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-05-04 14:09:16 +02:00
Rafael J. Wysocki	ba41e1bc28	cpufreq: intel_pstate: Fix HWP on boot CPU after system resume Commit `41cfd64cf4` "Update frequencies of policy->cpus only from ->set_policy()" changed the way the intel_pstate driver's ->set_policy callback updates the HWP (hardware-managed P-states) settings. A side effect of it is that if those settings are modified on the boot CPU during system suspend and wakeup, they will never be restored during subsequent system resume. To address this problem, allow cpufreq drivers that don't provide ->target or ->target_index callbacks to use ->suspend and ->resume callbacks and add a ->resume callback to intel_pstate to restore the HWP settings on the CPUs that belong to the given policy. Fixes: `41cfd64cf4` "Update frequencies of policy->cpus only from ->set_policy()" Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-05-02 13:48:15 +02:00
Aneesh Kumar K.V	d2adba3fd1	powerpc/mm: Abstraction for switch_mmu_context() How we switch MMU context differs between hash and radix. For hash we need to switch the SLB details and for radix we need to switch the PID SPR. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-01 18:33:04 +10:00
Rafael J. Wysocki	81be193b7e	Merge branch 'pm-cpufreq-fixes' * pm-cpufreq-fixes: cpufreq: intel_pstate: Fix processing for turbo activation ratio Revert "cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC"	2016-04-29 14:22:25 +02:00
Sudeep Holla	2482bc31ca	cpufreq: st: enable selective initialization based on the platform The sti-cpufreq does unconditional registration of the cpufreq-dt driver which causes issue on an multi-platform build. For example, on Vexpress TC2 platform, we get the following error on boot: cpu cpu0: OPP-v2 not supported cpu cpu0: Not doing voltage scaling cpu: dev_pm_opp_of_cpumask_add_table: couldn't find opp table for cpu:0, -19 cpu cpu0: dev_pm_opp_get_max_volt_latency: Invalid regulator (-6) ... arm_big_little: bL_cpufreq_register: Failed registering platform driver: vexpress-spc, err: -17 The actual driver fails to initialise as cpufreq-dt is probed successfully, which is incorrect. This issue can happen to any platform not using cpufreq-dt in a multi-platform build. This patch adds a check to do selective initialization of the driver. Fixes: `ab0ea257fc` (cpufreq: st: Provide runtime initialised driver for ST's platforms) Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Lee Jones <lee.jones@linaro.org> Cc: 4.5+ <stable@vger.kernel.org> # 4.5+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:25:56 +02:00
Viresh Kumar	9f123def55	cpufreq: mvebu: Move cpufreq code into drivers/cpufreq/ Move cpufreq bits for mvebu into drivers/cpufreq/ directory, that's where they really belong to. Compiled tested only. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:22:43 +02:00
Viresh Kumar	eb96924acd	cpufreq: dt: Kill platform-data There are no more users of platform-data for cpufreq-dt driver, get rid of it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:22:43 +02:00
Viresh Kumar	1530b9963e	cpufreq: dt: Identify cpu-sharing for platforms without operating-points-v2 Existing platforms, which do not support operating-points-v2, can explicitly tell the opp core that some of the CPUs share opp tables, with help of dev_pm_opp_set_sharing_cpus(). For such platforms, explicitly ask the opp core to provide list of CPUs sharing the opp table with current cpu device, before falling back to platform data. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:22:42 +02:00
Rafael J. Wysocki	b4f4b4b371	cpufreq: governor: Change confusing struct field and variable names The name of the prev_cpu_wall field in struct cpu_dbs_info is confusing, because it doesn't represent wall time, but the previous update time as returned by get_cpu_idle_time() (that may be the current value of jiffies_64 in some cases, for example). Moreover, the names of some related variables in dbs_update() take that confusion further. Rename all of those things to make their names reflect the purpose more accurately. While at it, drop unnecessary parens from one of the updated expressions. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Chen Yu <yu.c.chen@intel.com>	2016-04-28 15:10:08 +02:00
Srinivas Pandruvada	2b3ec76505	cpufreq: intel_pstate: Enable PPC enforcement for servers For platforms which are controlled via remove node manager, enable _PPC by default. These platforms are mostly categorized as enterprise server or performance servers. These platforms needs to go through some certifications tests, which tests control via _PPC. The relative risk of enabling by default is low as this is is less likely that these systems have broken _PSS table. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 01:01:39 +02:00
Srinivas Pandruvada	3be9200d51	cpufreq: intel_pstate: Adjust policy->max When policy->max is changed via _PPC or sysfs and is more than the max non turbo frequency, it does not really change resulting performance in some processors. When policy->max results in a P-State ratio more than the turbo activation ratio, then processor can choose any P-State up to max turbo. So the user or _PPC setting has no value, but this can cause undesirable side effects like: - Showing reduced max percentage in Intel P-State sysfs - It can cause reduced max performance under certain boundary conditions: The requested max scaling frequency either via _PPC or via cpufreq-sysfs, will be converted into a fixed floating point max percent scale. In majority of the cases this will result in correct max. But not 100% of the time. If the _PPC is requested at a point where the calculation lead to a lower max, this can result in a lower P-State then expected and it will impact performance. Example of this condition using a Broadwell laptop with config TDP. ACPI _PSS table from a Broadwell laptop `2301000` 2300000 2200000 2000000 1900000 1800000 1700000 1500000 1400000 1300000 1100000 1000000 900000 800000 600000 500000 The actual results by disabling config TDP so that we can get what is requested on or below 2300000Khz. scaling_max_freq Max Requested P-State Resultant scaling max ---------------------------------------- ---------------------- 2400000 18 `2900000` (max turbo) 2300000 17 2300000 (max physical non turbo) 2200000 15 2100000 2100000 15 2100000 2000000 13 1900000 1900000 13 1900000 1800000 12 1800000 1700000 11 1700000 1600000 10 1600000 1500000 f 1500000 1400000 e 1400000 1300000 d 1300000 1200000 c 1200000 1100000 a 1000000 1000000 a 1000000 900000 9 900000 800000 8 800000 700000 7 700000 600000 6 600000 500000 5 500000 ------------------------------------------------------------------ Now set the config TDP level 1 ratio as 0x0b (equivalent to 1100000KHz) in BIOS (not every system will let you adjust this). The turbo activation ratio will be set to one less than that, which will be 0x0a (So any request above 1000000KHz should result in turbo region assuming no thermal limits). Here _PPC will request max to 1100000KHz (which basically should still result in turbo as this is more than the turbo activation ratio up to max allowable turbo frequency), but actual calculation resulted in a max ceiling P-State which is 0x0a. So under any load condition, this driver will not request turbo P-States. This will be a huge performance hit. When config TDP feature is ON, if the _PPC points to a frequency above turbo activation ratio, the performance can still reach max turbo. In this case we don't need to treat this as the reduced frequency in set_policy callback. In this change when config TDP is active (by checking if the physical max non turbo ratio is more than the current max non turbo ratio), any request above current max non turbo is treated as full performance. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw : Minor cleanups ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 01:01:39 +02:00
Srinivas Pandruvada	9522a2ff9c	cpufreq: intel_pstate: Enforce _PPC limits Use ACPI _PPC notification to limit max P state driver will request. ACPI _PPC change notification is sent by BIOS to limit max P state in several cases: - Reduce impact of platform thermal condition - When Config TDP feature is used, a changed _PPC is sent to follow TDP change - Remote node managers in server want to control platform power via baseboard management controller (BMC) This change registers with ACPI processor performance lib so that _PPC changes are notified to cpufreq core, which in turns will result in call to .setpolicy() callback. Also the way _PSS table identifies a turbo frequency is not compatible to max turbo frequency in intel_pstate, so the very first entry in _PSS needs to be adjusted. This feature can be turned on by using kernel parameters: intel_pstate=support_acpi_ppc Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Minor cleanups ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 01:01:39 +02:00
Akshay Adiga	eaa2c3aeef	cpufreq: powernv: Ramp-down global pstate slower than local-pstate The frequency transition latency from pmin to pmax is observed to be in few millisecond granurality. And it usually happens to take a performance penalty during sudden frequency rampup requests. This patch set solves this problem by using an entity called "global pstates". The global pstate is a Chip-level entity, so the global entitiy (Voltage) is managed across the cores. The local pstate is a Core-level entity, so the local entity (frequency) is managed across threads. This patch brings down global pstate at a slower rate than the local pstate. Hence by holding global pstates higher than local pstate makes the subsequent rampups faster. A per policy structure is maintained to keep track of the global and local pstate changes. The global pstate is brought down using a parabolic equation. The ramp down time to pmin is set to ~5 seconds. To make sure that the global pstates are dropped at regular interval , a timer is queued for every 2 seconds during ramp-down phase, which eventually brings the pstate down to local pstate. Iozone results show fairly consistent performance boost. YCSB on redis shows improved Max latencies in most cases. Iozone write/rewite test were made with filesizes 200704Kb and 401408Kb with different record sizes . The following table shows IOoperations/sec with and without patch. Iozone Results ( in op/sec) ( mean over 3 iterations ) --------------------------------------------------------------------- file size- with without % recordsize-IOtype patch patch change ---------------------------------------------------------------------- 200704-1-SeqWrite 1616532 1615425 0.06 200704-1-Rewrite 2423195 2303130 5.21 200704-2-SeqWrite 1628577 1602620 1.61 200704-2-Rewrite 2428264 2312154 5.02 200704-4-SeqWrite 1617605 1617182 0.02 200704-4-Rewrite 2430524 2351238 3.37 200704-8-SeqWrite 1629478 1600436 1.81 200704-8-Rewrite `2415308` 2298136 5.09 200704-16-SeqWrite 1619632 1618250 0.08 200704-16-Rewrite 2396650 2352591 1.87 200704-32-SeqWrite 1632544 1598083 2.15 200704-32-Rewrite 2425119 2329743 4.09 200704-64-SeqWrite 1617812 1617235 0.03 200704-64-Rewrite 2402021 2321080 3.48 200704-128-SeqWrite 1631998 1600256 1.98 200704-128-Rewrite 2422389 2304954 5.09 200704-256 SeqWrite 1617065 1616962 0.00 200704-256-Rewrite 2432539 2301980 5.67 200704-512-SeqWrite 1632599 1598656 2.12 200704-512-Rewrite 2429270 2323676 4.54 200704-1024-SeqWrite 1618758 1616156 0.16 200704-1024-Rewrite 2431631 2315889 4.99 401408-1-SeqWrite 1631479 1608132 1.45 401408-1-Rewrite 2501550 2459409 1.71 401408-2-SeqWrite 1617095 1626069 -0.55 401408-2-Rewrite 2507557 2443621 2.61 401408-4-SeqWrite 1629601 1611869 1.10 401408-4-Rewrite 2505909 2462098 1.77 401408-8-SeqWrite 1617110 1626968 -0.60 401408-8-Rewrite 2512244 2456827 2.25 401408-16-SeqWrite 1632609 1609603 1.42 401408-16-Rewrite 2500792 2451405 2.01 401408-32-SeqWrite 1619294 1628167 -0.54 401408-32-Rewrite 2510115 2451292 2.39 401408-64-SeqWrite 1632709 1603746 1.80 401408-64-Rewrite 2506692 2433186 3.02 401408-128-SeqWrite 1619284 1627461 -0.50 401408-128-Rewrite 2518698 2453361 2.66 401408-256-SeqWrite 1634022 1610681 1.44 401408-256-Rewrite 2509987 2446328 2.60 401408-512-SeqWrite 1617524 1628016 -0.64 401408-512-Rewrite 2504409 2442899 2.51 401408-1024-SeqWrite 1629812 1611566 1.13 401408-1024-Rewrite 2507620 2442968 2.64 Tested with YCSB workload (50% update + 50% read) over redis for 1 million records and 1 million operation. Each test was carried out with target operations per second and persistence disabled. Max-latency (in us)( mean over 5 iterations ) --------------------------------------------------------------- op/s Operation with patch without patch %change --------------------------------------------------------------- 15000 Read 61480.6 50261.4 22.32 15000 cleanup 215.2 293.6 -26.70 15000 update 25666.2 25163.8 2.00 25000 Read 32626.2 89525.4 -63.56 25000 cleanup 292.2 263.0 11.10 25000 update 32293.4 90255.0 -64.22 35000 Read 34783.0 33119.0 5.02 35000 cleanup 321.2 395.8 -18.8 35000 update 36047.0 38747.8 -6.97 40000 Read 38562.2 42357.4 -8.96 40000 cleanup 371.8 384.6 -3.33 40000 update 27861.4 41547.8 -32.94 45000 Read 42271.0 88120.6 -52.03 45000 cleanup 263.6 383.0 -31.17 45000 update 29755.8 81359.0 -63.43 (test without target op/s) 47659 Read 83061.4 136440.6 -39.12 47659 cleanup 195.8 193.8 1.03 47659 update 73429.4 124971.8 -41.24 Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-27 23:56:58 +02:00
Shilpasri G Bhat	2920e9ce8f	cpufreq: powernv: Remove flag use-case of policy->driver_data commit `1b0289848d` ("cpufreq: powernv: Add sysfs attributes to show throttle stats") used policy->driver_data as a flag for one-time creation of throttle sysfs files. Instead of this use 'kernfs_find_and_get()' to check if the attribute already exists. This is required as policy->driver_data is used for other purposes in the later patch. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-27 23:56:58 +02:00
Javier Martinez Canillas	6de0dc4b53	cpufreq: e_powersaver: Use IS_ENABLED() instead of checking for built-in or module The IS_ENABLED() macro checks if a Kconfig symbol has been enabled either built-in or as a module, use that macro instead of open coding the same. Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-27 22:42:34 +02:00
Srinivas Pandruvada	1becf03545	cpufreq: intel_pstate: Fix processing for turbo activation ratio When the config TDP level is not nominal (level = 0), the MSR values for reading level 1 and level 2 ratios contain power in low 14 bits and actual ratio bits are at bits [23:16]. The current processing for level 1 and level 2 is wrong as there is no shift done to get actual ratio. Fixes: `6a35fc2d6c` (cpufreq: intel_pstate: get P1 from TAR when available) Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 4.4+ <stable@vger.kernel.org> # 4.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 23:39:09 +02:00
Rafael J. Wysocki	ba1ca654f3	cpufreq: governor: Fix prev_load initialization in cpufreq_governor_start() The way cpufreq_governor_start() initializes j_cdbs->prev_load is questionable. First off, j_cdbs->prev_cpu_wall used as a denominator in the computation may be zero. The case this happens is when get_cpu_idle_time_us() returns -1 and get_cpu_idle_time_jiffy() used to return that number is called exactly at the jiffies_64 wrap time. It is rather hard to trigger that error, but it is not impossible and it will just crash the kernel then. Second, j_cdbs->prev_load is computed as the average load during the entire time since the system started and it may not reflect the load in the previous sampling period (as it is expected to). That doesn't play well with the way dbs_update() uses that value. Namely, if the update time delta (wall_time) happens do be greater than twice the sampling rate on the first invocation of it, the initial value of j_cdbs->prev_load (which may be completely off) will be returned to the caller as the current load (unless it is equal to zero and unless another CPU sharing the same policy object has a greater load value). For this reason, notice that the prev_load field of struct cpu_dbs_info is only used by dbs_update() and only in that one place, so if cpufreq_governor_start() is modified to always initialize it to 0, it will make dbs_update() always compute the actual load first time it checks the update time delta against the doubled sampling rate (after initialization) and there won't be any side effects of it. Consequently, modify cpufreq_governor_start() as described. Fixes: `18b46abd00` (cpufreq: governor: Be friendly towards latency-sensitive bursty workloads) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-25 16:21:34 +02:00
Viresh Kumar	3920be471c	cpufreq: hisilicon: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:24 +02:00
Viresh Kumar	5e4249c6d9	cpufreq: zynq: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:24 +02:00
Viresh Kumar	117d4f59af	cpufreq: sunxi: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:24 +02:00
Viresh Kumar	a399dc9fc5	cpufreq: shmobile: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Finley Xiao	014400c127	cpufreq: rockchip: Use generic platdev driver This patch add rockchip's compatible string to the compat list and remove similar code from platform code for supporting generic platdev driver. Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Viresh Kumar	7694ca6e1d	cpufreq: omap: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Viresh Kumar	7ead83f6df	cpufreq: imx: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Note that the complete routine imx27_dt_init() is removed as of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL); has same effect as a NULL .init_machine machine callback pointer. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Viresh Kumar	a59511d1da	cpufreq: berlin: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Antoine Tenart <antoine.tenart@free-electrons.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Viresh Kumar	e92bb16674	cpufreq: dt: Mark platdev machines array as __initconst The machines array in cpufreq-dt-platdev is used only once at boot time and so should be marked with __initconst, so that kernel can free up memory used for it, if required. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:22 +02:00
Jia Hongtao	394cb8316b	cpufreq: qoriq: Fix cooling device registration issue during suspend Cooling device is registered by ready callback. It's also invoked while system resuming from sleep (Enabling non-boot cpus). Thus cooling device may be multiple registered. Matchable unregistration is added to exit callback to fix this issue. Signed-off-by: Jia Hongtao <hongtao.jia@nxp.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:07:02 +02:00
Jia Hongtao	495c716f17	cpufreq: qoriq: Remove __exit macro from .exit callback .exit callback (qoriq_cpufreq_cpu_exit()) is also used during suspend. So __exit macro should be removed or the function will be discarded. Signed-off-by: Jia Hongtao <hongtao.jia@nxp.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:07:02 +02:00
Jia Hongtao	27b8fe8daf	cpufreq: qoriq: Don't show cooling device messages if THERMAL_OF undefined When THERMAL_OF is undefined the cooling device messages should not be shown. -ENOSYS is returned from of_cpufreq_cooling_register() when THERMAL_OF is undefined. Signed-off-by: Jia Hongtao <hongtao.jia@nxp.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:00:42 +02:00
Ashwin Chaugule	a29a1e7678	cpufreq: ACPI / CPPC: Add module support for cppc_cpufreq driver Add a function to cleanup at module exit and export appropriate GPL string to enable moduler support for the cppc_cpufreq driver. Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 15:59:35 +02:00
Philippe Longepe	bdcaa23fb8	cpufreq: intel_pstate: Use average P-State instead of current P-State The result returned by pid_calc() is subtracted from current_pstate (which is the P-State requested during the last period) in order to obtain the target P-State for the current iteration. However, current_pstate may not reflect the real current P-State of the CPU. In particular, that P-State may be higher because of the frequency sharing per module. The theory is: - The load is the percentage of time spent in C0 and is related to the average P-State during the same period. - The last requested P-State can be completely different than the average P-State (because of frequency sharing or throttling). - The P-State shift computed by the pid_calc is based on the load computed at average P-State, so the shift must be relative to this average P-State. Using the average P-State instead of current P-State improves power without significant performance penalty in cases when a task migrates from one core to other core sharing frequency and voltage. Performance and power comparison with this patch on Cherry Trail platform using Android: Benchmark ?Perf ?Power FishTank 10.45% 3.1% SmartBench-Gaming -0.1% -10.4% SmartBench-Productivity -0.8% -10.4% CandyCrush n/a -17.4% AngryBirds n/a -5.9% videoPlayback n/a -13.9% audioPlayback n/a -4.9% IcyRocks-20-50 0.0% -38.4% iozone RR -0.16% -1.3% iozone RW 0.74% -1.3% Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 15:45:11 +02:00
Rafael J. Wysocki	1cbc99dfe5	Merge back cpufreq changes for v4.7.	2016-04-25 15:44:01 +02:00
Rafael J. Wysocki	94862a62df	Revert "cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC" Revert commit `0df35026c6` (cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC) that introduced a regression by causing the ondemand cpufreq governor to misbehave for CONFIG_TICK_CPU_ACCOUNTING unset (the frequency goes up to the max at one point and stays there indefinitely). The revert takes subsequent modifications of the code in question into account. Fixes: `0df35026c6` (cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC) Link: https://bugzilla.kernel.org/show_bug.cgi?id=115261 Reported-and-tested-by: Timo Valtoaho <timo.valtoaho@gmail.com> Cc: 4.5+ <stable@vger.kernel.org> # 4.5+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-25 02:39:20 +02:00
Rafael J. Wysocki	395da1259a	Merge branch 'pm-cpufreq-fixes' * pm-cpufreq-fixes: cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set intel_pstate: Avoid getting stuck in high P-states when idle	2016-04-21 20:57:46 +02:00
Ingo Molnar	6666ea558b	Linux 4.6-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXFELfAAoJEHm+PkMAQRiGRYIH+wWsUva7TR9arN1ZrURvI17b KqyQH8Ov9zJBsIaq/rFXOr5KfNgx7BU9BL9h7QkBy693HXTWf+GTZ1czHM4N12C3 0ZdHGrLwTHo2zdisiQaFORZSfhSVTUNGXGHXw13bUMgEqatPgkozXEnsvXXNdt1Z HtlcuJn3pcj+QIY7qDXZgTLTwgn248hi1AgNag+ntFcWiz21IYaMIi7/mCY9QUIi AY+Y3hqFQM7/8cVyThGS5wZPTg1YzdhsLJpoCk0TbS8FvMEnA+ylcTgc15C78bwu AxOwM3OCmH4gMsd7Dd/O+i9lE3K6PFrgzdDisYL3P7eHap+EdiLDvVzPDPPx0xg= =Q7r3 -----END PGP SIGNATURE----- Merge tag 'v4.6-rc4' into x86/asm, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-19 10:38:52 +02:00
Rafael J. Wysocki	c9d9c929e6	cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set Since governor operations are generally skipped if cpufreq_suspended is set, cpufreq_start_governor() should do nothing in that case. That function is called in the cpufreq_online() path, and may also be called from cpufreq_offline() in some cases, which are invoked by the nonboot CPUs disabing/enabling code during system suspend to RAM and resume. That happens when all devices have been suspended, so if the cpufreq driver relies on things like I2C to get the current frequency, it may not be ready to do that then. To prevent problems from happening for this reason, make cpufreq_update_current_freq(), which is the only function invoked by cpufreq_start_governor() that doesn't check cpufreq_suspended already, return 0 upfront if cpufreq_suspended is set. Fixes: `3bbf8fe3ae` (cpufreq: Always update current frequency before startig governor) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-18 23:47:42 +02:00
Borislav Petkov	93984fbd4e	x86/cpufeature: Replace cpu_has_apic with boot_cpu_has() usage Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: iommu@lists.linux-foundation.org Cc: linux-pm@vger.kernel.org Cc: oprofile-list@lists.sf.net Link: http://lkml.kernel.org/r/1459801503-15600-8-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-13 11:37:41 +02:00
Paul Gortmaker	6a0bcab9c6	drivers/cpufreq: make ppc_cbe_cpufreq_pmi driver explicitly non-modular The Kconfig for this driver is currently: config CPU_FREQ_CBE_PMI bool "CBE frequency scaling using PMI interface" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular and unused code here, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Christian Krafft <krafft@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-pm@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-11 20:30:43 +10:00
Rafael J. Wysocki	ffb810563c	intel_pstate: Avoid getting stuck in high P-states when idle Jörg Otte reports that commit `a4675fbc4a` (cpufreq: intel_pstate: Replace timers with utilization update callbacks) caused the CPUs in his Haswell-based system to stay in the very high frequency region even if the system is completely idle. That turns out to be an existing problem in the intel_pstate driver's P-state selection algorithm for Core processors. Namely, all decisions made by that algorithm are based on the average frequency of the CPU between sampling events and on the P-state requested on the last invocation, so it may get stuck at a very hight frequency even if the utilization of the CPU is very low (in fact, it may get stuck in a inadequate P-state regardless of the CPU utilization). The only way to kick it out of that limbo is a sufficiently long idle period (3 times longer than the prescribed sampling interval), but if that doesn't happen often enough (eg. due to a timing change like after the above commit), the P-state of the CPU may be inadequate pretty much all the time. To address the most egregious manifestations of that issue, reset the core_busy value used to determine the next P-state to request if the utilization of the CPU, determined with the help of the MPERF feedback register and the TSC, is below 1%. Link: https://bugzilla.kernel.org/show_bug.cgi?id=115771 Reported-and-tested-by: Jörg Otte <jrg.otte@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-10 05:59:10 +02:00
Viresh Kumar	8cee1eed8e	cpufreq: ACPI: Remove freq_table from acpi_cpufreq_data The freq-table is stored in struct cpufreq_policy also and there is absolutely no need of keeping a copy of its reference in struct acpi_cpufreq_data. Drop it. Also policy->freq_table can't be NULL in the target() callback, remove the useless check as well. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:54:31 +02:00
Viresh Kumar	9b55f55af8	cpufreq: ACPI: policy->driver_data can't be NULL in ->exit() Its always set by ->init() and so it will always be there in ->exit(). There is no need to have a special check for just that. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:41:50 +02:00
Rafael J. Wysocki	a794d6138c	cpufreq: Rearrange cpufreq_add_dev() Reorganize the code in cpufreq_add_dev() to avoid using the ret variable and reduce the indentation level in it. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-09 01:37:07 +02:00
Rafael J. Wysocki	cd73e9b01f	cpufreq: Simplify switch () in cpufreq_cpu_callback() Merge two switch entries that do the same thing in cpufreq_cpu_callback(). No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-09 01:37:07 +02:00
Joe Perches	1c5864e26c	cpufreq: Use consistent prefixing via pr_fmt Use the more common kernel style adding a define for pr_fmt. Miscellanea: o Remove now unused PFX defines Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:35:18 +02:00
Joe Perches	b49c22a6ca	cpufreq: Convert printk(KERN_<LEVEL> to pr_<level> Use the more common logging style. Miscellanea: o Coalesce formats o Realign arguments o Add a missing space between a coalesced format Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:35:18 +02:00
Joe Perches	4836df173a	intel_pstate: Use pr_fmt Prefix the output using the more common kernel style. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:33:42 +02:00
Geliang Tang	d2499d05f0	cpufreq: mt8173: use list_for_each_entry() Use list_for_each_entry() instead of list_for_each*() to simplify the code. Signed-off-by: Geliang Tang <geliangtang@163.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:27:54 +02:00
Rafael J. Wysocki	22590efb98	intel_pstate: Avoid pointless FRAC_BITS shifts under div_fp() There are multiple places in intel_pstate where int_tofp() is applied to both arguments of div_fp(), but this is pointless, because int_tofp() simply shifts its argument to the left by FRAC_BITS which mathematically is equivalent to multuplication by 2^FRAC_BITS, so if this is done to both arguments of a division, the extra factors will cancel each other during that operation anyway. Drop the pointless int_tofp() applied to div_fp() arguments throughout the driver. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:25:58 +02:00
Viresh Kumar	2249c00a0b	cpufreq: exynos: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:18:42 +02:00
Viresh Kumar	f56aad1d98	cpufreq: dt: Add generic platform-device creation support Multiple platforms are using the generic cpufreq-dt driver now, and all of them are required to create a platform device with name "cpufreq-dt", in order to get the cpufreq-dt probed. Many of them do it from platform code, others have special drivers just to do that. It would be more sensible to do this at a generic place, where all such platform can mark their entries. This patch adds a separate file to get this device created. Currently the compat list of platforms that we support is empty, and will be filled in as and when we move platforms to use it. It always compiles as part of the kernel and so doesn't need a module-exit operation. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:18:42 +02:00
Viresh Kumar	f3f24dea2c	cpufreq: tegra124: No need of setting platform-data All CPUs on Tegra platform share clock/voltage lines and there is absolutely no need of setting platform data for 'cpufreq-dt' platform device, as that's the default case. Stop setting platform data for cpufreq-dt device. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Thierry Reding <treding@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:12:09 +02:00
Paul Gortmaker	dbbe972c11	cpufreq: ppc_cbe_cpufreq_pmi: make the driver explicitly non-modular The Kconfig for this driver is currently: config CPU_FREQ_CBE_PMI bool "CBE frequency scaling using PMI interface" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular and unused code here, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:11:04 +02:00
Rafael J. Wysocki	4b42fafc1c	Merge branch 'pm-cpufreq-sched' into pm-cpufreq	2016-04-09 01:08:02 +02:00
Rafael J. Wysocki	6c9d9c8192	cpufreq: Call cpufreq_disable_fast_switch() in sugov_exit() Due to differences in the cpufreq core's handling of runtime CPU offline and nonboot CPUs disabling during system suspend-to-RAM, fast frequency switching gets disabled after a suspend-to-RAM and resume cycle on all of the nonboot CPUs. To prevent that from happening, move the invocation of cpufreq_disable_fast_switch() from cpufreq_exit_governor() to sugov_exit(), as the schedutil governor is the only user of fast frequency switching today anyway. That simply prevents cpufreq_disable_fast_switch() from being called without invoking the ->governor callback for the CPUFREQ_GOV_POLICY_EXIT event (which happens during system suspend now). Fixes: `b7898fda5b` (cpufreq: Support for fast frequency switching) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-08 22:41:36 +02:00
Rafael J. Wysocki	fa81e66ec8	Merge branches 'pm-cpufreq', 'pm-cpuidle' and 'acpi-cppc' * pm-cpufreq: cpufreq: dt: Drop stale comment cpufreq: intel_pstate: Documenation for structures cpufreq: intel_pstate: fix inconsistency in setting policy limits intel_pstate: Avoid extra invocation of intel_pstate_sample() intel_pstate: Do not set utilization update hook too early * pm-cpuidle: intel_idle: Add KBL support intel_idle: Add SKX support intel_idle: Clean up all registered devices on exit. intel_idle: Propagate hot plug errors. intel_idle: Don't overreact to a cpuidle registration failure. intel_idle: Setup the timer broadcast only on successful driver load. intel_idle: Avoid a double free of the per-CPU data. intel_idle: Fix dangling registration on error path. intel_idle: Fix deallocation order on the driver exit path. intel_idle: Remove redundant initialization calls. intel_idle: Fix a helper function's return value. intel_idle: remove useless return from void function. * acpi-cppc: mailbox: pcc: Don't access an unmapped memory address space	2016-04-08 21:46:05 +02:00
Viresh Kumar	b318556479	cpufreq: dt: Drop stale comment The comment in file header doesn't hold true anymore, drop it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-05 03:40:44 +02:00
Srinivas Pandruvada	13ad7701f9	cpufreq: intel_pstate: Documenation for structures No code change. Only added kernel doc style comments for structures. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-05 03:39:05 +02:00
Srinivas Pandruvada	30a3915385	cpufreq: intel_pstate: fix inconsistency in setting policy limits When user sets performance policy using cpufreq interface, it is possible that because of policy->max limits, the actual performance is still limited. But the current implementation will silently switch the policy to powersave and start using powersave limits. If user modifies any limits using intel_pstate sysfs, this is actually changing powersave limits. The current implementation tracks limits under powersave and performance policy using two different variables. When policy->max is less than policy->cpuinfo.max_freq, only powersave limit variable is used. This fix causes the performance limits variable to be used always when the policy is performance. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-05 03:37:13 +02:00
Rafael J. Wysocki	9bdcb44e39	cpufreq: schedutil: New governor based on scheduler utilization data Add a new cpufreq scaling governor, called "schedutil", that uses scheduler-provided CPU utilization information as input for making its decisions. Doing that is possible after commit `34e2c555f3` (cpufreq: Add mechanism for registering utilization update callbacks) that introduced cpufreq_update_util() called by the scheduler on utilization changes (from CFS) and RT/DL task status updates. In particular, CPU frequency scaling decisions may be based on the the utilization data passed to cpufreq_update_util() by CFS. The new governor is relatively simple. The frequency selection formula used by it depends on whether or not the utilization is frequency-invariant. In the frequency-invariant case the new CPU frequency is given by next_freq = 1.25 * max_freq * util / max where util and max are the last two arguments of cpufreq_update_util(). In turn, if util is not frequency-invariant, the maximum frequency in the above formula is replaced with the current frequency of the CPU: next_freq = 1.25 * curr_freq * util / max The coefficient 1.25 corresponds to the frequency tipping point at (util / max) = 0.8. All of the computations are carried out in the utilization update handlers provided by the new governor. One of those handlers is used for cpufreq policies shared between multiple CPUs and the other one is for policies with one CPU only (and therefore it doesn't need to use any extra synchronization means). The governor supports fast frequency switching if that is supported by the cpufreq driver in use and possible for the given policy. In the fast switching case, all operations of the governor take place in its utilization update handlers. If fast switching cannot be used, the frequency switch operations are carried out with the help of a work item which only calls __cpufreq_driver_target() (under a mutex) to trigger a frequency update (to a value already computed beforehand in one of the utilization update handlers). Currently, the governor treats all of the RT and DL tasks as "unknown utilization" and sets the frequency to the allowed maximum when updated from the RT or DL sched classes. That heavy-handed approach should be replaced with something more subtle and specifically targeted at RT and DL tasks. The governor shares some tunables management code with the "ondemand" and "conservative" governors and uses some common definitions from cpufreq_governor.h, but apart from that it is stand-alone. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-04-02 01:09:12 +02:00
Rafael J. Wysocki	b7898fda5b	cpufreq: Support for fast frequency switching Modify the ACPI cpufreq driver to provide a method for switching CPU frequencies from interrupt context and update the cpufreq core to support that method if available. Introduce a new cpufreq driver callback, ->fast_switch, to be invoked for frequency switching from interrupt context by (future) governors supporting that feature via (new) helper function cpufreq_driver_fast_switch(). Add two new policy flags, fast_switch_possible, to be set by the cpufreq driver if fast frequency switching can be used for the given policy and fast_switch_enabled, to be set by the governor if it is going to use fast frequency switching for the given policy. Also add a helper for setting the latter. Since fast frequency switching is inherently incompatible with cpufreq transition notifiers, make it possible to set the fast_switch_enabled only if there are no transition notifiers already registered and make the registration of new transition notifiers fail if fast_switch_enabled is set for at least one policy. Implement the ->fast_switch callback in the ACPI cpufreq driver and make it set fast_switch_possible during policy initialization as appropriate. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:03 +02:00
Rafael J. Wysocki	379480d825	cpufreq: Move governor symbols to cpufreq.h Move definitions of symbols related to transition latency and sampling rate to include/linux/cpufreq.h so they can be used by (future) goverernors located outside of drivers/cpufreq/. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:02 +02:00
Rafael J. Wysocki	66893b6ac9	cpufreq: Move governor attribute set headers to cpufreq.h Move definitions and function headers related to struct gov_attr_set to include/linux/cpufreq.h so they can be used by (future) goverernors located outside of drivers/cpufreq/. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:02 +02:00
Rafael J. Wysocki	2d0c58ad60	cpufreq: governor: Move abstract gov_attr_set code to seperate file Move abstract code related to struct gov_attr_set to a separate (new) file so it can be shared with (future) goverernors that won't share more code with "ondemand" and "conservative". No intentional functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:01 +02:00
Rafael J. Wysocki	0dd3c1d678	cpufreq: governor: New data type for management part of dbs_data In addition to fields representing governor tunables, struct dbs_data contains some fields needed for the management of objects of that type. As it turns out, that part of struct dbs_data may be shared with (future) governors that won't use the common code used by "ondemand" and "conservative", so move it to a separate struct type and modify the code using struct dbs_data to follow. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:00 +02:00
Rafael J. Wysocki	0bed612be6	cpufreq: sched: Helpers to add and remove update_util hooks Replace the single helper for adding and removing cpufreq utilization update hooks, cpufreq_set_update_util_data(), with a pair of helpers, cpufreq_add_update_util_hook() and cpufreq_remove_update_util_hook(), and modify the users of cpufreq_set_update_util_data() accordingly. With the new helpers, the code using them doesn't need to worry about the internals of struct update_util_data and in particular it doesn't need to worry about populating the func field in it properly upfront. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-04-02 01:08:43 +02:00
Rafael J. Wysocki	9fa64d6424	Merge back intel_pstate fixes for v4.6. * pm-cpufreq: intel_pstate: Avoid extra invocation of intel_pstate_sample() intel_pstate: Do not set utilization update hook too early	2016-04-02 01:07:17 +02:00
Rafael J. Wysocki	febce40feb	intel_pstate: Avoid extra invocation of intel_pstate_sample() The initialization of intel_pstate for a given CPU involves populating the fields of its struct cpudata that represent the previous sample, but currently that is done in a problematic way. Namely, intel_pstate_init_cpu() makes an extra call to intel_pstate_sample() so it reads the current register values that will be used to populate the "previous sample" record during the next invocation of intel_pstate_sample(). However, after commit `a4675fbc4a` (cpufreq: intel_pstate: Replace timers with utilization update callbacks) that doesn't work for last_sample_time, because the time value is passed to intel_pstate_sample() as an argument now. Passing 0 to it from intel_pstate_init_cpu() is problematic, because that causes cpu->last_sample_time == 0 to be visible in get_target_pstate_use_performance() (and hence the extra cpu->last_sample_time > 0 check in there) and effectively allows the first invocation of intel_pstate_sample() from intel_pstate_update_util() to happen immediately after the initialization which may lead to a significant "turn on" effect in the governor algorithm. To mitigate that issue, rework the initialization to avoid the extra intel_pstate_sample() call from intel_pstate_init_cpu(). Instead, make intel_pstate_sample() return false if it has been called with cpu->sample.time equal to zero, which will make intel_pstate_update_util() skip the sample in that case, and reset cpu->sample.time from intel_pstate_set_update_util_hook() to make the algorithm start properly every time the hook is set. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-02 01:06:21 +02:00
Rafael J. Wysocki	bb6ab52f2b	intel_pstate: Do not set utilization update hook too early The utilization update hook in the intel_pstate driver is set too early, as it only should be set after the policy has been fully initialized by the core. That may cause intel_pstate_update_util() to use incorrect data and put the CPUs into incorrect P-states as a result. To prevent that from happening, make intel_pstate_set_policy() set the utilization update hook instead of intel_pstate_init_cpu() so intel_pstate_update_util() only runs when all things have been initialized as appropriate. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-31 17:42:15 +02:00
Linus Torvalds	3d66c6ba3f	Power management and ACPI material for v4.6-rc1, part 2 - Fix for an intel_pstate driver issue related to the handling of MSR updates uncovered by the recent cpufreq rework (Rafael Wysocki). - cpufreq core cleanups related to starting governors and frequency synchronization during resume from system suspend and a locking fix for cpufreq_quick_get() (Rafael Wysocki, Richard Cochran). - acpi-cpufreq and powernv cpufreq driver updates (Jisheng Zhang, Michael Neuling, Richard Cochran, Shilpasri Bhat). - intel_idle driver update preventing some Skylake-H systems from hanging during initialization by disabling deep C-states mishandled by the platform in the problematic configurations (Len Brown). - Intel Xeon Phi Processor x200 support for intel_idle (Dasaratharaman Chandramouli). - cpuidle menu governor updates to make it always honor PM QoS latency constraints (and prevent C1 from being used as the fallback C-state on x86 when they are set below its exit latency) and to restore the previous behavior to fall back to C1 if the next timer event is set far enough in the future that was changed in 4.4 which led to an energy consumption regression (Rik van Riel, Rafael Wysocki). - New device ID for a future AMD UART controller in the ACPI driver for AMD SoCs (Wang Hongcheng). - Rockchip rk3399 support for the rockchip-io-domain adaptive voltage scaling (AVS) driver (David Wu). - ACPI PCI resources management fix for the handling of IO space resources on architectures where the IO space is memory mapped (IA64 and ARM64) broken by the introduction of common ACPI resources parsing for PCI host bridges in 4.4 (Lorenzo Pieralisi). - Fix for the ACPI backend of the generic device properties API to make it parse non-device (data node only) children of an ACPI device correctly (Irina Tirdea). - Fixes for the handling of global suspend flags (introduced in 4.4) during hibernation and resume from it (Lukas Wunner). - Support for obtaining configuration information from Device Trees in the PM clocks framework (Jon Hunter). - ACPI _DSM helper code and devfreq framework cleanups (Colin Ian King, Geert Uytterhoeven). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJW9JaRAAoJEILEb/54YlRx/GAQAJujANWilWHZYm24a9JDcIE9 rsNZIC/FdeBVilPtRTZQnig/Pj32Z4Jm7IZ/DLOq0Deu1YK/9uv3y59M3BcX6WyL H5VR80L8geUJZ7RRk0WfM5D4X82ovzwpE/kWt2Z7HDuvJSCBmFBZOvNrXbaRncKD jIvat/p6uCuxt5c08+ebnBLQ6tOs8wLTWiCx3fO128GIrGRGN2xFV6hzRWVGnJ4g WXGAR+AdLxRMZz4PPmqdTfRj4TNSR071GjKyaeKfZUjQGAsf5O9A77JFjeNVomDx g1K37Byid2bTByzVavlEXPJZ7eKb5dAhlo7IJ9HAcOAXChLqH2Czjrpd+1XjR9MF SV/78rCnF8eet83QYLbGV/Mzf7gbJP2Xp6wiaM22VAPpGe+sYfphJoQka9XRTfId OgAjyYMYdWAKo5DhxVNI8WyN0W5dsoBFPxnaUFhHSGDCIJH7Ksy20m6y3plG2Bxf ahoiQhmd9ohjtB5JbRnf4MY0hjekp8Srdf+DoNKsk/+JscIyROpYY3msQ3smUKo+ f628MC/wAosMpSV+l+KOYkbjCbtB49IabWtZ//NVD9hYB3E1f6aTN59yFbWB+1rp L7Y8iaxzSkyJy/yYVuBal3rSk356+BvvoXBlLXmBsyu1TMlcDjALIYztSiTVT5MB RZBhgNwdkxNCYJfU3ex+ =hUVj -----END PGP SIGNATURE----- Merge tag 'pm+acpi-4.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management and ACPI updates from Rafael Wysocki: "The second batch of power management and ACPI updates for v4.6. Included are fixups on top of the previous PM/ACPI pull request and other material that didn't make into it but still should go into 4.6. Among other things, there's a fix for an intel_pstate driver issue uncovered by recent cpufreq changes, a workaround for a boot hang on Skylake-H related to the handling of deep C-states by the platform and a PCI/ACPI fix for the handling of IO port resources on non-x86 architectures plus some new device IDs and similar. Specifics: - Fix for an intel_pstate driver issue related to the handling of MSR updates uncovered by the recent cpufreq rework (Rafael Wysocki). - cpufreq core cleanups related to starting governors and frequency synchronization during resume from system suspend and a locking fix for cpufreq_quick_get() (Rafael Wysocki, Richard Cochran). - acpi-cpufreq and powernv cpufreq driver updates (Jisheng Zhang, Michael Neuling, Richard Cochran, Shilpasri Bhat). - intel_idle driver update preventing some Skylake-H systems from hanging during initialization by disabling deep C-states mishandled by the platform in the problematic configurations (Len Brown). - Intel Xeon Phi Processor x200 support for intel_idle (Dasaratharaman Chandramouli). - cpuidle menu governor updates to make it always honor PM QoS latency constraints (and prevent C1 from being used as the fallback C-state on x86 when they are set below its exit latency) and to restore the previous behavior to fall back to C1 if the next timer event is set far enough in the future that was changed in 4.4 which led to an energy consumption regression (Rik van Riel, Rafael Wysocki). - New device ID for a future AMD UART controller in the ACPI driver for AMD SoCs (Wang Hongcheng). - Rockchip rk3399 support for the rockchip-io-domain adaptive voltage scaling (AVS) driver (David Wu). - ACPI PCI resources management fix for the handling of IO space resources on architectures where the IO space is memory mapped (IA64 and ARM64) broken by the introduction of common ACPI resources parsing for PCI host bridges in 4.4 (Lorenzo Pieralisi). - Fix for the ACPI backend of the generic device properties API to make it parse non-device (data node only) children of an ACPI device correctly (Irina Tirdea). - Fixes for the handling of global suspend flags (introduced in 4.4) during hibernation and resume from it (Lukas Wunner). - Support for obtaining configuration information from Device Trees in the PM clocks framework (Jon Hunter). - ACPI _DSM helper code and devfreq framework cleanups (Colin Ian King, Geert Uytterhoeven)" * tag 'pm+acpi-4.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits) PM / AVS: rockchip-io: add io selectors and supplies for rk3399 intel_idle: Support for Intel Xeon Phi Processor x200 Product Family intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled ACPI / PM: Runtime resume devices when waking from hibernate PM / sleep: Clear pm_suspend_global_flags upon hibernate cpufreq: governor: Always schedule work on the CPU running update cpufreq: Always update current frequency before startig governor cpufreq: Introduce cpufreq_update_current_freq() cpufreq: Introduce cpufreq_start_governor() cpufreq: powernv: Add sysfs attributes to show throttle stats cpufreq: acpi-cpufreq: make Intel/AMD MSR access, io port access static PCI: ACPI: IA64: fix IO port generic range check ACPI / util: cast data to u64 before shifting to fix sign extension cpufreq: powernv: Define per_cpu chip pointer to optimize hot-path cpuidle: menu: Fall back to polling if next timer event is near cpufreq: acpi-cpufreq: Clean up hot plug notifier callback intel_pstate: Do not call wrmsrl_on_cpu() with disabled interrupts cpufreq: Make cpufreq_quick_get() safe to call ACPI / property: fix data node parsing in acpi_get_next_subnode() ACPI / APD: Add device HID for future AMD UART controller ...	2016-03-24 22:59:58 -07:00
Rafael J. Wysocki	33068b61f8	Merge branches 'pm-cpufreq' and 'pm-cpuidle' * pm-cpufreq: cpufreq: governor: Always schedule work on the CPU running update cpufreq: Always update current frequency before startig governor cpufreq: Introduce cpufreq_update_current_freq() cpufreq: Introduce cpufreq_start_governor() cpufreq: powernv: Add sysfs attributes to show throttle stats cpufreq: acpi-cpufreq: make Intel/AMD MSR access, io port access static cpufreq: powernv: Define per_cpu chip pointer to optimize hot-path cpufreq: acpi-cpufreq: Clean up hot plug notifier callback intel_pstate: Do not call wrmsrl_on_cpu() with disabled interrupts cpufreq: Make cpufreq_quick_get() safe to call * pm-cpuidle: intel_idle: Support for Intel Xeon Phi Processor x200 Product Family intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled cpuidle: menu: Fall back to polling if next timer event is near cpuidle: menu: use high confidence factors only when considering polling	2016-03-25 00:57:22 +01:00
Rafael J. Wysocki	539a4c4247	cpufreq: governor: Always schedule work on the CPU running update Modify dbs_irq_work() to always schedule the process-context work on the current CPU which also ran the dbs_update_util_handler() that the irq_work being handled came from. This causes the entire frequency update handling (involving the "ondemand" or "conservative" governors) to be carried out by the CPU whose frequency is to be updated and reduces the overall amount of inter-CPU noise related to cpufreq. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-22 23:13:36 +01:00
Rafael J. Wysocki	3bbf8fe3ae	cpufreq: Always update current frequency before startig governor Make policy->cur match the current frequency returned by the driver's ->get() callback before starting the governor in case they went out of sync in the meantime and drop the piece of code attempting to resync policy->cur with the real frequency of the boot CPU from cpufreq_resume() as it serves no purpose any more (and it's racy and super-ugly anyway). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-22 23:13:36 +01:00
Rafael J. Wysocki	999f572983	cpufreq: Introduce cpufreq_update_current_freq() Move the part of cpufreq_update_policy() that obtains the current frequency from the driver and updates policy->cur if necessary to a separate function, cpufreq_get_current_freq(). That should not introduce functional changes and subsequent change set will need it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-22 23:13:36 +01:00
Rafael J. Wysocki	0a300767e5	cpufreq: Introduce cpufreq_start_governor() Starting a governor in cpufreq always follows the same pattern involving two calls to cpufreq_governor(), one with the event argument set to CPUFREQ_GOV_START and one with that argument set to CPUFREQ_GOV_LIMITS. Introduce cpufreq_start_governor() that will carry out those two operations and make all places where governors are started use it. That slightly modifies the behavior of cpufreq_set_policy() which now also will go back to the old governor if the second call to cpufreq_governor() (the one with event equal to CPUFREQ_GOV_LIMITS) fails, but that really is how it should work in the first place. Also cpufreq_resume() will now pring an error message if the CPUFREQ_GOV_LIMITS call to cpufreq_governor() fails, but that makes it follow cpufreq_add_policy_cpu() and cpufreq_offline() in that respect. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-22 23:13:36 +01:00
Shilpasri G Bhat	1b0289848d	cpufreq: powernv: Add sysfs attributes to show throttle stats Create sysfs attributes to export throttle information in /sys/devices/system/cpu/cpuX/cpufreq/throttle_stats directory. The newly added sysfs files are as follows: 1)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/turbo_stat 2)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/sub-turbo_stat 3)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/unthrottle 4)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/powercap 5)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/overtemp 6)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/supply_fault 7)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/overcurrent 8)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/occ_reset Detailed explanation of each attribute is added to Documentation/ABI/testing/sysfs-devices-system-cpu Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-22 23:11:18 +01:00
Jisheng Zhang	ac13b9967d	cpufreq: acpi-cpufreq: make Intel/AMD MSR access, io port access static These frequency register read/write operations' implementations for the given processor (Intel/AMD MSR access or I/O port access) are only used internally in acpi-cpufreq, so make them static. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-22 23:09:50 +01:00
Michael Neuling	3e5963bc34	cpufreq: powernv: Define per_cpu chip pointer to optimize hot-path Commit `96c4726f01` "cpufreq: powernv: Remove cpu_to_chip_id() from hot-path" introduced a 'core_to_chip_map' array to cache the chip-ids of all cores. Replace this with a per-CPU variable that stores the pointer to the chip-array. This removes the linear lookup and provides a neater and simpler solution. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-22 01:08:55 +01:00
Linus Torvalds	46e595a17d	ARM: SoC driver updates for v4.6 Driver updates for ARM SoCs, these contain various things that touch the drivers/ directory but got merged through arm-soc for practical reasons: - Rockchip rk3368 gains power domain support - Small updates for the ARM spmi driver - The Atmel PMC driver saw a larger rework, touching both arch/arm/mach-at91 and drivers/clk/at91 - All reset controller driver changes alway get merged through arm-soc, though this time the largest change is the addition of a MIPS pistachio reset driver - One bugfix for the NXP (formerly Freescale) i.MX weim bus driver -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIVAwUAVu67OmCrR//JCVInAQJ64hAAqNemdAMloJhh8mk4O74egd/XNE8GLK3v gGefpZNi0TC8u/GWMhU1aFCElaCmbNlL0IlqaRrU/vydOmQcZYht7Fg3bAm4r3ck TlKijGTJap4sdHhxSeui+7bhaBToxcklQTdcrKFgOwsype7CAWJCl5otIC/GHO5L fn4QSjQbqr5kqH1XfuVIphj/fJjDKRRze5D7zn0nExq46OyoYyjc2lm/QkLgeeS2 vDpzOULYXcjf5GfsPknCJGGjenISD7cIAwZukGvJXFh8WrXkEPZZ7B7bBI/8ZeBU MkdWvOm9fHEWpIPnuTcLeQNlfdzQ0Z0zijgJqnXjwSYXK2Es1UKEoIFvZUyGA9zG uyLtddFcKbP4QBDUKVMbyYM6x4Cj7LO96dB2pe8iH5rvnoLS32EjJ/4glnbPQFB7 75JKb7eU1pijoy9c3x/G10vINHzbPjyUN3sYTFKMomPFzEF4OVQ3GDclSuD7jjDr GnqmAqlj29+qGU6iQBBHp9TfLTxwrs/4MKPEZ+tTGvtINnzOpLGA3TUnji7nVFQc BYy3qaEvg9MfHI3uXhAl2L4CGCVvHfqFs5B7giZfAkbbcTNAHs9PkZ6gMYH+GG3p tEbTf/dMHmkkqttSz4f7LZS7D56cSfm3cD8kFCRJPLKifmGAk3w1HZ7JoCXdjr1K 22HSKRMxlhU= =HS4G -----END PGP SIGNATURE----- Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Arnd Bergmann: "Driver updates for ARM SoCs, these contain various things that touch the drivers/ directory but got merged through arm-soc for practical reasons: - Rockchip rk3368 gains power domain support - Small updates for the ARM spmi driver - The Atmel PMC driver saw a larger rework, touching both arch/arm/mach-at91 and drivers/clk/at91 - All reset controller driver changes alway get merged through arm-soc, though this time the largest change is the addition of a MIPS pistachio reset driver - One bugfix for the NXP (formerly Freescale) i.MX weim bus driver" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (43 commits) bus: imx-weim: Take the 'status' property value into account clk: at91: remove useless includes clk: at91: pmc: remove useless capacities handling clk: at91: pmc: drop at91_pmc_base usb: gadget: atmel: access the PMC using regmap ARM: at91: remove useless includes and function prototypes ARM: at91: pm: move idle functions to pm.c ARM: at91: pm: find and remap the pmc ARM: at91: pm: simply call at91_pm_init clk: at91: pmc: move pmc structures to C file clk: at91: pmc: merge at91_pmc_init in atmel_pmc_probe clk: at91: remove IRQ handling and use polling clk: at91: make use of syscon/regmap internally clk: at91: make use of syscon to share PMC registers in several drivers hwmon: (scpi) add energy meter support firmware: arm_scpi: add support for 64-bit sensor values firmware: arm_scpi: decrease Tx timeout to 20ms firmware: arm_scpi: fix send_message and sensor_get_value for big-endian reset: sti: Make reset_control_ops const reset: zynq: Make reset_control_ops const ...	2016-03-20 15:40:32 -07:00
Linus Torvalds	33b3d2e88c	ARM: SoC platform updates for v4.6 Newly added support for additional SoCs: - Axis Artpec-6 SoC family - Allwinner A83T SoC - Mediatek MT7623 - NXP i.MX6QP SoC - ST Microelectronics stm32f469 microcontroller New features: - SMP support for Mediatek mt2701 - Big-endian support for NXP i.MX - DaVinci now uses the new DMA engine dma_slave_map - OMAP now uses the new DMA engine dma_slave_map - earlyprintk support for palmchip uart on mach-tango - delay timer support for orion Other: - Exynos PMU driver moved out to drivers/soc/ - Various smaller updates for Renesas, Xilinx, PXA, AT91, OMAP, uniphier -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIVAwUAVu68DGCrR//JCVInAQIHVQ//Wblms+NKj3aKh6m2Sscs/YkSbFaQ4sY2 rNyfxLIYsLXkth1kbdHRFSMyL68Ym+xutErgw/3HQPB2D1YtYJE3VJ/y8AU92SU3 oHyQIty+atB8d8zBbtlkWmat94NIfYf0I8PQETreGb1LMaJqAf0mDEDAyorTLZcZ UtQ817Ihn7urqwdTJpTO58V41RmY/vflbHI5T6bIjUJn6fF1e/7+VqtMIfq5sjJ6 0EPEQdu8s5AJ7gcGlGi9I5gAtSnWSA/9phAxul9P8/HrMpUWIxreSEAy8FY7W14F 4TON3sQrnw7nyA72U80KGIXhgLy7SbEmHcSqyy4YJK3ycdk6VYk0CBO7nWVYAiD1 knLisOH6jwe0LIj9WXiRR+Y2Q53pXN8SF77pLDahSnvuShnYEjEH5uELHtxe7Vxh gn+NH1rDkRTgdYgt4RWlVyUoLkddQWzLb1m4QyQlvxtTR25cJJayXdVX2MRrNPF5 c1zRa9HH+b8LJQIMdWfo/NoHhHtftkkGGsqHAAaypZqdpyk0j2HpJYk5ecPR4f5C /8o/h/5xOI9gEzp/DVYSZ1VAvRqBQGIDfKBXWq6GuoZaF0aN8ISe5IxFn5Yx2F46 fNaxqiNpWmyywl8D+tSWPFK6aE21AXKGi5zIzexZZqy283aDjlUPI+tgF2GKIuKP 3ayYTDeBpLI= =ynNj -----END PGP SIGNATURE----- Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC platform updates from Arnd Bergmann: "Newly added support for additional SoCs: - Axis Artpec-6 SoC family - Allwinner A83T SoC - Mediatek MT7623 - NXP i.MX6QP SoC - ST Microelectronics stm32f469 microcontroller New features: - SMP support for Mediatek mt2701 - Big-endian support for NXP i.MX - DaVinci now uses the new DMA engine dma_slave_map - OMAP now uses the new DMA engine dma_slave_map - earlyprintk support for palmchip uart on mach-tango - delay timer support for orion Other: - Exynos PMU driver moved out to drivers/soc/ - Various smaller updates for Renesas, Xilinx, PXA, AT91, OMAP, uniphier" * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (83 commits) ARM: uniphier: rework SMP code to support new System Bus binding ARM: uniphier: add missing of_node_put() ARM: at91: avoid defining CONFIG_* symbols in source code ARM: DRA7: hwmod: Add data for eDMA tpcc, tptc0, tptc1 ARM: imx: Make reset_control_ops const ARM: imx: Do L2 errata only if the L2 cache isn't enabled ARM: imx: select ARM_CPU_SUSPEND only for imx6 dmaengine: pxa_dma: fix the maximum requestor line ARM: alpine: select the Alpine MSI controller driver ARM: pxa: add the number of DMA requestor lines dmaengine: mmp-pdma: add number of requestors dma: mmp_pdma: Add the #dma-requests DT property documentation ARM: OMAP2+: Add rtc hwmod configuration for ti81xx ARM: s3c24xx: Avoid warning for inb/outb ARM: zynq: Move early printk virtual address to vmalloc area ARM: DRA7: hwmod: Add custom reset handler for PCIeSS ARM: SAMSUNG: Remove unused register offset definition ARM: EXYNOS: Cleanup header files inclusion drivers: soc: samsung: Enable COMPILE_TEST MAINTAINERS: Add maintainers entry for drivers/soc/samsung ...	2016-03-20 14:57:08 -07:00
Richard Cochran	ed72662a84	cpufreq: acpi-cpufreq: Clean up hot plug notifier callback This driver has two issues. First, it tries to fiddle with the hot plugged CPU's MSR on the UP_PREPARE event, at a time when the CPU is not yet online. Second, the driver sets the "boost-disable" bit for a CPU when going down, but does not clear the bit again if the CPU comes up again due to DOWN_FAILED. This patch fixes the issues by changing the driver to react to the ONLINE/DOWN_FAILED events instead of UP_PREPARE. As an added benefit, the driver also becomes symmetric with respect to the hot plug mechanism. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-20 00:39:04 +01:00
Rafael J. Wysocki	fdfdb2b130	intel_pstate: Do not call wrmsrl_on_cpu() with disabled interrupts After commit `a4675fbc4a` (cpufreq: intel_pstate: Replace timers with utilization update callbacks) wrmsrl_on_cpu() cannot be called in the intel_pstate_adjust_busy_pstate() path as that is executed with disabled interrupts. However, atom_set_pstate() called from there via intel_pstate_set_pstate() uses wrmsrl_on_cpu() to update the IA32_PERF_CTL MSR which triggers the WARN_ON_ONCE() in smp_call_function_single(). The reason why wrmsrl_on_cpu() is used by atom_set_pstate() is because intel_pstate_set_pstate() calling it is also invoked during the initialization and cleanup of the driver and in those cases it is not guaranteed to be run on the CPU that is being updated. However, in the case when intel_pstate_set_pstate() is called by intel_pstate_adjust_busy_pstate(), wrmsrl() can be used to update the register safely. Moreover, intel_pstate_set_pstate() already contains code that only is executed if the function is called by intel_pstate_adjust_busy_pstate() and there is a special argument passed to it because of that. To fix the problem at hand, rearrange the code taking the above observations into account. First, replace the ->set() callback in struct pstate_funcs with a ->get_val() one that will return the value to be written to the IA32_PERF_CTL MSR without updating the register. Second, split intel_pstate_set_pstate() into two functions, intel_pstate_update_pstate() to be called by intel_pstate_adjust_busy_pstate() that will contain all of the intel_pstate_set_pstate() code which only needs to be executed in that case and will use wrmsrl() to update the MSR (after obtaining the value to write to it from the ->get_val() callback), and intel_pstate_set_min_pstate() to be invoked during the initialization and cleanup that will set the P-state to the minimum one and will update the MSR using wrmsrl_on_cpu(). Finally, move the code shared between intel_pstate_update_pstate() and intel_pstate_set_min_pstate() to a new static inline function intel_pstate_record_pstate() and make them both call it. Of course, that unifies the handling of the IA32_PERF_CTL MSR writes between Atom and Core. Fixes: `a4675fbc4a` (cpufreq: intel_pstate: Replace timers with utilization update callbacks) Reported-and-tested-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-20 00:37:09 +01:00
Richard Cochran	c75361c0b0	cpufreq: Make cpufreq_quick_get() safe to call The function, cpufreq_quick_get, accesses the global 'cpufreq_driver' and its fields without taking the associated lock, cpufreq_driver_lock. Without the locking, nothing guarantees that 'cpufreq_driver' remains consistent during the call. This patch fixes the issue by taking the lock before accessing the data structure. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-18 01:49:01 +01:00
Rafael J. Wysocki	4ed3900427	Merge branch 'pm-cpufreq' * pm-cpufreq: (94 commits) intel_pstate: Do not skip samples partially intel_pstate: Remove freq calculation from intel_pstate_calc_busy() intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance() intel_pstate: Optimize calculation for max/min_perf_adj intel_pstate: Remove extra conversions in pid calculation cpufreq: Move scheduler-related code to the sched directory Revert "cpufreq: postfix policy directory with the first CPU in related_cpus" cpufreq: Reduce cpufreq_update_util() overhead a bit cpufreq: Select IRQ_WORK if CPU_FREQ_GOV_COMMON is set cpufreq: Remove 'policy->governor_enabled' cpufreq: Rename __cpufreq_governor() to cpufreq_governor() cpufreq: Relocate handle_update() to kill its declaration cpufreq: governor: Drop unnecessary checks from show() and store() cpufreq: governor: Fix race in dbs_update_util_handler() cpufreq: governor: Make gov_set_update_util() static cpufreq: governor: Narrow down the dbs_data_mutex coverage cpufreq: governor: Make dbs_data_mutex static cpufreq: governor: Relocate definitions of tuners structures cpufreq: governor: Move per-CPU data to the common code cpufreq: governor: Make governor private data per-policy ...	2016-03-14 14:22:03 +01:00
Rafael J. Wysocki	4fec7ad5f6	intel_pstate: Do not skip samples partially If the current value of MPERF or the current value of TSC is the same as the previous one, respectively, intel_pstate_sample() bails out early and skips the sample. However, intel_pstate_adjust_busy_pstate() is still called in that case which is not correct, so modify intel_pstate_sample() to return a bool value indicating whether or not the sample has been taken and use it to decide whether or not to call intel_pstate_adjust_busy_pstate(). While at it, remove redundant parentheses from the MPERF/TSC check in intel_pstate_sample(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2016-03-11 00:07:51 +01:00
Philippe Longepe	8fa520af50	intel_pstate: Remove freq calculation from intel_pstate_calc_busy() Use a helper function to compute the average pstate and call it only where it is needed (only when tracing or in intel_pstate_get). Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-11 00:04:58 +01:00
Philippe Longepe	7349ec0470	intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance() The cpu_load algorithm doesn't need to invoke intel_pstate_calc_busy(), so move that call from intel_pstate_sample() to get_target_pstate_use_performance(). Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-11 00:04:58 +01:00
Philippe Longepe	a158bed5dc	intel_pstate: Optimize calculation for max/min_perf_adj mul_fp(int_tofp(A), B) expands to: ((A << FRAC_BITS) * B) >> FRAC_BITS, so the same result can be obtained via simple multiplication A * B. Apply this observation to max_perf * limits->max_perf and max_perf * limits->min_perf in intel_pstate_get_min_max()." Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-11 00:04:58 +01:00
Philippe Longepe	b54a0dfd56	intel_pstate: Remove extra conversions in pid calculation pid->setpoint and pid->deadband can be initialized in fixed point, so we can avoid the int_tofp in pid_calc. Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-11 00:04:42 +01:00
Rafael J. Wysocki	a5acbfbd70	Merge branch 'pm-cpufreq-governor' into pm-cpufreq	2016-03-10 20:46:03 +01:00
Rafael J. Wysocki	adaf9fcd13	cpufreq: Move scheduler-related code to the sched directory Create cpufreq.c under kernel/sched/ and move the cpufreq code related to the scheduler to that file and to sched.h. Redefine cpufreq_update_util() as a static inline function to avoid function calls at its call sites in the scheduler code (as suggested by Peter Zijlstra). Also move the definition of struct update_util_data and declaration of cpufreq_set_update_util_data() from include/linux/cpufreq.h to include/linux/sched.h. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-03-10 20:44:47 +01:00
Viresh Kumar	edd4a893e0	Revert "cpufreq: postfix policy directory with the first CPU in related_cpus" Revert commit `3510fac454` (cpufreq: postfix policy directory with the first CPU in related_cpus). Earlier, the policy->kobj was added to the kobject core, before ->init() callback was called for the cpufreq drivers. Which allowed those drivers to add or remove, driver dependent, sysfs files/directories to the same kobj from their ->init() and ->exit() callbacks. That isn't possible anymore after commit `3510fac454`. Now, there is no other clean alternative that people can adopt. Its better to revert the earlier commit to allow cpufreq drivers to create/remove sysfs files from ->init() and ->exit() callbacks. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 21:42:45 +01:00
Rafael J. Wysocki	08f511fd41	cpufreq: Reduce cpufreq_update_util() overhead a bit Use the observation that cpufreq_update_util() is only called by the scheduler with rq->lock held, so the callers of cpufreq_set_update_util_data() can use synchronize_sched() instead of synchronize_rcu() to wait for cpufreq_update_util() to complete. Moreover, if they are updated to do that, rcu_read_(un)lock() calls in cpufreq_update_util() might be replaced with rcu_read_(un)lock_sched(), respectively, but those aren't really necessary, because the scheduler calls that function from RCU-sched read-side critical sections already. In addition to that, if cpufreq_set_update_util_data() checks the func field in the struct update_util_data before setting the per-CPU pointer to it, the data->func check may be dropped from cpufreq_update_util() as well. Make the above changes to reduce the overhead from cpufreq_update_util() in the scheduler paths invoking it and to make the cleanup after removing its callbacks less heavy-weight somewhat. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-03-09 15:07:58 +01:00
Rafael J. Wysocki	e6f036571e	cpufreq: Select IRQ_WORK if CPU_FREQ_GOV_COMMON is set Commit 0eb463be3436 (cpufreq: governor: Replace timers with utilization update callbacks) made CPU_FREQ select IRQ_WORK, but that's not necessary, as it is sufficient for IRQ_WORK to be selected by CPU_FREQ_GOV_COMMON, so modify the cpufreq Kconfig to that effect. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 15:01:37 +01:00
Viresh Kumar	242aa883a6	cpufreq: Remove 'policy->governor_enabled' The entire sequence of events (like INIT/START or STOP/EXIT) for which cpufreq_governor() is called, is guaranteed to be protected by policy->rwsem now. The additional checks that were added earlier (as we were forced to drop policy->rwsem before calling cpufreq_governor() for EXIT event), aren't required anymore. Over that, they weren't sufficient really. They just take care of START/STOP events, but not INIT/EXIT and the state machine was never maintained properly by them. Kill the unnecessary checks and policy->governor_enabled field. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:12 +01:00
Viresh Kumar	a1317e091a	cpufreq: Rename __cpufreq_governor() to cpufreq_governor() The __ at the beginning of the routine aren't really necessary at all. Rename it to cpufreq_governor() instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:11 +01:00
Viresh Kumar	11eb69b984	cpufreq: Relocate handle_update() to kill its declaration handle_update() is declared at the top of the file as its user appear before its definition. Relocate the routine to get rid of this. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:11 +01:00
Viresh Kumar	f737236b12	cpufreq: governor: Drop unnecessary checks from show() and store() The show() and store() routines in the cpufreq-governor core don't need to check if the struct governor_attr they want to use really provides the callbacks they need as expected (if that's not the case, it means a bug in the code anyway), so change them to avoid doing that. Also change the error value to -EBUSY, if the governor is getting removed and we aren't allowed to store any more changes. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:11 +01:00
Rafael J. Wysocki	27de348239	cpufreq: governor: Fix race in dbs_update_util_handler() There is a scenario that may lead to undesired results in dbs_update_util_handler(). Namely, if two CPUs sharing a policy enter the funtion at the same time, pass the sample delay check and then one of them is stalled until dbs_work_handler() (queued up by the other CPU) clears the work counter, it may update the work counter and queue up another work item prematurely. To prevent that from happening, use the observation that the CPU queuing up a work item in dbs_update_util_handler() updates the last sample time. This means that if another CPU was stalling after passing the sample delay check and now successfully updated the work counter as a result of the race described above, it will see the new value of the last sample time which is different from what it used in the sample delay check before. If that happens, the sample delay check passed previously is not valid any more, so the CPU should not continue. Fixes: f17cbb53783c (cpufreq: governor: Avoid atomic operations in hot paths) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:10 +01:00
Rafael J. Wysocki	94ab5e030f	cpufreq: governor: Make gov_set_update_util() static The gov_set_update_util() routine is only used internally by the common governor code and it doesn't need to be exported, so make it static. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:10 +01:00
Rafael J. Wysocki	1112e9d83e	cpufreq: governor: Narrow down the dbs_data_mutex coverage Since cpufreq_governor_dbs() is now always called with policy->rwsem held, it cannot be executed twice in parallel for the same policy. Thus it is not necessary to hold dbs_data_mutex around the invocations of cpufreq_governor_start/stop/limits() from it as those functions never modify any data that can be shared between different policies. However, cpufreq_governor_dbs() may be executed twice in parallal for different policies using the same gov->gdbs_data object and dbs_data_mutex is still necessary to protect that object against concurrent updates. For this reason, narrow down the dbs_data_mutex locking to cpufreq_governor_init/exit() where it is needed and rename the mutex to gov_dbs_data_mutex to reflect its purpose. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:10 +01:00
Rafael J. Wysocki	e3f5ed9393	cpufreq: governor: Make dbs_data_mutex static That mutex is only used by cpufreq_governor_dbs() and it doesn't need to be exported to modules, so make it static and drop the export incantation. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:09 +01:00
Rafael J. Wysocki	47ebaac1f3	cpufreq: governor: Relocate definitions of tuners structures Move the definitions of struct od_dbs_tuners and struct cs_dbs_tuners from the common governor header to the ondemand and conservative governor code, respectively, as they don't need to be in the common header any more. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:09 +01:00
Rafael J. Wysocki	8c8f77fd07	cpufreq: governor: Move per-CPU data to the common code After previous changes there is only one piece of code in the ondemand governor making references to per-CPU data structures, but it can be easily modified to avoid doing that, so modify it accordingly and move the definition of per-CPU data used by the ondemand and conservative governors to the common code. Next, change that code to access the per-CPU data structures directly rather than via a governor callback. This causes the ->get_cpu_cdbs governor callback to become unnecessary, so drop it along with the macro and function definitions related to it. Finally, drop the definitions of struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s that aren't necessary any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:09 +01:00
Rafael J. Wysocki	7d5a9956af	cpufreq: governor: Make governor private data per-policy Some fields in struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s are only used for a limited set of CPUs. Namely, if a policy is shared between multiple CPUs, those fields will only be used for one of them (policy->cpu). This means that they really are per-policy rather than per-CPU and holding room for them in per-CPU data structures is generally wasteful. Also moving those fields into per-policy data structures will allow some significant simplifications to be made going forward. For this reason, introduce struct cs_policy_dbs_info and struct od_policy_dbs_info to hold those fields. Define each of the new structures as an extension of struct policy_dbs_info (such that struct policy_dbs_info is embedded in each of them) and introduce new ->alloc and ->free governor callbacks to allocate and free those structures, respectively, such that ->alloc() will return a pointer to the struct policy_dbs_info embedded in the allocated data structure and ->free() will take that pointer as its argument. With that, modify the code accessing the data fields in question in per-CPU data objects to look for them in the new structures via the struct policy_dbs_info pointer available to it and drop them from struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:08 +01:00
Rafael J. Wysocki	d1db75fffc	cpufreq: ondemand: Rework the handling of powersave bias updates The ondemand_powersave_bias_init() function used for resetting data fields related to the powersave bias tunable of the ondemand governor works by walking all of the online CPUs in the system and updating the od_cpu_dbs_info_s structures for all of them. However, if governor tunables are per policy, the update should not touch the CPUs that are not associated with the given dbs_data. Moreover, since the data fields in question are only ever used for policy->cpu in each policy governed by ondemand, the update can be limited to those specific CPUs. Rework the code to take the above observations into account. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:08 +01:00
Rafael J. Wysocki	a33cce1c6c	cpufreq: governor: Fix CPU load information updates via ->store The ->store() callbacks of some tunable sysfs attributes of the ondemand and conservative governors trigger immediate updates of the CPU load information for all CPUs "governed" by the given dbs_data by walking the cpu_dbs_info structures for all online CPUs in the system and updating them. This is questionable for two reasons. First, it may lead to a lot of extra overhead on a system with many CPUs if the given dbs_data is only associated with a few of them. Second, if governor tunables are per-policy, the CPUs associated with the other sets of governor tunables should not be updated. To address this issue, use the observation that in all of the places in question the update operation may be carried out in the same way (because all of the tunables involved are now located in struct dbs_data and readily available to the common code) and make the code in those places invoke the same (new) helper function that will carry out the update correctly. That new function always checks the ignore_nice_load tunable value and updates the CPUs' prev_cpu_nice data fields if that's set, which wasn't done by the original code in store_io_is_busy(), but it should have been done in there too. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:08 +01:00
Rafael J. Wysocki	76c5f66aa1	cpufreq: ondemand: Drop one more callback from struct od_ops The ->powersave_bias_init_cpu callback in struct od_ops is only used in one place and that invocation may be replaced with a direct call to the function pointed to by that callback, so change the code accordingly and drop the callback. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:07 +01:00
Rafael J. Wysocki	8434dadbb4	cpufreq: governor: Drop unused governor callback and data fields After some previous changes, the ->get_cpu_dbs_info_s governor callback and the "governor" field in struct dbs_governor (whose value represents the governor type) are not used any more, so drop them. Also drop the unused gov_ops field from struct dbs_governor. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:07 +01:00
Rafael J. Wysocki	702c9e542a	cpufreq: governor: Add a ->start callback for governors To avoid having to check the governor type explicitly in the common code in order to initialize data structures specific to the governor type properly, add a ->start callback to struct dbs_governor and use it to initialize those data structures for the ondemand and conservative governors. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:07 +01:00
Rafael J. Wysocki	8847e038c1	cpufreq: governor: Move io_is_busy to struct dbs_data The io_is_busy governor tunable is only used by the ondemand governor and is located in the ondemand-specific data structure, but it is looked at by the common governor code that has to do ugly things to get to that value, so move it to struct dbs_data and modify ondemand accordingly. Since the conservative governor never touches that field, it will be always 0 for that governor and it won't have any effect on the results of computations in that case. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:06 +01:00
Rafael J. Wysocki	574ef14d5d	cpufreq: governor: Close dbs_data update race condition It is possible for a dbs_data object to be updated after its usage counter has become 0. That may happen if governor_store() runs (via a govenor tunable sysfs attribute write) in parallel with cpufreq_governor_exit() called for the last cpufreq policy associated with the dbs_data in question. In that case, if governor_store() acquires dbs_data->mutex right after cpufreq_governor_exit() has released it, the ->store() callback invoked by it may operate on dbs_data with no users. Although sysfs will cause the kobject_put() in cpufreq_governor_exit() to block until governor_store() has returned, that situation may lead to some unexpected results, depending on the implementation of the ->store callback, and therefore it should be avoided. To that end, modify governor_store() to check the dbs_data's usage count before invoking the ->store() callback and return an error if it is 0 at that point. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:06 +01:00
Rafael J. Wysocki	8eb055d3f5	cpufreq: ondemand: Drop unused callback from struct od_ops The ->freq_increase callback in struct od_ops is never invoked, so drop it. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:06 +01:00
Rafael J. Wysocki	a7f35cffb9	cpufreq: ondemand: Simplify od_update() slightly Drop some lines of code from od_update() by arranging the statements in there in a more logical way. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:05 +01:00
Rafael J. Wysocki	07aa4402a0	cpufreq: governor: Use microseconds in sample delay computations Do not convert microseconds to jiffies and the other way around in governor computations related to the sampling rate and sample delay and drop delay_for_sampling_rate() which isn't of any use then. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:05 +01:00
Rafael J. Wysocki	6e96c5b3ac	cpufreq: ondemand: Simplify conditionals in od_dbs_timer() Reduce the indentation level in the conditionals in od_dbs_timer() and drop the delay variable from it. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:05 +01:00
Rafael J. Wysocki	57dc3bcd45	cpufreq: governor: Move rate_mult to struct policy_dbs The rate_mult field in struct od_cpu_dbs_info_s is used by the code shared with the conservative governor and to access it that code has to do an ugly governor type check. However, first of all it is ever only used for policy->cpu, so it is per-policy rather than per-CPU and second, it is initialized to 1 by cpufreq_governor_start(), so if the conservative governor never modifies it, it will have no effect on the results of any computations. For these reasons, move rate_mult to struct policy_dbs_info (as a common field). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:04 +01:00
Rafael J. Wysocki	78347cdb89	cpufreq: governor: Reset sample delay in store_sampling_rate() If store_sampling_rate() updates the sample delay when the ondemand governor is in the middle of its high/low dance (OD_SUB_SAMPLE sample type is set), the governor will still do the bottom half of the previous sample which may take too much time. To prevent that from happening, change store_sampling_rate() to always reset the sample delay to 0 which also is consistent with the new behavior of cpufreq_governor_limits(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:04 +01:00
Rafael J. Wysocki	4cccf75557	cpufreq: governor: Get rid of the ->gov_check_cpu callback The way the ->gov_check_cpu governor callback is used by the ondemand and conservative governors is not really straightforward. Namely, the governor calls dbs_check_cpu() that updates the load information for the policy and the invokes ->gov_check_cpu() for the governor. To get rid of that entanglement, notice that cpufreq_governor_limits() doesn't need to call dbs_check_cpu() directly. Instead, it can simply reset the sample delay to 0 which will cause a sample to be taken immediately. The result of that is practically equivalent to calling dbs_check_cpu() except that it will trigger a full update of governor internal state and not just the ->gov_check_cpu() part. Following that observation, make cpufreq_governor_limits() reset the sample delay and turn dbs_check_cpu() into a function that will simply evaluate the load and return the result called dbs_update(). That function can now be called by governors from the routines that previously were pointed to by ->gov_check_cpu and those routines can be called directly by each governor instead of dbs_check_cpu(). This way ->gov_check_cpu becomes unnecessary, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:04 +01:00
Rafael J. Wysocki	57eb832f90	cpufreq: governor: Clean up load-related computations Clean up some load-related computations in dbs_check_cpu() and cpufreq_governor_start() to get rid of unnecessary operations and type casts and make the code easier to read. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:03 +01:00
Rafael J. Wysocki	679b8fe43a	cpufreq: governor: Fix nice contribution computation in dbs_check_cpu() The contribution of the CPU nice time to the idle time in dbs_check_cpu() is computed in a bogus way, as the code may subtract current and previous nice values for different CPUs. That doesn't matter for cases when cpufreq policies are not shared, but may lead to problems otherwise. Fix the computation and simplify it to avoid taking unnecessary steps. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:03 +01:00
Rafael J. Wysocki	e4db2813d2	cpufreq: governor: Avoid atomic operations in hot paths Rework the handling of work items by dbs_update_util_handler() and dbs_work_handler() so the former (which is executed in scheduler paths) only uses atomic operations when absolutely necessary. That is, when the policy is shared and dbs_update_util_handler() has already decided that this is the time to queue up a work item. In particular, this avoids the atomic ops entirely on platforms where policy objects are never shared. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:03 +01:00
Rafael J. Wysocki	f62b93740c	cpufreq: governor: Simplify gov_cancel_work() slightly The atomic work counter incrementation in gov_cancel_work() is not necessary any more, because work items won't be queued up after gov_clear_update_util() anyway, so drop it along with the comment about how it may be missed by the gov_clear_update_util(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:41:02 +01:00
Rafael J. Wysocki	b9db42730a	cpufreq: governor: Avoid irq_work_queue_on() crash on non-SMP ARM As it turns out, irq_work_queue_on() will crash if invoked on non-SMP ARM platforms, but in fact it is not necessary to use that function in the cpufreq governor code (as it doesn't matter to that code which CPU will handle the irq_work), so change it to always use irq_work_queue(). Fixes: 8fb47ff100af (cpufreq: governor: Replace timers with utilization update callbacks) Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Reported-and-tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:02 +01:00
Viresh Kumar	a23d6d1809	cpufreq: ondemand: Rearrange od_dbs_timer() to avoid updating delay Avoid extra checks in od_dbs_timer() by rearranging updates to the local delay variable in it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:02 +01:00
Viresh Kumar	aded387b94	cpufreq: conservative: Update sample_delay_ns immediately The ondemand governor already updates sample_delay_ns immediately on updates to the sampling rate, but conservative doesn't do that. It was left out earlier as the code was really too complex to get that done easily. Things are sorted out very well now, however, and the conservative governor can be modified to follow ondemand in that respect. Moreover, since the code needed to implement that in the conservative governor would be identical to the corresponding ondemand governor's code, make that code common and change both governors to use it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:01 +01:00
Viresh Kumar	581c214b21	cpufreq: governor: No need to manage state machine now The cpufreq core now guarantees that policy->rwsem won't be dropped while running the ->governor callback for the CPUFREQ_GOV_POLICY_EXIT event and will be held acquired until the complete sequence of governor state changes has finished. This allows governor state machine checks to be dropped from multiple functions in cpufreq_governor.c. This also means that policy_dbs->policy can be initialized upfront, so the entire initialization of struct policy_dbs can be carried out in one place. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:01 +01:00
Viresh Kumar	99522fe678	cpufreq: Remove cpufreq_governor_lock We used to drop policy->rwsem just before calling __cpufreq_governor() in some cases earlier and so it was possible that __cpufreq_governor() ran concurrently via separate threads for the same policy. In order to guarantee valid state transitions for governors, 'governor_enabled' was required to be protected using some locking and cpufreq_governor_lock was added for that. But now __cpufreq_governor() is always called under policy->rwsem, and 'governor_enabled' is protected against races even without cpufreq_governor_lock. Get rid of the extra lock now. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw : Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:01 +01:00
Viresh Kumar	49f18560f8	cpufreq: Call __cpufreq_governor() with policy->rwsem held The cpufreq core code is not consistent with respect to invoking __cpufreq_governor() under policy->rwsem. Changing all code to always hold policy->rwsem around __cpufreq_governor() invocations will allow us to remove cpufreq_governor_lock that is used today because we can't guarantee that __cpufreq_governor() isn't executed twice in parallel for the same policy. We should also ensure that policy->rwsem is held across governor state changes. For example, while adding a CPU to the policy in the CPU online path, we need to stop the governor, change policy->cpus, start the governor and then refresh its limits. The complete sequence must be guaranteed to complete without interruptions by concurrent governor state updates. That can be achieved by holding policy->rwsem around those sequences of operations. Also note that after this patch cpufreq_driver->stop_cpu() and ->exit() will get called under policy->rwsem which wasn't the case earlier. That shouldn't have any side effects, though. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:00 +01:00
Viresh Kumar	69cee7147b	cpufreq: Merge cpufreq_offline_prepare/finish routines Commit `1aee40ac9c` (cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock) split the cpufreq's CPU offline routine in two pieces, one of them to be run with CPU offline/online locked and the other to be called later. The reason for that split was a possible deadlock scenario involving cpufreq sysfs attributes and CPU offline. However, the handling of CPU offline in cpufreq has changed since then. Policy sysfs attributes are never removed during CPU offline, so there's no need to worry about accessing them during CPU offline, because that can't lead to any deadlocks now. Governor sysfs attributes are still removed in __cpufreq_governor(_EXIT), but there is a new kobject type for them now and its show/store callbacks don't lock CPU offline/online (they don't need to do that). This means that the CPU offline code in cpufreq doesn't need to be split any more, so combine cpufreq_offline_prepare() with cpufreq_offline_finish(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Changelog ] Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:41:00 +01:00
Viresh Kumar	c54df07184	cpufreq: governor: Create and traverse list of policy_dbs to avoid deadlock The dbs_data_mutex lock is currently used in two places. First, cpufreq_governor_dbs() uses it to guarantee mutual exclusion between invocations of governor operations from the core. Second, it is used by ondemand governor's update_sampling_rate() to ensure the stability of data structures walked by it. The second usage is quite problematic, because update_sampling_rate() is called from a governor sysfs attribute's ->store callback and that leads to a deadlock scenario involving cpufreq_governor_exit() which runs under dbs_data_mutex. Thus it is better to rework the code so update_sampling_rate() doesn't need to acquire dbs_data_mutex. To that end, rework update_sampling_rate() to walk a list of policy_dbs objects supported by the dbs_data one it has been called for (instead of walking cpu_dbs_info object for all CPUs). The list manipulation is protected with dbs_data->mutex which also is held around the execution of update_sampling_rate(), it is not necessary to hold dbs_data_mutex in that function any more. Reported-by: Juri Lelli <juri.lelli@arm.com> Reported-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:40:59 +01:00
Viresh Kumar	68e80dae09	Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT" Earlier, when the struct freq-attr was used to represent governor attributes, the standard cpufreq show/store sysfs attribute callbacks were applied to the governor tunable attributes and they always acquire the policy->rwsem lock before carrying out the operation. That could have resulted in an ABBA deadlock if governor tunable attributes are removed under policy->rwsem while one of them is being accessed concurrently (if sysfs attributes removal wins the race, it will wait for the access to complete with policy->rwsem held while the attribute callback will block on policy->rwsem indefinitely). We attempted to address this issue by dropping policy->rwsem around governor tunable attributes removal (that is, around invocations of the ->governor callback with the event arg equal to CPUFREQ_GOV_POLICY_EXIT) in cpufreq_set_policy(), but that opened up race conditions that had not been possible with policy->rwsem held all the time. The previous commit, "cpufreq: governor: New sysfs show/store callbacks for governor tunables", fixed the original ABBA deadlock by adding new governor specific show/store callbacks. We don't have to drop rwsem around invocations of governor event CPUFREQ_GOV_POLICY_EXIT anymore, and original fix can be reverted now. Fixes: `955ef48335` (cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT) Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reported-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:40:59 +01:00
Viresh Kumar	fd8ddc482a	cpufreq: governor: Drop unused macros for creating governor tunable attributes The previous commit introduced a new set of macros for creating sysfs attributes that represent governor tunables and the old macros used for this purpose are not needed any more, so drop them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:40:58 +01:00
Viresh Kumar	c443563036	cpufreq: governor: New sysfs show/store callbacks for governor tunables The ondemand and conservative governors use the global-attr or freq-attr structures to represent sysfs attributes corresponding to their tunables (which of them is actually used depends on whether or not different policy objects can use the same governor with different tunables at the same time and, consequently, on where those attributes are located in sysfs). Unfortunately, in the freq-attr case, the standard cpufreq show/store sysfs attribute callbacks are applied to the governor tunable attributes and they always acquire the policy->rwsem lock before carrying out the operation. That may lead to an ABBA deadlock if governor tunable attributes are removed under policy->rwsem while one of them is being accessed concurrently (if sysfs attributes removal wins the race, it will wait for the access to complete with policy->rwsem held while the attribute callback will block on policy->rwsem indefinitely). We attempted to address this issue by dropping policy->rwsem around governor tunable attributes removal (that is, around invocations of the ->governor callback with the event arg equal to CPUFREQ_GOV_POLICY_EXIT) in cpufreq_set_policy(), but that opened up race conditions that had not been possible with policy->rwsem held all the time. Therefore policy->rwsem cannot be dropped in cpufreq_set_policy() at any point, but the deadlock situation described above must be avoided too. To that end, use the observation that in principle governor tunables may be represented by the same data type regardless of whether the governor is system-wide or per-policy and introduce a new structure, struct governor_attr, for representing them and new corresponding macros for creating show/store sysfs callbacks for them. Also make their parent kobject use a new kobject type whose default show/store callbacks are not related to the standard core cpufreq ones in any way (and they don't acquire policy->rwsem in particular). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Subject & changelog + rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:40:58 +01:00
Viresh Kumar	ff4b17895e	cpufreq: governor: Move common tunables to 'struct dbs_data' There are a few common tunables shared between the ondemand and conservative governors. Move them to struct dbs_data to simplify code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:40:58 +01:00
Viresh Kumar	d0684d3b89	cpufreq: governor: Create generic macro for common tunables Some tunables are present in governor-specific structures, whereas one (min_sampling_rate) is located directly in struct dbs_data. There is a special macro for creating its sysfs attribute and the show/store callbacks, but since more tunables are going to be moved to struct dbs_data, a new generic macro for such cases will be useful, so add it and use it for min_sampling_rate. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-09 14:40:57 +01:00
Rafael J. Wysocki	fafd5e8ab2	cpufreq: governor: Drop pointless goto from cpufreq_governor_init() It is silly to jump around "return 0", so don't do that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:57 +01:00
Rafael J. Wysocki	686cc637c9	cpufreq: governor: Rename skip_work to work_count The skip_work field in struct policy_dbs_info technically is a counter, so give it a new name to reflect that. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:57 +01:00
Rafael J. Wysocki	cea6a9e772	cpufreq: governor: Symmetrize cpu_dbs_info initialization and cleanup Make the initialization of struct cpu_dbs_info objects in alloc_policy_dbs_info() and the code that cleans them up in free_policy_dbs_info() more symmetrical. In particular, set/clear the update_util.func field in those functions along with the policy_dbs field. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:56 +01:00
Rafael J. Wysocki	bc505475b8	cpufreq: governor: Rearrange governor data structures The struct policy_dbs_info objects representing per-policy governor data are not accessible directly from the corresponding policy objects. To access them, one has to get a pointer to the struct cpu_dbs_info of policy->cpu and use the policy_dbs field of that which isn't really straightforward. To address that rearrange the governor data structures so the governor_data pointer in struct cpufreq_policy will point to struct policy_dbs_info (instead of struct dbs_data) and that will contain a pointer to struct dbs_data. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:56 +01:00
Rafael J. Wysocki	e975189400	cpufreq: governor: Simplify cpufreq_governor_limits() Use the observation that cpufreq_governor_limits() doesn't have to get to the policy object it wants to manipulate by walking the reference chain cdbs->policy_dbs->policy, as the final pointer is actually equal to its argument, and make it access the policy object directy via its argument. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:56 +01:00
Rafael J. Wysocki	d10b5eb5fc	cpufreq: governor: Drop cpu argument from dbs_check_cpu() Since policy->cpu is always passed as the second argument to dbs_check_cpu(), it is not really necessary to pass it, because the function can obtain that value via its first argument just fine. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:55 +01:00
Rafael J. Wysocki	e40e7b255e	cpufreq: governor: Rename cpu_common_dbs_info to policy_dbs_info The struct cpu_common_dbs_info structure represents the per-policy part of the governor data (for the ondemand and conservative governors), but its name doesn't reflect its purpose. Rename it to struct policy_dbs_info and rename variables related to it accordingly. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:55 +01:00
Rafael J. Wysocki	ea59ee0dc9	cpufreq: governor: Drop the gov pointer from struct dbs_data Since it is possible to obtain a pointer to struct dbs_governor from a pointer to the struct governor embedded in it with the help of container_of(), the additional gov pointer in struct dbs_data isn't really necessary. Drop that pointer and make the code using it reach the dbs_governor object via policy->governor. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:55 +01:00
Rafael J. Wysocki	906a6e5aae	cpufreq: governor: Rework cpufreq_governor_dbs() Since it is possible to obtain a pointer to struct dbs_governor from a pointer to the struct governor embedded in it via container_of(), the second argument of cpufreq_governor_init() is not necessary. Accordingly, cpufreq_governor_dbs() doesn't need its second argument either and the ->governor callbacks for both the ondemand and conservative governors may be set to cpufreq_governor_dbs() directly. Make that happen. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:54 +01:00
Rafael J. Wysocki	7bdad34d08	cpufreq: governor: Rename some data types and variables The ondemand and conservative governors are represented by struct common_dbs_data whose name doesn't reflect the purpose it is used for, so rename it to struct dbs_governor and rename variables of that type accordingly. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:54 +01:00
Rafael J. Wysocki	af92618523	cpufreq: governor: Put governor structure into common_dbs_data For the ondemand and conservative governors (generally, governors that use the common code in cpufreq_governor.c), there are two static data structures representing the governor, the struct governor structure (the interface to the cpufreq core) and the struct common_dbs_data one (the interface to the cpufreq_governor.c code). There's no fundamental reason why those two structures have to be separate. Moreover, if the struct governor one is included into struct common_dbs_data, it will be possible to reach the latter from the policy via its policy->governor pointer, so it won't be necessary to pass a separate pointer to it around. For this reason, embed struct governor in struct common_dbs_data. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:54 +01:00
Rafael J. Wysocki	5da3dd1e00	cpufreq: governor: Avoid passing dbs_data pointers around unnecessarily Do not pass struct dbs_data pointers to the family of functions implementing governor operations in cpufreq_governor.c as they can take that pointer from policy->governor by themselves. The cpufreq_governor_init() case is slightly more complicated, since policy->governor may be NULL when it is invoked, but then it can reach the pointer in question via its cdata argument just fine. While at it, rework cpufreq_governor_dbs() to avoid a pointless policy_governor check in the CPUFREQ_GOV_POLICY_INIT case. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:53 +01:00
Rafael J. Wysocki	2bb8d94fb0	cpufreq: governor: Use common mutex for dbs_data protection Every governor relying on the common code in cpufreq_governor.c has to provide its own mutex in struct common_dbs_data. However, there actually is no need to have a separate mutex per governor for this purpose, they may be using the same global mutex just fine. Accordingly, introduce a single common mutex for that and drop the mutex field from struct common_dbs_data. That at least will ensure that the mutex is always present and initialized regardless of what the particular governors do. Another benefit is that the common code does not need a pointer to a governor-related structure to get to the mutex which sometimes helps. Finally, it makes the code generally easier to follow. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-03-09 14:40:53 +01:00
Rafael J. Wysocki	9be4fd2c77	cpufreq: governor: Replace timers with utilization update callbacks Instead of using a per-CPU deferrable timer for queuing up governor work items, register a utilization update callback that will be invoked from the scheduler on utilization changes. The sampling rate is still the same as what was used for the deferrable timers and the added irq_work overhead should be offset by the eliminated timers overhead, so in theory the functional impact of this patch should not be significant. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>	2016-03-09 14:40:53 +01:00
Rafael J. Wysocki	a4675fbc4a	cpufreq: intel_pstate: Replace timers with utilization update callbacks Instead of using a per-CPU deferrable timer for utilization sampling and P-states adjustments, register a utilization update callback that will be invoked from the scheduler on utilization changes. The sampling rate is still the same as what was used for the deferrable timers, so the functional impact of this patch should not be significant. Based on an earlier patch from Srinivas Pandruvada. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2016-03-09 14:40:52 +01:00
Rafael J. Wysocki	34e2c555f3	cpufreq: Add mechanism for registering utilization update callbacks Introduce a mechanism by which parts of the cpufreq subsystem ("setpolicy" drivers or the core) can register callbacks to be executed from cpufreq_update_util() which is invoked by the scheduler's update_load_avg() on CPU utilization changes. This allows the "setpolicy" drivers to dispense with their timers and do all of the computations they need and frequency/voltage adjustments in the update_load_avg() code path, among other things. The update_load_avg() changes were suggested by Peter Zijlstra. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@kernel.org>	2016-03-09 14:39:19 +01:00
Rafael J. Wysocki	ed757a2c7b	cpufreq: acpi-cpufreq: Make read and write operations more efficient Setting a new CPU frequency and reading the current request value in the ACPI cpufreq driver involves each at least two switch instructions (there's more if the policy is shared). One of them is present in drv_read/write() that prepares a command structure and the other happens in subsequent do_drv_read/write() when that structure is interpreted. However, all of those switches may be avoided by using function pointers. To that end, add two function pointers to struct acpi_cpufreq_data to represent read and write operations on the frequency register and set them up during policy intitialization to point to the pair of routines suitable for the given processor (Intel/AMD MSR access or I/O port access). Then, use those pointers in do_drv_read/write() and modify drv_read/write() to prepare the command structure for them without any checks. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-03 03:57:50 +01:00
Arnd Bergmann	3c2002aec3	cpufreq: mediatek: allow building as a module The MT8173 cpufreq driver can currently only be built-in, but it has a Kconfig dependency on the thermal core. THERMAL can be a loadable module, which in turn makes this driver impossible to build. It is nicer to make the cpufreq driver a module as well, so this patch turns the option in to a 'tristate' and adapts the dependency accordingly. The driver has no module_exit() function, so it will continue to not support unloading, but it can be built as a module and loaded at runtime now. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: `5269e7067c` (cpufreq: Add ARM_MT8173_CPUFREQ dependency on THERMAL) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-01 02:43:05 +01:00
Arnd Bergmann	ddd30ef474	cpufreq: qoriq: allow building as module with THERMAL=m My previous patch to avoid link errors with the qoriq cpufreq driver disallowed all of the broken cases, but also prevented the driver from being built when CONFIG_THERMAL is a module. This changes the dependency to allow the cpufreq driver to also be a module in this case, just not built-in. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: `8ae1702a0d` (cpufreq: qoriq: Register cooling device based on device tree) Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-03-01 02:43:05 +01:00
Arnd Bergmann	1c277cae14	This is the pxa changes for v4.6 cycle. This is a minor cycle with : - cleanup fixes from Arnd, mainly build oriented and sparse type ones - dma fixes for requestors above 32 (impacting mainly camera driver) - some minor cleanup on pxa3xx device-tree side -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW0Y3SAAoJEAP2et0duMsSfesQAIGKccj/WRWtlCxje+8Aj3UW eu6vV9xGVBZu0RJxCchwCBP5ZO8WSvXBYvX5MIaBH2TFdJqJEjie6ES4mRlGVuup XMUZybcFOu9N5j0WMddj2OHhEeYGgMsSZwSOgK4nsQ/eyhrcI5fjykdafmPKywjW +/USV90ucVp+38C+yg4yXSuI8FOEABIn6VoX//+YpDZetvIoJCUQne1g+uPE2YoF dcPqVbY2OeAvuABkI3Wqdrc3Ico9i8Ns8erg8EuDe5xv2TvJXhn/mIeoVNZ2s4So 5aCD87RQu3rwDfqAv6FzW06k9AYAE/p/VKh0smI12D8MCxhSD0EZP+jBfDvuwx/n IICMH7YuunRSRe7VDFgWyIz7wduHQu3xctF9scSYZD+kBpvAD274sZs9WYs4fGd6 hXxxV4iXrP8af6A+sddDeo0Gq25jC7JoL43YlQTUzBMqbrzK4W3c0dXCib6hh/eN W/YVdPERGs9cTG/IZFbn3cn1QIYdA4exNoE38txWkKpeiVxu0tZKSsg6m9xCPu7+ vMQj1m8E8mkgiMq0BJnei02QC8Xw5ekf4wNUsLOOou29c7CTt3zicM8YtnDmANV2 fIIT4BK0izIZj4N0RZp9KT6h/IkF1VHRz3pcw3vYXJLbaKqHk/6doDUW30E/Ol1f PzJedKaWujhOV1DG1SLb =lBYC -----END PGP SIGNATURE----- Merge tag 'pxa-for-4.6' of https://github.com/rjarzmik/linux into next/soc Merge "pxa changes for v4.6 cycle" from Robert Jarzmik: This is a minor cycle with : - cleanup fixes from Arnd, mainly build oriented and sparse type ones - dma fixes for requestors above 32 (impacting mainly camera driver) - some minor cleanup on pxa3xx device-tree side * tag 'pxa-for-4.6' of https://github.com/rjarzmik/linux: dmaengine: pxa_dma: fix the maximum requestor line ARM: pxa: add the number of DMA requestor lines dmaengine: mmp-pdma: add number of requestors dma: mmp_pdma: Add the #dma-requests DT property documentation ARM: pxa: pxa3xx device-tree support cleanup ARM: pxa: don't select RFKILL if CONFIG_NET is disabled ARM: pxa: fix building without IWMMXT ARM: pxa: move extern declarations to pm.h ARM: pxa: always select one of the two CPU types ARM: pxa: don't select GPIO_SYSFS for MIOA701 ARM: pxa: mark unused eseries code as __maybe_unused ARM: pxa: mark spitz_card_pwr_ctrl as __maybe_unused ARM: pxa: define clock registers as __iomem	2016-03-01 00:24:43 +01:00
Shilpasri G Bhat	c5e29ea7ac	cpufreq: powernv: Fix bugs in powernv_cpufreq_{init/exit} Unregister the notifiers if cpufreq_driver_register() fails in powernv_cpufreq_init(). Re-arrange the unregistration and cleanup routines in powernv_cpufreq_exit() to free all the resources after the driver has unregistered. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-26 22:18:19 +01:00
Srinivas Pandruvada	f05c966585	cpufreq: intel_pstate: disable HWP notifications Disable HWP Interrupt notification before enabling HWP. Since we don't have HWP interrupt handling for possible performance interrupts, there is not much use of enabling HWP interrupts. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-26 22:15:38 +01:00
Srinivas Pandruvada	7791e4aa59	cpufreq: intel_pstate: Enable HWP by default If the processor supports HWP, enable it by default without checking for the cpu model. This will allow to enable HWP in all supported processors without driver change. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-26 22:15:38 +01:00
Rafael J. Wysocki	9a909a142f	cpufreq: acpi-cpufreq: Drop pointless label from acpi_cpufreq_target() The "out" label at the final return statement in acpi_cpufreq_target() is totally pointless, so drop them and modify the code to return the right values immediately instead of jumping to it. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-02-26 22:11:55 +01:00
Rafael J. Wysocki	6019d23a73	cpufreq: Rearrange __cpufreq_driver_target() Drop a pointless label at a return statement from __cpufreq_driver_target() and rearrange that function to reduce the indentation level. No intentional functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-02-26 22:11:55 +01:00
Olof Johansson	6997e172dc	Exynos-specific driver changes for v4.6: 1. Minor cleanup in s5pv210-cpufreq driver. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWwRh1AAoJEME3ZuaGi4PXyggP/RrBIvcSMAcGsmKZVYb0+osL IOy363iayW2D92QlyPOx5CSML9TenVwi5F+sNeKnvMevb6YxWR3HrixI6cDOmmnx IDEa6/cW8pmbDxea2an79sW4LRUnFuZA+UfdcLMFRFUF6HYKeU/G8zACncX2x+4H IcYnsNSuUAFn9twIWa3eLBLW46x+8L5YUAdrEd6HdfYhQ0+pK/KjBEcje1jY4nwM megjhwT6WevmRxzEp1vbGNVJIfFRy1CSAOm+40gmFp8YJLAEkybGnHgbFojNESF7 3BV0ncuKJsk0HylR3A2TCwBeaUbgcNgJoI90B7pgQUOBvL7St+x3PuXqR5dbV3xg TtC461pNV5Yw2F5S1ZkqVxr4u6ThNpBtao+AQjBTQVL1RuO3v/gmBFL03jd2xc8I Mn4FVQDTOZmEJIhpZsvk27ayJMkOF5Gtgd2T7rhlEdy8s8EuXjYOAlRvA9C2BgXx DxE1BUwnZsjY3ygq3BLO5vVuPjFlcYytOo1tsswzrsayI3UmJkbUWqThliXj2pXW t4OWirqX0q1nt5U/e3SsEgTyHBqcltwwcrhEt5CFTccAstQfdukgiOxSeMEvO5ie o9oseH+FP//5/DpuLOnkaRXJlLCuNw9ZTK4S2y/Gnfw0Qh8kwWNzuqBH+h2cQmZe eUzaSkpq6YFh6dCNxABJ =nkpz -----END PGP SIGNATURE----- Merge tag 'samsung-drivers-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into next/drivers Exynos-specific driver changes for v4.6: 1. Minor cleanup in s5pv210-cpufreq driver. * tag 'samsung-drivers-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: cpufreq: s5pv210: remove superfluous CONFIG_PM ifdefs Signed-off-by: Olof Johansson <olof@lixom.net>	2016-02-24 14:30:05 -08:00
Viresh Kumar	41cfd64cf4	intel_pstate: Update frequencies of policy->cpus only from ->set_policy() The intel-pstate driver is using intel_pstate_hwp_set() from two separate paths, i.e. ->set_policy() callback and sysfs update path for the files present in /sys/devices/system/cpu/intel_pstate/ directory. While an update to the sysfs path applies to all the CPUs being managed by the driver (which essentially means all the online CPUs), the update via the ->set_policy() callback applies to a smaller group of CPUs managed by the policy for which ->set_policy() is called. And so, intel_pstate_hwp_set() should update frequencies of only the CPUs that are part of policy->cpus mask, while it is called from ->set_policy() callback. In order to do that, add a parameter (cpumask) to intel_pstate_hwp_set() and apply the frequency changes only to the concerned CPUs. For ->set_policy() path, we are only concerned about policy->cpus, and so policy->rwsem lock taken by the core prior to calling ->set_policy() is enough to take care of any races. The larger lock acquired by get_online_cpus() is required only for the updates to sysfs files. Add another routine, intel_pstate_hwp_set_online_cpus(), and call it from the sysfs update paths. This also fixes a lockdep reported recently, where policy->rwsem and get_online_cpus() could have been acquired in any order causing an ABBA deadlock. The sequence of events leading to that was: intel_pstate_init(...) ...cpufreq_online(...) down_write(&policy->rwsem); // Locks policy->rwsem ... cpufreq_init_policy(policy); ...intel_pstate_hwp_set(); get_online_cpus(); // Temporarily locks cpu_hotplug.lock ... up_write(&policy->rwsem); pm_suspend(...) ...disable_nonboot_cpus() _cpu_down() cpu_hotplug_begin(); // Locks cpu_hotplug.lock __cpu_notify(CPU_DOWN_PREPARE, ...); ...cpufreq_offline_prepare(); down_write(&policy->rwsem); // Locks policy->rwsem Reported-and-tested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-23 01:10:44 +01:00
Eric Biggers	fd7dc7e6b6	cpufreq: simplify for_each_suitable_policy() macro Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-22 13:56:41 +01:00
Eric Biggers	63af405572	cpufreq: fix comment about return value of cpufreq_register_driver() The comment has been incorrect since commit `4dea5806d3` ("cpufreq: return EEXIST instead of EBUSY for second registering"). Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-22 13:56:41 +01:00
Rafael J. Wysocki	6541aef01a	cpufreq: Drop unnecessary checks from show() and store() The show() and store() routines in the cpufreq core don't need to check if the struct freq_attr they want to use really provides the callbacks they need as expected (if that's not the case, it means a bug in the code anyway), so change them to avoid doing that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-02-12 23:56:21 +01:00
Viresh Kumar	dd02a3d920	cpufreq: dt: No need to allocate resources anymore OPP layer manages it now and cpufreq-dt driver doesn't need it. But, we still need to check for availability of resources for deferred probing. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-11 00:24:37 +01:00
Viresh Kumar	df2c8ec28e	cpufreq: dt: No need to fetch voltage-tolerance Its already done by core and we don't need to get it anymore. And so, we don't need to get of node in cpufreq_init() anymore, move that to find_supply_name() instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-11 00:24:37 +01:00
Viresh Kumar	78c3ba5df9	cpufreq: dt: Use dev_pm_opp_set_rate() to switch frequency OPP core supports frequency/voltage changes based on the target frequency now, use that instead of open coding the same in cpufreq-dt driver. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-11 00:24:37 +01:00
Viresh Kumar	755b888ff0	cpufreq: dt: Reuse dev_pm_opp_get_max_transition_latency() OPP layer has all the information now to calculate transition latency (clock_latency + voltage_latency). Lets reuse the OPP layer helper dev_pm_opp_get_max_transition_latency() instead of open coding the same in cpufreq-dt driver. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-11 00:24:37 +01:00
Viresh Kumar	6def6ea75e	cpufreq: dt: Unsupported OPPs are already disabled The core already have a valid regulator set for the device opp and the unsupported OPPs are already disabled by the core. There is no need to repeat that in the user drivers, get rid of it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-11 00:24:37 +01:00
Viresh Kumar	050794aaeb	cpufreq: dt: Pass regulator name to the OPP core OPP core can handle the regulators by itself, and but it needs to know the name of the regulator to fetch. Add support for that. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-11 00:24:36 +01:00
Viresh Kumar	391d9aef81	cpufreq: dt: OPP layers handles clock-latency for V1 bindings as well "clock-latency" is handled by OPP layer for all bindings and so there is no need to make special calls for V1 bindings. Use dev_pm_opp_get_max_clock_latency() for both the cases. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-11 00:24:36 +01:00
Viresh Kumar	457e99e60a	cpufreq: dt: Rename 'need_update' to 'opp_v1' That's the real purpose of this field, i.e. to take special care of old OPP V1 bindings. Lets name it accordingly, so that it can be used elsewhere. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-11 00:24:36 +01:00
Viresh Kumar	896d6a4c0f	cpufreq: dt: Convert few pr_debug/err() calls to dev_dbg/err() We have the device structure available now, lets use it for better print messages. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-11 00:24:36 +01:00
Shilpasri G Bhat	c89f2682a3	cpufreq: powernv: Replace pr_info with trace print for throttle event Currently we use printk message to notify the throttle event. But this can flood the console if the cpu is throttled frequently. So replace the printk with the tracepoint to notify the throttle event. And also events like throttle below nominal frequency and OCC_RESET are reduced to pr_warn/pr_warn_once as pointed by MFG to not mark them as critical messages. This patch adds 'throttle_reason' to struct chip to store the throttle reason. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-05 02:38:03 +01:00
Shilpasri G Bhat	96c4726f01	cpufreq: powernv: Remove cpu_to_chip_id() from hot-path cpu_to_chip_id() does a DT walk through to find out the chip id by taking a contended device tree lock. This adds an unnecessary overhead in a hot path. So instead of calling cpu_to_chip_id() everytime cache the chip ids for all cores in the array 'core_to_chip_map' and use it in the hotpath. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-05 02:38:02 +01:00
Shilpasri G Bhat	6d167a44e6	cpufreq: powernv: Hot-plug safe the kworker thread In the kworker_thread powernv_cpufreq_work_fn(), we can end up sending an IPI to a cpu going offline. This is a rare corner case which is fixed using {get/put}_online_cpus(). Along with this fix, this patch adds changes to do oneshot cpumask_{clear/and} operation. Suggested-by: Shreyas B Prabhu <shreyas@linux.vnet.ibm.com> Suggested-by: Gautham R Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-05 02:38:02 +01:00
Shilpasri G Bhat	86622cb8c5	cpufreq: powernv: Free 'chips' on module exit This will free the dynamically allocated memory of 'chips' on module exit. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-02-05 02:38:01 +01:00
Rafael J. Wysocki	de1df26b7c	cpufreq: Clean up default and fallback governor setup The preprocessor magic used for setting the default cpufreq governor (and for using the performance governor as a fallback one for that matter) is really nasty, so replace it with __weak functions and overrides. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-02-05 02:37:42 +01:00
Arnd Bergmann	ea7743e271	ARM: pxa: define clock registers as __iomem We should not dereference registers as pointers, so use readl/writel instead for these registers. The clock registers are accessed in multiple files, so we have to change them all at once. I stumbled over these registers while looking at something unrelated. There are in fact other registers with the same problem, but I did not try to address those at this point. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>	2016-02-01 21:43:41 +01:00
Rafael J. Wysocki	ad1ac94767	Merge branches 'pm-cpuidle', 'pm-cpufreq', 'pm-domains' and 'pm-sleep' * pm-cpuidle: cpuidle: coupled: remove unused define cpuidle_coupled_lock cpuidle: fix fallback mechanism for suspend to idle in absence of enter_freeze * pm-cpufreq: cpufreq: cpufreq-dt: avoid uninitialized variable warnings: cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype cpufreq: Use list_is_last() to check last entry of the policy list cpufreq: Fix NULL reference crash while accessing policy->governor_data * pm-domains: PM / Domains: Fix typo in comment PM / Domains: Fix potential deadlock while adding/removing subdomains PM / domains: fix lockdep issue for all subdomains * pm-sleep: PM: APM_EMULATION does not depend on PM	2016-01-29 21:45:17 +01:00
Arnd Bergmann	b331bc20d9	cpufreq: cpufreq-dt: avoid uninitialized variable warnings: gcc warns quite a bit about values returned from allocate_resources() in cpufreq-dt.c: cpufreq-dt.c: In function 'cpufreq_init': cpufreq-dt.c:327:6: error: 'cpu_dev' may be used uninitialized in this function [-Werror=maybe-uninitialized] cpufreq-dt.c:197:17: note: 'cpu_dev' was declared here cpufreq-dt.c:376:2: error: 'cpu_clk' may be used uninitialized in this function [-Werror=maybe-uninitialized] cpufreq-dt.c:199:14: note: 'cpu_clk' was declared here cpufreq-dt.c: In function 'dt_cpufreq_probe': cpufreq-dt.c:461:2: error: 'cpu_clk' may be used uninitialized in this function [-Werror=maybe-uninitialized] cpufreq-dt.c:447:14: note: 'cpu_clk' was declared here The problem is that it's slightly hard for gcc to follow return codes across PTR_ERR() calls. This patch uses explicit assignments to the "ret" variable to make it easier for gcc to verify that the code is actually correct, without the need to add a bogus initialization. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-01-27 23:23:54 +01:00
Arnd Bergmann	fb2a24a1c6	cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype There are two definitions of pxa_cpufreq_change_voltage, with slightly different prototypes after one of them had its argument marked 'const'. Now the other one (for !CONFIG_REGULATOR) produces a harmless warning: drivers/cpufreq/pxa2xx-cpufreq.c: In function 'pxa_set_target': drivers/cpufreq/pxa2xx-cpufreq.c:291:36: warning: passing argument 1 of 'pxa_cpufreq_change_voltage' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] ret = pxa_cpufreq_change_voltage(&pxa_freq_settings[idx]); ^ drivers/cpufreq/pxa2xx-cpufreq.c:205:12: note: expected 'struct pxa_freqs ' but argument is of type 'const struct pxa_freqs ' static int pxa_cpufreq_change_voltage(struct pxa_freqs *pxa_freq) ^ This changes the prototype in the same way as the other, which avoids the warning. Fixes: `03c2299063` (cpufreq: pxa: make pxa_freqs arrays const) Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.2+ <stable@vger.kernel.org> # 4.2+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-01-27 23:22:50 +01:00
Gautham R Shenoy	2dadfd7564	cpufreq: Use list_is_last() to check last entry of the policy list Currently next_policy() explicitly checks if a policy is the last policy in the cpufreq_policy_list. Use the standard list_is_last primitive instead. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-01-27 23:13:59 +01:00
Viresh Kumar	e4b133cc4b	cpufreq: Fix NULL reference crash while accessing policy->governor_data There is a race discovered by Juri, where we are able to: - create and read a sysfs file before policy->governor_data is being set to a non NULL value. OR - set policy->governor_data to NULL, and reading a file before being destroyed. And so such a crash is reported: Unable to handle kernel NULL pointer dereference at virtual address 0000000c pgd = edfc8000 [0000000c] pgd=bfc8c835 Internal error: Oops: 17 [#1] SMP ARM Modules linked in: CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463 Hardware name: ARM-Versatile Express task: ee8e8480 ti: ee930000 task.ti: ee930000 PC is at show_ignore_nice_load_gov_pol+0x24/0x34 LR is at show+0x4c/0x60 pc : [<c058f1bc>] lr : [<c058ae88>] psr: a0070013 sp : ee931dd0 ip : ee931de0 fp : ee931ddc r10: ee4bc290 r9 : 00001000 r8 : ef2cb000 r7 : ee4bc200 r6 : ef2cb000 r5 : c0af57b0 r4 : ee4bc2e0 r3 : 00000000 r2 : 00000000 r1 : c0928df4 r0 : ef2cb000 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: adfc806a DAC: 00000051 Process cat (pid: 1730, stack limit = 0xee930210) Stack: (0xee931dd0 to 0xee932000) 1dc0: ee931dfc ee931de0 c058ae88 c058f1a4 1de0: edce3bc0 c07bfca4 edce3ac0 00001000 ee931e24 ee931e00 c01fcb90 c058ae48 1e00: 00000001 edce3bc0 00000000 00000001 ee931e50 ee8ff480 ee931e34 ee931e28 1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48 1e40: 00000000 edce3bf0 befe4a00 ee931f78 00000000 00000000 000001e4 00000000 1e60: c00545a8 edce3ac0 00001000 00001000 befe4a00 ee931f78 00000000 00001000 1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 00020000 00000000 00000000 1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 00001000 befe4a00 ee931f78 1ec0: 00000000 00001000 ee931f44 ee931ed8 c017c328 c01fbdc4 00001000 00000000 1ee0: ee8ff480 00001000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08 1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 00000020 ee8ff480 1f20: ee8ff480 befe4a00 00001000 ee931f78 00000000 00000000 ee931f74 ee931f48 1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 00001000 befe4a00 1f60: 00000000 00000000 ee931fa4 ee931f78 c017d2a8 c017d160 00000000 00000000 1f80: 000a9f20 00001000 befe4a00 00000003 c000ffe4 ee930000 00000000 ee931fa8 1fa0: c000fe40 c017d264 000a9f20 00001000 00000003 befe4a00 00001000 00000000 Unable to handle kernel NULL pointer dereference at virtual address 0000000c 1fc0: 000a9f20 00001000 befe4a00 00000003 00000000 00000000 00000003 00000001 pgd = edfc4000 [0000000c] pgd=bfcac835 1fe0: 00000000 befe49dc 000197f8 b6e35dfc 60070010 00000003 3065b49d 134ac2c9 [<c058f1bc>] (show_ignore_nice_load_gov_pol) from [<c058ae88>] (show+0x4c/0x60) [<c058ae88>] (show) from [<c01fcb90>] (sysfs_kf_seq_show+0x90/0xfc) [<c01fcb90>] (sysfs_kf_seq_show) from [<c01fb33c>] (kernfs_seq_show+0x34/0x38) [<c01fb33c>] (kernfs_seq_show) from [<c01a5210>] (seq_read+0x1e4/0x4e4) [<c01a5210>] (seq_read) from [<c01fbed8>] (kernfs_fop_read+0x120/0x1a0) [<c01fbed8>] (kernfs_fop_read) from [<c017c328>] (__vfs_read+0x3c/0xe0) [<c017c328>] (__vfs_read) from [<c017d1ec>] (vfs_read+0x98/0x104) [<c017d1ec>] (vfs_read) from [<c017d2a8>] (SyS_read+0x50/0x90) [<c017d2a8>] (SyS_read) from [<c000fe40>] (ret_fast_syscall+0x0/0x1c) Code: e5903044 e1a00001 e3081df4 e34c1092 (e593300c) ---[ end trace 5994b9a5111f35ee ]--- Fix that by making sure, policy->governor_data is updated at the right places only. Cc: 4.2+ <stable@vger.kernel.org> # 4.2+ Reported-and-tested-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-01-27 23:13:53 +01:00
Bartlomiej Zolnierkiewicz	9b3f105ef6	cpufreq: s5pv210: remove superfluous CONFIG_PM ifdefs CONFIG_PM ifdefs are superfluous and can be removed. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Kukjin Kim <kgene@kernel.org> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>	2016-01-25 15:13:53 +09:00
Linus Torvalds	f689b742f2	powerpc updates for 4.5 - Ground work for the new Power9 MMU from Aneesh Kumar K.V - Optimise FP/VMX/VSX context switching from Anton Blanchard - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling, Andrew Donnellan - Allow wrapper to work on non-english system from Laurent Vivier - Add rN aliases to the pt_regs_offset table from Rashmica Gupta - Fix module autoload for rackmeter & axonram drivers from Luis de Bethencourt - Include KVM guest test in all interrupt vectors from Paul Mackerras - Fix DSCR inheritance over fork() from Anton Blanchard - Make value-returning atomics & {cmp}xchg* & their atomic_ versions fully ordered from Boqun Feng - Print MSR TM bits in oops messages from Michael Neuling - Add TM signal return & invalid stack selftests from Michael Neuling - Limit EPOW reset event warnings from Vipin K Parashar - Remove the Cell QPACE code from Rashmica Gupta - Append linux_banner to exception information in xmon from Rashmica Gupta - Add selftest to check if VSRs are corrupted from Rashmica Gupta - Remove broken GregorianDay() from Daniel Axtens - Import Anton's context_switch2 benchmark into selftests from Michael Ellerman - Add selftest script to test HMI functionality from Daniel Axtens - Remove obsolete OPAL v2 support from Stewart Smith - Make enter_rtas() private from Michael Ellerman - PPR exception cleanups from Michael Ellerman - Add page soft dirty tracking from Laurent Dufour - Add support for Nvlink NPUs from Alistair Popple - Add support for kexec on 476fpe from Alistair Popple - Enable kernel CPU dlpar from sysfs from Nathan Fontenot - Copy only required pieces of the mm_context_t to the paca from Michael Neuling - Add a kmsg_dumper that flushes OPAL console output on panic from Russell Currey - Implement save_stack_trace_regs() to enable kprobe stack tracing from Steven Rostedt - Add HWCAP bits for Power9 from Michael Ellerman - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins - scripts/recordmcount.pl: support data in text section on powerpc from Ulrich Weigand - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand - cxl: Fix possible idr warning when contexts are released from Vaibhav Jain - cxl: use correct operator when writing pcie config space values from Andrew Donnellan - cxl: Fix DSI misses when the context owning task exits from Vaibhav Jain - cxl: fix build for GCC 4.6.x from Brian Norris - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris - cxl: Enable PCI device ID for future IBM CXL adapter from Uma Krishnan - Freescale updates from Scott: Highlights include moving QE code out of arch/powerpc (to be shared with arm), device tree updates, and minor fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWmIxeAAoJEFHr6jzI4aWAA+cQAIXAw4WfVWJ2V4ZK+1eKfB57 fdXG71PuXG+WYIWy71ly8keLHdzzD1NQ2OUB64bUVRq202nRgVc15ZYKRJ/FE/sP SkxaQ2AG/2kI2EflWshOi0Lu9qaZ+LMHJnszIqE/9lnGSB2kUI/cwsSXgziiMKXR XNci9v14SdDd40YV/6BSZXoxApwyq9cUbZ7rnzFLmz4hrFuKmB/L3LABDF8QcpH7 sGt/YaHGOtqP0UX7h5KQTFLGe1OPvK6NWixSXeZKQ71ED6cho1iKUEOtBA9EZeIN QM5JdHFWgX8MMRA0OHAgidkSiqO38BXjmjkVYWoIbYz7Zax3ThmrDHB4IpFwWnk3 l7WBykEXY7KEqpZzbh0GFGehZWzVZvLnNgDdvpmpk/GkPzeYKomBj7ZZfm3H1yGD BTHPwuWCTX+/K75yEVNO8aJO12wBg7DRl4IEwBgqhwU8ga4FvUOCJkm+SCxA1Dnn qlpS7qPwTXNIEfKMJcxp5X0KiwDY1EoOotd4glTN0jbeY5GEYcxe+7RQ302GrYxP zcc8EGLn8h6BtQvV3ypNHF5l6QeTW/0ZlO9c236tIuUQ5gQU39SQci7jQKsYjSzv BB1XdLHkbtIvYDkmbnr1elbeJCDbrWL9rAXRUTRyfuCzaFWTfZmfVNe8c8qwDMLk TUxMR/38aI7bLcIQjwj9 =R5bX -----END PGP SIGNATURE----- Merge tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Core: - Ground work for the new Power9 MMU from Aneesh Kumar K.V - Optimise FP/VMX/VSX context switching from Anton Blanchard Misc: - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling, Andrew Donnellan - Allow wrapper to work on non-english system from Laurent Vivier - Add rN aliases to the pt_regs_offset table from Rashmica Gupta - Fix module autoload for rackmeter & axonram drivers from Luis de Bethencourt - Include KVM guest test in all interrupt vectors from Paul Mackerras - Fix DSCR inheritance over fork() from Anton Blanchard - Make value-returning atomics & {cmp}xchg* & their atomic_ versions fully ordered from Boqun Feng - Print MSR TM bits in oops messages from Michael Neuling - Add TM signal return & invalid stack selftests from Michael Neuling - Limit EPOW reset event warnings from Vipin K Parashar - Remove the Cell QPACE code from Rashmica Gupta - Append linux_banner to exception information in xmon from Rashmica Gupta - Add selftest to check if VSRs are corrupted from Rashmica Gupta - Remove broken GregorianDay() from Daniel Axtens - Import Anton's context_switch2 benchmark into selftests from Michael Ellerman - Add selftest script to test HMI functionality from Daniel Axtens - Remove obsolete OPAL v2 support from Stewart Smith - Make enter_rtas() private from Michael Ellerman - PPR exception cleanups from Michael Ellerman - Add page soft dirty tracking from Laurent Dufour - Add support for Nvlink NPUs from Alistair Popple - Add support for kexec on 476fpe from Alistair Popple - Enable kernel CPU dlpar from sysfs from Nathan Fontenot - Copy only required pieces of the mm_context_t to the paca from Michael Neuling - Add a kmsg_dumper that flushes OPAL console output on panic from Russell Currey - Implement save_stack_trace_regs() to enable kprobe stack tracing from Steven Rostedt - Add HWCAP bits for Power9 from Michael Ellerman - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins - scripts/recordmcount.pl: support data in text section on powerpc from Ulrich Weigand - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand cxl: - cxl: Fix possible idr warning when contexts are released from Vaibhav Jain - cxl: use correct operator when writing pcie config space values from Andrew Donnellan - cxl: Fix DSI misses when the context owning task exits from Vaibhav Jain - cxl: fix build for GCC 4.6.x from Brian Norris - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris - cxl: Enable PCI device ID for future IBM CXL adapter from Uma Krishnan Freescale: - Freescale updates from Scott: Highlights include moving QE code out of arch/powerpc (to be shared with arm), device tree updates, and minor fixes" * tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits) powerpc/module: Handle R_PPC64_ENTRY relocations scripts/recordmcount.pl: support data in text section on powerpc powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff powerpc/mm: Fix _PAGE_PTE breaking swapoff cxl: Enable PCI device ID for future IBM CXL adapter cxl: use -Werror only with CONFIG_PPC_WERROR cxl: fix build for GCC 4.6.x powerpc: Add HWCAP bits for Power9 powerpc/powernv: Reserve PE#0 on NPU powerpc/powernv: Change NPU PE# assignment powerpc/powernv: Fix update of NVLink DMA mask powerpc/powernv: Remove misleading comment in pci.c powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing powerpc: Fix build break due to paca mm_context_t changes cxl: Fix DSI misses when the context owning task exits MAINTAINERS: Update Scott Wood's e-mail address powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery() powerpc: Fix style of self-test config prompts powerpc/powernv: Only delay opal_rtc_read() retry when necessary ...	2016-01-15 13:18:47 -08:00
Andrzej Hajda	929ca89c30	cpufreq-dt: fix handling regulator_get_voltage() result The function can return negative values so it should be assigned to signed type. The problem has been detected using proposed semantic patch scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci. Link: http://permalink.gmane.org/gmane.linux.kernel/2038576 Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-01-05 02:01:38 +01:00
Chen Yu	0df35026c6	cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC It is reported that, with CONFIG_HZ_PERIODIC=y cpu stays at the lowest frequency even if the usage goes to 100%, neither ondemand nor conservative governor works, however performance and userspace work as expected. If set with CONFIG_NO_HZ_FULL=y, everything goes well. This problem is caused by improper calculation of the idle_time when the load is extremely high(near 100%). Firstly, cpufreq_governor uses get_cpu_idle_time to get the total idle time for specific cpu, then: 1.If the system is configured with CONFIG_NO_HZ_FULL, the idle time is returned by ktime_get, which is always increasing, it's OK. 2.However, if the system is configured with CONFIG_HZ_PERIODIC, get_cpu_idle_time might not guarantee to be always increasing, because it will leverage get_cpu_idle_time_jiffy to calculate the idle_time, consider the following scenario: At T1: idle_tick_1 = total_tick_1 - user_tick_1 sample period(80ms)... At T2: ( T2 = T1 + 80ms): idle_tick_2 = total_tick_2 - user_tick_2 Currently the algorithm is using (idle_tick_2 - idle_tick_1) to get the delta idle_time during the past sample period, however it CAN NOT guarantee that idle_tick_2 >= idle_tick_1, especially when cpu load is high. (Yes, total_tick_2 >= total_tick_1, and user_tick_2 >= user_tick_1, but how about idle_tick_2 and idle_tick_1? No guarantee.) So governor might get a negative value of idle_time during the past sample period, which might mislead the system that the idle time is very big(converted to unsigned int), and the busy time is nearly zero, which causes the governor to always choose the lowest cpufreq, then cause this problem. In theory there are two solutions: 1.The logic should not rely on the idle tick during every sample period, but be based on the busy tick directly, as this is how 'top' is implemented. 2.Or the logic must make sure that the idle_time is strictly increasing during each sample period, then there would be no negative idle_time anymore. This solution requires minimum modification to current code and this patch uses method 2. Link: https://bugzilla.kernel.org/show_bug.cgi?id=69821 Reported-by: Jan Fikar <j.fikar@gmail.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-01-05 02:01:32 +01:00
Pi-Cheng Chen	a889331d75	cpufreq: mt8173: migrate to use operating-points-v2 bindings Modify mt8173-cpufreq driver to get OPP-sharing information and set up OPP table provided by operating-points-v2 bindings. Signed-off-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-01-02 00:49:33 +01:00
Rafael J. Wysocki	7a6c79f2fe	cpufreq: Simplify core code related to boost support Notice that the boost_supported field in struct cpufreq_driver is redundant, because the driver's ->set_boost callback may be left unset if "boost" is not supported. Moreover, the only driver populating the ->set_boost callback is acpi_cpufreq, so make it avoid populating that callback if "boost" is not supported, rework the core to check ->set_boost instead of boost_supported to verify "boost" support and drop boost_supported which isn't used any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-01-01 03:49:51 +01:00
Rafael J. Wysocki	17135782b8	cpufreq: acpi-cpufreq: Simplify boost-related code The store_boost() routine is only used by store_cpb(), so move the code from it directly to that function and rename _store_boost() to set_boost() to make its name reflect the name of the driver callback pointing to it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-01-01 03:49:51 +01:00
Rafael J. Wysocki	41669da030	cpufreq: Make cpufreq_boost_supported() static cpufreq_boost_supported() is not used outside of cpufreq.c, so make it static. While at it, refactor it as a one-liner (which it really is). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-01-01 03:49:51 +01:00
Markus Elfring	d23f8cadf9	blackfin-cpufreq: Mark cpu_set_cclk() as static The cpu_set_cclk() function was only used in a single source file so far. Indicate this setting also by the corresponding linkage specifier. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-28 01:51:36 +01:00
Markus Elfring	a5810b4f07	blackfin-cpufreq: Change return type of cpu_set_cclk() to that of clk_set_rate() The return type "unsigned long" was used by the cpu_set_cclk() function while the type "int" is provided by the clk_set_rate() function. Let us make this usage consistent. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-28 01:51:35 +01:00
Rafael J. Wysocki	9ad55cd9e6	Merge back earlier cpufreq material for v4.5.	2015-12-28 01:34:35 +01:00
Dan Carpenter	a7def561c2	cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info() The "domain" variable needs to be signed for the error handling to work. Fixes: `8def31034d` (cpufreq: arm_big_little: add SCPI interface driver) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-24 02:11:37 +01:00
Rafael J. Wysocki	4157c2fc84	Merge back earlier cpufreq material for v4.5.	2015-12-21 03:15:15 +01:00
Stewart Smith	e4d54f71d2	powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is just OPALv3, with nobody ever expecting anything on pre-OPALv3 to be cared about or supported by mainline kernels. So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL exclusively. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:54 +11:00
Rafael J. Wysocki	f1b9fc591e	Merge branches 'powercap', 'pm-cpufreq' and 'pm-domains' * powercap: powercap / RAPL: fix BIOS lock check * pm-cpufreq: cpufreq: intel_pstate: Minor cleanup for FRAC_BITS cpufreq: tegra: add regulator dependency for T124 * pm-domains: PM / Domains: Allow runtime PM callbacks to be re-used during system PM	2015-12-14 22:58:57 +01:00
Linus Torvalds	097b285d32	ARM: SoC fixes for 4.4-rc Here are a bunch of small bug fixes for various ARM platforms, nothing really sticks out this week, most of either fixes bugs in code that was just added in 4.4, or that has been broken for many years without anyone noticing. at91/sama5d2 - fix sama5de hardware setup of sd/mmc interface - proper selection of pinctrl drivers. PIO4 is necessary for sama5d2 berlin - fix incorrect clock input for SDIO exynos - Fix potential NULL pointer dereference in Exynos PMU driver. imx - Fix vf610 SAI clock configuration bug which is discovered by the newly added master mode support in SAI audio driver. - Fix buggy L2 cache latency values in vf610 device trees, which may cause system hang when cpu runs at a higher frequency. ixp4xx - fix prototypes for readl/writel functions ls2080a - use little-endian register access for GPIO and SDHCI omap - Fix clock source for ARM TWD and global timers on am437x - Always select REGULATOR_FIXED_VOLTAGE for omap2+ instead of when MACH_OMAP3_PANDORA is selected - Fix SPI DMA handles for dm816x as only some were mapped - Fix up mbox cells for dm816x to make mailbox usable pxa - use PWM lookup table for all ezx machines s3c24xx - Remove incorrect __init annotation from s3c24xx cpufreq driver structures. versatile - fix PCI IRQ mapping on Versatile PB -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIVAwUAVmyQMWCrR//JCVInAQIIDA//VyJ2UoTJ2JC3thVP56P/ZXh7Pz8VDqnq cgoFUio27IeHPSgs+W9qWliOrb+LaXkuOl8CKgepm+Bv7j8Y+uryP4X2rKQ3ZRmy 2f5+uUqAIZ0Co2aJdtG395lY9TKNHl6cPEskcbgL7cjdgj7QBqfIyj22QZbj6yRp kp8pj+cKXBFRLa5PvePon2w03MA/bLaP30VzKCSL1zchcs52rxekU694V3ISNa63 eshyyKf354Sl9hP4Y8xCdl/mboymKzQxEGDQS/Fcb8h/OQ3djoh+7EKdVbdyZ2A7 phgfazd2aE7wQ5GVIkMNV/MzGHj9xpiD4Z1Hi/2E8WdzuXJTRicS4bJihRAIualt H1FOEdgqT+xS4JUYxAvl46fwwqcFJfixtGgKka27sJTtk+Y1kHjASWvueZKlHMIK ln9CF7PoecF0InQaY2N8Vy05Qcp5MuoB/0v+XlftI0sAtIXNeo142H2NQZCsO+1U bJDyb5E4z06jzqk7IOK4/AKyEAV9KZPDws+ZxcNH/faPT10epK7MeZdetbD7b8q3 pkY7s5iXV8uBox7FtHoamrlMFgAzN9Qh0E4bcw70aKaJZZ02ozTXCvJIKjoIPMne FsvidQToznqbA2RSXpxRQrcXrMxvURaPCRBe7CxrCoynmhIxd4UHND2HJ4OG645z 4SAGOzOlZKM= =fgEd -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "Here are a bunch of small bug fixes for various ARM platforms, nothing really sticks out this week, most of either fixes bugs in code that was just added in 4.4, or that has been broken for many years without anyone noticing. at91/sama5d2: - fix sama5de hardware setup of sd/mmc interface - proper selection of pinctrl drivers. PIO4 is necessary for sama5d2 berlin: - fix incorrect clock input for SDIO exynos: - Fix potential NULL pointer dereference in Exynos PMU driver. imx: - Fix vf610 SAI clock configuration bug which is discovered by the newly added master mode support in SAI audio driver. - Fix buggy L2 cache latency values in vf610 device trees, which may cause system hang when cpu runs at a higher frequency. ixp4xx: - fix prototypes for readl/writel functions ls2080a: - use little-endian register access for GPIO and SDHCI omap: - Fix clock source for ARM TWD and global timers on am437x - Always select REGULATOR_FIXED_VOLTAGE for omap2+ instead of when MACH_OMAP3_PANDORA is selected - Fix SPI DMA handles for dm816x as only some were mapped - Fix up mbox cells for dm816x to make mailbox usable pxa: - use PWM lookup table for all ezx machines s3c24xx: - Remove incorrect __init annotation from s3c24xx cpufreq driver structures. versatile: - fix PCI IRQ mapping on Versatile PB" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ls2080a/dts: Add little endian property for GPIO IP block dt-bindings: define little-endian property for QorIQ GPIO ARM64: dts: ls2080a: fix eSDHC endianness ARM: dts: vf610: use reset values for L2 cache latencies ARM: pxa: use PWM lookup table for all machines ARM: dts: berlin: add 2nd clock for BG2Q sdhci0 and sdhci1 ARM: dts: berlin: correct BG2Q's sdhci2 2nd clock ARM: dts: am4372: fix clock source for arm twd and global timers ARM: at91: fix pinctrl driver selection ARM: at91/dt: add always-on to 1.8V regulator ARM: dts: vf610: fix clock definition for SAI2 ARM: imx: clk-vf610: fix SAI clock tree ARM: ixp4xx: fix read{b,w,l} return types irqchip/versatile-fpga: Fix PCI IRQ mapping on Versatile PB ARM: OMAP2+: enable REGULATOR_FIXED_VOLTAGE ARM: dts: add dm816x missing spi DT dma handles ARM: dts: add dm816x missing #mbox-cells cpufreq: s3c24xx: Do not mark s3c2410_plls_add as __init ARM: EXYNOS: Fix potential NULL pointer access in exynos_sys_powerdown_conf	2015-12-12 16:43:44 -08:00
Lee Jones	ab0ea257fc	cpufreq: st: Provide runtime initialised driver for ST's platforms The bootloader is charged with the responsibility to provide platform specific Dynamic Voltage and Frequency Scaling (DVFS) information via Device Tree. This driver takes the supplied configuration and registers it with the new generic OPP framework, to then be used with CPUFreq. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-12 02:55:21 +01:00
Prarit Bhargava	88b7b7c0c2	cpufreq: intel_pstate: Minor cleanup for FRAC_BITS `785ee27` ("cpufreq: intel_pstate: Fix limits->max_perf rounding error") hardcodes the value of FRAC_BITS. This patch fixes that minor issue. Fixes: `785ee27881` (cpufreq: intel_pstate: Fix limits->max_perf rounding error) Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-12 02:28:19 +01:00
Pi-Cheng Chen	89b56047f6	cpufreq: mt8173: Move resources allocation into ->probe() Since the return value of ->init() of cpufreq driver is not propagated to the device driver model now, move resources allocation into ->probe() to handle -EPROBE_DEFER properly. Signed-off-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-12 02:27:16 +01:00
Arnd Bergmann	b5832e4b62	cpufreq: tegra: add regulator dependency for T124 This driver is the only one that calls regulator_sync_voltage(), but it can currently be built with CONFIG_REGULATOR disabled, producing this build error: drivers/cpufreq/tegra124-cpufreq.c: In function 'tegra124_cpu_switch_to_pllx': drivers/cpufreq/tegra124-cpufreq.c:68:2: error: implicit declaration of function 'regulator_sync_voltage' [-Werror=implicit-function-declaration] regulator_sync_voltage(priv->vdd_cpu_reg); My first attempt was to implement a helper for this function for regulator_sync_voltage, but Mark Brown explained: We don't do this for all regulator API functions - there's some where using them strongly suggests that there is actually a dependency on the regulator API. This does seem like it might be falling into the specialist category [...] Looking at the code I'm pretty unclear on what the authors think the use of _sync_voltage() is doing in the first place so it may be even better to just remove the call. It seems to have been included in the first commit so there's not changelog explaining things and there's no comment either. I'd expect it to be a noop as far as I can see. This adds the dependency to make the driver always build successfully or not be enabled at all. Alternatively, we could investigate if the driver should stop calling regulator_sync_voltage instead. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-12 02:25:56 +01:00
Arnd Bergmann	789d73b384	Fixes for Exynos: 1. Fix potential NULL pointer dereference in Exynos PMU driver. 2. Remove incorrect __init annotation from s3c24xx cpufreq driver structures. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWV66yAAoJEME3ZuaGi4PXxdkP/jHTVeUGKt0NmDuAQ8oHLXsL AGAFF0KdkrWvAcjiLBfyNw4DTQwZu6pyF4E+PFffhXRjJFp3ku7pKcGcf2+vvrdt mc5Sg9dvICoqqQiJmnyMymG+dKXWGEK5NF8HPZSATAl+7DrTsACRWU4UPMLlOjmQ /U2KABSJFcwuAPBd0wKL6B5Wbx7ijWaUI1xnNxqpuZW6YsXEqz4GYxJ41UWqNrN/ htoOgYCDTh9aATWpJeDRUALEYiIPS+pAjdCYaiZzrmJerCDiNO8RdMKqVX2LG+sk tqBem2cy9TlCuJFdZX50Y700nK6G+iniTVaGoJO93w4XLRwTKAJROr/+7uNGqWqq Q/Ofkh6SEJpOiNunREqf7zCLOwUs6d2bQSTp6FilAsSxQCeRVSvXkTfTv388UWOk vD3TtfCPPokPwyWubtPeRYn6lDTfXUZr8cVkh1+/lcTCXEMvlK9AyDp3BsKDQHyQ zG5yZuZ6VQ2bmhE4B43yCMdqxFTLe+vf+GetJl4aDSAQNMuBdXZ84tyLwQqAMpvK PXTjL9RqgA78YurlS5LamqChQEI4abVnWvwBk5rTxwaqL5B5BC9i7U5Q8DuOFPv2 W52GcBCdYriFmPhpXSVLDn47s1eyE360mBknUh2V0p6+GQzaAjzyjiTxU9/suoHQ IPGSkgxOfIiBMiHYPG0O =jNcR -----END PGP SIGNATURE----- Merge tag 'samsung-fixes-4.4' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into fixes Merge "Fixes for Exynos" from Krzysztof Kozlowski: 1. Fix potential NULL pointer dereference in Exynos PMU driver. 2. Remove incorrect __init annotation from s3c24xx cpufreq driver structures. * tag 'samsung-fixes-4.4' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: cpufreq: s3c24xx: Do not mark s3c2410_plls_add as __init ARM: EXYNOS: Fix potential NULL pointer access in exynos_sys_powerdown_conf	2015-12-11 00:19:37 +01:00
Philippe Longepe	63d1d656a5	cpufreq: intel_pstate: Account for IO wait time In cases where we have many IOs, the global load becomes low and the load algorithm will decrease the requested P-State. Because of that, the IOs overheads will increase and impact the IO performances. To improve IO bound work, we can count the io-wait time as busy time in calculating CPU busy. This change uses get_cpu_iowait_time_us() to obtain the IO wait time value and converts time into number of cycles spent waiting on IO at the TSC rate. At the moment, this trick is only used for Atom. Signed-off-by: Philippe Longepe <philippe.longepe@intel.com> Signed-off-by: Stephane Gasparini <stephane.gasparini@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 01:17:40 +01:00
Philippe Longepe	e70eed2b64	cpufreq: intel_pstate: Account for non C0 time The current function to calculate cpu utilization uses the average P-state ratio (APerf/Mperf) scaled by the ratio of the current P-state to the max available non-turbo one. This leads to an overestimation of utilization which causes higher-performance P-states to be selected more often and that leads to increased energy consumption. This is a problem for low-power systems, so it is better to use a different utilization calculation algorithm for them. Namely, the Percent Busy value (or load) can be estimated as the ratio of the MPERF counter that runs at a constant rate only during active periods (C0) to the time stamp counter (TSC) that also runs (at the same rate) during idle. That is: Percent Busy = 100 * (delta_mperf / delta_tsc) Use this algorithm for platforms with SoCs based on the Airmont and Silvermont Atom cores. Signed-off-by: Philippe Longepe <philippe.longepe@intel.com> Signed-off-by: Stephane Gasparini <stephane.gasparini@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 01:17:40 +01:00
Philippe Longepe	157386b6fc	cpufreq: intel_pstate: Configurable algorithm to get target pstate Target systems using different cpus have different power and performance requirements. They may use different algorithms to get the next P-state based on their power or performance preference. For example, power-constrained systems may not want to use high-performance P-states as aggressively as a full-size desktop or a server platform. A server platform may want to run close to the max to achieve better performance, while laptop-like systems may prefer sacrificing performance for longer battery lifes. For the above reasons, modify intel_pstate to allow the target P-state selection algorithm to be depend on the CPU ID. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Philippe Longepe <philippe.longepe@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 01:17:40 +01:00
Pi-Cheng Chen	40be4c3ccb	cpufreq: mt8173: check return value of regulator_get_voltage() call Sometimes regulator_get_voltage() call returns negative values for reasons(e.g. underlying I2C bus timeout). Add check for the return values and fail out early. Signed-off-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 00:27:33 +01:00
Pi-Cheng Chen	93625d52e7	cpufreq: mt8173: remove redundant regulator_get_voltage() call Remove redundant regulator_get_voltage() call to get Vsram value since it will be obtained later at the beginning of voltage tracking loop. Signed-off-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 00:27:33 +01:00
Pi-Cheng Chen	9bb46b87d6	cpufreq: mt8173: add CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag Add CPUFREQ_HAVE_GOVERNOR_PER_POLICY to have individual set of tunables for each cluster of MT8173. Signed-off-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 00:27:32 +01:00
Hongtao Jia	8ae1702a0d	cpufreq: qoriq: Register cooling device based on device tree Register the qoriq cpufreq driver as a cooling device, based on the thermal device tree framework. When temperature crosses the passive trip point cpufreq is used to throttle CPUs. Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 00:19:35 +01:00
Jacob Tanenbaum	790d849bf8	cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency The cpufreq documentation specifies policy->cpuinfo.transition_latency the time it takes on this CPU to switch between two frequencies in nanoseconds (if appropriate, else specify CPUFREQ_ETERNAL) currently pcc-cpufreq does not expose the value and sets it to zero. I changed the pcc-cpufreq driver and it's documentation to conform to the default value specified in Documentation/cpu-freq/cpu-drivers.txt Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 00:17:03 +01:00
Punit Agrawal	2f7e8a175d	cpufreq: arm_big_little: Add support to register a cpufreq cooling device Register passive cooling devices when initialising cpufreq on big.LITTLE systems. If the device tree provides a dynamic power coefficient for the CPUs then the bound cooling device will support the extensions that allow it to be used with all the existing thermal governors including the power allocator governor. A cooling device will be created per individual frequency domain and can be bound to thermal zones via the thermal DT bindings. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 00:14:58 +01:00
Punit Agrawal	f8fa8ae06b	cpufreq-dt: Supply power coefficient when registering cooling devices Support registering cooling devices with dynamic power coefficient where provided by the device tree. This allows OF registered cooling devices driver to be used with the power_allocator thermal governor. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-10 00:14:58 +01:00
Rafael J. Wysocki	2dd3e724b4	cpufreq: governor: Use lockless timer function It is possible to get rid of the timer_lock spinlock used by the governor timer function for synchronization, but a couple of races need to be avoided. The first race is between multiple dbs_timer_handler() instances that may be running in parallel with each other on different CPUs. Namely, one of them has to queue up the work item, but it cannot be queued up more than once. To achieve that, atomic_inc_return() can be used on the skip_work field of struct cpu_common_dbs_info. The second race is between an already running dbs_timer_handler() and gov_cancel_work(). In that case the dbs_timer_handler() might not notice the skip_work incrementation in gov_cancel_work() and it might queue up its work item after gov_cancel_work() had returned (and that work item would corrupt skip_work going forward). To prevent that from happening, gov_cancel_work() can be made wait for the timer function to complete (on all CPUs) right after skip_work has been incremented. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-12-09 22:26:13 +01:00
Viresh Kumar	f08f638b9c	cpufreq: ondemand: update update_sampling_rate() to make it more efficient Currently update_sampling_rate() runs over each online CPU and cancels/queues timers on all policy->cpus every time. This should be done just once for any cpu belonging to a policy. Create a cpumask and keep on clearing it as and when we process policies, so that we don't have to traverse through all CPUs of the same policy. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-09 22:26:12 +01:00
Viresh Kumar	70f43e5e79	cpufreq: governor: replace per-CPU delayed work with timers cpufreq governors evaluate load at sampling rate and based on that they update frequency for a group of CPUs belonging to the same cpufreq policy. This is required to be done in a single thread for all policy->cpus, but because we don't want to wakeup idle CPUs to do just that, we use deferrable work for this. If we would have used a single delayed deferrable work for the entire policy, there were chances that the CPU required to run the handler can be in idle and we might end up not changing the frequency for the entire group with load variations. And so we were forced to keep per-cpu works, and only the one that expires first need to do the real work and others are rescheduled for next sampling time. We have been using the more complex solution until now, where we used a delayed deferrable work for this, which is a combination of a timer and a work. This could be made lightweight by keeping per-cpu deferred timers with a single work item, which is scheduled by the first timer that expires. This patch does just that and here are important changes: - The timer handler will run in irq context and so we need to use a spin_lock instead of the timer_mutex. And so a separate timer_lock is created. This also makes the use of the mutex and lock quite clear, as we know what exactly they are protecting. - A new field 'skip_work' is added to track when the timer handlers can queue a work. More comments present in code. Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Ashwin Chaugule <ashwin.chaugule@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-09 22:26:00 +01:00
Viresh Kumar	5e4500d8db	cpufreq: governor: initialize/destroy timer_mutex with 'shared' timer_mutex is required to be initialized only while memory for 'shared' is allocated and in a similar way it is required to be destroyed only when memory for 'shared' is freed. There is no need to do the same every time we start/stop the governor. Move code to initialize/destroy timer_mutex to the relevant places. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-07 02:20:23 +01:00
Viresh Kumar	affde5d06a	cpufreq: governor: Pass policy as argument to ->gov_dbs_timer() Pass 'policy' as argument to ->gov_dbs_timer() instead of cdbs and dbs_data. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-07 02:20:22 +01:00
Viresh Kumar	e68fe18c5b	cpufreq: ondemand: Work is guaranteed to be pending We are guaranteed to have works scheduled for policy->cpus, as the policy isn't stopped yet. And so there is no need to check that again. Drop it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-07 02:20:21 +01:00
Viresh Kumar	e128c86407	cpufreq: ondemand: Update sampling rate only for concerned policies We are comparing policy->governor against cpufreq_gov_ondemand to make sure that we update sampling rate only for the concerned CPUs. But that isn't enough. In case of governor_per_policy, there can be multiple instances of ondemand governor and we will always end up updating all of them with current code. What we rather need to do, is to compare dbs_data with poilcy->governor_data, which will match only for the policies governed by dbs_data. This code is also racy as the governor might be getting stopped at that time and we may end up scheduling work for a policy, which we have just disabled. Fix that by protecting the entire function with &od_dbs_cdata.mutex, which will prevent against races with policy START/STOP/etc. After these locks are in place, we can safely get the policy via per-cpu dbs_info. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-07 02:20:10 +01:00
Rafael J. Wysocki	d441fe25e7	Merge branches 'pm-domains' and 'pm-cpufreq' * pm-domains: PM / Domains: Fix bad of_node_put() in failure paths of genpd_dev_pm_attach() PM / Domains: Validate cases of a non-bound driver in genpd governor * pm-cpufreq: cpufreq: use last policy after online for drivers with ->setpolicy	2015-12-04 14:01:42 +01:00
Srinivas Pandruvada	69030dd1c3	cpufreq: use last policy after online for drivers with ->setpolicy For cpufreq drivers which use setpolicy interface, after offline->online the policy is set to default. This can be reproduced by setting the default policy of intel_pstate or longrun to ondemand and then change to "performance". After offline and online, the setpolicy will be called with the policy=ondemand. For drivers using governors this condition is handled by storing last_governor, during offline and restoring during online. The same should be done for drivers using setpolicy interface. Storing last_policy during offline and restoring during online. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-12-02 23:50:33 +01:00
Rafael J. Wysocki	f28a1b0df7	Merge branches 'pm-cpufreq' and 'acpi-cppc' * pm-cpufreq: intel_pstate: Fix "performance" mode behavior with HWP enabled cpufreq: SCPI: Depend on SCPI clk driver cpufreq: intel_pstate: Fix limits->max_perf rounding error cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev() * acpi-cppc: cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly	2015-11-27 16:23:59 +01:00
Arnd Bergmann	62f49ee26f	cpufreq: s3c24xx: Do not mark s3c2410_plls_add as __init s3c2410_plls_add is a device notifier that may be called at runtime and is correctly not marked __init. However it calls s3c_plltab_register() which is marked __init, and that triggers a build error when we are checking for section mismatches: WARNING: vmlinux.o(.text+0x195e0): Section mismatch in reference from the function s3c2410_plls_add() to the function .init.text:s3c_plltab_register() The function s3c2410_plls_add() references the function __init s3c_plltab_register(). This is often because s3c2410_plls_add lacks a __init annotation or the annotation of s3c_plltab_register is wrong. This removes the __init annotation from s3c2410_plls_add as well as the __initdata section annotations from s3c2440_plls_12 and s3c2440_plls_169344, which in turn are referenced from s3c2410_plls_add. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>	2015-11-27 10:10:32 +09:00
Alexandra Yates	584ee3dcb1	intel_pstate: Fix "performance" mode behavior with HWP enabled If hardware-driven P-state selection (HWP) is enabled, the "performance" mode of intel_pstate should only allow the processor to use the highest-performance P-state available. That is not the case currently, so make it actually happen. Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Alexandra Yates <alexandra.yates@linux.intel.com> [ rjw: Subject and changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-25 23:37:44 +01:00
Punit Agrawal	73124ced9c	cpufreq: SCPI: Depend on SCPI clk driver The SCPI clk driver registers the virtual cpufreq device that kicks off initialisation of the SCPI cpufreq driver. Also, clk_get() will fail for the cpufreq driver if the SCPI clk driver is missing. Fix this by making the SCPI cpufreq driver explicitly depend on the SCPI clk driver. Fixes: `8def31034d` (cpufreq: arm_big_little: add SCPI interface driver) Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-23 23:50:27 +01:00
Rafael J. Wysocki	b0ceed0685	Merge back earlier cpufreq fixes for v4.4.	2015-11-23 23:49:57 +01:00
Prarit Bhargava	785ee27881	cpufreq: intel_pstate: Fix limits->max_perf rounding error A rounding error was found in the calculation of limits->max_perf in intel_pstate_set_policy(), which is used to calculate the max and min pstate values in intel_pstate_get_min_max(). In that code, limits->max_perf is truncated to 2 hex digits such that, for example, 0x169 was incorrectly calculated to 0x16 instead of 0x17. This resulted in the pstate being set one level too low. This patch rounds the value of limits->max_perf up instead of down so that the correct max pstate can be reached. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-23 23:15:34 +01:00
Prarit Bhargava	8478f53946	cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error I have a Intel (6,63) processor with a "marketing" frequency (from /proc/cpuinfo) of 2100MHz, and a max turbo frequency of 2600MHz. I can execute cpupower frequency-set -g powersave --min 1200MHz --max 2100MHz and the max_freq_pct is set to 80. When adding load to the system I noticed that the cpu frequency only reached 2000MHZ and not 2100MHz as expected. This is because limits->max_policy_pct is calculated as 2100 * 100 /2600 = 80.7 and is rounded down to 80 when it should be rounded up to 81. This patch adds a DIV_ROUND_UP() which will return the correct value. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-23 23:14:10 +01:00
Viresh Kumar	f344dae0fe	cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev() Subsys interface's ->remove_dev() is called when the cpufreq driver is unregistering or the CPU is getting physically removed. We keep removing the cpuX/cpufreq link for all CPUs except the last one, which is a mistake as all CPUs contain a link now. Because of this, one CPU from each policy will still contain a link (to an already removed policyX directory), after the cpufreq driver is unregistered. Fix that by removing the link first and then only see if the policy is required to be freed. That will make sure that no links are left out. Fixes: `96bdda61f5` ("cpufreq: create cpu/cpufreq/policyX directories") Reported-and-tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-23 22:49:42 +01:00
Ashwin Chaugule	9dc1791773	cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly The CPU policy struct indicates the co-ordination type for all CPUs of a common freq domain. Initialize it correctly using the CPU specific data gathered from CPPC ACPI lib via acpi_get_psd_map(). The PSD object is optional, so the cpu->shared_type can also be 0. So instead of assuming any value other than SW_ANY(0xFD) is unsupported, explictly check if shared_type is SW_ALL and then bail. Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-23 22:21:18 +01:00
Rafael J. Wysocki	9832bf3a35	Merge branches 'pm-cpufreq' and 'acpi-cppc' * pm-cpufreq: Revert "Documentation: kernel_parameters for Intel P state driver" cpufreq: mediatek: fix build error cpufreq: intel_pstate: Add separate support for Airmont cores cpufreq: intel_pstate: Replace BYT with ATOM Revert "cpufreq: intel_pstate: Use ACPI perf configuration" Revert "cpufreq: intel_pstate: Avoid calculation for max/min" * acpi-cppc: ACPI / CPPC: Use h/w reduced version of the PCCT structure	2015-11-20 01:22:10 +01:00
Arnd Bergmann	2d4ee30367	cpufreq: mediatek: fix build error The recently added mt8173 cpufreq driver relies on the cpu topology that is always present on ARM64 but optional on ARM32: drivers/cpufreq/mt8173-cpufreq.c: In function 'mtk_cpufreq_init': drivers/cpufreq/mt8173-cpufreq.c:441:30: error: 'cpu_topology' undeclared (first use in this function) cpumask_copy(policy->cpus, &cpu_topology[policy->cpu].core_sibling); This refines the Kconfig dependencies so that we can still build on ARM32, but only if COMPILE_TEST is selected and the CPU topology code is present. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-19 00:21:47 +01:00
Philippe Longepe	1421df63c3	cpufreq: intel_pstate: Add separate support for Airmont cores There are two flavors of Atom cores to be supported by intel_pstate, Silvermont and Airmont, so make the driver distinguish between them by adding separate frequency tables. Separate the CPU defaults params for each of them and match the CPU IDs against them as appropriate. Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com> Signed-off-by: Stephane Gasparini <stephane.gasparini@linux.intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject and changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-19 00:21:46 +01:00
Philippe Longepe	938d21a2a6	cpufreq: intel_pstate: Replace BYT with ATOM Rename symbol and function names starting with "BYT" or "byt" to start with "ATOM" or "atom", respectively, so as to make it clear that they may apply to Atom in general and not just to Baytrail (the goal is to support several Atoms architectures eventually). This should not lead to any functional changes. Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com> Signed-off-by: Stephane Gasparini <stephane.gasparini@linux.intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw : Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-19 00:21:46 +01:00
Rafael J. Wysocki	6ee11e413c	Revert "cpufreq: intel_pstate: Use ACPI perf configuration" Revert commit `37afb00032` (cpufreq: intel_pstate: Use ACPI perf configuration) that is reported to cause a regression to happen on a system where invalid data are returned by the ACPI _PSS object. Since that commit makes assumptions regarding the _PSS output correctness that may turn out to be overly optimistic in general, there is a concern that it may introduce regression on more systems, so it's better to revert it now and we'll revisit the underlying issue in the next cycle with a more robust solution. Conflicts: drivers/cpufreq/intel_pstate.c Fixes: `37afb00032` (cpufreq: intel_pstate: Use ACPI perf configuration) Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-19 00:20:42 +01:00
Rafael J. Wysocki	799281a3c4	Revert "cpufreq: intel_pstate: Avoid calculation for max/min" Revert commit `4ef4514870` (cpufreq: intel_pstate: Avoid calculation for max/min) as it depends on commit `37afb00032` (cpufreq: intel_pstate: Use ACPI perf configuration) that causes problems to happen and needs to be reverted. Conflicts: drivers/cpufreq/intel_pstate.c Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-18 23:29:56 +01:00
Linus Torvalds	be23c9d20b	More power management and ACPI updates for v4.4-rc1 - Support for the ACPI _CCA configuration object intended to tell the OS whether or not a bus master device supports hardware managed cache coherency and a new set of functions to allow drivers to check the cache coherency support for devices in a platform firmware interface agnostic way (Suravee Suthikulpanit, Jeremy Linton). - ACPI backlight quirks for ESPRIMO Mobile M9410 and Dell XPS L421X (Aaron Lu, Hans de Goede). - Fixes for the arm_big_little and s5pv210-cpufreq cpufreq drivers (Jon Medhurst, Nicolas Pitre). - kfree()-related fixup for the recently introduced CPPC cpufreq frontend (Markus Elfring). - intel_pstate fix reducing kernel log noise on systems where P-states are managed by hardware (Prarit Bhargava). - intel_pstate maintainers information update (Srinivas Pandruvada). - cpufreq core optimization related to the handling of delayed work items used by governors (Viresh Kumar). - Locking fixes and cleanups of the Operating Performance Points (OPP) framework (Viresh Kumar). - Generic power domains framework cleanups (Lina Iyer). - cpupower tool updates (Jacob Tanenbaum, Sriram Raghunathan, Thomas Renninger). - turbostat tool updates (Len Brown). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJWQ96OAAoJEILEb/54YlRxyYYQALJ1HXu76SvYX1re2aawOw6Y WgzF3Ly7JX034E1VvA2xP6wgkWpBRBDcpnRDeltNA4dYXPBDei/eTcRZTLX12N3g AfFRGjGWTtLJfpNPecNMmUyF5xHjgDgMIQRabY+Is5NfP5STkPHJeqULnEpvTtx8 bd0lnC5jc4vuZiPEh1xVb+ClYDqWS8YQPyFJVjV/BaIf8Qwe5+oRX36byMBaKc9D ZgmvmCk5n/HLQQ1uQsqe4xnhFLHN2rypt2BLvFrOtlnSz9VNNpQyB+OIW1mgCD4f LhpKIwjP8NhZNQUq8HFu7nDlm8ciQtWmeMPB5NdGQ+OESu7yfKAOzQ+3U6Gl2Gaf 66zVGyV6SOJJwfDVJ3qKTtroWps9QV7ZClOJ+zJGgiujwU+tJ3pDQyZM9pa7CL3C s7ZAUsI6IigSBjD3nJVOyG4DO0a8KQFCIE1mDmyqId45Qz8xJoOrYP33/ZnDuOdo 2OtL/emyfWsz9ixbHVfwIhb7EC6aoaUxQrhSWmNraaQS43YfioZR7h4we8gwenph X4E1KY4SdML+uFf2VKIcd45NM3IBprCxx5UgFAJdrqe8+otqPNF2AVosG4iqhg/b k4nxwuIvw2a8Fm77U9ytyXDYMItU/wIlAHMbnmgx+oTwRv6AbZ07MHkyfuQLYuhD tq5Y14qSiTS7prNacx98 =XZiP -----END PGP SIGNATURE----- Merge tag 'pm+acpi-4.4-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management and ACPI updates from Rafael Wysocki: "The only new feature in this batch is support for the ACPI _CCA device configuration object, which it a pre-requisite for future ACPI PCI support on ARM64, but should not affect the other architectures. The rest is fixes and cleanups, mostly in cpufreq (including intel_pstate), the Operating Performace Points (OPP) framework and tools (cpupower and turbostat). Specifics: - Support for the ACPI _CCA configuration object intended to tell the OS whether or not a bus master device supports hardware managed cache coherency and a new set of functions to allow drivers to check the cache coherency support for devices in a platform firmware interface agnostic way (Suravee Suthikulpanit, Jeremy Linton). - ACPI backlight quirks for ESPRIMO Mobile M9410 and Dell XPS L421X (Aaron Lu, Hans de Goede). - Fixes for the arm_big_little and s5pv210-cpufreq cpufreq drivers (Jon Medhurst, Nicolas Pitre). - kfree()-related fixup for the recently introduced CPPC cpufreq frontend (Markus Elfring). - intel_pstate fix reducing kernel log noise on systems where P-states are managed by hardware (Prarit Bhargava). - intel_pstate maintainers information update (Srinivas Pandruvada). - cpufreq core optimization related to the handling of delayed work items used by governors (Viresh Kumar). - Locking fixes and cleanups of the Operating Performance Points (OPP) framework (Viresh Kumar). - Generic power domains framework cleanups (Lina Iyer). - cpupower tool updates (Jacob Tanenbaum, Sriram Raghunathan, Thomas Renninger). - turbostat tool updates (Len Brown)" * tag 'pm+acpi-4.4-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits) PCI: ACPI: Add support for PCI device DMA coherency PCI: OF: Move of_pci_dma_configure() to pci_dma_configure() of/pci: Fix pci_get_host_bridge_device leak device property: ACPI: Remove unused DMA APIs device property: ACPI: Make use of the new DMA Attribute APIs device property: Adding DMA Attribute APIs for Generic Devices ACPI: Adding DMA Attribute APIs for ACPI Device device property: Introducing enum dev_dma_attr ACPI: Honor ACPI _CCA attribute setting cpufreq: CPPC: Delete an unnecessary check before the function call kfree() PM / OPP: Add opp_rcu_lockdep_assert() to _find_device_opp() PM / OPP: Hold dev_opp_list_lock for writers PM / OPP: Protect updates to list_dev with mutex PM / OPP: Propagate error properly from dev_pm_opp_set_sharing_cpus() cpufreq: s5pv210-cpufreq: fix wrong do_div() usage MAINTAINERS: update for intel P-state driver Creating a common structure initialization pattern for struct option cpupower: Enable disabled Cstates if they are below max latency cpupower: Remove debug message when using cpupower idle-set -D switch cpupower: cpupower monitor reports uninitialized values for offline cpus ...	2015-11-12 11:50:33 -08:00
Linus Torvalds	b44a3d2a85	ARM: SoC driver updates for v4.4 As we've enabled multiplatform kernels on ARM, and greatly done away with the contents under arch/arm/mach-, there's still need for SoC-related drivers to go somewhere. Many of them go in through other driver trees, but we still have drivers/soc to hold some of the "doesn't fit anywhere" lowlevel code that might be shared between ARM and ARM64 (or just in general makes sense to not have under the architecture directory). This branch contains mostly such code: - Drivers for qualcomm SoCs for SMEM, SMD and SMD-RPM, used to communicate with power management blocks on these SoCs for use by clock, regulator and bus frequency drivers. - Allwinner Reduced Serial Bus driver, again used to communicate with PMICs. - Drivers for ARM's SCPI (System Control Processor). Not to be confused with PSCI (Power State Coordination Interface). SCPI is used to communicate with the assistant embedded cores doing power management, and we have yet to see how many of them will implement this for their hardware vs abstracting in other ways (or not at all like in the past). - To make confusion between SCPI and PSCI more likely, this release also includes an update of PSCI to interface version 1.0. - Rockchip support for power domains. - A driver to talk to the firmware on Raspberry Pi. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWQC+cAAoJEIwa5zzehBx3jEUP/0GpxfDVanEUkudVLLe7J0RH CNlRan107Cw6hXRUJo7elEsuCALjccXjc1CAH4+RnNpOAeBKW97n+WU7trTv+wUZ sQX4SkBPKFBlgwGF2qhsi5q74gms/BrgtCa4kNb9joOYso039tlfIOPzK80DMkOm TkyIJdUCgFJMjCQLhX6kGT0PDcrbIjb6aA2cF3FAVeaJA7uz8lNe/eHJr3oHxIEY CvC651yJ2mIHQUU4BJx/AJo+wXg3dRUXNCAtBjwLRPEAzduYZXYm1ZTVIby/1q9r dR2KDFEuibODXmXrDBzKNJwCu/TLJEwo/1oPaEIVfY91XLKfiWUhgVqa1o1I+d9U XoGPibCW461qFahjQW87MfInALpCOA7/RbTNjFp+MVyipCYvkaYq7KFiYEldgFDx z4Qx/J4hYc2TlDWrpNiUCZMfmhwi7y+Ib+tnenYTO1eyMuw0e9mfnVdjk5iU3Pvk Ye4qPqpYclJruyHbYi164878+1lLaW2NCUgC3rkBO/GWPAzp7d9iLWoZ3PuyD5i5 PEjs668UcRdZYbI4rdrhGHL8Eq9Gnuc4Rthu7HxPOK+DG0XgP8r97PhM8aYGYVDO +yikBtjWRsA9fPj3rMKA3UsQ61DAeR9LmZ0XPGjWFMCjCG0JlUoIMaA+Uu0i8fr8 95qxBVxbO7rhL39r1rhV =dm+I -----END PGP SIGNATURE----- Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Olof Johansson: "As we've enabled multiplatform kernels on ARM, and greatly done away with the contents under arch/arm/mach-, there's still need for SoC-related drivers to go somewhere. Many of them go in through other driver trees, but we still have drivers/soc to hold some of the "doesn't fit anywhere" lowlevel code that might be shared between ARM and ARM64 (or just in general makes sense to not have under the architecture directory). This branch contains mostly such code: - Drivers for qualcomm SoCs for SMEM, SMD and SMD-RPM, used to communicate with power management blocks on these SoCs for use by clock, regulator and bus frequency drivers. - Allwinner Reduced Serial Bus driver, again used to communicate with PMICs. - Drivers for ARM's SCPI (System Control Processor). Not to be confused with PSCI (Power State Coordination Interface). SCPI is used to communicate with the assistant embedded cores doing power management, and we have yet to see how many of them will implement this for their hardware vs abstracting in other ways (or not at all like in the past). - To make confusion between SCPI and PSCI more likely, this release also includes an update of PSCI to interface version 1.0. - Rockchip support for power domains. - A driver to talk to the firmware on Raspberry Pi" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (57 commits) soc: qcom: smd-rpm: Correct size of outgoing message bus: sunxi-rsb: Add driver for Allwinner Reduced Serial Bus bus: sunxi-rsb: Add Allwinner Reduced Serial Bus (RSB) controller bindings ARM: bcm2835: add mutual inclusion protection drivers: psci: make PSCI 1.0 functions initialization version dependent dt-bindings: Correct paths in Rockchip power domains binding document soc: rockchip: power-domain: don't try to print the clock name in error case soc: qcom/smem: add HWSPINLOCK dependency clk: berlin: add cpuclk ARM: berlin: dts: add CLKID_CPU for BG2Q ARM: bcm2835: Add the Raspberry Pi firmware driver soc: qcom: smem: Move RPM message ram out of smem DT node soc: qcom: smd-rpm: Correct the active vs sleep state flagging soc: qcom: smd: delete unneeded of_node_put firmware: qcom-scm: build for correct architecture level soc: qcom: smd: Correct SMEM items for upper channels qcom-scm: add missing prototype for qcom_scm_is_available() qcom-scm: fix endianess issue in __qcom_scm_is_call_available soc: qcom: smd: Reject send of too big packets soc: qcom: smd: Handle big endian CPUs ...	2015-11-10 15:00:03 -08:00
Rafael J. Wysocki	1f47b0ddf3	Merge branch 'pm-cpufreq' * pm-cpufreq: cpufreq: s5pv210-cpufreq: fix wrong do_div() usage MAINTAINERS: update for intel P-state driver cpufreq: governor: Quit work-handlers early if governor is stopped intel_pstate: decrease number of "HWP enabled" messages cpufreq: arm_big_little: fix frequency check when bL switcher is active	2015-11-07 01:30:49 +01:00
Rafael J. Wysocki	3930f660b4	Merge branches 'acpi-video' and 'acpi-cppc' * acpi-video: ACPI / video: only register backlight for LCD device ACPI / video: Add a quirk to force acpi-video backlight on Dell XPS L421X * acpi-cppc: cpufreq: CPPC: Delete an unnecessary check before the function call kfree()	2015-11-07 01:30:22 +01:00
Markus Elfring	efb2d3be53	cpufreq: CPPC: Delete an unnecessary check before the function call kfree() The kfree() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-07 00:03:18 +01:00
Nicolas Pitre	d7e53e35f9	cpufreq: s5pv210-cpufreq: fix wrong do_div() usage It is wrong to use do_div() with 32-bit dividends (unsigned long is 32 bits on 32-bit architectures). Signed-off-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-05 22:50:48 +01:00
Viresh Kumar	3a91b069ea	cpufreq: governor: Quit work-handlers early if governor is stopped gov_queue_work() acquires cpufreq_governor_lock to allow cpufreq_governor_stop() to drain delayed work items possibly scheduled on CPUs that share the policy with a CPU being taken offline. However, the same goal may be achieved in a more straightforward way if the policy pointer in the struct cpu_dbs_info matching the policy CPU is reset upfront by cpufreq_governor_stop() under the timer_mutex belonging to it and checked against NULL, under the same lock, at the beginning of dbs_timer(). In that case every instance of dbs_timer() run for a struct cpu_dbs_info sharing the policy pointer in question after cpufreq_governor_stop() has started will notice that that pointer is NULL and bail out immediately without queuing up any new work items. In turn, gov_cancel_work() called by cpufreq_governor_stop() before destroying timer_mutex will wait for all of the delayed work items currently running on the CPUs sharing the policy to drop the mutex, so it may be destroyed safely. Make cpufreq_governor_stop() and dbs_timer() work as described and modify gov_queue_work() so it does not acquire cpufreq_governor_lock any more. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-02 02:07:11 +01:00
Prarit Bhargava	539342f60b	intel_pstate: decrease number of "HWP enabled" messages When booting an HWP enabled system the kernel displays one "HWP enabled" message for each cpu. The messages are superfluous since HWP is globally enabled across all CPUs. This patch also adds an informational message when HWP is disabled via intel_pstate=no_hwp. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-02 02:01:18 +01:00
$Jon Medhurst $Tixy$$ Jon Medhurst $Tixy$	14f1ba3af6	cpufreq: arm_big_little: fix frequency check when bL switcher is active The check for correct frequency being set in bL_cpufreq_set_rate is broken when the big.LITTLE switcher is active, for two reasons. 1. The 'new_rate' variable gets overwritten before the test by the code calculating the frequency of the old cluster. 2. The frequency returned by bL_cpufreq_get_rate will be the virtual frequency, not the actual one the intended version of new_rate contains. This means the function always returns an error causing an endless stream of: "cpufreq: __target_index: Failed to change cpu frequency: -5" As the intent is to check for errors that clk_set_rate doesn't report lets move the check to immediately after that and directly use clk_get_rate, rather than the arm_big_little helpers which only confuse matters. Also, update the comment to be hopefully clearer about the purpose of the code. Fixes: `0a95e630b4` (cpufreq: arm_big_little: check if the frequency is set correctly) Signed-off-by: Jon Medhurst <tixy@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Michael Turquette <mturquette@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-11-02 01:58:27 +01:00
Rafael J. Wysocki	394f7164e6	Merge branch 'pm-opp' * pm-opp: PM / OPP: passing NULL to PTR_ERR() PM / OPP: Move cpu specific code to opp/cpu.c PM / OPP: Move opp core to its own directory PM / OPP: Prefix exported opp routines with dev_pm_opp_ PM / OPP: Rename opp init/free table routines PM / OPP: reuse of_parse_phandle()	2015-11-02 00:54:37 +01:00

... 3 4 5 6 7 ...

2031 Commits