In case thermal_zone_xxx_register() returns an error, priv->zone
isn't NULL any more, but contains the error code.
This is passed to thermal_zone_device_unregister(), then. This checks
for priv->zone being NULL, but the error code is != NULL. So it works
with the error code as a pointer. Crashing immediately.
To fix this, reset priv->zone to NULL before entering
rcar_gen3_thermal_remove().
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
of_match_device could return NULL, and so cause a NULL pointer
dereference later at line 472:
data->socdata = of_id->data;
For fixing this problem, we use of_device_get_match_data(), this will
simplify the code a little by using a standard function for
getting the match data.
Reported-by: coverity (CID 1324128)
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
The field "owner" is set by core. Thus delete an extra initialisation.
Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Currently all CPU cooling devices share a
`struct thermal_cooling_device_ops` instance. The thermal core uses the
presence of functions in this struct to determine if a cooling device
has a power model (see cdev_is_power_actor). cpu_cooling.c adds the
power model functions to the shared struct when a device is registered
with a power model.
Therefore, if a CPU cooling device is registered using
[of_]cpufreq_power_cooling_register, _all_ devices will be determined to
have a power model, including any registered with
[of_]cpufreq_cooling_register. This can result in cpufreq_state2power
being called on a device where dyn_power_table is NULL.
With this commit, instead of having a shared thermal_cooling_device_ops
which is mutated, we have two versions: one with the power functions and
one without.
Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Javi Merino <javi.merino@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
The driver allocates the mutex but not initialize it.
Use mutex_init() on it to initialize it correctly.
This is detected by Coccinelle semantic patch.
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
When multiple thermal zones are bound to the same cooling device, multiple
kernel threads may want to update the cooling device state by calling
thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
condition. Consider the following situation with two kernel threads k1 and k2:
Thread k1 Thread k2
||
|| call thermal_cdev_update()
|| ...
|| set_cur_state(cdev, target);
call power_actor_set_power() ||
... ||
instance->target = state; ||
cdev->updated = false; ||
|| cdev->updated = true;
|| // completes execution
call thermal_cdev_update() ||
// cdev->updated == true ||
return; ||
\/
time
k2 has already looped through the thermal instances looking for the deepest
cooling device state and is preempted right before setting cdev->updated to
true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
to false. Then, k1 is preempted and k2 continues the execution by setting
cdev->updated to true, therefore preventing k1 from performing the update.
Notice that this is not an issue if k2 looks at the instance->target modified by
k1 "after" it is assigned by k1. In fact, in this case the update will happen
anyway and k1 can safely return immediately from thermal_cdev_update().
This may lead to a situation where a thermal governor never updates the cooling
device. For example, this is the case for the step_wise governor: when calling
the function thermal_zone_trip_update(), the governor may always get a new state
equal to the old one (which, however, wasn't notified to the cooling device) and
will therefore skip the update.
CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Peter Feuerer <peter@piie.net>
Reported-by: Toby Huang <toby.huang@arm.com>
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Added suspend/resume callback to disable/enable PCH thermal sensor
respectively. If the sensor is enabled by the BIOS, then the sensor status
will not be changed during suspend/resume.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Pull thermal management fixes from Zhang Rui:
- fix an ordering issue in cpu cooling that cooling device is
registered before it's ready (freq_table being populated).
(Lukasz Luba)
- fix a missing comment update (Caesar Wang)
* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
thermal: add the note for set_trip_temp
thermal: cpu_cooling: fix improper order during initialization
Most of the callers of cpufreq_frequency_get_table() already have the
pointer to a valid 'policy' structure and they don't really need to go
through the per-cpu variable first and then a check to validate the
frequency, in order to find the freq-table for the policy.
Directly use the policy->freq_table field instead for them.
Only one user of that API is left after above changes, cpu_cooling.c and
it accesses the freq_table in a racy way as the policy can get freed in
between.
Fix it by using cpufreq_cpu_get() properly.
Since there are no more users of cpufreq_frequency_get_table() left, get
rid of it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Javi Merino <javi.merino@arm.com> (cpu_cooling.c)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The X86_FAMILY_ANY in here is bogus. "BYT" and model 0x37 are
family-6 only.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: jacob.jun.pan@intel.com
Cc: linux-pm@vger.kernel.org
Link: http://lkml.kernel.org/r/20160603001952.9B6E114D@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The freq_table array is not populated before calling
thermal_of_cooling_register. The code which populates the freq table was
introduced in commit f6859014.
This should be done before registering new thermal cooling device.
The log shows effects of this wrong decision.
[ 2.172614] cpu cpu1: Failed to get voltage for frequency 1984518656000: -34
[ 2.220863] cpu cpu0: Failed to get voltage for frequency 1984524416000: -34
Cc: <stable@vger.kernel.org> # 3.19+
Fixes: f6859014c7 ("thermal: cpu_cooling: Store frequencies in descending order")
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Javi Merino <javi.merino@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
commit 059500940d (ACPI/video: export acpi_video_get_levels)
mistakenly dropped the correct value of max_level and that caused the
set_level function following failed and the acpi_video backlight interface
didn't get created. Fix this by passing back the correct max_level value.
While at it, also fix the param used in acpi_video_device_lcd_query_levels
where acpi_handle is expected but acpi_video_device is passed.
Fixes: 059500940d (ACPI/video: export acpi_video_get_levels)
Reported-and-tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull thermal management updates from Zhang Rui:
- Introduce generic ADC thermal driver, based on OF thermal (Laxman
Dewangan)
- Introduce new thermal driver for Tango chips (Marc Gonzalez)
- Rockchip driver support for RK3399, RK3366, and some fixes (Caesar
Wang, Elaine Zhang and Shawn Lin)
- Add CPU power cooling model to Mediatek thermal driver (Dawei Chien)
- Wider usage of dev_thermal_zone_of_sensor_register (Eduardo Valentin)
- TI thermal driver gained a new maintainer (Keerthy).
- Enabled powerclamp driver by checking CPU feature and package cstate
counter instead of CPU whitelist (Jacob Pan)
- Various fixes on thermal governor, OF thermal, Tegra, and RCAR
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (50 commits)
thermal: tango: initialize TEMPSI_CFG
thermal: rockchip: use the usleep_range instead of udelay
thermal: rockchip: add the notes for better reading
thermal: rockchip: Support RK3366 SoCs in the thermal driver
thermal: rockchip: handle the power sequence for tsadc controller
thermal: rockchip: update the tsadc table for rk3399
thermal: rockchip: fixes the code_to_temp for tsadc driver
thermal: rockchip: disable thermal->clk in err case
thermal: tegra: add Tegra132 specific SOC_THERM driver
thermal: fix ptr_ret.cocci warnings
thermal: mediatek: Add cpu dynamic power cooling model.
thermal: generic-adc: Add ADC based thermal sensor driver
thermal: generic-adc: Add DT binding for ADC based thermal sensor
thermal: tegra: fix static checker warning
thermal: tegra: mark PM functions __maybe_unused
thermal: add temperature sensor support for tango SoC
thermal: hisilicon: fix IRQ imbalance enabling
thermal: hisilicon: support to use any sensor
thermal: rcar: Remove binding docs for r8a7794
thermal: tegra: add PM support
...
TEMPSI_CFG is not equal to 0 at reset. It must be initialized.
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Documentation/timers/timers-howto.txt recommends to use
usleep_range on delays > 10usec.
The usleep_range indeed reduces CPU load, since the udelay will busy wait
for enough loop cycles to achieve the desired delay.
Fixes commit b06c52db39fd ("thermal: rockchip:
handle the power sequence for tsadc controller").
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Suggested-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
To update the notes for keeping in mind that quickly in case
someone re-read this driver in the future.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The RK3366 SoCs have two Temperature Sensors, channel 0 is for CPU
channel 1 is for GPU.
Signed-off-by: Elaine Zhang <zhangqing@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
This adds the grf property to handle the tsadc power sequence on
rockchip some SoCs.
Verified on rk3399 can work with this patch on now.
while true; do grep "" /sys/class/thermal/thermal_zone[0-1]/temp
sleep .5; done
/sys/class/thermal/thermal_zone0/temp:40555
/sys/class/thermal/thermal_zone1/temp:41111
/sys/class/thermal/thermal_zone0/temp:40555
/sys/class/thermal/thermal_zone1/temp:41111
/sys/class/thermal/thermal_zone0/temp:40555
/sys/class/thermal/thermal_zone1/temp:41666
/sys/class/thermal/thermal_zone0/temp:40555
/sys/class/thermal/thermal_zone1/temp:41111
/sys/class/thermal/thermal_zone0/temp:40555
/sys/class/thermal/thermal_zone1/temp:41111
/sys/class/thermal/thermal_zone0/temp:40555
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
This patch fixes the incorrect conversion table.
The Code to Temperature mapping is updated based on sillcon results.
Fixes commit b0d70338bc
("thermal: rockchip: Support the RK3399 SoCs in thermal driver").
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
We should judge the table.id[mid].code insearch algorithm on matter the
adc value increment or decrement.
Or otherwise, the temperature return the incorrect value in some cases.
[ 1.438589] adc_val=402,temp=-40000
[ 1.438903] adc_val=403,temp=-39375
[ 1.439217] adc_val=404,temp=-38750
...
[ 1.441102] adc_val=410,temp=-40000
[ 1.441416] adc_val=411,temp=-34445
[ 1.441737] adc_val=412,temp=-33889
...
Let's fix it right now.
Fixes commit 020ba95dbb ("thermal: rockchip:
Add the sort mode for adc value increment or decrement").
Reported-by: Rocky Hao <rocky.hao@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Disable thermal->clk when enabling pclk fails in
resume routine.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
drivers/thermal/tango_thermal.c:86:1-3: WARNING: PTR_ERR_OR_ZERO can be used
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
Generated by: scripts/coccinelle/api/ptr_ret.cocci
CC: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
In some of platform, thermal sensors like NCT thermistors are
connected to the one of ADC channel. The temperature is read by
reading the voltage across the sensor resistance via ADC. Lookup
table for ADC read value to temperature is referred to get
temperature. ADC is read via IIO framework.
Add support for thermal sensor driver which read the voltage across
sensor resistance from ADC through IIO framework.
Acked-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
There has a static checker warning:
warn: variable dereferenced before check 'dev' (see line 222)
Since check 'dev' is unnecessary, so remove this check.
Fixes: ee6d79f202a4 ("thermal: tegra: add thermtrip function")
Signed-off-by: Wei Ni <wni@nvidia.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
After the PM support has been added to this driver, we get
a harmless warning when that support is disabled at compile
time:
drivers/thermal/tegra/soctherm.c:641:12: error: 'soctherm_resume' defined but not used [-Werror=unused-function]
static int soctherm_resume(struct device *dev)
This marks the two PM functions as __maybe_unused to shut up
the warning. This is preferred over adding an #ifdef around
them, as it is harder to get wrong, and provides better
compile-time coverage.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: a134b4143b65 ("thermal: tegra: add PM support")
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The Tango thermal driver provides support for the primitive temperature
sensor embedded in Tango chips since the SMP8758.
This sensor only generates a 1-bit signal to indicate whether the die
temperature exceeds a programmable threshold.
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
When register sensors into thermal zone during initialization phase, it
reports error for IRQ imbalance enabling:
[ 2.040713] WARNING: at kernel/irq/manage.c:513
[ 2.040719] Modules linked in:
[ 2.040721]
[ 2.040729] CPU: 1 PID: 804 Comm: irq/33-hisi_the Not tainted 4.5.0-rc4+ #505
[ 2.040732] Hardware name: HiKey Development Board (DT)
[ 2.040736] task: ffffffc03ae82580 ti: ffffffc0379c8000 task.ti: ffffffc0379c8000
[ 2.040745] PC is at __enable_irq+0x74/0x84
[ 2.040749] LR is at __enable_irq+0x74/0x84
This warning is for IRQ imbalance enabling, which is caused by
enable_irq() twice. During sensor's initialization it tries to enable
IRQ, the driver will call thermal_zone_of_sensor_register() to bind
sensors and read sensor's temperature. But at this moment the flag
"data->irq_enabled" has been not initialized as correct state, so it
finally introduces the function enabled_irq() to be called twice. In
essentially this is caused by the flag "data->irq_enabled" is
inconsistent with real hardware IRQ enabling state.
So this patch is to fix this issue, firstly init "irq_enabled" flag
before binding sensors to thermal zone. Also change to use the function
irq_get_irqchip_state() to read back real interrupt line state.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
In current code sensor driver registers all 4 sensors together and if
any of them has not bound to thermal zone successfully then driver will
return failure for driver's initialization. As a result, if DT binds
thermal zone with only one sensor, then the thermal driver will not work
well anymore.
So this patch is to fix this issue. It allows the thermal sensor driver
can register any number sensors at initialization phase, and fix up code
for other related code to skip related sensor's accessing if the sensor
has not been enabled in initialization phase.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Handle HW initialization in one function soctherm_init(),
so that the codes are more clear.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Handle clock enable/disable codes in one function
soctherm_clk_enable(), so that the codes are more clear.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Add support for hardware critical thermal limits to the
SOC_THERM driver. It use the Linux thermal framework to
create critical trip temp, and set it to SOC_THERM hardware.
If these limits are breached, the chip will reset, and if
appropriately configured, will turn off the PMIC.
This support is critical for safe usage of the chip.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
In current of-thermal, the .set_trip_temp only support to
set trip_temp for SW. But some sensors support to set
trip_temp on hardware, so that can trigger interrupt,
shutdown or any other events.
This patch adds .set_trip_temp() callback in
thermal_zone_of_device_ops{}, so that the sensor device can
use it to set trip_temp on hardware.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Add a debugfs interface to show register contents for debug.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Split most of the Tegra124 data and code into a Tegra124-specific
file.
Split most of the fuse-related code into a fuse-related source file.
This is in preparation for adding a Tegra210-specific driver in a
future patch.
Beyond the maintainability improvements, this is intended to separate
chip-specific ATE and characterization-related hacks into chip-specific
files, in the hopes that they won't pollute code for other chips.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Get rid of T124-specific PDIV/HOTSPOT hack.
tegra-soctherm.c contained a hack to set the SENSOR_PDIV and
SENSOR_HOTSPOT_OFFSET registers - it just did two writes of
T124-specific opaque values. Convert these into a form that can be
substituted on a per-chip basis, and into structure fields that have
at least some independent meaning.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Combine sensor group-related data structures into struct
tegra_tsensor_group. This provides a single location for
sensor group data storage.
More sensor group data will be added in subsequent patches.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Move Tegra soctherm driver to tegra directory, it's easy to maintain
and add more new function support for Tegra platforms.
This will also help to split soctherm driver into common parts and
chip specific data related parts.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-omap@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-tegra@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>