All backends are reimplementing a variation of the same CPU reference
count handling. They are also responsible for driving the MCPM special
low-level locking. This is needless duplication, involving algorithmic
requirements that are not necessarily obvious to the uninitiated.
And from past code review experience, those were all initially
implemented badly.
After 3 years, it is time to refactor as much common code to the core
MCPM facility to make the backends as simple as possible. To avoid a
flag day, the new scheme is introduced in parallel to the existing
backend interface. When all backends are converted over, the
compatibility interface could be removed.
The new MCPM backend interface implements simpler methods addressing
very platform specific tasks performed under lock protection while
keeping the algorithmic complexity and race avoidance local to the
core code.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
The kernel already has the responsibility to handle resources such as the
CCI when hotplugging CPUs, during the booting of secondary CPUs, and when
resuming from suspend/idle. It would be more coherent and less confusing
if the CCI for the boot CPU (or cluster) was also initialized by the
kernel rather than expecting the firmware/bootloader to do it and only in
that case. After all, the kernel has all the necessary code already and
the bootloader shouldn't have to care at all.
The CCI may be turned on only when the cache is off. Leveraging the CPU
suspend code to loop back through the low-level MCPM entry point is all
that is needed to properly turn on the CCI from the kernel by using the
same code as during secondary boot.
Let's provide a generic MCPM loopback function that can be invoked by
backend initialization code to set things (CCI or similar) on the boot
CPU just as it is done for the other CPUs.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The name "power_down_finish" seems to be causing some confusion,
because it suggests that this function is responsible for taking
some action to cause the specified CPU to complete its power down.
This patch renames the affected functions to "wait_for_powerdown"
and similar, since this function's intended purpose is just to wait
for the hardware to finish a powerdown initiated by a previous
cpu_power_down.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The switcher should not depend on MAX_CLUSTER to determine ifit should
be activated or not. In a multiplatform kernel binary it is possible to
have dual-cluster and quad-cluster platforms configured in. In that case
MAX_CLUSTER which is a build time limit should be 4 and that shouldn't
prevent the switcher from working if the kernel is booted on a b.L
dual-cluster system.
In bL_switcher_halve_cpus() we already have a runtime validation check
to make sure we're dealing with only two clusters, so booting on a quad
cluster system will be caught and switcher activation aborted.
However, the b.L switcher must ensure the MCPM layer is initialized on
the booted hardware before doing anything. The mcpm_is_available()
function is added to that effect.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Abhilash Kesavan <kesavan.abhilash@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
sync_cache_w already includes a dsb, so we can just use sev() directly
then following a cache-sync.
Acked-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We have a handy macro to replace open coded __cpuc_flush_dcache_area(()
and outer_clean_range() sequences. Let's use it. No functional change.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
CPU hotplug and kexec rely on smp_ops.cpu_kill(), which is supposed
to wait for the CPU to park or power down, and perform the last
rites (such as disabling clocks etc., where the platform doesn't do
this automatically).
kexec in particular is unsafe without performing this
synchronisation to park secondaries. Without it, the secondaries
might not be parked when kexec trashes the kernel.
There is no generic way to do this synchronisation, so a new mcpm
platform_ops method power_down_finish() is added by this patch.
The new method is mandatory. A platform which provides no way to
detect when CPUs are parked is likely broken.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Currently mcpm_cpu_power_down() and mcpm_cpu_suspend() trigger BUG()
if mcpm_platform_register() is not called beforehand. This may occur
for many reasons such as some incomplete device tree passed to the kernel
or the like.
Let's be nicer to users and avoid killing the kernel if that happens by
logging a warning and returning to the caller. The mcpm_cpu_suspend()
user is already set to deal with this situation, and so is cpu_die()
invoking mcpm_cpu_die().
The problematic case would have been the B.L switcher's usage of
mcpm_cpu_power_down(), however it has to call mcpm_cpu_power_up() first
which is already set to catch an error resulting from a missing
mcpm_platform_register() call.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows to poke a predetermined value into a specific address
upon entering the early boot code in bL_head.S.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
This provides helper methods to coordinate between CPUs coming down
and CPUs going up, as well as documentation on the used algorithms,
so that cluster teardown and setup
operations are not done for a cluster simultaneously.
For use in the power_down() implementation:
* __mcpm_cpu_going_down(unsigned int cluster, unsigned int cpu)
* __mcpm_outbound_enter_critical(unsigned int cluster)
* __mcpm_outbound_leave_critical(unsigned int cluster)
* __mcpm_cpu_down(unsigned int cluster, unsigned int cpu)
The power_up_setup() helper should do platform-specific setup in
preparation for turning the CPU on, such as invalidating local caches
or entering coherency. It must be assembler for now, since it must
run before the MMU can be switched on. It is passed the affinity level
for which initialization should be performed.
Because the mcpm_sync_struct content is looked-up and modified
with the cache enabled or disabled depending on the code path, it is
crucial to always ensure proper cache maintenance to update main memory
right away. The sync_cache_*() helpers are used to that end.
Also, in order to prevent a cached writer from interfering with an
adjacent non-cached writer, we ensure each state variable is located to
a separate cache line.
Thanks to Nicolas Pitre and Achin Gupta for the help with this
patch.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
This is the basic API used to handle the powering up/down of individual
CPUs in a (multi-)cluster system. The platform specific backend
implementation has the responsibility to also handle the cluster level
power as well when the first/last CPU in a cluster is brought up/down.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
CPUs in cluster based systems, such as big.LITTLE, have special needs
when entering the kernel due to a hotplug event, or when resuming from
a deep sleep mode.
This is vectorized so multiple CPUs can enter the kernel in parallel
without serialization.
The mcpm prefix stands for "multi cluster power management", however
this is usable on single cluster systems as well. Only the basic
structure is introduced here. This will be extended with later patches.
In order not to complexify things more than they currently have to,
the planned work to make runtime adjusted MPIDR based indexing and
dynamic memory allocation for cluster states is postponed to a later
cycle. The MAX_NR_CLUSTERS and MAX_CPUS_PER_CLUSTER static definitions
should be sufficient for those systems expected to be available in the
near future.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>