Merge branch 'VExpress_DCSCB' of git://git.linaro.org/people/nico/linux into next/soc

From Nicolas Pitre:

This is the first MCPM backend submission for VExpress running on RTSM
aka Fast Models implementing the big.LITTLE system architecture.  This
enables SMP secondary boot as well as CPU hotplug on this platform.

A big prerequisite for this support is the CCI driver from Lorenzo
included in this pull request.

Also included is Rob Herring's set_auxcr/get_auxcr allowing nicer code.

Signed-off-by: Olof Johansson <olof@lixom.net>

* 'VExpress_DCSCB' of git://git.linaro.org/people/nico/linux:
  ARM: vexpress: Select multi-cluster SMP operation if required
  ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI
  ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster
  ARM: vexpress/dcscb: add CPU use counts to the power up/down API implementation
  ARM: vexpress: introduce DCSCB support
  ARM: introduce common set_auxcr/get_auxcr functions
  drivers/bus: arm-cci: function to enable CCI ports from early boot code
  drivers: bus: add ARM CCI support
This commit is contained in:
Olof Johansson 2013-05-31 23:39:35 -07:00
commit db9bde2fa5
15 changed files with 1132 additions and 14 deletions

View File

@ -0,0 +1,172 @@
=======================================================
ARM CCI cache coherent interconnect binding description
=======================================================
ARM multi-cluster systems maintain intra-cluster coherency through a
cache coherent interconnect (CCI) that is capable of monitoring bus
transactions and manage coherency, TLB invalidations and memory barriers.
It allows snooping and distributed virtual memory message broadcast across
clusters, through memory mapped interface, with a global control register
space and multiple sets of interface control registers, one per slave
interface.
Bindings for the CCI node follow the ePAPR standard, available from:
www.power.org/documentation/epapr-version-1-1/
with the addition of the bindings described in this document which are
specific to ARM.
* CCI interconnect node
Description: Describes a CCI cache coherent Interconnect component
Node name must be "cci".
Node's parent must be the root node /, and the address space visible
through the CCI interconnect is the same as the one seen from the
root node (ie from CPUs perspective as per DT standard).
Every CCI node has to define the following properties:
- compatible
Usage: required
Value type: <string>
Definition: must be set to
"arm,cci-400"
- reg
Usage: required
Value type: <prop-encoded-array>
Definition: A standard property. Specifies base physical
address of CCI control registers common to all
interfaces.
- ranges:
Usage: required
Value type: <prop-encoded-array>
Definition: A standard property. Follow rules in the ePAPR for
hierarchical bus addressing. CCI interfaces
addresses refer to the parent node addressing
scheme to declare their register bases.
CCI interconnect node can define the following child nodes:
- CCI control interface nodes
Node name must be "slave-if".
Parent node must be CCI interconnect node.
A CCI control interface node must contain the following
properties:
- compatible
Usage: required
Value type: <string>
Definition: must be set to
"arm,cci-400-ctrl-if"
- interface-type:
Usage: required
Value type: <string>
Definition: must be set to one of {"ace", "ace-lite"}
depending on the interface type the node
represents.
- reg:
Usage: required
Value type: <prop-encoded-array>
Definition: the base address and size of the
corresponding interface programming
registers.
* CCI interconnect bus masters
Description: masters in the device tree connected to a CCI port
(inclusive of CPUs and their cpu nodes).
A CCI interconnect bus master node must contain the following
properties:
- cci-control-port:
Usage: required
Value type: <phandle>
Definition: a phandle containing the CCI control interface node
the master is connected to.
Example:
cpus {
#size-cells = <0>;
#address-cells = <1>;
CPU0: cpu@0 {
device_type = "cpu";
compatible = "arm,cortex-a15";
cci-control-port = <&cci_control1>;
reg = <0x0>;
};
CPU1: cpu@1 {
device_type = "cpu";
compatible = "arm,cortex-a15";
cci-control-port = <&cci_control1>;
reg = <0x1>;
};
CPU2: cpu@100 {
device_type = "cpu";
compatible = "arm,cortex-a7";
cci-control-port = <&cci_control2>;
reg = <0x100>;
};
CPU3: cpu@101 {
device_type = "cpu";
compatible = "arm,cortex-a7";
cci-control-port = <&cci_control2>;
reg = <0x101>;
};
};
dma0: dma@3000000 {
compatible = "arm,pl330", "arm,primecell";
cci-control-port = <&cci_control0>;
reg = <0x0 0x3000000 0x0 0x1000>;
interrupts = <10>;
#dma-cells = <1>;
#dma-channels = <8>;
#dma-requests = <32>;
};
cci@2c090000 {
compatible = "arm,cci-400";
#address-cells = <1>;
#size-cells = <1>;
reg = <0x0 0x2c090000 0 0x1000>;
ranges = <0x0 0x0 0x2c090000 0x6000>;
cci_control0: slave-if@1000 {
compatible = "arm,cci-400-ctrl-if";
interface-type = "ace-lite";
reg = <0x1000 0x1000>;
};
cci_control1: slave-if@4000 {
compatible = "arm,cci-400-ctrl-if";
interface-type = "ace";
reg = <0x4000 0x1000>;
};
cci_control2: slave-if@5000 {
compatible = "arm,cci-400-ctrl-if";
interface-type = "ace";
reg = <0x5000 0x1000>;
};
};
This CCI node corresponds to a CCI component whose control registers sits
at address 0x000000002c090000.
CCI slave interface @0x000000002c091000 is connected to dma controller dma0.
CCI slave interface @0x000000002c094000 is connected to CPUs {CPU0, CPU1};
CCI slave interface @0x000000002c095000 is connected to CPUs {CPU2, CPU3};

View File

@ -0,0 +1,19 @@
ARM Dual Cluster System Configuration Block
-------------------------------------------
The Dual Cluster System Configuration Block (DCSCB) provides basic
functionality for controlling clocks, resets and configuration pins in
the Dual Cluster System implemented by the Real-Time System Model (RTSM).
Required properties:
- compatible : should be "arm,rtsm,dcscb"
- reg : physical base address and the size of the registers window
Example:
dcscb@60000000 {
compatible = "arm,rtsm,dcscb";
reg = <0x60000000 0x1000>;
};

View File

@ -61,6 +61,20 @@ static inline void set_cr(unsigned int val)
isb(); isb();
} }
static inline unsigned int get_auxcr(void)
{
unsigned int val;
asm("mrc p15, 0, %0, c1, c0, 1 @ get AUXCR" : "=r" (val));
return val;
}
static inline void set_auxcr(unsigned int val)
{
asm volatile("mcr p15, 0, %0, c1, c0, 1 @ set AUXCR"
: : "r" (val));
isb();
}
#ifndef CONFIG_SMP #ifndef CONFIG_SMP
extern void adjust_cr(unsigned long mask, unsigned long set); extern void adjust_cr(unsigned long mask, unsigned long set);
#endif #endif

View File

@ -57,4 +57,13 @@ config ARCH_VEXPRESS_CORTEX_A5_A9_ERRATA
config ARCH_VEXPRESS_CA9X4 config ARCH_VEXPRESS_CA9X4
bool "Versatile Express Cortex-A9x4 tile" bool "Versatile Express Cortex-A9x4 tile"
config ARCH_VEXPRESS_DCSCB
bool "Dual Cluster System Control Block (DCSCB) support"
depends on MCPM
select ARM_CCI
help
Support for the Dual Cluster System Configuration Block (DCSCB).
This is needed to provide CPU and cluster power management
on RTSM implementing big.LITTLE.
endmenu endmenu

View File

@ -6,5 +6,6 @@ ccflags-$(CONFIG_ARCH_MULTIPLATFORM) := -I$(srctree)/$(src)/include \
obj-y := v2m.o obj-y := v2m.o
obj-$(CONFIG_ARCH_VEXPRESS_CA9X4) += ct-ca9x4.o obj-$(CONFIG_ARCH_VEXPRESS_CA9X4) += ct-ca9x4.o
obj-$(CONFIG_ARCH_VEXPRESS_DCSCB) += dcscb.o dcscb_setup.o
obj-$(CONFIG_SMP) += platsmp.o obj-$(CONFIG_SMP) += platsmp.o
obj-$(CONFIG_HOTPLUG_CPU) += hotplug.o obj-$(CONFIG_HOTPLUG_CPU) += hotplug.o

View File

@ -6,6 +6,8 @@
void vexpress_dt_smp_map_io(void); void vexpress_dt_smp_map_io(void);
bool vexpress_smp_init_ops(void);
extern struct smp_operations vexpress_smp_ops; extern struct smp_operations vexpress_smp_ops;
extern void vexpress_cpu_die(unsigned int cpu); extern void vexpress_cpu_die(unsigned int cpu);

View File

@ -0,0 +1,253 @@
/*
* arch/arm/mach-vexpress/dcscb.c - Dual Cluster System Configuration Block
*
* Created by: Nicolas Pitre, May 2012
* Copyright: (C) 2012-2013 Linaro Limited
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/io.h>
#include <linux/spinlock.h>
#include <linux/errno.h>
#include <linux/of_address.h>
#include <linux/vexpress.h>
#include <linux/arm-cci.h>
#include <asm/mcpm.h>
#include <asm/proc-fns.h>
#include <asm/cacheflush.h>
#include <asm/cputype.h>
#include <asm/cp15.h>
#define RST_HOLD0 0x0
#define RST_HOLD1 0x4
#define SYS_SWRESET 0x8
#define RST_STAT0 0xc
#define RST_STAT1 0x10
#define EAG_CFG_R 0x20
#define EAG_CFG_W 0x24
#define KFC_CFG_R 0x28
#define KFC_CFG_W 0x2c
#define DCS_CFG_R 0x30
/*
* We can't use regular spinlocks. In the switcher case, it is possible
* for an outbound CPU to call power_down() while its inbound counterpart
* is already live using the same logical CPU number which trips lockdep
* debugging.
*/
static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED;
static void __iomem *dcscb_base;
static int dcscb_use_count[4][2];
static int dcscb_allcpus_mask[2];
static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
{
unsigned int rst_hold, cpumask = (1 << cpu);
unsigned int all_mask = dcscb_allcpus_mask[cluster];
pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
if (cpu >= 4 || cluster >= 2)
return -EINVAL;
/*
* Since this is called with IRQs enabled, and no arch_spin_lock_irq
* variant exists, we need to disable IRQs manually here.
*/
local_irq_disable();
arch_spin_lock(&dcscb_lock);
dcscb_use_count[cpu][cluster]++;
if (dcscb_use_count[cpu][cluster] == 1) {
rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
if (rst_hold & (1 << 8)) {
/* remove cluster reset and add individual CPU's reset */
rst_hold &= ~(1 << 8);
rst_hold |= all_mask;
}
rst_hold &= ~(cpumask | (cpumask << 4));
writel_relaxed(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
} else if (dcscb_use_count[cpu][cluster] != 2) {
/*
* The only possible values are:
* 0 = CPU down
* 1 = CPU (still) up
* 2 = CPU requested to be up before it had a chance
* to actually make itself down.
* Any other value is a bug.
*/
BUG();
}
arch_spin_unlock(&dcscb_lock);
local_irq_enable();
return 0;
}
static void dcscb_power_down(void)
{
unsigned int mpidr, cpu, cluster, rst_hold, cpumask, all_mask;
bool last_man = false, skip_wfi = false;
mpidr = read_cpuid_mpidr();
cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
cpumask = (1 << cpu);
all_mask = dcscb_allcpus_mask[cluster];
pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
BUG_ON(cpu >= 4 || cluster >= 2);
__mcpm_cpu_going_down(cpu, cluster);
arch_spin_lock(&dcscb_lock);
BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
dcscb_use_count[cpu][cluster]--;
if (dcscb_use_count[cpu][cluster] == 0) {
rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
rst_hold |= cpumask;
if (((rst_hold | (rst_hold >> 4)) & all_mask) == all_mask) {
rst_hold |= (1 << 8);
last_man = true;
}
writel_relaxed(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
} else if (dcscb_use_count[cpu][cluster] == 1) {
/*
* A power_up request went ahead of us.
* Even if we do not want to shut this CPU down,
* the caller expects a certain state as if the WFI
* was aborted. So let's continue with cache cleaning.
*/
skip_wfi = true;
} else
BUG();
if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
arch_spin_unlock(&dcscb_lock);
/*
* Flush all cache levels for this cluster.
*
* A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
* a preliminary flush here for those CPUs. At least, that's
* the theory -- without the extra flush, Linux explodes on
* RTSM (to be investigated).
*/
flush_cache_all();
set_cr(get_cr() & ~CR_C);
flush_cache_all();
/*
* This is a harmless no-op. On platforms with a real
* outer cache this might either be needed or not,
* depending on where the outer cache sits.
*/
outer_flush_all();
/* Disable local coherency by clearing the ACTLR "SMP" bit: */
set_auxcr(get_auxcr() & ~(1 << 6));
/*
* Disable cluster-level coherency by masking
* incoming snoops and DVM messages:
*/
cci_disable_port_by_cpu(mpidr);
__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
} else {
arch_spin_unlock(&dcscb_lock);
/*
* Flush the local CPU cache.
*
* A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
* a preliminary flush here for those CPUs. At least, that's
* the theory -- without the extra flush, Linux explodes on
* RTSM (to be investigated).
*/
flush_cache_louis();
set_cr(get_cr() & ~CR_C);
flush_cache_louis();
/* Disable local coherency by clearing the ACTLR "SMP" bit: */
set_auxcr(get_auxcr() & ~(1 << 6));
}
__mcpm_cpu_down(cpu, cluster);
/* Now we are prepared for power-down, do it: */
dsb();
if (!skip_wfi)
wfi();
/* Not dead at this point? Let our caller cope. */
}
static const struct mcpm_platform_ops dcscb_power_ops = {
.power_up = dcscb_power_up,
.power_down = dcscb_power_down,
};
static void __init dcscb_usage_count_init(void)
{
unsigned int mpidr, cpu, cluster;
mpidr = read_cpuid_mpidr();
cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
BUG_ON(cpu >= 4 || cluster >= 2);
dcscb_use_count[cpu][cluster] = 1;
}
extern void dcscb_power_up_setup(unsigned int affinity_level);
static int __init dcscb_init(void)
{
struct device_node *node;
unsigned int cfg;
int ret;
if (!cci_probed())
return -ENODEV;
node = of_find_compatible_node(NULL, NULL, "arm,rtsm,dcscb");
if (!node)
return -ENODEV;
dcscb_base = of_iomap(node, 0);
if (!dcscb_base)
return -EADDRNOTAVAIL;
cfg = readl_relaxed(dcscb_base + DCS_CFG_R);
dcscb_allcpus_mask[0] = (1 << (((cfg >> 16) >> (0 << 2)) & 0xf)) - 1;
dcscb_allcpus_mask[1] = (1 << (((cfg >> 16) >> (1 << 2)) & 0xf)) - 1;
dcscb_usage_count_init();
ret = mcpm_platform_register(&dcscb_power_ops);
if (!ret)
ret = mcpm_sync_init(dcscb_power_up_setup);
if (ret) {
iounmap(dcscb_base);
return ret;
}
pr_info("VExpress DCSCB support installed\n");
/*
* Future entries into the kernel can now go
* through the cluster entry vectors.
*/
vexpress_flags_set(virt_to_phys(mcpm_entry_point));
return 0;
}
early_initcall(dcscb_init);

View File

@ -0,0 +1,38 @@
/*
* arch/arm/include/asm/dcscb_setup.S
*
* Created by: Dave Martin, 2012-06-22
* Copyright: (C) 2012-2013 Linaro Limited
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/linkage.h>
ENTRY(dcscb_power_up_setup)
cmp r0, #0 @ check affinity level
beq 2f
/*
* Enable cluster-level coherency, in preparation for turning on the MMU.
* The ACTLR SMP bit does not need to be set here, because cpu_resume()
* already restores that.
*
* A15/A7 may not require explicit L2 invalidation on reset, dependent
* on hardware integration decisions.
* For now, this code assumes that L2 is either already invalidated,
* or invalidation is not required.
*/
b cci_enable_port_for_self
2: @ Implementation-specific local CPU setup operations should go here,
@ if any. In this case, there is nothing to do.
bx lr
ENDPROC(dcscb_power_up_setup)

View File

@ -12,9 +12,11 @@
#include <linux/errno.h> #include <linux/errno.h>
#include <linux/smp.h> #include <linux/smp.h>
#include <linux/io.h> #include <linux/io.h>
#include <linux/of.h>
#include <linux/of_fdt.h> #include <linux/of_fdt.h>
#include <linux/vexpress.h> #include <linux/vexpress.h>
#include <asm/mcpm.h>
#include <asm/smp_scu.h> #include <asm/smp_scu.h>
#include <asm/mach/map.h> #include <asm/mach/map.h>
@ -203,3 +205,21 @@ struct smp_operations __initdata vexpress_smp_ops = {
.cpu_die = vexpress_cpu_die, .cpu_die = vexpress_cpu_die,
#endif #endif
}; };
bool __init vexpress_smp_init_ops(void)
{
#ifdef CONFIG_MCPM
/*
* The best way to detect a multi-cluster configuration at the moment
* is to look for the presence of a CCI in the system.
* Override the default vexpress_smp_ops if so.
*/
struct device_node *node;
node = of_find_compatible_node(NULL, NULL, "arm,cci-400");
if (node && of_device_is_available(node)) {
mcpm_smp_set_ops();
return true;
}
#endif
return false;
}

View File

@ -456,6 +456,7 @@ static const char * const v2m_dt_match[] __initconst = {
DT_MACHINE_START(VEXPRESS_DT, "ARM-Versatile Express") DT_MACHINE_START(VEXPRESS_DT, "ARM-Versatile Express")
.dt_compat = v2m_dt_match, .dt_compat = v2m_dt_match,
.smp = smp_ops(vexpress_smp_ops), .smp = smp_ops(vexpress_smp_ops),
.smp_init = smp_init_ops(vexpress_smp_init_ops),
.map_io = v2m_dt_map_io, .map_io = v2m_dt_map_io,
.init_early = v2m_dt_init_early, .init_early = v2m_dt_init_early,
.init_irq = irqchip_init, .init_irq = irqchip_init,

View File

@ -26,4 +26,11 @@ config OMAP_INTERCONNECT
help help
Driver to enable OMAP interconnect error handling driver. Driver to enable OMAP interconnect error handling driver.
config ARM_CCI
bool "ARM CCI driver support"
depends on ARM
help
Driver supporting the CCI cache coherent interconnect for ARM
platforms.
endmenu endmenu

View File

@ -7,3 +7,5 @@ obj-$(CONFIG_OMAP_OCP2SCP) += omap-ocp2scp.o
# Interconnect bus driver for OMAP SoCs. # Interconnect bus driver for OMAP SoCs.
obj-$(CONFIG_OMAP_INTERCONNECT) += omap_l3_smx.o omap_l3_noc.o obj-$(CONFIG_OMAP_INTERCONNECT) += omap_l3_smx.o omap_l3_noc.o
# CCI cache coherent interconnect for ARM platforms
obj-$(CONFIG_ARM_CCI) += arm-cci.o

533
drivers/bus/arm-cci.c Normal file
View File

@ -0,0 +1,533 @@
/*
* CCI cache coherent interconnect driver
*
* Copyright (C) 2013 ARM Ltd.
* Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* This program is distributed "as is" WITHOUT ANY WARRANTY of any
* kind, whether express or implied; without even the implied warranty
* of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*/
#include <linux/arm-cci.h>
#include <linux/io.h>
#include <linux/module.h>
#include <linux/of_address.h>
#include <linux/slab.h>
#include <asm/cacheflush.h>
#include <asm/smp_plat.h>
#define CCI_PORT_CTRL 0x0
#define CCI_CTRL_STATUS 0xc
#define CCI_ENABLE_SNOOP_REQ 0x1
#define CCI_ENABLE_DVM_REQ 0x2
#define CCI_ENABLE_REQ (CCI_ENABLE_SNOOP_REQ | CCI_ENABLE_DVM_REQ)
struct cci_nb_ports {
unsigned int nb_ace;
unsigned int nb_ace_lite;
};
enum cci_ace_port_type {
ACE_INVALID_PORT = 0x0,
ACE_PORT,
ACE_LITE_PORT,
};
struct cci_ace_port {
void __iomem *base;
unsigned long phys;
enum cci_ace_port_type type;
struct device_node *dn;
};
static struct cci_ace_port *ports;
static unsigned int nb_cci_ports;
static void __iomem *cci_ctrl_base;
static unsigned long cci_ctrl_phys;
struct cpu_port {
u64 mpidr;
u32 port;
};
/*
* Use the port MSB as valid flag, shift can be made dynamic
* by computing number of bits required for port indexes.
* Code disabling CCI cpu ports runs with D-cache invalidated
* and SCTLR bit clear so data accesses must be kept to a minimum
* to improve performance; for now shift is left static to
* avoid one more data access while disabling the CCI port.
*/
#define PORT_VALID_SHIFT 31
#define PORT_VALID (0x1 << PORT_VALID_SHIFT)
static inline void init_cpu_port(struct cpu_port *port, u32 index, u64 mpidr)
{
port->port = PORT_VALID | index;
port->mpidr = mpidr;
}
static inline bool cpu_port_is_valid(struct cpu_port *port)
{
return !!(port->port & PORT_VALID);
}
static inline bool cpu_port_match(struct cpu_port *port, u64 mpidr)
{
return port->mpidr == (mpidr & MPIDR_HWID_BITMASK);
}
static struct cpu_port cpu_port[NR_CPUS];
/**
* __cci_ace_get_port - Function to retrieve the port index connected to
* a cpu or device.
*
* @dn: device node of the device to look-up
* @type: port type
*
* Return value:
* - CCI port index if success
* - -ENODEV if failure
*/
static int __cci_ace_get_port(struct device_node *dn, int type)
{
int i;
bool ace_match;
struct device_node *cci_portn;
cci_portn = of_parse_phandle(dn, "cci-control-port", 0);
for (i = 0; i < nb_cci_ports; i++) {
ace_match = ports[i].type == type;
if (ace_match && cci_portn == ports[i].dn)
return i;
}
return -ENODEV;
}
int cci_ace_get_port(struct device_node *dn)
{
return __cci_ace_get_port(dn, ACE_LITE_PORT);
}
EXPORT_SYMBOL_GPL(cci_ace_get_port);
static void __init cci_ace_init_ports(void)
{
int port, ac, cpu;
u64 hwid;
const u32 *cell;
struct device_node *cpun, *cpus;
cpus = of_find_node_by_path("/cpus");
if (WARN(!cpus, "Missing cpus node, bailing out\n"))
return;
if (WARN_ON(of_property_read_u32(cpus, "#address-cells", &ac)))
ac = of_n_addr_cells(cpus);
/*
* Port index look-up speeds up the function disabling ports by CPU,
* since the logical to port index mapping is done once and does
* not change after system boot.
* The stashed index array is initialized for all possible CPUs
* at probe time.
*/
for_each_child_of_node(cpus, cpun) {
if (of_node_cmp(cpun->type, "cpu"))
continue;
cell = of_get_property(cpun, "reg", NULL);
if (WARN(!cell, "%s: missing reg property\n", cpun->full_name))
continue;
hwid = of_read_number(cell, ac);
cpu = get_logical_index(hwid & MPIDR_HWID_BITMASK);
if (cpu < 0 || !cpu_possible(cpu))
continue;
port = __cci_ace_get_port(cpun, ACE_PORT);
if (port < 0)
continue;
init_cpu_port(&cpu_port[cpu], port, cpu_logical_map(cpu));
}
for_each_possible_cpu(cpu) {
WARN(!cpu_port_is_valid(&cpu_port[cpu]),
"CPU %u does not have an associated CCI port\n",
cpu);
}
}
/*
* Functions to enable/disable a CCI interconnect slave port
*
* They are called by low-level power management code to disable slave
* interfaces snoops and DVM broadcast.
* Since they may execute with cache data allocation disabled and
* after the caches have been cleaned and invalidated the functions provide
* no explicit locking since they may run with D-cache disabled, so normal
* cacheable kernel locks based on ldrex/strex may not work.
* Locking has to be provided by BSP implementations to ensure proper
* operations.
*/
/**
* cci_port_control() - function to control a CCI port
*
* @port: index of the port to setup
* @enable: if true enables the port, if false disables it
*/
static void notrace cci_port_control(unsigned int port, bool enable)
{
void __iomem *base = ports[port].base;
writel_relaxed(enable ? CCI_ENABLE_REQ : 0, base + CCI_PORT_CTRL);
/*
* This function is called from power down procedures
* and must not execute any instruction that might
* cause the processor to be put in a quiescent state
* (eg wfi). Hence, cpu_relax() can not be added to this
* read loop to optimize power, since it might hide possibly
* disruptive operations.
*/
while (readl_relaxed(cci_ctrl_base + CCI_CTRL_STATUS) & 0x1)
;
}
/**
* cci_disable_port_by_cpu() - function to disable a CCI port by CPU
* reference
*
* @mpidr: mpidr of the CPU whose CCI port should be disabled
*
* Disabling a CCI port for a CPU implies disabling the CCI port
* controlling that CPU cluster. Code disabling CPU CCI ports
* must make sure that the CPU running the code is the last active CPU
* in the cluster ie all other CPUs are quiescent in a low power state.
*
* Return:
* 0 on success
* -ENODEV on port look-up failure
*/
int notrace cci_disable_port_by_cpu(u64 mpidr)
{
int cpu;
bool is_valid;
for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
is_valid = cpu_port_is_valid(&cpu_port[cpu]);
if (is_valid && cpu_port_match(&cpu_port[cpu], mpidr)) {
cci_port_control(cpu_port[cpu].port, false);
return 0;
}
}
return -ENODEV;
}
EXPORT_SYMBOL_GPL(cci_disable_port_by_cpu);
/**
* cci_enable_port_for_self() - enable a CCI port for calling CPU
*
* Enabling a CCI port for the calling CPU implies enabling the CCI
* port controlling that CPU's cluster. Caller must make sure that the
* CPU running the code is the first active CPU in the cluster and all
* other CPUs are quiescent in a low power state or waiting for this CPU
* to complete the CCI initialization.
*
* Because this is called when the MMU is still off and with no stack,
* the code must be position independent and ideally rely on callee
* clobbered registers only. To achieve this we must code this function
* entirely in assembler.
*
* On success this returns with the proper CCI port enabled. In case of
* any failure this never returns as the inability to enable the CCI is
* fatal and there is no possible recovery at this stage.
*/
asmlinkage void __naked cci_enable_port_for_self(void)
{
asm volatile ("\n"
" mrc p15, 0, r0, c0, c0, 5 @ get MPIDR value \n"
" and r0, r0, #"__stringify(MPIDR_HWID_BITMASK)" \n"
" adr r1, 5f \n"
" ldr r2, [r1] \n"
" add r1, r1, r2 @ &cpu_port \n"
" add ip, r1, %[sizeof_cpu_port] \n"
/* Loop over the cpu_port array looking for a matching MPIDR */
"1: ldr r2, [r1, %[offsetof_cpu_port_mpidr_lsb]] \n"
" cmp r2, r0 @ compare MPIDR \n"
" bne 2f \n"
/* Found a match, now test port validity */
" ldr r3, [r1, %[offsetof_cpu_port_port]] \n"
" tst r3, #"__stringify(PORT_VALID)" \n"
" bne 3f \n"
/* no match, loop with the next cpu_port entry */
"2: add r1, r1, %[sizeof_struct_cpu_port] \n"
" cmp r1, ip @ done? \n"
" blo 1b \n"
/* CCI port not found -- cheaply try to stall this CPU */
"cci_port_not_found: \n"
" wfi \n"
" wfe \n"
" b cci_port_not_found \n"
/* Use matched port index to look up the corresponding ports entry */
"3: bic r3, r3, #"__stringify(PORT_VALID)" \n"
" adr r0, 6f \n"
" ldmia r0, {r1, r2} \n"
" sub r1, r1, r0 @ virt - phys \n"
" ldr r0, [r0, r2] @ *(&ports) \n"
" mov r2, %[sizeof_struct_ace_port] \n"
" mla r0, r2, r3, r0 @ &ports[index] \n"
" sub r0, r0, r1 @ virt_to_phys() \n"
/* Enable the CCI port */
" ldr r0, [r0, %[offsetof_port_phys]] \n"
" mov r3, #"__stringify(CCI_ENABLE_REQ)" \n"
" str r3, [r0, #"__stringify(CCI_PORT_CTRL)"] \n"
/* poll the status reg for completion */
" adr r1, 7f \n"
" ldr r0, [r1] \n"
" ldr r0, [r0, r1] @ cci_ctrl_base \n"
"4: ldr r1, [r0, #"__stringify(CCI_CTRL_STATUS)"] \n"
" tst r1, #1 \n"
" bne 4b \n"
" mov r0, #0 \n"
" bx lr \n"
" .align 2 \n"
"5: .word cpu_port - . \n"
"6: .word . \n"
" .word ports - 6b \n"
"7: .word cci_ctrl_phys - . \n"
: :
[sizeof_cpu_port] "i" (sizeof(cpu_port)),
#ifndef __ARMEB__
[offsetof_cpu_port_mpidr_lsb] "i" (offsetof(struct cpu_port, mpidr)),
#else
[offsetof_cpu_port_mpidr_lsb] "i" (offsetof(struct cpu_port, mpidr)+4),
#endif
[offsetof_cpu_port_port] "i" (offsetof(struct cpu_port, port)),
[sizeof_struct_cpu_port] "i" (sizeof(struct cpu_port)),
[sizeof_struct_ace_port] "i" (sizeof(struct cci_ace_port)),
[offsetof_port_phys] "i" (offsetof(struct cci_ace_port, phys)) );
unreachable();
}
/**
* __cci_control_port_by_device() - function to control a CCI port by device
* reference
*
* @dn: device node pointer of the device whose CCI port should be
* controlled
* @enable: if true enables the port, if false disables it
*
* Return:
* 0 on success
* -ENODEV on port look-up failure
*/
int notrace __cci_control_port_by_device(struct device_node *dn, bool enable)
{
int port;
if (!dn)
return -ENODEV;
port = __cci_ace_get_port(dn, ACE_LITE_PORT);
if (WARN_ONCE(port < 0, "node %s ACE lite port look-up failure\n",
dn->full_name))
return -ENODEV;
cci_port_control(port, enable);
return 0;
}
EXPORT_SYMBOL_GPL(__cci_control_port_by_device);
/**
* __cci_control_port_by_index() - function to control a CCI port by port index
*
* @port: port index previously retrieved with cci_ace_get_port()
* @enable: if true enables the port, if false disables it
*
* Return:
* 0 on success
* -ENODEV on port index out of range
* -EPERM if operation carried out on an ACE PORT
*/
int notrace __cci_control_port_by_index(u32 port, bool enable)
{
if (port >= nb_cci_ports || ports[port].type == ACE_INVALID_PORT)
return -ENODEV;
/*
* CCI control for ports connected to CPUS is extremely fragile
* and must be made to go through a specific and controlled
* interface (ie cci_disable_port_by_cpu(); control by general purpose
* indexing is therefore disabled for ACE ports.
*/
if (ports[port].type == ACE_PORT)
return -EPERM;
cci_port_control(port, enable);
return 0;
}
EXPORT_SYMBOL_GPL(__cci_control_port_by_index);
static const struct cci_nb_ports cci400_ports = {
.nb_ace = 2,
.nb_ace_lite = 3
};
static const struct of_device_id arm_cci_matches[] = {
{.compatible = "arm,cci-400", .data = &cci400_ports },
{},
};
static const struct of_device_id arm_cci_ctrl_if_matches[] = {
{.compatible = "arm,cci-400-ctrl-if", },
{},
};
static int __init cci_probe(void)
{
struct cci_nb_ports const *cci_config;
int ret, i, nb_ace = 0, nb_ace_lite = 0;
struct device_node *np, *cp;
struct resource res;
const char *match_str;
bool is_ace;
np = of_find_matching_node(NULL, arm_cci_matches);
if (!np)
return -ENODEV;
cci_config = of_match_node(arm_cci_matches, np)->data;
if (!cci_config)
return -ENODEV;
nb_cci_ports = cci_config->nb_ace + cci_config->nb_ace_lite;
ports = kcalloc(sizeof(*ports), nb_cci_ports, GFP_KERNEL);
if (!ports)
return -ENOMEM;
ret = of_address_to_resource(np, 0, &res);
if (!ret) {
cci_ctrl_base = ioremap(res.start, resource_size(&res));
cci_ctrl_phys = res.start;
}
if (ret || !cci_ctrl_base) {
WARN(1, "unable to ioremap CCI ctrl\n");
ret = -ENXIO;
goto memalloc_err;
}
for_each_child_of_node(np, cp) {
if (!of_match_node(arm_cci_ctrl_if_matches, cp))
continue;
i = nb_ace + nb_ace_lite;
if (i >= nb_cci_ports)
break;
if (of_property_read_string(cp, "interface-type",
&match_str)) {
WARN(1, "node %s missing interface-type property\n",
cp->full_name);
continue;
}
is_ace = strcmp(match_str, "ace") == 0;
if (!is_ace && strcmp(match_str, "ace-lite")) {
WARN(1, "node %s containing invalid interface-type property, skipping it\n",
cp->full_name);
continue;
}
ret = of_address_to_resource(cp, 0, &res);
if (!ret) {
ports[i].base = ioremap(res.start, resource_size(&res));
ports[i].phys = res.start;
}
if (ret || !ports[i].base) {
WARN(1, "unable to ioremap CCI port %d\n", i);
continue;
}
if (is_ace) {
if (WARN_ON(nb_ace >= cci_config->nb_ace))
continue;
ports[i].type = ACE_PORT;
++nb_ace;
} else {
if (WARN_ON(nb_ace_lite >= cci_config->nb_ace_lite))
continue;
ports[i].type = ACE_LITE_PORT;
++nb_ace_lite;
}
ports[i].dn = cp;
}
/* initialize a stashed array of ACE ports to speed-up look-up */
cci_ace_init_ports();
/*
* Multi-cluster systems may need this data when non-coherent, during
* cluster power-up/power-down. Make sure it reaches main memory.
*/
sync_cache_w(&cci_ctrl_base);
sync_cache_w(&cci_ctrl_phys);
sync_cache_w(&ports);
sync_cache_w(&cpu_port);
__sync_cache_range_w(ports, sizeof(*ports) * nb_cci_ports);
pr_info("ARM CCI driver probed\n");
return 0;
memalloc_err:
kfree(ports);
return ret;
}
static int cci_init_status = -EAGAIN;
static DEFINE_MUTEX(cci_probing);
static int __init cci_init(void)
{
if (cci_init_status != -EAGAIN)
return cci_init_status;
mutex_lock(&cci_probing);
if (cci_init_status == -EAGAIN)
cci_init_status = cci_probe();
mutex_unlock(&cci_probing);
return cci_init_status;
}
/*
* To sort out early init calls ordering a helper function is provided to
* check if the CCI driver has beed initialized. Function check if the driver
* has been initialized, if not it calls the init function that probes
* the driver and updates the return value.
*/
bool __init cci_probed(void)
{
return cci_init() == 0;
}
EXPORT_SYMBOL_GPL(cci_probed);
early_initcall(cci_init);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("ARM CCI support");

View File

@ -37,20 +37,6 @@
extern void highbank_set_cpu_jump(int cpu, void *jump_addr); extern void highbank_set_cpu_jump(int cpu, void *jump_addr);
extern void *scu_base_addr; extern void *scu_base_addr;
static inline unsigned int get_auxcr(void)
{
unsigned int val;
asm("mrc p15, 0, %0, c1, c0, 1 @ get AUXCR" : "=r" (val) : : "cc");
return val;
}
static inline void set_auxcr(unsigned int val)
{
asm volatile("mcr p15, 0, %0, c1, c0, 1 @ set AUXCR"
: : "r" (val) : "cc");
isb();
}
static noinline void calxeda_idle_restore(void) static noinline void calxeda_idle_restore(void)
{ {
set_cr(get_cr() | CR_C); set_cr(get_cr() | CR_C);

61
include/linux/arm-cci.h Normal file
View File

@ -0,0 +1,61 @@
/*
* CCI cache coherent interconnect support
*
* Copyright (C) 2013 ARM Ltd.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef __LINUX_ARM_CCI_H
#define __LINUX_ARM_CCI_H
#include <linux/errno.h>
#include <linux/types.h>
struct device_node;
#ifdef CONFIG_ARM_CCI
extern bool cci_probed(void);
extern int cci_ace_get_port(struct device_node *dn);
extern int cci_disable_port_by_cpu(u64 mpidr);
extern int __cci_control_port_by_device(struct device_node *dn, bool enable);
extern int __cci_control_port_by_index(u32 port, bool enable);
#else
static inline bool cci_probed(void) { return false; }
static inline int cci_ace_get_port(struct device_node *dn)
{
return -ENODEV;
}
static inline int cci_disable_port_by_cpu(u64 mpidr) { return -ENODEV; }
static inline int __cci_control_port_by_device(struct device_node *dn,
bool enable)
{
return -ENODEV;
}
static inline int __cci_control_port_by_index(u32 port, bool enable)
{
return -ENODEV;
}
#endif
#define cci_disable_port_by_device(dev) \
__cci_control_port_by_device(dev, false)
#define cci_enable_port_by_device(dev) \
__cci_control_port_by_device(dev, true)
#define cci_disable_port_by_index(dev) \
__cci_control_port_by_index(dev, false)
#define cci_enable_port_by_index(dev) \
__cci_control_port_by_index(dev, true)
#endif