Commit Graph

35 Commits

Author SHA1 Message Date
Rob Herring 92871b94a5 ARM: 7855/1: Add check for Cortex-A15 errata 798181 ECO
The work-around for A15 errata 798181 is not needed if appropriate ECO
fixes have been applied to r3p2 and earlier core revisions. This can be
checked by reading REVIDR register bits 4 and 9. If only bit 4 is set,
then the IPI broadcast can be skipped.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-10-29 11:06:13 +00:00
Will Deacon 62cbbc42e0 ARM: tlb: reduce scope of barrier domains for TLB invalidation
Our TLB invalidation routines may require a barrier before the
maintenance (in order to ensure pending page table writes are visible to
the hardware walker) and barriers afterwards (in order to ensure
completion of the maintenance and visibility in the instruction stream).

Whilst this is expensive, the cost can be reduced somewhat by reducing
the scope of the barrier instructions:

  - The barrier before only needs to apply to stores (pte writes)
  - Local ops are required only to affect the non-shareable domain
  - Global ops are required only to affect the inner-shareable domain

This patch makes these changes for the TLB flushing code.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12 12:25:45 +01:00
Will Deacon 2c813980c6 ARM: tlb: don't perform inner-shareable invalidation for local BP ops
Now that the ASID allocator doesn't require inner-shareable maintenance,
we can convert the local_bp_flush_all function to perform only
non-shareable flushing, in a similar manner to the TLB invalidation
routines.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12 12:25:44 +01:00
Will Deacon 587b9b6487 ARM: tlb: don't bother with barriers for branch predictor maintenance
Branch predictor maintenance is only required when we are either
changing the kernel's view of memory (switching tables completely) or
dealing with ASID rollover.

Both of these use-cases require subsequent TLB invalidation, which has
the relevant barrier instructions to ensure completion and visibility
of the maintenance, so this patch removes the instruction barrier from
[local_]flush_bp_all.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12 12:25:44 +01:00
Will Deacon f0915781bd ARM: tlb: don't perform inner-shareable invalidation for local TLB ops
Inner-shareable TLB invalidation is typically more expensive than local
(non-shareable) invalidation, so performing the broadcasting for
local_flush_tlb_* operations is a waste of cycles and needlessly
clobbers entries in the TLBs of other CPUs.

This patch introduces __flush_tlb_* versions for many of the TLB
invalidation functions, which only respect inner-shareable variants of
the invalidation instructions when presented with the TLB_V7_UIS_FULL
flag. The local version is also inlined to prevent SMP_ON_UP kernels
from missing flushes, where the __flush variant would be called with
the UP flags.

This gains us around 0.5% in hackbench scores for a dual-core A15, but I
would expect this to improve as more cores (and clusters) are added to
the equation.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Albin Tonnerre <Albin.Tonnerre@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12 12:25:44 +01:00
Fabio Estevam 1f49856bb0 ARM: 7789/1: Do not run dummy_flush_tlb_a15_erratum() on non-Cortex-A15
Commit 93dc688 (ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)) causes the following undefined instruction error on a mx53 (Cortex-A8):

Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
CPU: 0 PID: 275 Comm: modprobe Not tainted 3.11.0-rc2-next-20130722-00009-g9b0f371 #881
task: df46cc00 ti: df48e000 task.ti: df48e000
PC is at check_and_switch_context+0x17c/0x4d0
LR is at check_and_switch_context+0xdc/0x4d0

This problem happens because check_and_switch_context() calls dummy_flush_tlb_a15_erratum() without checking if we are really running on a Cortex-A15 or not.

To avoid this issue, only call dummy_flush_tlb_a15_erratum() inside
check_and_switch_context() if erratum_a15_798181() returns true, which means that we are really running on a Cortex-A15.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26 12:02:09 +01:00
Russell King b3f288de7c Merge branch 'for-rmk/hugepages' of git://git.linaro.org/people/stevecapper/linux into devel-stable
These changes bring both HugeTLB support and Transparent HugePage
(THP) support to ARM.  Only long descriptors (LPAE) are supported
in this series.

The code has been tested on an Arndale board (Exynos 5250).
2013-06-18 20:05:48 +01:00
Jonathan Austin 8d655d835b ARM: nommu: add stub local_flush_bp_all() for !CONFIG_MMUU
Since the merging of Will's tlb-ops branch, specifically 89c7e4b8bb
(ARM: 7661/1: mm: perform explicit branch predictor maintenance when required),
building SMP without CONFIG_MMU has been broken.

The local_flush_bp_all function is only called for operations related to
changing the kernel's view of memory and ASID rollover - both of which are
irrelevant to an !MMU kernel.

This patch adds a stub local_flush_bp_all() function to the other tlb
maintenance stubs and restores the ability to build an SMP !MMU kernel.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
2013-06-07 17:02:46 +01:00
Will Deacon 5c709e6998 ARM: nommu: define dummy TLB operations for nommu configurations
nommu platforms do not perform address translation and therefore clearly
don't have TLBs. However, some SMP code assumes the presence of the TLB
flushing routines and will therefore fail to compile for a nommu system.

This patch defines dummy local_* TLB operations and #defines
tlb_ops_need_broadcast() as 0, therefore causing the usual ARM SMP TLB
operations to call the local variants instead.

Signed-off-by: Will Deacon <will.deacon@arm.com>
CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: Nicolas Pitre <nico@linaro.org>
2013-06-07 17:02:42 +01:00
Catalin Marinas 8d96250700 ARM: mm: Transparent huge page support for LPAE systems.
The patch adds support for THP (transparent huge pages) to LPAE
systems. When this feature is enabled, the kernel tries to map
anonymous pages as 2MB sections where possible.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[steve.capper@linaro.org: symbolic constants used, value of
PMD_SECT_SPLITTING adjusted, tlbflush.h included in pgtable.h,
added PROT_NONE support.]
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
2013-06-04 16:52:38 +01:00
Russell King 946342d03e Merge branches 'devel-stable', 'entry', 'fixes', 'mach-types', 'misc' and 'smp-hotplug' into for-linus 2013-05-02 21:30:36 +01:00
Russell King 4d855021dd Merge branch 'for-rmk/740t' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into fixes 2013-04-17 10:35:23 +01:00
Will Deacon ae8a8b9553 ARM: 7691/1: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead
Many ARMv7 cores have hardware page table walkers that can read the L1
cache. This is discoverable from the ID_MMFR3 register, although this
can be expensive to access from the low-level set_pte functions and is a
pain to cache, particularly with multi-cluster systems.

A useful observation is that the multi-processing extensions for ARMv7
require coherent table walks, meaning that we can make use of ALT_SMP
patching in proc-v7-* to patch away the cache flush safely for these
cores.

Reported-by: Albin Tonnerre <Albin.Tonnerre@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-03 17:39:07 +01:00
Catalin Marinas 93dc68876b ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)
On Cortex-A15 (r0p0..r3p2) the TLBI/DSB are not adequately shooting down
all use of the old entries. This patch implements the erratum workaround
which consists of:

1. Dummy TLBIMVAIS and DSB on the CPU doing the TLBI operation.
2. Send IPI to the CPUs that are running the same mm (and ASID) as the
   one being invalidated (or all the online CPUs for global pages).
3. CPU receiving the IPI executes a DMB and CLREX (part of the exception
   return code already).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-03 16:45:49 +01:00
Will Deacon 4cc3daaf39 ARM: tlbflush: remove ARMv3 support
We no longer support any ARMv3 platforms, so remove the old tlbflushing
code.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-03-25 13:31:55 +00:00
Will Deacon 862c588f06 ARM: 7660/1: tlb: add branch predictor maintenance operations
The ARM architecture requires explicit branch predictor maintenance
when updating an instruction stream for a given virtual address. In
reality, this isn't so much of a burden because the branch predictor
is flushed during the cache maintenance required to make the new
instructions visible to the I-side of the processor.

However, there are still some cases where explicit flushing is required,
so add a local_bp_flush_all operation to deal with this.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-03-03 22:54:15 +00:00
Russell King 357c9c1f07 ARM: Remove support for ARMv3 ARM610 and ARM710 CPUs
This patch removes support for ARMv3 CPUs, which haven't worked properly
for quite some time (see the FIXME comment in arch/arm/mm/fault.c).  The
only V3 parts left is the cache model for ARMv3, which is needed for some
odd reason by ARM740T CPUs, and being able to build with -march=armv3,
which is required for the RiscPC platform due to its bus structure.

Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-05-05 05:50:50 +01:00
Russell King 87067a935a ARM: Optimize multi-CPU tlb flushing a little more
The compiler does not conditionalize the assembly instructions for
the tlb operations, which leads to sub-optimal code being generated
when building a kernel for multiple CPUs.

We can tweak things fairly simply as the code fragment below shows:

    17f8:       e3120001        tst     r2, #1  ; 0x1
...
    1800:       0a000000        beq     1808 <handle_pte_fault+0x194>
    1804:       ee061f10        mcr     15, 0, r1, cr6, cr0, {0}
    1808:       e3120004        tst     r2, #4  ; 0x4
    180c:       0a000000        beq     1814 <handle_pte_fault+0x1a0>
    1810:       ee081f36        mcr     15, 0, r1, cr8, cr6, {1}
becomes:
    17f0:       e3120001        tst     r2, #1  ; 0x1
    17f4:       1e063f10        mcrne   15, 0, r3, cr6, cr0, {0}
    17f8:       e3120004        tst     r2, #4  ; 0x4
    17fc:       1e083f36        mcrne   15, 0, r3, cr8, cr6, {1}

Overall, for Realview with V6 and V7 CPUs configured:

   text    data     bss     dec     hex filename
4153998  207340 5371036 9732374  948116 ../build/realview/vmlinux.before
4153366  207332 5371036 9731734  947e96 ../build/realview/vmlinux.after

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24 09:38:52 +00:00
Catalin Marinas 442e70c0b3 ARM: 7076/1: LPAE: Add (pte|pmd)val_t type definitions as u32
This patch defines the (pte|pmd)val_t as u32 and changes the page table
types to be based on these. The PMD bits are converted to the
corresponding type using the _AT macro.

The flush_pmd_entry/clean_pmd_entry argument was changed to (void *) to
allow them to be used with both PGD and PMD pointers and avoid code
duplication.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-06 15:40:05 +01:00
Russell King 4348810a24 ARM: btc: avoid invalidating the branch target cache on kernel TLB maintanence
Kernel space needs very little in the way of BTC maintanence as most
mappings which are created and destroyed are non-executable, and so
could never enter the instruction stream.

The case which does warrant BTC maintanence is when a module is loaded.
This creates a new executable mapping, but at that point the pages have
not been initialized with code and data, so at that point they contain
unpredictable information.  Invalidating the BTC at this stage serves
little useful purpose.

Before we execute module code, we call flush_icache_range(), which deals
with the BTC maintanence requirements.  This ensures that we have a BTC
maintanence operation before we execute code via the newly created
mapping.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-07-19 11:44:06 +01:00
Russell King 58e9c47fa0 ARM: tlb: move noMMU tlb_flush() to asm/tlb.h
There's no need to noMMU to put tlb_flush() in asm/tlbflush.h - it's
part of the tlb shootdown interface.  Move it to asm/tlb.h instead, as
per x86.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-02-21 19:29:28 +00:00
Russell King 23beab76b4 Merge branches 'at91', 'dcache', 'ftrace', 'hwbpt', 'misc', 'mmci', 's3c', 'st-ux' and 'unwind' into devel 2010-10-18 22:34:25 +01:00
Russell King f00ec48fad ARM: Allow SMP kernels to boot on UP systems
UP systems do not implement all the instructions that SMP systems have,
so in order to boot a SMP kernel on a UP system, we need to rewrite
parts of the kernel.

Do this using an 'alternatives' scheme, where the kernel code and data
is modified prior to initialization to replace the SMP instructions,
thereby rendering the problematical code ineffectual.  We use the linker
to generate a list of 32-bit word locations and their replacement values,
and run through these replacements when we detect a UP system.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-04 20:23:36 +01:00
Catalin Marinas 6012191aa9 ARM: 6380/1: Introduce __sync_icache_dcache() for VIPT caches
On SMP systems, there is a small chance of a PTE becoming visible to a
different CPU before the current cache maintenance operations in
update_mmu_cache(). To avoid this, cache maintenance must be handled in
set_pte_at() (similar to IA-64 and PowerPC).

This patch provides a unified VIPT cache handling mechanism and
implements the __sync_icache_dcache() function for ARMv6 onwards
architectures. It is called from set_pte_at() and replaces the
update_mmu_cache(). The latter is still used on VIVT hardware where a
vm_area_struct is required.

Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-19 12:17:44 +01:00
Catalin Marinas c01778001a ARM: 6379/1: Assume new page cache pages have dirty D-cache
There are places in Linux where writes to newly allocated page cache
pages happen without a subsequent call to flush_dcache_page() (several
PIO drivers including USB HCD). This patch changes the meaning of
PG_arch_1 to be PG_dcache_clean and always flush the D-cache for a newly
mapped page in update_mmu_cache().

The patch also sets the PG_arch_1 bit in the DMA cache maintenance
function to avoid additional cache flushing in update_mmu_cache().

Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-19 12:17:43 +01:00
Will Deacon cdf357f1e1 ARM: 6299/1: errata: TLBIASIDIS and TLBIMVAIS operations can broadcast a faulty ASID
On versions of the Cortex-A9 prior to r2p0, performing TLB invalidations by
ASID match can result in the incorrect ASID being broadcast to other CPUs.
As a consequence of this, the targetted TLB entries are not invalidated
across the system.

This workaround changes the TLB flushing routines to invalidate entries
regardless of the ASID.

Cc: <stable@kernel.org>
Tested-by: Rob Clark <rob@ti.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-08-10 22:10:54 +01:00
Catalin Marinas b8349b569a ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMP
The standard I-cache Invalidate All (ICIALLU) and Branch Predication
Invalidate All (BPIALL) operations are not automatically broadcast to
the other CPUs in an ARMv7 MP system. The patch adds the Inner Shareable
variants, ICIALLUIS and BPIALLIS, if ARMv7 and SMP.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-08 10:44:30 +01:00
Russell King 4b3073e1c5 MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself
On VIVT ARM, when we have multiple shared mappings of the same file
in the same MM, we need to ensure that we have coherency across all
copies.  We do this via make_coherent() by making the pages
uncacheable.

This used to work fine, until we allowed highmem with highpte - we
now have a page table which is mapped as required, and is not available
for modification via update_mmu_cache().

Ralf Beache suggested getting rid of the PTE value passed to
update_mmu_cache():

  On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
  to construct a pointer to the pte again.  Passing a pte_t * is much
  more elegant.  Maybe we might even replace the pte argument with the
  pte_t?

Ben Herrenschmidt would also like the pte pointer for PowerPC:

  Passing the ptep in there is exactly what I want.  I want that
  -instead- of the PTE value, because I have issue on some ppc cases,
  for I$/D$ coherency, where set_pte_at() may decide to mask out the
  _PAGE_EXEC.

So, pass in the mapped page table pointer into update_mmu_cache(), and
remove the PTE value, updating all implementations and call sites to
suit.

Includes a fix from Stephen Rothwell:

  sparc: fix fallout from update_mmu_cache API change

  Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-02-20 16:41:46 +00:00
Santosh Shilimkar daaeb6c938 ARM: 5763/1: ARM: SMP: Fix the BUG with CONFIG_PREEMPT enabled
This patch fixes the BUG: using smp_processor_id() in preemptible
Below is the stripped backtrace.

BUG: using smp_processor_id() in preemptible [00000000] code: init/1
caller is flush_tlb_mm+0x44/0x70
Backtrace:
[<c00225c4>] (dump_backtrace+0x0/0x110) from [<c01713a0>] (dump_stack+0x18/0x1c)
 r7:00000000 r6:c00234f0 r5:00000001 r4:c7828000
[<c0171388>] (dump_stack+0x0/0x1c) from [<c0135364>] (debug_smp_processor_id+0xc0/0xf0)
[<c01352a4>] (debug_smp_processor_id+0x0/0xf0) from [<c00234f0>] (flush_tlb_mm+0x44/0x70)
 r7:00000000 r6:c60b41a0 r5:c60b4154 r4:00000001
[<c00234ac>] (flush_tlb_mm+0x0/0x70) from [<c0039568>] (dup_mm+0x304/0x38c)
 r5:c1f09058 r4:00000000
[<c0039264>] (dup_mm+0x0/0x38c) from [<c0039de4>] (copy_process+0x7b8/0xeb0)
[<c003962c>] (copy_process+0x0/0xeb0) from [<c003a638>] (do_fork+0x15c/0x29c)
[<c003a4dc>] (do_fork+0x0/0x29c) from [<c0021df0>] (sys_clone+0x34/0x3c)
[<c0021dbc>] (sys_clone+0x0/0x3c) from [<c001efa0>] (ret_fast_syscall+0x0/0x2c)

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-10-15 15:45:15 +01:00
Rusty Russell 56f8ba83a5 cpumask: use mm_cpumask() wrapper: arm
Makes code futureproof against the impending change to mm->cpu_vm_mask.

It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24 09:34:49 +09:30
Catalin Marinas faa7bc51c1 Check whether the TLB operations need broadcasting on SMP systems
ARMv7 SMP hardware can handle the TLB maintenance operations
broadcasting in hardware so that the software can avoid the costly IPIs.
This patch adds the necessary checks (the MMFR3 CPUID register) to avoid
the broadcasting if already supported by the hardware.

(this patch is based on the work done by Tony Thompson @ ARM)

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2009-05-30 14:00:14 +01:00
Paulius Zaleckas 28853ac8fe ARM: Add support for FA526 v2
Adds support for Faraday FA526 core. This core is used at least by:
Cortina Systems Gemini and Centroid family
Cavium Networks ECONA family
Grain Media GM8120
Pixelplus ImageARM
Prolific PL-1029
Faraday IP evaluation boards

v2:
- move TLB_BTB to separate patch
- update copyrights

Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
2009-03-25 13:10:01 +02:00
Paulius Zaleckas bba7d0b9ba ARM: tlbflush.h: introduce TLB_BTB flag
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
2009-03-25 10:58:47 +02:00
Paul Walmsley 61db7fb1c7 [ARM] 5192/1: ARM TLB: add v7wbi_{possible,always}_flags to {possible,always}_tlb_flags
Commit 2ccdd1e77d doesn't add
v7wbi_possible_flags and v7wbi_always_flags to possible_tlb_flags and
always_tlb_flags.  This causes the L2 cache flush in clean_pmd_entry()
(intended for Feroceon only) to execute on ARMv7, and the CPU hangs.

This patch is required for OMAP3 boards to boot.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-08-12 19:54:08 +01:00
Russell King 4baa992243 [ARM] move include/asm-arm to arch/arm/include/asm
Move platform independent header files to arch/arm/include/asm, leaving
those in asm/arch* and asm/plat* alone.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-08-02 21:32:35 +01:00