When v6 and >=v7 boards are supported in the same kernel, the
__und_usr code currently makes a build-time assumption that Thumb-2
instructions occurring in userspace don't need to be supported.
Strictly speaking this is incorrect.
This patch fixes the above case by doing a run-time check on the
CPU architecture in these cases. This only affects kernels which
support v6 and >=v7 CPUs together: plain v6 and plain v7 kernels
are unaffected.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Reviewed-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When testing whether a Thumb-2 instruction is 32 bits long or not,
the masking done in order to test bits 11-15 of the first
instruction halfword won't affect the result of the comparison, so
remove it.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Reviewed-by: Jon Medhurst <tixy@yxit.co.uk>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The SVC IRQ, prefetch and data abort handlers preserve the SPSR value
via r5 across the exception. Rather than re-loading it from pt_regs,
use the preserved value instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tail-call the main C data abort handler code from the per-CPU helper
code. Update the comments in the code wrt the new calling and return
register state.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tail-call the main C prefetch abort handler code from the per-CPU
helper code. Also note that the helper function becomes ABI
compliant in terms of the registers preserved.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This avoids the irq entry assembly corrupting r5, thereby allowing it
to be preserved through to the svc exit code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
All handlers now call trace_hardirqs_off, so move this common code into
the (svc|usr)_entry assembler macros.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As we no longer re-enable interrupts in these exception handlers, add
the irqsoff tracing calls to them so that the kernel tracks the state
more accurately.
Note that these calls are conditional on IRQSOFF_TRACER:
kernel ----------> user ---------> kernel
^ irqs enabled ^ irqs disabled
No kernel code can run on the local CPU until we've re-entered the
kernel through one of the exception handlers - and userspace can not
take any locks etc. So, the kernel doesn't care about the IRQ mask
state while userspace is running unless we're doing IRQ off latency
tracing. So, we can (and do) avoid the overhead of updating the IRQ
mask state on every kernel->user and user->kernel transition.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add irqtrace function calls to the undefined exception handler, so
that we get sane lockdep traces from locking problems in undefined
exception handlers.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Avoid enabling interrupts if the parent context had interrupts enabled
in the abort handler assembly code, and move this into the breakpoint/
page/alignment fault handlers instead.
This gets rid of some special-casing for the breakpoint fault handlers
from the low level abort handler path.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This avoids unnecessary instructions for CPUs which implement the IFAR
(instruction fault address register).
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows us to avoid moving registers twice to work around the
clobbered registers when we add calls to trace_hardirqs_{on,off}.
Ensure that all SVC handlers return with SPSR in r5 for consistency.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There's no point checking to see whether IRQs were masked in the parent
context when returning from IRQ handling - the fact that we're handling
an IRQ means that the parent context must have had IRQs unmasked.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
irq_enter() and irq_exit() already take care of the preempt_count
handling for interrupts, which increment and decrement the hardirq
bits of the preempt count. So we can remove the preempt count handing
in our IRQ entry/exit assembly, like x86 did some 9 years ago.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Replace r4 with ip for calling abort helpers - ip is allowed to be
corrupted by called functions in the ABI, so it makes more sense to
use such a register.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some user space applications are designed around the ability to perform
atomic operations on 64 bit values. Since this is natively possible
only with ARMv6k and above, let's provide a new kuser helper to perform
the operation with kernel supervision on pre ARMv6k hardware.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Dave Martin <dave.martin@linaro.org>
Digging into some assembly file in order to get information about the
kuser helpers is not that convivial. Let's move that information to
a better formatted file in Documentation/arm/ and improve on it a bit.
Thanks to Dave Martin <dave.martin@linaro.org> for the initial cleanup and
clarifications.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Dave Martin <dave.martin@linaro.org>
This patch fixes the lockdep warning of "unannotated irqs-off"[1].
After entering __irq_usr, arm core will disable interrupt automatically,
but __irq_usr does not annotate the irq disable, so lockdep may complain
the warning if it has chance to check this in irq handler.
This patch adds trace_hardirqs_off in __irq_usr before entering irq_handler
to handle the irq, also calls ret_to_user_from_irq to avoid calling
disable_irq again.
This is also a fix for irq off tracer.
[1], lockdep warning log of "unannotated irqs-off"
[ 13.804687] ------------[ cut here ]------------
[ 13.809570] WARNING: at kernel/lockdep.c:3335 check_flags+0x78/0x1d0()
[ 13.816467] Modules linked in:
[ 13.819732] Backtrace:
[ 13.822357] [<c01cb42c>] (dump_backtrace+0x0/0x100) from [<c06abb14>] (dump_stack+0x20/0x24)
[ 13.831268] r6:c07d8c2c r5:00000d07 r4:00000000 r3:00000000
[ 13.837280] [<c06abaf4>] (dump_stack+0x0/0x24) from [<c01ffc04>] (warn_slowpath_common+0x5c/0x74)
[ 13.846649] [<c01ffba8>] (warn_slowpath_common+0x0/0x74) from [<c01ffc48>] (warn_slowpath_null+0x2c/0x34)
[ 13.856781] r8:00000000 r7:00000000 r6:c18b8194 r5:60000093 r4:ef182000
[ 13.863708] r3:00000009
[ 13.866485] [<c01ffc1c>] (warn_slowpath_null+0x0/0x34) from [<c0237d84>] (check_flags+0x78/0x1d0)
[ 13.875823] [<c0237d0c>] (check_flags+0x0/0x1d0) from [<c023afc8>] (lock_acquire+0x4c/0x150)
[ 13.884704] [<c023af7c>] (lock_acquire+0x0/0x150) from [<c06af638>] (_raw_spin_lock+0x4c/0x84)
[ 13.893798] [<c06af5ec>] (_raw_spin_lock+0x0/0x84) from [<c01f9a44>] (sched_ttwu_pending+0x58/0x8c)
[ 13.903320] r6:ef92d040 r5:00000003 r4:c18b8180
[ 13.908233] [<c01f99ec>] (sched_ttwu_pending+0x0/0x8c) from [<c01f9a90>] (scheduler_ipi+0x18/0x1c)
[ 13.917663] r6:ef183fb0 r5:00000003 r4:00000000 r3:00000001
[ 13.923645] [<c01f9a78>] (scheduler_ipi+0x0/0x1c) from [<c01bc458>] (do_IPI+0x9c/0xfc)
[ 13.932006] [<c01bc3bc>] (do_IPI+0x0/0xfc) from [<c06b0888>] (__irq_usr+0x48/0xe0)
[ 13.939971] Exception stack(0xef183fb0 to 0xef183ff8)
[ 13.945281] 3fa0: ffffffc3 0001500c 00000001 0001500c
[ 13.953948] 3fc0: 00000050 400b45f0 400d9000 00000000 00000001 400d9600 6474e552 bea05b3c
[ 13.962585] 3fe0: 400d96c0 bea059c0 400b6574 400b65d8 20000010 ffffffff
[ 13.969573] r6:00000403 r5:fa240100 r4:ffffffff r3:20000010
[ 13.975585] ---[ end trace efc4896ab0fb62cb ]---
[ 13.980468] possible reason: unannotated irqs-off.
[ 13.985534] irq event stamp: 1610
[ 13.989044] hardirqs last enabled at (1610): [<c01c703c>] no_work_pending+0x8/0x2c
[ 13.997131] hardirqs last disabled at (1609): [<c01c7024>] ret_slow_syscall+0xc/0x1c
[ 14.005371] softirqs last enabled at (0): [<c01fe5e4>] copy_process+0x2cc/0xa24
[ 14.013183] softirqs last disabled at (0): [< (null)>] (null)
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows the cache/processor/fault glue to be more easily used
from assembler code. Tested on Assabet and Tegra 2.
Tested-by: Colin Cross <ccross@android.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Per subarch interrupt handler macros V3.
This patch breaks out code from the irq_handler macro
into arch_irq_handler and arch_irq_handler_default.
The macros are put in the header file "entry-macro-multi.S"
The arch_irq_handler_default macro is designed to be
used by irq_handler in entry-armv.S while arch_irq_handler
is suitable for per-subarch use.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Normally different ARM platform has different way to decode the IRQ
hardware status and demultiplex to the corresponding IRQ handler.
This is highly optimized by macro irq_handler in entry-armv.S, and
each machine defines their own macro to decode the IRQ number.
However, this prevents multiple machine classes to be built into a
single kernel.
By allowing each machine to specify thier own handler, and making
function pointer 'handle_arch_irq' to point to it at run time, this
can be solved. And introduce CONFIG_MULTI_IRQ_HANDLER to allow both
solutions to work.
Comparing with the highly optimized macro of irq_handler, the new
function must be written with care not to lose too much performance.
And the IPI stuff on SMP is expected to move to the provided arch
IRQ handler as well.
The assembly code to invoke handle_arch_irq is optimized by Russell
King.
Signed-off-by: Eric Miao <eric.miao@canonical.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* __fixup_smp_on_up has been modified with support for the
THUMB2_KERNEL case. For THUMB2_KERNEL only, fixups are split
into halfwords in case of misalignment, since we can't rely on
unaligned accesses working before turning the MMU on.
No attempt is made to optimise the aligned case, since the
number of fixups is typically small, and it seems best to keep
the code as simple as possible.
* Add a rotate in the fixup_smp code in order to support
CPU_BIG_ENDIAN, as suggested by Nicolas Pitre.
* Add an assembly-time sanity-check to ALT_UP() to ensure that
the content really is the right size (4 bytes).
(No check is done for ALT_SMP(). Possibly, this could be fixed
by splitting the two uses ot ALT_SMP() (ALT_SMP...SMP_UP versus
ALT_SMP...SMP_UP_B) into two macros. In the first case,
ALT_SMP needs to expand to >= 4 bytes, not == 4.)
* smp_mpidr.h (which implements ALT_SMP()/ALT_UP() manually due
to macro limitations) has not been modified: the affected
instruction (mov) has no 16-bit encoding, so the correct
instruction size is satisfied in this case.
* A "mode" parameter has been added to smp_dmb:
smp_dmb arm @ assumes 4-byte instructions (for ARM code, e.g. kuser)
smp_dmb @ uses W() to ensure 4-byte instructions for ALT_SMP()
This avoids assembly failures due to use of W() inside smp_dmb,
when assembling pure-ARM code in the vectors page.
There might be a better way to achieve this.
* Kconfig: make SMP_ON_UP depend on
(!THUMB2_KERNEL || !BIG_ENDIAN) i.e., THUMB2_KERNEL is now
supported, but only if !BIG_ENDIAN (The fixup code for Thumb-2
currently assumes little-endian order.)
Tested using a single generic realview kernel on:
ARM RealView PB-A8 (CONFIG_THUMB2_KERNEL={n,y})
ARM RealView PBX-A9 (SMP)
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On ARM, debug exceptions occur in the form of data or prefetch aborts.
One difference is that debug exceptions require access to per-cpu banked
registers and data structures which are not saved in the low-level exception
code. For kernels built with CONFIG_PREEMPT, there is an unlikely scenario
that the debug handler ends up running on a different CPU from the one
that originally signalled the event, resulting in random data being read
from the wrong registers.
This patch adds a debug_entry macro to the low-level exception handling
code which checks whether the taken exception is a debug exception. If
it is, the preempt count for the faulting process is incremented. After
the debug handler has finished, the count is decremented.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The existing code invokes the syscall with rubbish in r7,
due to what looks like an incorrect literal load idiom.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows us to use smp_cross_call() to trigger a number of different
software generated interrupts, rather than combining them all on one
SGI. Recover the SGI number via do_IPI.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch removes the domain switching functionality via the set_fs and
__switch_to functions on cores that have a TLS register.
Currently, the ioremap and vmalloc areas share the same level 1 page
tables and therefore have the same domain (DOMAIN_KERNEL). When the
kernel domain is modified from Client to Manager (via the __set_fs or in
the __switch_to function), the XN (eXecute Never) bit is overridden and
newer CPUs can speculatively prefetch the ioremap'ed memory.
Linux performs the kernel domain switching to allow user-specific
functions (copy_to/from_user, get/put_user etc.) to access kernel
memory. In order for these functions to work with the kernel domain set
to Client, the patch modifies the LDRT/STRT and related instructions to
the LDR/STR ones.
The user pages access rights are also modified for kernel read-only
access rather than read/write so that the copy-on-write mechanism still
works. CPU_USE_DOMAINS gets disabled only if the hardware has a TLS register
(CPU_32v6K is defined) since writing the TLS value to the high vectors page
isn't possible.
The user addresses passed to the kernel are checked by the access_ok()
function so that they do not point to the kernel space.
Tested-by: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
UP systems do not implement all the instructions that SMP systems have,
so in order to boot a SMP kernel on a UP system, we need to rewrite
parts of the kernel.
Do this using an 'alternatives' scheme, where the kernel code and data
is modified prior to initialization to replace the SMP instructions,
thereby rendering the problematical code ineffectual. We use the linker
to generate a list of 32-bit word locations and their replacement values,
and run through these replacements when we detect a UP system.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
CPU: Testing write buffer coherency: ok
------------[ cut here ]------------
WARNING: at kernel/lockdep.c:3145 check_flags+0xcc/0x1dc()
Modules linked in:
[<c0035120>] (unwind_backtrace+0x0/0xf8) from [<c0355374>] (dump_stack+0x20/0x24)
[<c0355374>] (dump_stack+0x20/0x24) from [<c0060c04>] (warn_slowpath_common+0x58/0x70)
[<c0060c04>] (warn_slowpath_common+0x58/0x70) from [<c0060c3c>] (warn_slowpath_null+0x20/0x24)
[<c0060c3c>] (warn_slowpath_null+0x20/0x24) from [<c008f224>] (check_flags+0xcc/0x1dc)
[<c008f224>] (check_flags+0xcc/0x1dc) from [<c00945dc>] (lock_acquire+0x50/0x140)
[<c00945dc>] (lock_acquire+0x50/0x140) from [<c0358434>] (_raw_spin_lock+0x50/0x88)
[<c0358434>] (_raw_spin_lock+0x50/0x88) from [<c00fd114>] (set_task_comm+0x2c/0x60)
[<c00fd114>] (set_task_comm+0x2c/0x60) from [<c007e184>] (kthreadd+0x30/0x108)
[<c007e184>] (kthreadd+0x30/0x108) from [<c0030104>] (kernel_thread_exit+0x0/0x8)
---[ end trace 1b75b31a2719ed1c ]---
possible reason: unannotated irqs-on.
irq event stamp: 3
hardirqs last enabled at (2): [<c0059bb0>] finish_task_switch+0x48/0xb0
hardirqs last disabled at (3): [<c002f0b0>] ret_slow_syscall+0xc/0x1c
softirqs last enabled at (0): [<c005f3e0>] copy_process+0x394/0xe5c
softirqs last disabled at (0): [<(null)>] (null)
Fix this by ensuring that the lockdep interrupt state is manipulated in
the appropriate places. We essentially treat userspace as an entirely
separate environment which isn't relevant to lockdep (lockdep doesn't
monitor userspace.) We don't tell lockdep that IRQs will be enabled
in that environment.
Instead, when creating kernel threads (which is a rare event compared
to entering/leaving userspace) we have to update the lockdep state. Do
this by starting threads with IRQs disabled, and in the kthread helper,
tell lockdep that IRQs are enabled, and enable them.
This provides lockdep with a consistent view of the current IRQ state
in kernel space.
This also revert portions of 0d928b0b61
which didn't fix the problem.
Tested-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The TLS register is only available on ARM1136 r1p0 and later.
Set HWCAP_TLS flags if hardware TLS is available and test for
it if CONFIG_CPU_32v6K is not set for V6.
Note that we set the TLS instruction in __kuser_get_tls
dynamically as suggested by Jamie Lokier <jamie@shareable.org>.
Also the __switch_to code is optimized out in most cases as
suggested by Nicolas Pitre <nico@fluxnic.net>.
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
A new random value for the canary is stored in the task struct whenever
a new task is forked. This is meant to allow for different canary values
per task. On ARM, GCC expects the canary value to be found in a global
variable called __stack_chk_guard. So this variable has to be updated
with the value stored in the task struct whenever a task switch occurs.
Because the variable GCC expects is global, this cannot work on SMP
unfortunately. So, on SMP, the same initial canary value is kept
throughout, making this feature a bit less effective although it is still
useful.
One way to overcome this GCC limitation would be to locate the
__stack_chk_guard variable into a memory page of its own for each CPU,
and then use TLB locking to have each CPU see its own page at the same
virtual address for each of them.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
With CONFIG_KPROBES enabled two section are getting created which
leads to below build break.
LOG:
AS arch/arm/kernel/entry-armv.o
arch/arm/kernel/entry-armv.S: Assembler messages:
arch/arm/kernel/entry-armv.S:431: Error: symbol ret_from_exception is in a different section
arch/arm/kernel/entry-armv.S:490: Error: symbol ret_from_exception is in a different section
arch/arm/kernel/entry-armv.S:491: Error: symbol __und_usr_unknown is in a different section
This was introduced by commit 4260415f6a
Reported-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
/tmp/ccJ3ssZW.s: Assembler messages:
/tmp/ccJ3ssZW.s:1952: Error: can't resolve `.text' {.text section} - `.LFB1077'
This is caused because:
.section .data
.section .text
.section .text
.previous
does not return us to the .text section, but the .data section; this
makes use of .previous dangerous if the ordering of previous sections
is not known.
Fix up the other users of .previous; .pushsection and .popsection are
a safer pairing to use than .section and .previous.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The __kuser_cmpxchg code uses an ARMv6 dmb instruction, rather than
one based upon the architecture being built for. Switch to using
the macro provided for this purpose, which also eliminates the
need for an ifdef.
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use a definition for the cmpxchg SWI instead of hard-coding the number.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
The 32-bit wide variant of "mov pc, reg" in Thumb-2 is unpredictable
causing improper handling of the undefined instructions not caught by
the kernel. This patch adds a movw_pc macro for such situations
(currently only used in call_fpe).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Instruction fault status register, IFSR, was introduced on ARMv6 to
provide status information about the last insturction fault. It
needed for proper prefetch abort handling.
Now we have three prefetch abort model:
* legacy - for CPUs before ARMv6. They doesn't provide neither
IFSR nor IFAR. We simulate IFSR with section translation fault
status for them to generalize code;
* ARMv6 - provides IFSR, but not IFAR;
* ARMv7 - provides both IFSR and IFAR.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
VFP instructions in the kernel may trigger undefined exceptions if VFP
hardware is not present. This patch corrects the loading of such Thumb-2
instructions. It also marks the "no_fp" label as a function so that the
linker generate a Thumb address.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The patch adds a CLREX or dummy STREX to the exception return path. This
is needed because several atomic/locking operations use a pair of
LDREX/STREXEQ and the EQ condition may not always be satisfied. This
would leave the exclusive monitor status set and may cause problems with
atomic/locking operations in the interrupted code.
With this patch, the atomic_set() operation can be a simple STR
instruction (on SMP systems, the global exclusive monitor is cleared by
STR anyway). Clearing the exclusive monitor during context switch is no
longer needed as this is handled by the exception return path anyway.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Jamie Lokier <jamie@shareable.org>
Before this patch enabling and disabling irqs in assembler code and by
the hardware wasn't tracked completly.
I had to transpose two instructions in arch/arm/lib/bitops.h because
restore_irqs doesn't preserve the flags with CONFIG_TRACE_IRQFLAGS=y
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Since the Thumb-2 instructions can be 16-bit wide, data in the .text
sections may not be aligned to a 32-bit word and this leads to unaligned
exceptions. This patch does not affect the ARM code generation.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Starting with ARMv6, the CPUs support the BE-8 variant of big-endian
(byte-invariant). This patch adds the core support:
- setting of the BE-8 mode via the CPSR.E register for both kernel and
user threads
- big-endian page table walking
- REV used to rotate instructions read from memory during fault
processing as they are still little-endian format
- Kconfig and Makefile support for BE-8. The --be8 option must be passed
to the final linking stage to convert the instructions to
little-endian
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Mathieu Desnoyers pointed out that the ARM barriers were lacking:
- cmpxchg, xchg and atomic add return need memory barriers on
architectures which can reorder the relative order in which memory
read/writes can be seen between CPUs, which seems to include recent
ARM architectures. Those barriers are currently missing on ARM.
- test_and_xxx_bit were missing SMP barriers.
So put these barriers in. Provide separate atomic_add/atomic_sub
operations which do not require barriers.
Reported-Reviewed-and-Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This is needed to allow or stop the unwinding at certain points in the
kernel like exception entries.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Aaro says:
> With spinlock debugs enabled I get might_sleep() warnings when using
> ptrace.
tracked down to a missing enable_irq before calling do_undefinstr().
Reported-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This declaration specifies the "function" type and size for various
assembly functions, mainly needed for generating the correct branch
instructions in Thumb-2.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch implements Thumb-2 application support in Linux. Original
implementation by Paul Brook with fixes for VFP and Neon by Catalin
Marinas.
Signed-off-by: Paul Brook <paul@codesourcery.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds a prefetch abort handler similar to the data abort one
and renames the latter for consistency. Initial implementation by Paul
Brook with some renaming by Catalin Marinas.
Signed-off-by: Paul Brook <paul@codesourcery.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
If kprobes installs a breakpoint on a "stmdb sp!, {...}" instruction,
and then single-step it by simulation from the exception context, it will
corrupt the saved regs on the stack from the previous context.
To avoid this, let's add an optional parameter to the svc_entry macro
allowing for a hole to be created on the stack before saving the
interrupted context, and use it in the undef_svc handler when kprobes
is enabled.
Signed-off-by: Nicolas Pitre <nico@marvell.com>
This patch enables the use of the Advanced SIMD (NEON) extension on
ARMv7. The NEON technology is a 64/128-bit hybrid SIMD architecture
for accelerating the performance of multimedia and signal processing
applications. The extension shares the registers with the VFP unit and
enabling/disabling and saving/restoring follow the same rules. In
addition, there are instructions that do not have the appropriate CP
number encoded, the checks being made in the call_fpe function.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The ARM __kuser_cmpxchg routine is meant to implement an atomic cmpxchg
in user space. It however can produce spurious false negative if a
processor exception occurs in the middle of the operation. Normally
this is not a problem since cmpxchg is typically called in a loop until
it succeeds to implement an atomic increment for example.
Some use cases which don't involve a loop require that the operation be
100% reliable though. This patch changes the implementation so to
reattempt the operation after an exception has occurred in the critical
section rather than abort it.
Here's a simple program to test the fix (don't use CONFIG_NO_HZ in your
kernel as this depends on a sufficiently high interrupt rate):
#include <stdio.h>
typedef int (__kernel_cmpxchg_t)(int oldval, int newval, int *ptr);
#define __kernel_cmpxchg (*(__kernel_cmpxchg_t *)0xffff0fc0)
int main()
{
int i, x = 0;
for (i = 0; i < 100000000; i++) {
int v = x;
if (__kernel_cmpxchg(v, v+1, &x))
printf("failed at %d: %d vs %d\n", i, v, x);
}
printf("done with %d vs %d\n", i, x);
return 0;
}
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
get_irqnr_preamble allows machines to take some action before entering the
get_irqnr_and_base loop. On iop we enable cp6 access.
arch_ret_to_user is added to the userspace return path to allow individual
architectures to take actions, like disabling coprocessor access, before
the final return to userspace.
Per Nicolas Pitre's note, there is no need to cp_wait on the return to user
as the latency to return is sufficient.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
bad_mode() currently prints the mode which caused the exception, and
then causes an oops dump to be printed which again displays this
information (since the CPSR in the struct pt_regs is correct.) This
leads to processor_modes[] being shared between traps.c and process.c
with a local declaration of it.
We can clean this up by moving processor_modes[] to process.c and
removing the duplication, resulting in processor_modes[] becoming
static.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If the kernel attempts to execute a CP1 or CP2 instruction and it
aborts, and a FP emulator is not loaded, we try to return as if to
a user context, instead of the proper kernel context. Since the
fault came from kernel mode, we must use the kernel return paths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
XScale cores either have a DSP coprocessor (which contains a single
40 bit accumulator register), or an iWMMXt coprocessor (which contains
eight 64 bit registers.)
Because of the small amount of state in the DSP coprocessor, access to
the DSP coprocessor (CP0) is always enabled, and DSP context switching
is done unconditionally on every task switch. Access to the iWMMXt
coprocessor (CP0/CP1) is enabled only when an iWMMXt instruction is
first issued, and iWMMXt context switching is done lazily.
CONFIG_IWMMXT is supposed to mean 'the cpu we will be running on will
have iWMMXt support', but boards are supposed to select this config
symbol by hand, and at least one pxa27x board doesn't get this right,
so on that board, proc-xscale.S will incorrectly assume that we have a
DSP coprocessor, enable CP0 on boot, and we will then only save the
first iWMMXt register (wR0) on context switches, which is Bad.
This patch redefines CONFIG_IWMMXT as 'the cpu we will be running on
might have iWMMXt support, and we will enable iWMMXt context switching
if it does.' This means that with this patch, running a CONFIG_IWMMXT=n
kernel on an iWMMXt-capable CPU will no longer potentially corrupt iWMMXt
state over context switches, and running a CONFIG_IWMMXT=y kernel on a
non-iWMMXt capable CPU will still do DSP context save/restore.
These changes should make iWMMXt work on PXA3xx, and as a side effect,
enable proper acc0 save/restore on non-iWMMXt capable xsc3 cores such
as IOP13xx and IXP23xx (which will not have CONFIG_CPU_XSCALE defined),
as well as setting and using HWCAP_IWMMXT properly.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
The userspace helpers in clean/arch/arm/kernel/entry-armv.S are called
directly in/from userspace. They need to cope with being called from
Thumb code.
Patch below uses the bx interworking instruction when
CONFIG_ARM_THUMB=y.
Based on an earlier patch from Paul Brook <paul@codesourcery.com>
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (44 commits)
[ARM] 3541/2: workaround for PXA27x erratum E7
[ARM] nommu: provide a way for correct control register value selection
[ARM] 3705/1: add supersection support to ioremap()
[ARM] 3707/1: iwmmxt: use the generic thread notifier infrastructure
[ARM] 3706/2: ep93xx: add cirrus logic edb9315a support
[ARM] 3704/1: format IOP Kconfig with tabs, create more consistency
[ARM] 3703/1: Add help description for ARCH_EP80219
[ARM] 3678/1: MMC: Make OMAP MMC work
[ARM] 3677/1: OMAP: Update H2 defconfig
[ARM] 3676/1: ARM: OMAP: Fix dmtimers and timer32k to compile on OMAP1
[ARM] Add section support to ioremap
[ARM] Fix sa11x0 SDRAM selection
[ARM] Set bit 4 on section mappings correctly depending on CPU
[ARM] 3666/1: TRIZEPS4 [1/5] core
ARM: OMAP: Multiplexing for 24xx GPMC wait pin monitoring
ARM: OMAP: Fix SRAM to use MT_MEMORY instead of MT_DEVICE
ARM: OMAP: Update dmtimers
ARM: OMAP: Make clock variables static
ARM: OMAP: Fix GPMC compilation when DEBUG is defined
ARM: OMAP: Mux updates for external DMA and GPIO
...
Patch from Lennert Buytenhek
This patch makes the iWMMXt context switch hook use the generic
thread notifier infrastructure that was recently merged in commit
d6551e884c.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
Add the necessary kernel bits for crunch task switching.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some machine classes need to allow VFP support to be built into the
kernel, but still allow the kernel to run even though VFP isn't
present. Unfortunately, the kernel hard-codes VFP instructions
into the thread switch, which prevents this being run-time selectable.
Solve this by introducing a notifier which things such as VFP can
hook into to be informed of events which affect the VFP subsystem
(eg, creation and destruction of threads, switches between threads.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Paul Brook
The example code in the source documentation for __kernel_dmb
clobbers r0 but doesn't list it the asm clobber list.
Signed-off-by: Paul Brook <paul@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Allow the individual coprocessor handlers to decide when to enable
interrupts, rather than unconditionally enabling them.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
negative
Patch from Nicolas Pitre
The pre ARMv5 implementation can be aborted if an exception occurs in
the middle of it. Because of that, the ARMv6 implementation doesn't
re-attempt the operation on a failed strex either. Let's make this
transient nature of such a false positive more explicit in the
definition.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
The cmpxchg emulation on pre-ARMv5 relies on user code executed from a
kernel address. If the operation cannot complete atomically, it is
aborted from the usr_entry macro by clearing the Z flag. This clearing
of the Z flag is done whenever the user pc is above TASK_SIZE.
However this "pc >= TASK_SIZE" test cannot work in the non MMU case.
Worse: the current code will corrupt the Z flag on every entry to the
kernel.
Let's disable it in the non MMU case for now. Using NPTL on non MMU
targets needs to be worked out anyway.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
This is kernel provided user space code.
Since a syscall is used, it has to be updated to work with EABI.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
The ARM EABI says that the stack pointer has to be 64-bit aligned for
reasons already mentioned in patch #3101 when calling C functions.
We therefore must verify and adjust sp accordingly when taking an
exception from kernel mode since sp might not necessarily be 64-bit
aligned if the exception occurs in the middle of a kernel function.
If the exception occurs while in user mode then no sp fixup is needed as
long as sizeof(struct pt_regs) as well as any additional syscall data
stack space remain multiples of 8.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds register switch support in nommu mode.
Signed-off-by: Hyok S. Choi <hyok.choi@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
arch/arm/kernel/entry-armv.S has contained a comment suggesting
that asm/hardware.h and asm/arch/irqs.h should be moved into the
asm/arch/entry-macro.S include. So move the includes to these
two files as required.
Add missing includes (asm/hardware.h, asm/io.h) to asm/arch/system.h
includes which use those facilities, and remove asm/io.h from
kernel/process.c.
Remove other unnecessary includes from arch/arm/kernel, arch/arm/mm
and arch/arm/mach-footbridge.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
Strictly speaking, the NPTL kernel helpers are required for pre ARMv6
only. They are available on ARMv6+ as well for obvious compatibility
reasons. However there are cases where extra memory barriers are needed
when using an SMP ARMv6 machine but not on pre-ARMv6.
This patch adds a memory barrier kernel helper that glibc can use as
needed for pre-ARMv6 binaries to be forward compatible with an SMP
kernel on ARMv6, as well as the necessary dmb instructions to the
cmpxchg helper.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Acked-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add infrastructure for supporting per-cpu local timers to update
the profiling information and update system time accounting.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
Since we know the value of cpsr on entry, we can replace the bic+orr with
a single eor. Also remove a possible result delay (at least on XScale).
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
This patch allows for assorted type of cleanups by letting assembly code
use the same set of defines for constant values and avoid duplicated
definitions that might not always be in sync, or that might simply be
confusing due to the different names for the same thing.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We accidentally corrupted the TLS value when clearing out the ARMv6
exclusive monitor. Avoid doing so.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
Not that there might be many of them on the planet, but at least RMK
apparently has one.
Signed-off-by: Nicolas Pitre
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The current vector entry system does not allow for SMP. In
order to work around this, we need to eliminate our reliance
on the fixed save areas, which breaks the way we enable
alignment traps. This patch changes the way we handle the
save areas such that we can have one per CPU.
Signed-off-by: Russell King <rmk@arm.linux.org.uk>