Commit Graph

706678 Commits

Author SHA1 Message Date
Dave Martin abf73988a7 arm64: signal: Verify extra data is user-readable in sys_rt_sigreturn
Currently sys_rt_sigreturn() verifies that the base sigframe is
readable, but no similar check is performed on the extra data to
which an extra_context record points.

This matters because the extra data will be read with the
unprotected user accessors.  However, this is not a problem at
present because the extra data base address is required to be
exactly at the end of the base sigframe.  So, there would need to
be a non-user-readable kernel address within about 59K
(SIGFRAME_MAXSZ - sizeof(struct rt_sigframe)) of some address for
which access_ok(VERIFY_READ) returns true, in order for sigreturn
to be able to read kernel memory that should be inaccessible to the
user task.  This is currently impossible due to the untranslatable
address hole between the TTBR0 and TTBR1 address ranges.

Disappearance of the hole between the TTBR0 and TTBR1 mapping
ranges would require the VA size for TTBR0 and TTBR1 to grow to at
least 55 bits, and either the disabling of tagged pointers for
userspace or enabling of tagged pointers for kernel space; none of
which is currently envisaged.

Even so, it is wrong to use the unprotected user accessors without
an accompanying access_ok() check.

To avoid the potential for future surprises, this patch does an
explicit access_ok() check on the extra data space when parsing an
extra_context record.

Fixes: 33f082614c ("arm64: signal: Allow expansion of the signal frame")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:24:11 +00:00
Dave Martin 94ef7ecbdf arm64: fpsimd: Correctly annotate exception helpers called from asm
A couple of FPSIMD exception handling functions that are called
from entry.S are currently not annotated as such.

This is not a big deal since asmlinkage does nothing on arm/arm64,
but fixing the annotations is more consistent and may help avoid
future surprises.

This patch adds appropriate asmlinkage annotations for
do_fpsimd_acc() and do_fpsimd_exc().

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:24:11 +00:00
Dave Martin 27e64b4be4 regset: Add support for dynamically sized regsets
Currently the regset API doesn't allow for the possibility that
regsets (or at least, the amount of meaningful data in a regset)
may change in size.

In particular, this results in useless padding being added to
coredumps if a regset's current size is smaller than its
theoretical maximum size.

This patch adds a get_size() function to struct user_regset.
Individual regset implementations can implement this function to
return the current size of the regset data.  A regset_size()
function is added to provide callers with an abstract interface for
determining the size of a regset without needing to know whether
the regset is dynamically sized or not.

The only affected user of this interface is the ELF coredump code:
This patch ports ELF coredump to dump regsets with their actual
size in the coredump.  This has no effect except for new regsets
that are dynamically sized and provide a get_size() implementation.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: H. J. Lu <hjl.tools@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:24:11 +00:00
Suzuki K Poulose c7f5828bf7 arm-ccn: perf: Prevent module unload while PMU is in use
When the PMU driver is built as a module, the perf expects the
pmu->module to be valid, so that the driver is prevented from
being unloaded while it is in use. Fix the CCN pmu driver to
fill in this field.

Fixes: a33b0daab7 ("bus: ARM CCN PMU driver")
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:24:06 +00:00
Suzuki K Poulose 19b4aff202 perf: arm_spe: Prevent module unload while the PMU is in use
When the PMU driver is built as a module, the perf expects the
pmu->module to be valid, so that the driver is prevented from
being unloaded while it is in use. Fix the SPE pmu driver to
fill in this field.

Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:23:55 +00:00
Julien Thierry d125bffcef arm64: Fix static use of function graph
Function graph does not work currently when CONFIG_DYNAMIC_TRACE is not
set. This is because ftrace_function_trace is not always set to ftrace_stub
when function_graph is in use.

Do not skip checking of graph tracer functions when ftrace_function_trace
is set.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 12:05:23 +00:00
Xie XiuQi a92d4d1454 arm64: entry.S: move SError handling into a C function for future expansion
Today SError is taken using the inv_entry macro that ends up in
bad_mode.

SError can be used by the RAS Extensions to notify either the OS or
firmware of CPU problems, some of which may have been corrected.

To allow this handling to be added, add a do_serror() C function
that just panic()s. Add the entry.S boiler plate to save/restore the
CPU registers and unmask debug exceptions. Future patches may change
do_serror() to return if the SError Interrupt was notification of a
corrected error.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Wang Xiongfeng <wangxiongfengi2@huawei.com>
[Split out of a bigger patch, added compat path, renamed, enabled debug
 exceptions]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 15:55:41 +00:00
James Morse b282e1ce29 arm64: entry.S: convert elX_irq
Following our 'dai' order, irqs should be processed with debug and
serror exceptions unmasked.

Add a helper to unmask these two, (and fiq for good measure).

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 15:55:41 +00:00
James Morse 746647c75a arm64: entry.S convert el0_sync
el0_sync also unmasks exceptions on a case-by-case basis, debug exceptions
are enabled, unless this was a debug exception. Irqs are unmasked for
some exception types but not for others.

el0_dbg should run with everything masked to prevent us taking a debug
exception from do_debug_exception. For the other cases we can unmask
everything. This changes the behaviour of fpsimd_{acc,exc} and el0_inv
which previously ran with irqs masked.

This patch removed the last user of enable_dbg_and_irq, remove it.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 15:55:41 +00:00
James Morse b55a5a1b0a arm64: entry.S: convert el1_sync
el1_sync unmasks exceptions on a case-by-case basis, debug exceptions
are unmasked, unless this was a debug exception. IRQs are unmasked
for instruction and data aborts only if the interupted context had
irqs unmasked.

Following our 'dai' order, el1_dbg should run with everything masked.
For the other cases we can inherit whatever we interrupted.

Add a macro inherit_daif to set daif based on the interrupted pstate.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 15:55:41 +00:00
James Morse 84d0fb1bb6 arm64: entry.S: Remove disable_dbg
enable_step_tsk is the only user of disable_dbg, which doesn't respect
our 'dai' order for exception masking. enable_step_tsk may enable
single-step, so previously needed to mask debug exceptions to prevent us
from single-stepping kernel_exit. enable_step_tsk is called at the end
of the ret_to_user loop, which has already masked all exceptions so this
is no longer needed.

Remove disable_dbg, add a comment that enable_step_tsk's caller should
have masked debug.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 15:55:41 +00:00
James Morse 8d66772e86 arm64: Mask all exceptions during kernel_exit
To take RAS Exceptions as quickly as possible we need to keep SError
unmasked as much as possible. We need to mask it during kernel_exit
as taking an error from this code will overwrite the exception-registers.

Adding a naked 'disable_daif' to kernel_exit causes a performance problem
for micro-benchmarks that do no real work, (e.g. calling getpid() in a
loop). This is because the ret_to_user loop has already masked IRQs so
that the TIF_WORK_MASK thread flags can't change underneath it, adding
disable_daif is an additional self-synchronising operation.

In the future, the RAS APEI code may need to modify the TIF_WORK_MASK
flags from an SError, in which case the ret_to_user loop must mask SError
while it examines the flags.

Disable all exceptions for return to EL1. For return to EL0 get the
ret_to_user loop to leave all exceptions masked once it has done its
work, this avoids an extra pstate-write.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 15:55:41 +00:00
James Morse 41bd5b5d22 arm64: Move the async/fiq helpers to explicitly set process context flags
Remove the local_{async,fiq}_{en,dis}able macros as they don't respect
our newly defined order and are only used to set the flags for process
context when we bring CPUs online.

Add a helper to do this. The IRQ flag varies as we want it masked on
the boot CPU until we are ready to handle interrupts.
The boot CPU unmasks SError during early boot once it can print an error
message. If we can print an error message about SError, we can do the
same for FIQ. Debug exceptions are already enabled by __cpu_setup(),
which has also configured MDSCR_EL1 to disable MDE and KDE.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 15:55:41 +00:00
James Morse 65be7a1b79 arm64: introduce an order for exceptions
Currently SError is always masked in the kernel. To support RAS exceptions
using SError on hardware with the v8.2 RAS Extensions we need to unmask
SError as much as possible.

Let's define an order for masking and unmasking exceptions. 'dai' is
memorable and effectively what we have today.

Disabling debug exceptions should cause all other exceptions to be masked.
Masking SError should mask irq, but not disable debug exceptions.
Masking irqs has no side effects for other flags. Keeping to this order
makes it easier for entry.S to know which exceptions should be unmasked.

FIQ is never expected, but we mask it when we mask debug exceptions, and
unmask it at all other times.

Given masking debug exceptions masks everything, we don't need macros
to save/restore that bit independently. Remove them and switch the last
caller over to use the daif calls.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 15:55:41 +00:00
James Morse 0fbeb31875 arm64: explicitly mask all exceptions
There are a few places where we want to mask all exceptions. Today we
do this in a piecemeal fashion, typically we expect the caller to
have masked irqs and the arch code masks debug exceptions, ignoring
serror which is probably masked.

Make it clear that 'mask all exceptions' is the intention by adding
helpers to do exactly that.

This will let us unmask SError without having to add 'oh and SError'
to these paths.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 15:55:40 +00:00
Yisheng Xie c10f0d06ad arm64: suspend: remove useless included file
After commit 9e8e865bbe ("arm64: unify idmap removal"), we no need to
flush tlb in suspend.c, so the included file tlbflush.h can be removed.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 14:37:01 +00:00
Will Deacon 80b6eb04b5 arm64: Don't walk page table for user faults in do_mem_abort
Commit 42dbf54e88 ("arm64: consistently log ESR and page table")
dumps page table entries for user faults hitting do_bad entries in the
fault handler table. Whilst this shouldn't really happen in practice,
it's not beyond the realms of possibility if e.g. running an old kernel
on a new CPU.

Generally, we want to avoid exposing physical addresses under the control
of userspace (see commit bf396c09c2 ("arm64: mm: don't print out page
table entries on EL0 faults")), so walk the page tables only on exceptions
from EL1.

Reported-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02 13:52:48 +00:00
Mark Rutland c80ed088a5 arm64: vdso: fix clock_getres for 4GiB-aligned res
The vdso tries to check for a NULL res pointer in __kernel_clock_getres,
but only checks the lower 32 bits as is uses CBZ on the W register the
res pointer is held in.

Thus, if the res pointer happened to be aligned to a 4GiB boundary, we'd
spuriously skip storing the timespec to it, while returning a zero error code
to the caller.

Prevent this by checking the whole pointer, using CBZ on the X register
the res pointer is held in.

Fixes: 9031fefde6 ("arm64: VDSO support")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Andrew Pinski <apinski@cavium.com>
Reported-by: Mark Salyzyn <salyzyn@android.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 09:49:33 +00:00
Nick Desaulniers fd9dde6abc arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27
Upon upgrading to binutils 2.27, we found that our lz4 and gzip
compressed kernel images were significantly larger, resulting is 10ms
boot time regressions.

As noted by Rahul:
"aarch64 binaries uses RELA relocations, where each relocation entry
includes an addend value. This is similar to x86_64.  On x86_64, the
addend values are also stored at the relocation offset for relative
relocations. This is an optimization: in the case where code does not
need to be relocated, the loader can simply skip processing relative
relocations.  In binutils-2.25, both bfd and gold linkers did this for
x86_64, but only the gold linker did this for aarch64.  The kernel build
here is using the bfd linker, which stored zeroes at the relocation
offsets for relative relocations.  Since a set of zeroes compresses
better than a set of non-zero addend values, this behavior was resulting
in much better lz4 compression.

The bfd linker in binutils-2.27 is now storing the actual addend values
at the relocation offsets. The behavior is now consistent with what it
does for x86_64 and what gold linker does for both architectures.  The
change happened in this upstream commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1f56df9d0d5ad89806c24e71f296576d82344613
Since a bunch of zeroes got replaced by non-zero addend values, we see
the side effect of lz4 compressed image being a bit bigger.

To get the old behavior from the bfd linker, "--no-apply-dynamic-relocs"
flag can be used:
$ LDFLAGS="--no-apply-dynamic-relocs" make
With this flag, the compressed image size is back to what it was with
binutils-2.25.

If the kernel is using ASLR, there aren't additional runtime costs to
--no-apply-dynamic-relocs, as the relocations will need to be applied
again anyway after the kernel is relocated to a random address.

If the kernel is not using ASLR, then presumably the current default
behavior of the linker is better. Since the static linker performed the
dynamic relocs, and the kernel is not moved to a different address at
load time, it can skip applying the relocations all over again."

Some measurements:

$ ld -v
GNU ld (binutils-2.25-f3d35cf6) 2.25.51.20141117
                    ^
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300652760 Oct 26 11:57 vmlinux
$ ls -l Image.lz4-dtb
-rw-r----- 1 ndesaulniers eng 16932627 Oct 26 11:57 Image.lz4-dtb

$ ld -v
GNU ld (binutils-2.27-53dd00a1) 2.27.0.20170315
                    ^
pre patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 11:43 vmlinux
$ ls -l Image.lz4-dtb
-rw-r----- 1 ndesaulniers eng 18159474 Oct 26 11:43 Image.lz4-dtb

post patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 12:06 vmlinux
$ ls -l Image.lz4-dtb
-rw-r----- 1 ndesaulniers eng 16932466 Oct 26 12:06 Image.lz4-dtb

By Siqi's measurement w/ gzip:
binutils 2.27 with this patch (with --no-apply-dynamic-relocs):
Image 41535488
Image.gz 13404067

binutils 2.27 without this patch (without --no-apply-dynamic-relocs):
Image 41535488
Image.gz 14125516

Any compression scheme should be able to get better results from the
longer runs of zeros, not just GZIP and LZ4.

10ms boot time savings isn't anything to get excited about, but users of
arm64+compression+bfd-2.27 should not have to pay a penalty for no
runtime improvement.

Reported-by: Gopinath Elanchezhian <gelanchezhian@google.com>
Reported-by: Sindhuri Pentyala <spentyala@google.com>
Reported-by: Wei Wang <wvw@google.com>
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Suggested-by: Rahul Chaudhry <rahulchaudhry@google.com>
Suggested-by: Siqi Lin <siqilin@google.com>
Suggested-by: Stephen Hines <srhines@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: added comment to Makefile]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-30 13:45:12 +00:00
Catalin Marinas 6218f96c58 arm64: Implement arch-specific pte_access_permitted()
The generic pte_access_permitted() implementation only checks for
pte_present() (together with the write permission where applicable).
However, for both kernel ptes and PROT_NONE mappings pte_present() also
returns true on arm64 even though such mappings are not user accessible.
Additionally, arm64 now supports execute-only user permission
(PROT_EXEC) which is implemented by clearing the PTE_USER bit.

With this patch the arm64 implementation of pte_access_permitted()
checks for the PTE_VALID and PTE_USER bits together with writable access
if applicable.

Cc: <stable@vger.kernel.org>
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-30 12:41:15 +00:00
Will Deacon d7b1d22d38 arm64: uapi: Remove PSR_Q_BIT
PSTATE.Q only exists for AArch32, which can be referred to using
COMPAT_PSR_Q_BIT. Remove PSR_Q_BIT, since the native bit doesn't exist
in the architecture

Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-27 16:27:03 +01:00
Will Deacon b7300d4c03 arm64: traps: Pretty-print pstate in register dumps
We can decode the PSTATE easily enough, so pretty-print it in register
dumps.

Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-27 16:26:58 +01:00
Will Deacon a25ffd3a63 arm64: traps: Don't print stack or raw PC/LR values in backtraces
Printing raw pointer values in backtraces has potential security
implications and are of questionable value anyway.

This patch follows x86's lead and removes the "Exception stack:" dump
from kernel backtraces, as well as converting PC/LR values to symbols
such as "sysrq_handle_crash+0x20/0x30".

Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-27 16:26:53 +01:00
Mark Rutland 42dbf54e88 arm64: consistently log ESR and page table
When we take a fault we can't handle, we try to dump some relevant
information, but we're not consistent about doing so.

In do_mem_abort(), we log the full ESR, but don't dump a page table
walk. In __do_kernel_fault, we dump an attempted decoding of the ESR
(but not the ESR itself) along with a page table walk.

Let's try to make things more consistent by dumping the full ESR in
mem_abort_decode(), and having do_mem_abort dump a page table walk. The
existing dump of the ESR in do_mem_abort() is rendered redundant, and
removed.

Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Julien Thierry <julien.thierry@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-27 16:26:47 +01:00
Dave Martin fa3eb71d96 arm64: asm-bug: Renumber macro local labels to avoid clashes
Currently ASM_BUG() and its constituent macros define local
assembler labels 0, 1 and 2 internally, which carries a high risk
of clash with callers' labels and consequent mis-assembly.

This patch gives the labels a big random offset to minimise the
chance of such errors.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-25 15:57:15 +01:00
Julien Thierry 6436beeee5 arm64: Fix single stepping in kernel traps
Software Step exception is missing after stepping a trapped instruction.

Ensure SPSR.SS gets set to 0 after emulating/skipping a trapped instruction
before doing ERET.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[will: replaced AARCH32_INSN_SIZE with 4]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-25 11:57:33 +01:00
Julien Thierry e28cc02559 arm64: Use existing defines for mdscr
Literal values are being used to set single stepping in mdscr from assembly
code. There are already existing defines representing those values, use
those instead of the literal values.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-25 11:56:59 +01:00
Mark Salyzyn 9ca255bf04 arm64: Avoid aligning normal memory pointers in __memcpy_{to,from}io
__memcpy_{to,from}io fall back to byte-at-a-time copying if both the
source and destination pointers are not 8-byte aligned. Since one of the
pointers always points at normal memory, this is unnecessary and
detrimental to performance, so only do byte copying until we hit an 8-byte
boundary for the device pointer.

This change was motivated by performance issues in the pstore driver.
On a test platform, measuring probe time for pstore, console buffer
size of 1/4MB and pmsg of 1/2MB, was in the 90-107ms region. Change
managed to reduce it to 10-25ms, an improvement in boot time.

Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-24 16:23:07 +01:00
Will Deacon 1e0c661f05 Merge branch 'for-next/perf' into aarch64/for-next/core
Merge in ARM PMU and perf updates for 4.15:

  - Support for the Statistical Profiling Extension
  - Support for Hisilicon's SoC PMU

Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-24 16:06:56 +01:00
Julien Thierry 611479c79a arm/arm64: pmu: Distinguish percpu irq and percpu_devid irq
arm_pmu interrupts are maked as PERCPU even when these are not local
physical interrupts to a single CPU. When using non-local interrupts,
interrupts marked as PERCPU will not get freed not disabled properly
by the PMU driver.

Check if interrupts are local to a single CPU with PERCPU_DEVID since
this is what the PMU driver really needs to know.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-24 16:04:05 +01:00
Julien Thierry 08395c7f4d irqdesc: Add function to identify percpu_devid irqs
irq_is_percpu indicates whether an irq should only target a single cpu.
PERCPU_DEVID flag indicates that an irq can be configured differently on
each cpu it can target.

Provide a function to check whether an irq is PERCPU_DEVID.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
2017-10-24 16:03:28 +01:00
Suzuki K Poulose 5bdecb7971 arm64: Fix the feature type for ID register fields
Now that the ARM ARM clearly specifies the rules for inferring
the values of the ID register fields, fix the types of the
feature bits we have in the kernel.

As per ARM ARM DDI0487B.b, section D10.1.4 "Principles of the
ID scheme for fields in ID registers" lists the registers to
which the scheme applies along with the exceptions.

This patch changes the relevant feature bits from FTR_EXACT
to FTR_LOWER_SAFE to select the safer value. This will enable
an older kernel running on a new CPU detect the safer option
rather than completely disabling the feature.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:42:08 +01:00
Shaokun Zhang 0714134214 arm64: MAINTAINERS: hisi: Add HiSilicon SoC PMU support
Add support HiSilicon SoC uncore PMU driver.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:35 +01:00
Shaokun Zhang 904dcf03f0 perf: hisi: Add support for HiSilicon SoC DDRC PMU driver
This patch adds support for DDRC PMU driver in HiSilicon SoC chip, Each
DDRC has own control, counter and interrupt registers and is an separate
PMU. For each DDRC PMU, it has 8-fixed-purpose counters which have been
mapped to 8-events by hardware, it assumes that counter index is equal
to event code (0 - 7) in DDRC PMU driver. Interrupt is supported to
handle counter (32-bits) overflow.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:35 +01:00
Shaokun Zhang 2bab3cf910 perf: hisi: Add support for HiSilicon SoC HHA PMU driver
L3 cache coherence is maintained by Hydra Home Agent (HHA) in HiSilicon
SoC. This patch adds support for HHA PMU driver, Each HHA has own
control, counter and interrupt registers and is an separate PMU. For
each HHA PMU, it has 16-programable counters and each counter is
free-running. Interrupt is supported to handle counter (48-bits)
overflow.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:35 +01:00
Shaokun Zhang 2940bc4333 perf: hisi: Add support for HiSilicon SoC L3C PMU driver
This patch adds support for L3C PMU driver in HiSilicon SoC chip, Each
L3C has own control, counter and interrupt registers and is an separate
PMU. For each L3C PMU, it has 8-programable counters and each counter
is free-running. Interrupt is supported to handle counter (48-bits)
overflow.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:34 +01:00
Shaokun Zhang 6ce4ef9419 perf: hisi: Add support for HiSilicon SoC uncore PMU driver
This patch adds support HiSilicon SoC uncore PMU driver framework and
interfaces.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
[will: Fix leader accounting in uncore group validation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:34 +01:00
Shaokun Zhang 3125b5b2a3 Documentation: perf: hisi: Documentation for HiSilicon SoC PMU driver
This patch adds documentation for the uncore PMUs on HiSilicon SoC.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:34 +01:00
Julien Thierry 3f7c86b238 arm64: Update fault_info table with new exception types
Based on: ARM Architecture Reference Manual, ARMv8 (DDI 0487B.b).

ARMv8.1 introduces the optional feature ARMv8.1-TTHM which can trigger a
new type of memory abort. This exception is triggered when hardware update
of page table flags is not atomic in regards to other memory accesses.
Replace the corresponding unknown entry with a more accurate one.

Cf: Section D10.2.28 ESR_ELx, Exception Syndrome Register (p D10-2381),
section D4.4.11 Restriction on memory types for hardware updates on page
tables (p D4-2116 - D4-2117).

ARMv8.2 does not add new exception types, however it is worth mentioning
that when obligatory feature RAS (optional for ARMv8.{0,1}) is implemented,
exceptions related to "Synchronous parity or ECC error on memory access,
not on translation table walk" become reserved and should not occur.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 10:57:40 +01:00
Will Deacon d5d9696b03 drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension
The ARMv8.2 architecture introduces the optional Statistical Profiling
Extension (SPE).

SPE can be used to profile a population of operations in the CPU pipeline
after instruction decode. These are either architected instructions (i.e.
a dynamic instruction trace) or CPU-specific uops and the choice is fixed
statically in the hardware and advertised to userspace via caps/. Sampling
is controlled using a sampling interval, similar to a regular PMU counter,
but also with an optional random perturbation to avoid falling into patterns
where you continuously profile the same instruction in a hot loop.

After each operation is decoded, the interval counter is decremented. When
it hits zero, an operation is chosen for profiling and tracked within the
pipeline until it retires. Along the way, information such as TLB lookups,
cache misses, time spent to issue etc is captured in the form of a sample.
The sample is then filtered according to certain criteria (e.g. load
latency) that can be specified in the event config (described under
format/) and, if the sample satisfies the filter, it is written out to
memory as a record, otherwise it is discarded. Only one operation can
be sampled at a time.

The in-memory buffer is linear and virtually addressed, raising an
interrupt when it fills up. The PMU driver handles these interrupts to
give the appearance of a ring buffer, as expected by the AUX code.

The in-memory trace-like format is self-describing (though not parseable
in reverse) and written as a series of records, with each record
corresponding to a sample and consisting of a sequence of packets. These
packets are defined by the architecture, although some have CPU-specific
fields for recording information specific to the microarchitecture.

As a simple example, a record generated for a branch instruction may
consist of the following packets:

  0 (Address) : Virtual PC of the branch instruction
  1 (Type)    : Conditional direct branch
  2 (Counter) : Number of cycles taken from Dispatch to Issue
  3 (Address) : Virtual branch target + condition flags
  4 (Counter) : Number of cycles taken from Dispatch to Complete
  5 (Events)  : Mispredicted as not-taken
  6 (END)     : End of record

It is also possible to toggle properties such as timestamp packets in
each record.

This patch adds support for SPE in the form of a new perf driver.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:34 +01:00
Will Deacon 4b8b77a4ec dt-bindings: Document devicetree binding for ARM SPE
This patch documents the devicetree binding in use for ARM SPE.

Cc: Rob Herring <robh@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:33 +01:00
Will Deacon b0c57e1071 arm64: head: Init PMSCR_EL2.{PA,PCT} when entered at EL2 without VHE
When booting at EL2, ensure that we permit the EL1 host to sample
physical addresses and physical counter values using SPE.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:32 +01:00
Will Deacon a173c390d9 arm64: sysreg: Move SPE registers and PSB into common header files
SPE is part of the v8.2 architecture, so move its system register and
field definitions into sysreg.h and the new PSB barrier into barrier.h

Finally, move KVM over to using the generic definitions so that it
doesn't have to open-code its own versions.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:32 +01:00
Will Deacon 085b30625e perf/core: Add PERF_AUX_FLAG_COLLISION to report colliding samples
The ARM SPE architecture permits an implementation to ignore a sample
if the sample is due to be taken whilst another sample is already being
produced. In this case, it is desirable to report the collision to
userspace, as they may want to lower the sample period.

This patch adds a PERF_AUX_FLAG_COLLISION flag, so that such events can
be relayed to userspace.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:31 +01:00
Will Deacon bc1d202023 perf/core: Export AUX buffer helpers to modules
Perf PMU drivers using AUX buffers cannot be built as modules unless
the AUX helpers are exported.

This patch exports perf_aux_output_{begin,end,skip} and perf_get_aux to
modules.

Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:30 +01:00
Will Deacon 5ffeb0501c genirq: export irq_get_percpu_devid_partition to modules
Any modular driver using cluster-affine PPIs needs to be able to call
irq_get_percpu_devid_partition so that it can enable the IRQ on the
correct subset of CPUs.

This patch exports the symbol so that it can be called from within a
module.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:29 +01:00
Will Deacon 0515ce0ff8 ACPI IORT updates for v4.15; patches content is logically split into
the following subseries:
 
 - Code clean-ups (A.Yadav, L.Pieralisi)
 - Platform devices inizialization rework in preparation for IORT PMCG
   handling (L.Pieralisi)
 - Mapping API rework to enable MSIs for IORT components as defined in
   IORT specification issue C (H.Guo, L.Pieralisi)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJZ5L0JAAoJEIKLOaai0TZ/qnkP/jI4t1P/nFCe/C+26zsqyu87
 59TGXYdeAFFKkIiPN/nvxpebC2/spaAVRE3fvO+GMUcfFCVT3kQ2fz9MbGhiw2Q7
 qwOGevUMIkj/g0v6wYKIIBmvFdamXtDjVdke90EOIVA45heSlxQukpYjvZvdRlh6
 vYsStaBm9gAQeNwnH2ocNl7ejYb50DZYktsJFuNi58Aw9ffbO8hY2YwZOgBLxcH+
 Raf8pWr2tsKm35ePwvDLsRcl+dXqx48HNeIv2Ebo0tQLB4paVLfqaWj4xkcwIkjK
 xrgidVYBUI74p0iQOiRzHQiHSOiyE2IClu1ANmWryqJg/UhZFWTO8sQwMckbJZx8
 a5gdnIzcWEmLLYfc5SqMYsSIOjmXOw0N+ArqclTwljIopyVMyLjaAzfgzwzrIR9C
 +NbEpulYhgKZWpcrkOQqKfOgKoP+tfq4c7+aBHVAQIRZ7LPWiqnxS9dlk5k/A/4d
 IBh0ARpeqFvpMoDVjzAiIma/UnNmxX87HZlI4DYCDJyOcV0wXyPofvqsS9OFJwUs
 xmCFiHlJ8NZKxnSKwUw3E+dkn4Kj0okUZReLhRQ534H9g7rDGzfZmG/cDeq2LQ2u
 ZqqB6infOT2Ikxd8fUNW6wMheztY4dncGI8IOZCEFtT97pTcydnrUKJfn8HdyaNG
 68VWc6+wl8aXTpOf7fDj
 =yHFd
 -----END PGP SIGNATURE-----

Merge tag 'acpi/iort-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/linux into aarch64/for-next/core

Pull arm64 ACPI IORT updates from Lorenzo Pieralisi:

- Code clean-ups (A.Yadav, L.Pieralisi)
- Platform devices inizialization rework in preparation for IORT PMCG
  handling (L.Pieralisi)
- Mapping API rework to enable MSIs for IORT components as defined in
  IORT specification issue C (H.Guo, L.Pieralisi)

Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-17 12:33:13 +01:00
Lorenzo Pieralisi 65637901a3 ACPI/IORT: Enable SMMUv3/PMCG IORT MSI domain set-up
ITS specific mappings for SMMUv3/PMCG components can be retrieved
through special index mapping entries introduced in IORT revision C.

Introduce a new API iort_set_device_domain() to set the MSI domain for
SMMUv3/PMCG nodes (extendable to any future IORT node requiring special
index ITS mapping entries) that represent MSI through special index
mappings in order to enable MSI support for the devices their nodes
represent.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
2017-10-16 14:30:15 +01:00
Hanjun Guo 86456a3f19 ACPI/IORT: Add SMMUv3 specific special index mapping handling
IORT revision C introduced a mapping entry binding to describe ITS
device ID mapping for SMMUv3 MSI interrupts.

Enable the single mapping flag (ie that is used by SMMUv3 component for
its special index mappings) for the SMMUv3 node in the IORT mapping API
and add IORT code to handle special index mapping entry for the SMMUv3
IORT nodes to enable their MSI interrupts. In case the ACPICA for
SMMUv3 device ID mapping is not ready, use the ACPICA version as a guard
for function iort_get_id_mapping_index().

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
[lorenzo.pieralisi@arm.com: patch split, typos fixing, rewrote the log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2017-10-16 14:30:05 +01:00
Hanjun Guo 8c8df8dcd6 ACPI/IORT: Enable special index ITS group mappings for IORT nodes
IORT revision C introduced SMMUv3 and PMCG MSI support by adding
specific mapping entries in the SMMUv3/PMCG subtables to retrieve
the device ID and the ITS group it maps to for a given SMMUv3/PMCG
IORT node.

Introduce a mapping function (ie iort_get_id_mapping_index()), that
for a given IORT node looks up if an ITS specific ID mapping entry
exists and if so retrieve the corresponding mapping index in the IORT
node mapping array.

Since an ITS specific index mapping can be present for an IORT
node that is not a leaf node (eg SMMUv3 - to describe its own
ITS device ID) special handling is required for two steps mapping
cases such as PCI/NamedComponent--->SMMUv3--->ITS because the SMMUv3
ITS specific index mapping entry should be skipped to prevent the
IORT API from considering the mapping entry as a regular mapping one.

If we take the following IORT topology example:

|----------------------|
|  Root Complex Node   |
|----------------------|
|    map entry[x]      |
|----------------------|
|       id value       |
| output_reference     |
|---|------------------|
    |
    |   |----------------------|
    |-->|        SMMUv3        |
        |----------------------|
        |     SMMUv3 dev ID    |
        |     mapping index 0  |
        |----------------------|
        |      map entry[0]    |
        |----------------------|
        |       id value       |
        | output_reference-----------> ITS 1 (SMMU MSI domain)
        |----------------------|
        |      map entry[1]    |
        |----------------------|
        |       id value       |
        | output_reference-----------> ITS 2 (PCI MSI domain)
        |----------------------|

where the SMMUv3 ITS specific mapping entry is index 0 and it
represents the SMMUv3 ITS specific index mapping entry (describing its
own ITS device ID), we need to skip that mapping entry while carrying
out the Root Complex Node regular mappings to prevent erroneous
translations.

Reuse the iort_get_id_mapping_index() function to detect the ITS
specific mapping index for a specific IORT node and skip it in the IORT
mapping API (ie iort_node_map_id()) loop to prevent considering it a
normal PCI/Named Component ID mapping entry.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
[lorenzo.pieralisi@arm.com: split patch/rewrote commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2017-10-16 14:29:55 +01:00