Cache clean to PoP is subject to the same access controls as to PoC, so
if we are trapping userspace cache maintenance with SCTLR_EL1.UCI, we
need to be prepared to handle it. To avoid getting into complicated
fights with binutils about ARMv8.2 options, we'll just cheat and use the
raw SYS instruction rather than the 'proper' DC alias.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The ARMv8.2-DCPoP feature introduces persistent memory support to the
architecture, by defining a point of persistence in the memory
hierarchy, and a corresponding cache maintenance operation, DC CVAP.
Expose the support via HWCAP and MRS emulation.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
__inval_cache_range() is already the odd one out among our data cache
maintenance routines as the only remaining range-based one; as we're
going to want an invalidation routine to call from C code for the pmem
API, let's tweak the prototype and name to bring it in line with the
clean operations, and to make its relationship with __dma_inv_area()
neatly mirror that of __clean_dcache_area_poc() and __dma_clean_area().
The loop clearing the early page tables gets mildly massaged in the
process for the sake of consistency.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Rather than continue adding CPU-specific event maps, instead look up by
default in the PMUv3 event map and only fallback to the CPU-specific maps
if either the event isn't described by PMUv3, or it is described but
the PMCEID registers say that it is unsupported by the current CPU.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, when unwinding the call stack, we validate the frame pointer
of each frame against frame.sp, whose value is not clearly defined, and
which makes it more difficult to link stack frames together across
different stacks. It is far better to simply check whether the frame
pointer itself points into a valid stack.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Our IRQ_STACK_PTR() and on_irq_stack() helpers both take a cpu argument,
used to generate a percpu address. In all cases, they are passed
{raw_,}smp_processor_id(), so this parameter is redundant.
Since {raw_,}smp_processor_id() use a percpu variable internally, this
approach means we generate a percpu offset to find the current cpu, then
use this to index an array of percpu offsets, which we then use to find
the current CPU's IRQ stack pointer. Thus, most of the work is
redundant.
Instead, we can consistently use raw_cpu_ptr() to generate the CPU's
irq_stack pointer by simply adding the percpu offset to the irq_stack
address, which is simpler in both respects.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Currently, cpu_switch_to and ret_from_fork both live in .entry.text,
though neither form the critical path for an exception entry.
In subsequent patches, we will require that code in .entry.text is part
of the critical path for exception entry, for which we can assume
certain properties (e.g. the presence of exception regs on the stack).
Neither cpu_switch_to nor ret_from_fork will meet these requirements, so
we must move them out of .entry.text. To ensure that neither are kprobed
after being moved out of .entry.text, we must explicitly blacklist them,
requiring a new NOKPROBE() asm helper.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
In most cases, our exception entry assembly branches to C handlers with
a BL instruction, but in cases where we do not expect to return, we use
B instead.
While this is correct today, it means that backtraces for fatal
exceptions miss the entry assembly (as the LR is stale at the point we
call C code), while non-fatal exceptions have the entry assembly in the
LR. In subsequent patches, we will need the LR to be set in these cases
in order to backtrace reliably.
This patch updates these sites to use a BL, ensuring consistency, and
preparing for backtrace rework. An ASM_BUG() is added after each of
these new BLs, which both catches unexpected returns, and ensures that
the LR value doesn't point to another function label.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Currently:
$ perf stat -e cycles:u -e cycles:k true
Performance counter stats for 'true':
2,24,699 cycles:u
<not counted> cycles:k (0.00%)
0.000788087 seconds time elapsed
We can not count more than one cycle counter in one instance,because we
allow to map cycle counter into PMCCNTR_EL0 only. However, if I did not
miss anything then specification do not prohibit to use PMEVCNTR<n>_EL0
for cycle count as well.
Modify the code so that it still prefers to use PMCCNTR_EL0 for cycle
counter, however allow to use PMEVCNTR<n>_EL0 if PMCCNTR_EL0 is already
in use.
After this patch:
$ perf stat -e cycles:u -e cycles:k true
Performance counter stats for 'true':
2,17,310 cycles:u
7,40,009 cycles:k
0.000764149 seconds time elapsed
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
may_use_simd() can be invoked from loadable modules and it accesses
kernel_neon_busy. Make sure that the latter is exported.
Fixes: cb84d11e16 ("arm64: neon: Remove support for nested or hardirq kernel-mode NEON")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The -1 "no syscall" value is written in various ways, shared with
the user ABI in some places, and generally obscure.
This patch attempts to make things a little more consistent and
readable by replacing all these uses with a single #define. A
couple of symbolic helpers are provided to clarify the intent
further.
Because the in-syscall check in do_signal() is changed from >= 0 to
!= NO_SYSCALL by this patch, different behaviour may be observable
if syscallno is set to values less than -1 by a tracer. However,
this is not different from the behaviour that is already observable
if a tracer sets syscallno to a value >= __NR_(compat_)syscalls.
It appears that this can cause spurious syscall restarting, but
that is not a new behaviour either, and does not appear harmful.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The upper 32 bits of the syscallno field in thread_struct are
handled inconsistently, being sometimes zero extended and sometimes
sign-extended. In fact, only the lower 32 bits seem to have any
real significance for the behaviour of the code: it's been OK to
handle the upper bits inconsistently because they don't matter.
Currently, the only place I can find where those bits are
significant is in calling trace_sys_enter(), which may be
unintentional: for example, if a compat tracer attempts to cancel a
syscall by passing -1 to (COMPAT_)PTRACE_SET_SYSCALL at the
syscall-enter-stop, it will be traced as syscall 4294967295
rather than -1 as might be expected (and as occurs for a native
tracer doing the same thing). Elsewhere, reads of syscallno cast
it to an int or truncate it.
There's also a conspicuous amount of code and casting to bodge
around the fact that although semantically an int, syscallno is
stored as a u64.
Let's not pretend any more.
In order to preserve the stp x instruction that stores the syscall
number in entry.S, this patch special-cases the layout of struct
pt_regs for big endian so that the newly 32-bit syscallno field
maps onto the low bits of the stored value. This is not beautiful,
but benchmarking of the getpid syscall on Juno suggests indicates a
minor slowdown if the stp is split into an stp x and stp w.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Support for kernel-mode NEON to be nested and/or used in hardirq
context adds significant complexity, and the benefits may be
marginal. In practice, kernel-mode NEON is not used in hardirq
context, and is rarely used in softirq context (by certain mac80211
drivers).
This patch implements an arm64 may_use_simd() function to allow
clients to check whether kernel-mode NEON is usable in the current
context, and simplifies kernel_neon_{begin,end}() to handle only
saving of the task FPSIMD state (if any). Without nesting, there
is no other state to save.
The partial fpsimd save/restore functions become redundant as a
result of these changes, so they are removed too.
The save/restore model is changed to operate directly on
task_struct without additional percpu storage. This simplifies the
code and saves a bit of memory, but means that softirqs must now be
disabled when manipulating the task fpsimd state from task context:
correspondingly, preempt_{en,dis}sable() calls are upgraded to
local_bh_{en,dis}able() as appropriate. fpsimd_thread_switch()
already runs with hardirqs disabled and so is already protected
from softirqs.
These changes should make it easier to support kernel-mode NEON in
the presence of the Scalable Vector extension in the future.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In order to be able to cope with kernel-mode NEON being unavailable
in hardirq/nmi context and non-nestable, we need special handling
for EFI runtime service calls that may be made during an interrupt
that interrupted a kernel_neon_begin()..._end() block. This will
occur if the kernel tries to write diagnostic data to EFI
persistent storage during a panic triggered by an NMI for example.
EFI runtime services specify an ABI that clobbers the FPSIMD state,
rather than being able to use it optionally as an accelerator.
This means that EFI is really a special case and can be handled
specially.
To enable EFI calls from interrupts, this patch creates dedicated
__efi_fpsimd_{begin,end}() helpers solely for this purpose, which
save/restore to a separate percpu buffer if called in a context
where kernel_neon_begin() is not usable.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
__this_cpu_ ops are not used consistently with regard to this_cpu_
ops in a couple of places in fpsimd.c.
Since preemption is explicitly disabled in
fpsimd_restore_current_state() and fpsimd_update_current_state(),
this patch converts this_cpu_ ops in those functions to __this_cpu_
ops. This doesn't save cost on arm64, but benefits from additional
assertions in the core code.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Multiple architectures define this as a trivial function, and I'm adding
another one as part of the RISC-V port. Add a __weak version of
pcibios_align_resource() and delete the now-obselete ones in a handful of
ports.
The only functional change should be that a handful of ports used to export
pcibios_fixup_bus(). Only some architectures export this, so I just
dropped it.
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Multiple architectures define this as an empty function, and I'm adding
another one as part of the RISC-V port. Add a __weak version of
pcibios_fixup_bus() and delete the now-obselete ones in a handful of
ports.
The only functional change should be that microblaze used to export
pcibios_fixup_bus(). None of the other architectures exports this, so I
just dropped it.
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In an ideal world, CNTFRQ_EL0 always contains the timer frequency
for the kernel to use. Sadly, we get quite a few broken systems
where the firmware authors cannot be bothered to program that
register on all CPUs, and rely on DT to provide that frequency.
So when trapping CNTFRQ_EL0, make sure to return the actual rate
(as known by the kernel), and not CNTFRQ_EL0.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Sparse complains about wrong address space used in __acpi_map_table()
and in __acpi_unmap_table().
arch/x86/kernel/acpi/boot.c:127:29: warning: incorrect type in return expression (different address spaces)
arch/x86/kernel/acpi/boot.c:127:29: expected char *
arch/x86/kernel/acpi/boot.c:127:29: got void [noderef] <asn:2>*
arch/x86/kernel/acpi/boot.c:135:23: warning: incorrect type in argument 1 (different address spaces)
arch/x86/kernel/acpi/boot.c:135:23: expected void [noderef] <asn:2>*addr
arch/x86/kernel/acpi/boot.c:135:23: got char *map
Correct address space to be in align of type of returned and passed
parameter.
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
struct siginfo is a union and the kernel since 2.4 has been hiding a union
tag in the high 16bits of si_code using the values:
__SI_KILL
__SI_TIMER
__SI_POLL
__SI_FAULT
__SI_CHLD
__SI_RT
__SI_MESGQ
__SI_SYS
While this looks plausible on the surface, in practice this situation has
not worked well.
- Injected positive signals are not copied to user space properly
unless they have these magic high bits set.
- Injected positive signals are not reported properly by signalfd
unless they have these magic high bits set.
- These kernel internal values leaked to userspace via ptrace_peek_siginfo
- It was possible to inject these kernel internal values and cause the
the kernel to misbehave.
- Kernel developers got confused and expected these kernel internal values
in userspace in kernel self tests.
- Kernel developers got confused and set si_code to __SI_FAULT which
is SI_USER in userspace which causes userspace to think an ordinary user
sent the signal and that it was not kernel generated.
- The values make it impossible to reorganize the code to transform
siginfo_copy_to_user into a plain copy_to_user. As si_code must
be massaged before being passed to userspace.
So remove these kernel internal si codes and make the kernel code simpler
and more maintainable.
To replace these kernel internal magic si_codes introduce the helper
function siginfo_layout, that takes a signal number and an si_code and
computes which union member of siginfo is being used. Have
siginfo_layout return an enumeration so that gcc will have enough
information to warn if a switch statement does not handle all of union
members.
A couple of architectures have a messed up ABI that defines signal
specific duplications of SI_USER which causes more special cases in
siginfo_layout than I would like. The good news is only problem
architectures pay the cost.
Update all of the code that used the previous magic __SI_ values to
use the new SIL_ values and to call siginfo_layout to get those
values. Escept where not all of the cases are handled remove the
defaults in the switch statements so that if a new case is missed in
the future the lack will show up at compile time.
Modify the code that copies siginfo si_code to userspace to just copy
the value and not cast si_code to a short first. The high bits are no
longer used to hold a magic union member.
Fixup the siginfo header files to stop including the __SI_ values in
their constants and for the headers that were missing it to properly
update the number of si_codes for each signal type.
The fixes to copy_siginfo_from_user32 implementations has the
interesting property that several of them perviously should never have
worked as the __SI_ values they depended up where kernel internal.
With that dependency gone those implementations should work much
better.
The idea of not passing the __SI_ values out to userspace and then
not reinserting them has been tested with criu and criu worked without
changes.
Ref: 2.4.0-test1
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
In current die(), the irq is disabled for __die() handle, not
including the possible panic() handling. Since the log in __die()
can take several hundreds ms, new irq might come and interrupt
current die().
If the process calling die() holds some critical resource, and some
other process scheduled later also needs it, then it would deadlock.
The first panic will not be executed.
So here disable irq for the whole flow of die().
Signed-off-by: Qiao Zhou <qiaozhou@asrmicro.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ensure the address limit is a user-mode segment before returning to
user-mode. Otherwise a process can corrupt kernel-mode memory and
elevate privileges [1].
The set_fs function sets the TIF_SETFS flag to force a slow path on
return. In the slow path, the address limit is checked to be USER_DS if
needed.
[1] https://bugs.chromium.org/p/project-zero/issues/detail?id=990
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: kernel-hardening@lists.openwall.com
Cc: Will Deacon <will.deacon@arm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Will Drewry <wad@chromium.org>
Cc: linux-api@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170615011203.144108-3-thgarnie@google.com
- Better machine check handling for HV KVM
- Ability to support guests with threads=2, 4 or 8 on POWER9
- Fix for a race that could cause delayed recognition of signals
- Fix for a bug where POWER9 guests could sleep with interrupts pending.
ARM:
- VCPU request overhaul
- allow timer and PMU to have their interrupt number selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups
s390:
- initial machine check forwarding
- migration support for the CMMA page hinting information
- cleanups and fixes
x86:
- nested VMX bugfixes and improvements
- more reliable NMI window detection on AMD
- APIC timer optimizations
Generic:
- VCPU request overhaul + documentation of common code patterns
- kvm_stat improvements
There is a small conflict in arch/s390 due to an arch-wide field rename.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJZW4XTAAoJEL/70l94x66DkhMH/izpk54KI17PtyQ9VYI2sYeZ
BWK6Kl886g3ij4pFi3pECqjDJzWaa3ai+vFfzzpJJ8OkCJT5Rv4LxC5ERltVVmR8
A3T1I/MRktSC0VJLv34daPC2z4Lco/6SPipUpPnL4bE2HATKed4vzoOjQ3tOeGTy
dwi7TFjKwoVDiM7kPPDRnTHqCe5G5n13sZ49dBe9WeJ7ttJauWqoxhlYosCGNPEj
g8ZX8+cvcAhVnz5uFL8roqZ8ygNEQq2mgkU18W8ZZKuiuwR0gdsG0gSBFNTdwIMK
NoreRKMrw0+oLXTIB8SZsoieU6Qi7w3xMAMabe8AJsvYtoersugbOmdxGCr1lsA=
=OD7H
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"PPC:
- Better machine check handling for HV KVM
- Ability to support guests with threads=2, 4 or 8 on POWER9
- Fix for a race that could cause delayed recognition of signals
- Fix for a bug where POWER9 guests could sleep with interrupts pending.
ARM:
- VCPU request overhaul
- allow timer and PMU to have their interrupt number selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups
s390:
- initial machine check forwarding
- migration support for the CMMA page hinting information
- cleanups and fixes
x86:
- nested VMX bugfixes and improvements
- more reliable NMI window detection on AMD
- APIC timer optimizations
Generic:
- VCPU request overhaul + documentation of common code patterns
- kvm_stat improvements"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits)
Update my email address
kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS
x86: kvm: mmu: use ept a/d in vmcs02 iff used in vmcs12
kvm: x86: mmu: allow A/D bits to be disabled in an mmu
x86: kvm: mmu: make spte mmio mask more explicit
x86: kvm: mmu: dead code thanks to access tracking
KVM: PPC: Book3S: Fix typo in XICS-on-XIVE state saving code
KVM: PPC: Book3S HV: Close race with testing for signals on guest entry
KVM: PPC: Book3S HV: Simplify dynamic micro-threading code
KVM: x86: remove ignored type attribute
KVM: LAPIC: Fix lapic timer injection delay
KVM: lapic: reorganize restart_apic_timer
KVM: lapic: reorganize start_hv_timer
kvm: nVMX: Check memory operand to INVVPID
KVM: s390: Inject machine check into the nested guest
KVM: s390: Inject machine check into the guest
tools/kvm_stat: add new interactive command 'b'
tools/kvm_stat: add new command line switch '-i'
tools/kvm_stat: fix error on interactive command 'g'
KVM: SVM: suppress unnecessary NMI singlestep on GIF=0 and nested exit
...
Here is the big driver core update for 4.13-rc1.
The large majority of this is a lot of cleanup of old fields in the
driver core structures and their remaining usages in random drivers.
All of those fixes have been reviewed by the various subsystem
maintainers. There's also some small firmware updates in here, a new
kobject uevent api interface that makes userspace interaction easier,
and a few other minor things.
All of these have been in linux-next for a long while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWVpX4A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymobgCfd0d13IfpZoq1N41wc6z2Z0xD7cwAnRMeH1/p
kEeISGpHPYP9f8PBh9FO
=Hfqt
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big driver core update for 4.13-rc1.
The large majority of this is a lot of cleanup of old fields in the
driver core structures and their remaining usages in random drivers.
All of those fixes have been reviewed by the various subsystem
maintainers. There's also some small firmware updates in here, a new
kobject uevent api interface that makes userspace interaction easier,
and a few other minor things.
All of these have been in linux-next for a long while with no reported
issues"
* tag 'driver-core-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (56 commits)
arm: mach-rpc: ecard: fix build error
zram: convert remaining CLASS_ATTR() to CLASS_ATTR_RO()
driver-core: remove struct bus_type.dev_attrs
powerpc: vio_cmo: use dev_groups and not dev_attrs for bus_type
powerpc: vio: use dev_groups and not dev_attrs for bus_type
USB: usbip: convert to use DRIVER_ATTR_RW
s390: drivers: convert to use DRIVER_ATTR_RO/WO
platform: thinkpad_acpi: convert to use DRIVER_ATTR_RO/RW
pcmcia: ds: convert to use DRIVER_ATTR_RO
wireless: ipw2x00: convert to use DRIVER_ATTR_RW
net: ehea: convert to use DRIVER_ATTR_RO
net: caif: convert to use DRIVER_ATTR_RO
TTY: hvc: convert to use DRIVER_ATTR_RW
PCI: pci-driver: convert to use DRIVER_ATTR_WO
IB: nes: convert to use DRIVER_ATTR_RW
HID: hid-core: convert to use DRIVER_ATTR_RO and drv_groups
arm: ecard: fix dev_groups patch typo
tty: serdev: use dev_groups and not dev_attrs for bus_type
sparc: vio: use dev_groups and not dev_attrs for bus_type
hid: intel-ish-hid: use dev_groups and not dev_attrs for bus_type
...
Pull SMP hotplug updates from Thomas Gleixner:
"This update is primarily a cleanup of the CPU hotplug locking code.
The hotplug locking mechanism is an open coded RWSEM, which allows
recursive locking. The main problem with that is the recursive nature
as it evades the full lockdep coverage and hides potential deadlocks.
The rework replaces the open coded RWSEM with a percpu RWSEM and
establishes full lockdep coverage that way.
The bulk of the changes fix up recursive locking issues and address
the now fully reported potential deadlocks all over the place. Some of
these deadlocks have been observed in the RT tree, but on mainline the
probability was low enough to hide them away."
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
cpu/hotplug: Constify attribute_group structures
powerpc: Only obtain cpu_hotplug_lock if called by rtasd
ARM/hw_breakpoint: Fix possible recursive locking for arch_hw_breakpoint_init
cpu/hotplug: Remove unused check_for_tasks() function
perf/core: Don't release cred_guard_mutex if not taken
cpuhotplug: Link lock stacks for hotplug callbacks
acpi/processor: Prevent cpu hotplug deadlock
sched: Provide is_percpu_thread() helper
cpu/hotplug: Convert hotplug locking to percpu rwsem
s390: Prevent hotplug rwsem recursion
arm: Prevent hotplug rwsem recursion
arm64: Prevent cpu hotplug rwsem recursion
kprobes: Cure hotplug lock ordering issues
jump_label: Reorder hotplug lock and jump_label_lock
perf/tracing/cpuhotplug: Fix locking order
ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus()
PCI: Replace the racy recursion prevention
PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode()
x86/perf: Drop EXPORT of perf_check_microcode
...
Pull timer updates from Thomas Gleixner:
"A rather large update for timers/timekeeping:
- compat syscall consolidation (Al Viro)
- Posix timer consolidation (Christoph Helwig / Thomas Gleixner)
- Cleanup of the device tree based initialization for clockevents and
clocksources (Daniel Lezcano)
- Consolidation of the FTTMR010 clocksource/event driver (Linus
Walleij)
- The usual set of small fixes and updates all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (93 commits)
timers: Make the cpu base lock raw
clocksource/drivers/mips-gic-timer: Fix an error code in 'gic_clocksource_of_init()'
clocksource/drivers/fsl_ftm_timer: Unmap region obtained by of_iomap
clocksource/drivers/tcb_clksrc: Make IO endian agnostic
clocksource/drivers/sun4i: Switch to the timer-of common init
clocksource/drivers/timer-of: Fix invalid iomap check
Revert "ktime: Simplify ktime_compare implementation"
clocksource/drivers: Fix uninitialized variable use in timer_of_init
kselftests: timers: Add test for frequency step
kselftests: timers: Fix inconsistency-check to not ignore first timestamp
time: Add warning about imminent deprecation of CONFIG_GENERIC_TIME_VSYSCALL_OLD
time: Clean up CLOCK_MONOTONIC_RAW time handling
posix-cpu-timers: Make timespec to nsec conversion safe
itimer: Make timeval to nsec conversion range limited
timers: Fix parameter description of try_to_del_timer_sync()
ktime: Simplify ktime_compare implementation
clocksource/drivers/fttmr010: Factor out clock read code
clocksource/drivers/fttmr010: Implement delay timer
clocksource/drivers: Add timer-of common init routine
clocksource/drivers/tcb_clksrc: Save timer context on suspend/resume
...
Pull scheduler updates from Ingo Molnar:
"The main changes in this cycle were:
- Add the SYSTEM_SCHEDULING bootup state to move various scheduler
debug checks earlier into the bootup. This turns silent and
sporadically deadly bugs into nice, deterministic splats. Fix some
of the splats that triggered. (Thomas Gleixner)
- A round of restructuring and refactoring of the load-balancing and
topology code (Peter Zijlstra)
- Another round of consolidating ~20 of incremental scheduler code
history: this time in terms of wait-queue nomenclature. (I didn't
get much feedback on these renaming patches, and we can still
easily change any names I might have misplaced, so if anyone hates
a new name, please holler and I'll fix it.) (Ingo Molnar)
- sched/numa improvements, fixes and updates (Rik van Riel)
- Another round of x86/tsc scheduler clock code improvements, in hope
of making it more robust (Peter Zijlstra)
- Improve NOHZ behavior (Frederic Weisbecker)
- Deadline scheduler improvements and fixes (Luca Abeni, Daniel
Bristot de Oliveira)
- Simplify and optimize the topology setup code (Lauro Ramos
Venancio)
- Debloat and decouple scheduler code some more (Nicolas Pitre)
- Simplify code by making better use of llist primitives (Byungchul
Park)
- ... plus other fixes and improvements"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
sched/cputime: Refactor the cputime_adjust() code
sched/debug: Expose the number of RT/DL tasks that can migrate
sched/numa: Hide numa_wake_affine() from UP build
sched/fair: Remove effective_load()
sched/numa: Implement NUMA node level wake_affine()
sched/fair: Simplify wake_affine() for the single socket case
sched/numa: Override part of migrate_degrades_locality() when idle balancing
sched/rt: Move RT related code from sched/core.c to sched/rt.c
sched/deadline: Move DL related code from sched/core.c to sched/deadline.c
sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled
sched/fair: Spare idle load balancing on nohz_full CPUs
nohz: Move idle balancer registration to the idle path
sched/loadavg: Generalize "_idle" naming to "_nohz"
sched/core: Drop the unused try_get_task_struct() helper function
sched/fair: WARN() and refuse to set buddy when !se->on_rq
sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well
sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming
sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c
sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h>
sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h>
...
Pull EFI updates from Ingo Molnar:
"The main changes in this cycle were:
- Rework the EFI capsule loader to allow for workarounds for
non-compliant firmware (Ard Biesheuvel)
- Implement a capsule loader quirk for Quark X102x (Jan Kiszka)
- Enable SMBIOS/DMI support for the ARM architecture (Ard Biesheuvel)
- Add CONFIG_EFI_PGT_DUMP=y support for x86-32 and kexec (Sai
Praneeth)
- Fixes for EFI support for Xen dom0 guests running under x86-64
hosts (Daniel Kiper)"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/xen/efi: Initialize only the EFI struct members used by Xen
efi: Process the MEMATTR table only if EFI_MEMMAP is enabled
efi/arm: Enable DMI/SMBIOS
x86/efi: Extend CONFIG_EFI_PGT_DUMP support to x86_32 and kexec as well
efi/efi_test: Use memdup_user() helper
efi/capsule: Add support for Quark security header
efi/capsule-loader: Use page addresses rather than struct page pointers
efi/capsule-loader: Redirect calls to efi_capsule_setup_info() via weak alias
efi/capsule: Remove NULL test on kmap()
efi/capsule-loader: Use a cached copy of the capsule header
efi/capsule: Adjust return type of efi_capsule_setup_info()
efi/capsule: Clean up pr_err/_info() messages
efi/capsule: Remove pr_debug() on ENOMEM or EFAULT
efi/capsule: Fix return code on failing kmap/vmap
With the introduction of struct pci_host_bridge.map_irq pointer it is
possible to assign IRQs for all devices originating from a PCI host bridge
at probe time; this is implemented through pci_assign_irq() that relies on
the struct pci_host_bridge.map_irq pointer to map IRQ for a given device.
The benefits this brings are twofold:
- the IRQ for a device is assigned once at probe time
- the IRQ assignment works also for hotplugged devices
With all DT based PCI host bridges converted to the struct
pci_host_bridge.{map/swizzle}_irq hooks mechanism the DT IRQ allocation in
ARM64 pcibios_alloc_irq() is now redundant and can be removed.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by <linux/sysfs.h> work with const
attribute_group. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
- vcpu request overhaul
- allow timer and PMU to have their interrupt number
selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAllWCM0VHG1hcmMuenlu
Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDjJ0QAI16x6+trKhH31lTSYekYfqm4hZ2
Fp7IbALW9KNCaY35tZov2Zuh99qGRduxTh7ewqhKpON8kkU+UKj0F7zH22+vfN4m
yas/+uNr8R9VLyvea4ysPsgx8Q8v1Ix9setohHYNZIL9/klVqtaHpYvArHVF/mzq
p2j/NxRS2dlp9r2TtoMRMhA05u6r0wolhUuh+z9v2ipib0gfOBIG24jsqCTEcD9n
5A/cVd+ztYshkrV95h3y9peahwt3zOA4QBGzrQ2K25jp0s54nqhmC7JTNSa8dtar
YGW2MuAMoIFTwCFAlpwCzrwpOJFzF3Q6A8bOxei2fjclzjPMgT1xQxuhOoe4ntFa
lTPxSHalm5W6dFTW90YSo2DBcPe+N7sQkhjR0cCeY3GYsOFhXMLTlOl5Pt1YK1or
+3FAI74tFRKvVmb9mhZeGTvuzhDgRvtf3Qq5rjwlGzKc2BBOEgtMyj/Wgwo4N6Dz
IjOnoRaUGELoBCWoTorMxLpsPBdPVSUxNyJTdAhqZ/ZtT1xqjhFNLZcrVWmOTzDM
1cav+jZkla4sLmJSNDD54aCSvvtPHis0nZn9PRlh12xgOyYiAVx4K++MNuWP0P37
hbh1gbPT+FcoVxPurUsX/pjNlTucPZcBwFytZDQlpwtPBpEFzJiImLYe/PldRb0f
9WQOH1Y1+q14MF+N
=6hNK
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM updates for 4.13
- vcpu request overhaul
- allow timer and PMU to have their interrupt number
selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups
Conflicts:
arch/s390/include/asm/kvm_host.h
Now that compat_vfp_get() uses the regset API to copy the FPSCR
value out to userspace, compat_vfp_set() looks inconsistent. In
particular, compat_vfp_set() will fail if called with kbuf != NULL
&& ubuf == NULL (which is valid usage according to the regset API).
This patch fixes compat_vfp_set() to use user_regset_copyin(),
similarly to compat_vfp_get().
This also squashes a sparse warning triggered by the cast that
drops __user when calling get_user().
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
compat_vfp_set() checks for userspace trying to write an excessive
amount of data to the regset. However this check is conspicuous
for its absence from every other _set() in the arm64 ptrace
implementation. In fact, the core ptrace_regset() already clamps
userspace's iov_len to the regset size before the individual regset
.{get,set}() methods get called.
This patch removes the redundant check.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
If get_user() fails when reading the new FPSCR value from userspace
in compat_vfp_get(), then garbage* will be written to the task's
FPSR and FPCR registers.
This patch prevents this by checking the return from get_user()
first.
[*] Actually, zero, due to the behaviour of get_user() on error, but
that's still not what userspace expects.
Fixes: 478fcb2cdb ("arm64: Debugging support")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
get_alt_insn() is used to read and create ARM instructions, which
are always stored in memory in little-endian order. These values
are thus correctly converted to/from native order when processed
but the pointers used to hold the address of these instructions
are declared as for native order values.
Fix this by declaring the pointers as __le32* instead of u32* and
make the few appropriate needed changes like removing the unneeded
cast '(u32*)' in front of __ALT_PTR()'s definition.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In the flattened device tree format, all integer properties are
in big-endian order.
Here the property "kaslr-seed" is read from the fdt and then
correctly converted to native order (via fdt64_to_cpu()) but the
pointer used for this is not annotated as being for big-endian.
Fix this by declaring the pointer as fdt64_t instead of u64
(fdt64_t being itself typedefed to __be64).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Here both variables 'cpu_id' and 'entry_point' are read via
read[lq]_relaxed(), from a little-endian annotated pointer
and then used as a native endian value.
This is correct since the read[lq]() family of function
internally do a little-to-native endian conversion.
But in this case, it is wrong to declare these variable as
little-endian since there are native ones.
Fix this by changing the declaration of these variables
as 'u32' or 'u64' instead of '__le32' / '__le64'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Here the entrypoint, declared as a 64 bit integer, is read from
a pointer to 64bit integer but the read is done via readl_relaxed()
which is for 32bit quantities.
All the high bits will thus be lost which change the meaning
of the test against zero done later.
Fix this by using readq_relaxed() instead as it should be for
64bit quantities.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Here the functions reloc_insn_movw() & reloc_insn_imm() are used
to read, modify and write back ARM instructions, which are always
stored in memory in little-endian order. These values are thus
correctly converted to/from native order but the pointers used to
hold their addresses are declared as for native order values.
Fix this by declaring the pointers as __le32* and remove the
casts that are now unneeded.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
aarch64_insn_write() is used to write an instruction.
As on ARM64 in-memory instructions are always stored
in little-endian order, this function, taking the instruction
opcode in native order, correctly convert it to little-endian
before sending it to an helper function __aarch64_insn_write()
which will do the effective write.
This is all good, but the variable and argument holding the
converted value are not annotated for a little-endian value
but left for native values.
Fix this by adjusting the prototype of the helper and
directly using the result of cpu_to_le32() without passing
by an intermediate variable (which was not a distinct one
but the same as the one holding the native value).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The function arch64_insn_read() is used to read an instruction.
On AM64 instructions are always stored in little-endian order
and thus the function correctly do a little-to-native endian
conversion to the value just read.
However, the variable used to hold the value before the conversion
is not declared for a little-endian value but for a native one.
Fix this by using the correct type for the declaration: __le32
Note: This only works because the function reading the value,
probe_kernel_read((), takes a void pointer and void pointers
are endian-agnostic. Otherwise probe_kernel_read() should
also be properly annotated (or worse, need to be specialized).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Here we're reading thumb or ARM instructions, which are always
stored in memory in little-endian order. These values are thus
correctly converted to native order but the intermediate value
should be annotated as for little-endian values.
Fix this by declaring the intermediate var as __le32 or __le16.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Here we're reading thumb or ARM instructions, which are always
stored in memory in little-endian order. These values are thus
correctly converted to native order but the intermediate value
should be annotated as for little-endian values.
Fix this by declaring the intermediate var as __le32 or __le16.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When a kernel is built without CONFIG_ARM64_MODULE_PLTS, we don't
generate the expected branch instruction in ftrace_make_nop(). This
means we pass zero (rather than a valid branch) to ftrace_modify_code()
as the expected instruction to validate. This causes us to return
-EINVAL to the core ftrace code for a valid case, resulting in a splat
at boot time.
This was an unintended effect of commit:
687644209a ("arm64: ftrace: fix building without CONFIG_MODULES")
... which incorrectly moved the generation of the branch instruction
into the ifdef for CONFIG_ARM64_MODULE_PLTS.
This patch fixes the issue by moving the ifdef inside of the relevant
if-else case, and always checking that the branch is in range,
regardless of CONFIG_ARM64_MODULE_PLTS. This ensures that we generate
the expected branch instruction, and also improves our sanity checks.
For consistency, both ftrace_make_nop() and ftrace_make_call() are
updated with this pattern.
Fixes: 687644209a ("arm64: ftrace: fix building without CONFIG_MODULES")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch defines an extra_context signal frame record that can be
used to describe an expanded signal frame, and modifies the context
block allocator and signal frame setup and parsing code to create,
populate, parse and decode this block as necessary.
To avoid abuse by userspace, parse_user_sigframe() attempts to
ensure that:
* no more than one extra_context is accepted;
* the extra context data is a sensible size, and properly placed
and aligned.
The extra_context data is required to start at the first 16-byte
aligned address immediately after the dummy terminator record
following extra_context in rt_sigframe.__reserved[] (as ensured
during signal delivery). This serves as a sanity-check that the
signal frame has not been moved or copied without taking the extra
data into account.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
[will: add __force annotation when casting extra_datap to __user pointer]
Signed-off-by: Will Deacon <will.deacon@arm.com>
When debugging a kernel panic(), it can be useful to know which CPU
features have been detected by the kernel, as some code paths can depend
on these (and may have been patched at runtime).
This patch adds a notifier to dump the detected CPU caps (as a hex
string) at panic(), when we log other information useful for debugging.
On a Juno R1 system running v4.12-rc5, this looks like:
[ 615.431249] Kernel panic - not syncing: Fatal exception in interrupt
[ 615.437609] SMP: stopping secondary CPUs
[ 615.441872] Kernel Offset: disabled
[ 615.445372] CPU features: 0x02086
[ 615.448522] Memory Limit: none
A developer can decode this by looking at the corresponding
<asm/cpucaps.h> bits. For example, the above decodes as:
* bit 1: ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE
* bit 2: ARM64_WORKAROUND_845719
* bit 7: ARM64_WORKAROUND_834220
* bit 13: ARM64_HAS_32BIT_EL0
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When reading current's user-writable TLS register (which occurs
when dumping core for native tasks), it is possible that userspace
has modified it since the time the task was last scheduled out.
The new TLS register value is not guaranteed to have been written
immediately back to thread_struct in this case.
As a result, a coredump can capture stale data for this register.
Reading the register for a stopped task via ptrace is unaffected.
For native tasks, this patch explicitly flushes the TPIDR_EL0
register back to thread_struct before dumping when operating on
current, thus ensuring that coredump contents are up to date. For
compat tasks, the TLS register is not user-writable and so cannot
be out of sync, so no flush is required in compat_tls_get().
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When reading the FPSIMD state of current (which occurs when dumping
core), it is possible that userspace has modified the FPSIMD
registers since the time the task was last scheduled out. Such
changes are not guaranteed to be reflected immedately in
thread_struct.
As a result, a coredump can contain stale values for these
registers. Reading the registers of a stopped task via ptrace is
unaffected.
This patch explicitly flushes the CPU state back to thread_struct
before dumping when operating on current, thus ensuring that
coredump contents are up to date.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, VFP registers are omitted from coredumps for compat
processes, due to a bug in the REGSET_COMPAT_VFP regset
implementation.
compat_vfp_get() needs to transfer non-contiguous data from
thread_struct.fpsimd_state, and uses put_user() to handle the
offending trailing word (FPSCR). This fails when copying to a
kernel address (i.e., kbuf && !ubuf), which is what happens when
dumping core. As a result, the ELF coredump core code silently
omits the NT_ARM_VFP note from the dump.
It would be possible to work around this with additional special
case code for the put_user(), but since user_regset_copyout() is
explicitly designed to handle this scenario it is cleaner to port
the put_user() to a user_regset_copyout() call, which this patch
does.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Merge time(keeping) updates from John Stultz:
"Just a small set of changes, the biggest changes being the MONOTONIC_RAW
handling cleanup, and a new kselftest from Miroslav. Also a a clear
warning deprecating CONFIG_GENERIC_TIME_VSYSCALL_OLD, which affects ppc
and ia64."
Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW,
remove the duplicitive tk->raw_time.tv_nsec, which can be
stored in tk->tkr_raw.xtime_nsec (similarly to how its handled
for monotonic time).
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Daniel Mentz <danielmentz@google.com>
Tested-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
This patch factors out the allocator for signal frame optional
records into a separate function, to ensure consistency and
facilitate later expansion.
No overrun checking is currently done, because the allocation is in
user memory and anyway the kernel never tries to allocate enough
space in the signal frame yet for an overrun to occur. This
behaviour will be refined in future patches.
The approach taken in this patch to allocation of the terminator
record is not very clean: this will also be replaced in subsequent
patches.
For future extension, a comment is added in sigcontext.h
documenting the current static allocations in __reserved[]. This
will be important for determining under what circumstances
userspace may or may not see an expanded signal frame.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, rt_sigreturn does very limited checking on the
sigcontext coming from userspace.
Future additions to the sigcontext data will increase the potential
for surprises. Also, it is not clear whether the sigcontext
extension records are supposed to occur in a particular order.
To allow the parsing code to be extended more easily, this patch
factors out the sigcontext parsing into a separate function, and
adds extra checks to validate the well-formedness of the sigcontext
structure.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In order to be able to increase the amount of the data currently
written to the __reserved[] array in the signal frame, it is
necessary to overwrite the locations currently occupied by the
{fp,lr} frame link record pushed at the top of the signal stack.
In order for this to work, this patch detaches the frame link
record from struct rt_sigframe and places it separately at the top
of the signal stack. This will allow subsequent patches to insert
data between it and __reserved[].
This change relies on the non-ABI status of the placement of the
frame record with respect to struct sigframe: this status is
undocumented, but the placement is not declared or described in the
user headers, and known unwinder implementations (libgcc,
libunwind, gdb) appear not to rely on it.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Conflicts:
kernel/sched/Makefile
Pick up the waitqueue related renames - it didn't get much feedback,
so it appears to be uncontroversial. Famous last words? ;-)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Recently vDSO support for CLOCK_MONOTONIC_RAW was added in
49eea433b3 ("arm64: Add support for CLOCK_MONOTONIC_RAW in
clock_gettime() vDSO"). Noticing that the core timekeeping code
never set tkr_raw.xtime_nsec, the vDSO implementation didn't
bother exposing it via the data page and instead took the
unshifted tk->raw_time.tv_nsec value which was then immediately
shifted left in the vDSO code.
Unfortunately, by accellerating the MONOTONIC_RAW clockid, it
uncovered potential 1ns time inconsistencies caused by the
timekeeping core not handing sub-ns resolution.
Now that the core code has been fixed and is actually setting
tkr_raw.xtime_nsec, we need to take that into account in the
vDSO by adding it to the shifted raw_time value, in order to
fix the user-visible inconsistency. Rather than do that at each
use (and expand the data page in the process), instead perform
the shift/addition operation when populating the data page and
remove the shift from the vDSO code entirely.
[jstultz: minor whitespace tweak, tried to improve commit
message to make it more clear this fixes a regression]
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Daniel Mentz <danielmentz@google.com>
Acked-by: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: "stable #4 . 8+" <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: http://lkml.kernel.org/r/1496965462-20003-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The kernel watchdog is a great debugging tool for finding tasks that
consume a disproportionate amount of CPU time in contiguous chunks. One
can imagine building a similar watchdog for arbitrary driver threads
using save_stack_trace_tsk() and print_stack_trace(). However, this is
not viable for dynamically loaded driver modules on ARM platforms
because save_stack_trace_tsk() is not exported for those architectures.
Export save_stack_trace_tsk() for the ARM64 architecture to align with
x86 and support various debugging use cases such as arbitrary driver
thread watchdog timers.
Signed-off-by: Dustin Brown <dustinb@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Some Cavium Thunder CPUs suffer a problem where a KVM guest may
inadvertently cause the host kernel to quit receiving interrupts.
Use the Group-0/1 trapping in order to deal with it.
[maz]: Adapted patch to the Group-0/1 trapping, reworked commit log
Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
The function name is now renamed to 'timer_probe' for consistency with
the CLOCKSOURCE_OF_DECLARE => TIMER_OF_DECLARE change.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
When CONFIG_MODULES is disabled, we cannot dereference a module pointer:
arch/arm64/kernel/ftrace.c: In function 'ftrace_make_call':
arch/arm64/kernel/ftrace.c:107:36: error: dereferencing pointer to incomplete type 'struct module'
trampoline = (unsigned long *)mod->arch.ftrace_trampoline;
Also, the within_module() function is not defined:
arch/arm64/kernel/ftrace.c: In function 'ftrace_make_nop':
arch/arm64/kernel/ftrace.c:171:8: error: implicit declaration of function 'within_module'; did you mean 'init_module'? [-Werror=implicit-function-declaration]
This addresses both by adding replacing the IS_ENABLED(CONFIG_ARM64_MODULE_PLTS)
checks with #ifdef versions.
Fixes: e71a4e1beb ("arm64: ftrace: add support for far branches to dynamic ftrace")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, dynamic ftrace support in the arm64 kernel assumes that all
core kernel code is within range of ordinary branch instructions that
occur in module code, which is usually the case, but is no longer
guaranteed now that we have support for module PLTs and address space
randomization.
Since on arm64, all patching of branch instructions involves function
calls to the same entry point [ftrace_caller()], we can emit the modules
with a trampoline that has unlimited range, and patch both the trampoline
itself and the branch instruction to redirect the call via the trampoline.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: minor clarification to smp_wmb() comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
When turning branch instructions into NOPs, we attempt to validate the
action by comparing the old value at the call site with the opcode of
a direct relative branch instruction pointing at the old target.
However, these call sites are statically initialized to call _mcount(),
and may be redirected via a PLT entry if the module is loaded far away
from the kernel text, leading to false negatives and spurious errors.
So skip the validation if CONFIG_ARM64_MODULE_PLTS is configured.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Wire up the existing arm64 support for SMBIOS tables (aka DMI) for ARM as
well, by moving the arm64 init code to drivers/firmware/efi/arm-runtime.c
(which is shared between ARM and arm64), and adding a asm/dmi.h header to
ARM that defines the mapping routines for the firmware tables.
This allows userspace to access these tables to discover system information
exposed by the firmware. It also sets the hardware name used in crash
dumps, e.g.:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ed3c0000
[00000000] *pgd=bf1f3835
Internal error: Oops: 817 [#1] SMP THUMB2
Modules linked in:
CPU: 0 PID: 759 Comm: bash Not tainted 4.10.0-09601-g0e8f38792120-dirty #112
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
^^^
NOTE: This does *NOT* enable or encourage the use of DMI quirks, i.e., the
the practice of identifying the platform via DMI to decide whether
certain workarounds for buggy hardware and/or firmware need to be
enabled. This would require the DMI subsystem to be enabled much
earlier than we do on ARM, which is non-trivial.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170602135207.21708-14-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 3fde2999fa ("arm64: cpufeature: Don't dump useless backtrace on
CPU_OUT_OF_SPEC") changed the cpufeature detection code to use add_taint
instead of WARN_TAINT_ONCE when detecting a heterogeneous system with
mismatched feature support. Unfortunately, this resulted in all systems
getting the taint, regardless of any feature mismatch.
This patch fixes the problem by conditionalising the taint on detecting
a feature mismatch.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that some functions that deal with arch topology information live
under drivers, there is a clash of naming that might create confusion.
Tidy things up by creating a topology namespace for interfaces used by
arch code; achieve this by prepending a 'topology_' prefix to driver
interfaces.
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Create a new header file (include/linux/arch_topology.h) and put there
declarations of interfaces used by arm, arm64 and drivers code.
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reduce the scope of cap_parsing_failed (making it static in
drivers/base/arch_topology.c) by slightly changing {arm,arm64} DT
parsing code.
For arm checking for !cap_parsing_failed before calling normalize_
cpu_capacity() is superfluous, as returning an error from parse_
cpu_capacity() (above) means cap_from _dt is set to false.
For arm64 we can simply check if raw_capacity points to something,
which is not if capacity parsing has failed.
Suggested-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
arm and arm64 share lot of code relative to parsing CPU capacity
information from DT, using that information for appropriate scaling and
exposing a sysfs interface for chaging such values at runtime.
Factorize such code in a common place (driver/base/arch_topology.c) in
preparation for further additions.
Suggested-by: Will Deacon <will.deacon@arm.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Perf has supported ARMv8.1 feature with 16-bit evtCount filed [see c210ae8
arm64: perf: Extend event mask for ARMv8.1], event config should be
extended to 16-bit too, otherwise, if use -e event_name whose event_code
is more than 0x3ff, pmu_config_term will return -EINVAL because function
pmu_format_max_value depends on event config.
This patch extends event config to 16-bit.
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
PCI core requires the NUMA node for the struct pci_host_bridge.dev to
be set by using the pcibus_to_node(struct pci_bus*) API, that on ARM64
systems relies on the struct pci_host_bridge->bus.dev NUMA node.
The struct pci_host_bridge.dev NUMA node is then propagated through
the PCI device hierarchy as PCI devices (and bridges) are enumerated
under it.
Therefore, in order to set-up the PCI NUMA hierarchy appropriately, the
struct pci_host_bridge->bus.dev NUMA node must be set before core
code calls pcibus_to_node(struct pci_bus*) on it so that PCI core can
retrieve the NUMA node for the struct pci_host_bridge.dev device and can
propagate it through the PCI bus tree.
On ARM64 ACPI based systems the struct pci_host_bridge->bus.dev NUMA
node can be set-up in pcibios_root_bridge_prepare() by parsing the root
bridge ACPI device firmware binding.
Add code to the pcibios_root_bridge_prepare() that, when booting with
ACPI, parse the root bridge ACPI device companion NUMA binding and set
the corresponding struct pci_host_bridge->bus.dev NUMA node
appropriately.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Tested-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
It's useless to print machine name and setup arch-specific system
identifiers if of_flat_dt_get_machine_name() return NULL, especially
when ACPI-based boot.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Unfortunately, it turns out that mismatched CPU features in big.LITTLE
systems are starting to appear in the wild. Whilst we should continue to
taint the kernel with CPU_OUT_OF_SPEC for features that differ in ways
that we can't fix up, dumping a useless backtrace out of the cpufeature
code is pointless and irritating.
This patch removes the backtrace from the taint.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Generic code expects show_regs() to dump the stack, but arm64's
show_regs() does not. This makes it hard to debug softlockups and
other issues that result in show_regs() being called.
This patch updates arm64's show_regs() to dump the stack, as common
code expects.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
[will: folded in bug_handler fix from mrutland]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Generic code expects show_regs() to also dump the stack, but arm64's
show_reg() does not do this. Some arm64 callers of show_regs() *only*
want the registers dumped, without the stack.
To enable generic code to work as expected, we need to make
show_regs() dump the stack. Where we only want the registers dumped,
we must use __show_regs().
This patch updates code to use __show_regs() where only registers are
desired. A subsequent patch will modify show_regs().
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The text patching functions which are invoked from jump_label and kprobes
code are protected against cpu hotplug at the call sites.
Use stop_machine_cpuslocked() to avoid recursion on the cpu hotplug
rwsem. stop_machine_cpuslocked() contains a lockdep assertion to catch any
unprotected callers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170524081549.197070135@linutronix.de
Commit 093d24a204 ("arm64: PCI: Manage controller-specific data on
per-controller basis") added code to allocate ACPI PCI root_ops
dynamically on a per host bridge basis but failed to update the
corresponding memory allocation failure path in pci_acpi_scan_root()
leading to a potential memory leakage.
Fix it by adding the required kfree call.
Fixes: 093d24a204 ("arm64: PCI: Manage controller-specific data on per-controller basis")
Reviewed-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Timmy Li <lixiaoping3@huawei.com>
[lorenzo.pieralisi@arm.com: refactored code, rewrote commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.
Adjust the system_state check in smp_send_stop() to handle the extra states.
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20170516184735.112589728@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
must take the jump_label mutex.
We call cpus_set_cap() in the secondary bringup path, from the idle
thread where interrupts are disabled. Taking a mutex in this path "is a
NONO" regardless of whether it's contended, and something we must avoid.
We didn't spot this until recently, as ___might_sleep() won't warn for
this case until all CPUs have been brought up.
This patch avoids taking the mutex in the secondary bringup path. The
poking of static keys is deferred until enable_cpu_capabilities(), which
runs in a suitable context on the boot CPU. To account for the static
keys being set later, cpus_have_const_cap() is updated to use another
static key to check whether the const cap keys have been initialised,
falling back to the caps bitmap until this is the case.
This means that users of cpus_have_const_cap() gain should only gain a
single additional NOP in the fast path once the const caps are
initialised, but should always see the current cap value.
The hyp code should never dereference the caps array, since the caps are
initialized before we run the module initcall to initialise hyp. A check
is added to the hyp init code to document this requirement.
This change will sidestep a number of issues when the upcoming hotplug
locking rework is merged.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyniger <marc.zyngier@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
commit d98ecdaca2 ("arm64: perf: Count EL2 events if the kernel is
running in HYP") returns -EINVAL when perf system call perf_event_open is
called with exclude_hv != exclude_kernel. This change breaks applications
on VHE enabled ARMv8.1 platforms. The issue was observed with HHVM
application, which calls perf_event_open with exclude_hv = 1 and
exclude_kernel = 0.
There is no separate hypervisor privilege level when VHE is enabled, the
host kernel runs at EL2. So when VHE is enabled, we should ignore
exclude_hv from the application. This behaviour is consistent with PowerPC
where the exclude_hv is ignored when the hypervisor is not present and with
x86 where this flag is ignored.
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
[will: added comment to justify the behaviour of exclude_hv]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- Silence module allocation failures when CONFIG_ARM*_MODULE_PLTS is
enabled. This requires a check for __GFP_NOWARN in alloc_vmap_area()
- Improve/sanitise user tagged pointers handling in the kernel
- Inline asm fixes/cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZFJszAAoJEGvWsS0AyF7xASwQAKsY72jJMu+FbLqzn9vS7Frx
AGlx+M20odn6htFBBEDhaJQxFTFSfuBUNb6z4WmRsVVcVZ722EHsvEFFkHU4naR1
lAdZ1iFNHBRwGxV/JwCt08JwG0ipuqvcuNQH7XaYeuqldQLWaVTf4cangH4cZGX4
Fcl54DI7Nfy6QYBnfkBSzi6Pqjhkdn6vh1JlNvkX40BwkT6Zt9WryXzvCwQha9A0
EsstRhBECK6yCSaBcp7MbwyRbpB56PyOxUaeRUNoPaag+bSa8xs65JFq/yvolmpa
Cm1Bt/hlVHvi3rgMIYnm+z1C4IVgLA1ouEKYAGdq4IpWA46BsPxwOBmmYG/0qLqH
b7F5my5W8bFm9w1LI9I9l4FwoM1BU7b+n8KOZDZGpgfTwy86jIODhb42e7E4vEtn
yHCwwu688zkxoI+JTt7PvY3Oue69zkP1/kXUWt5SILKH5LFyweZvdGc+VCSeQoGo
fjwlnxI0l12vYIt2RnZWGJcA+W/T1E4cPJtIvvid9U9uuXs3Vv/EQ3F5wgaXoPN2
UDyJTxwrv/iT2yMoZmaaVh36+6UDUPV+b2alA9Wq/3996axGlzeI3go+cdhQXj+E
8JFzWph+kIZqCnGUaWMt/FTphFhOHjMxC36WEgxVRQZigXrajdrKAgvCj+7n2Qtm
X0wL+XDgsWA8yPgt4WLK
=WZ6G
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull more arm64 updates from Catalin Marinas:
- Silence module allocation failures when CONFIG_ARM*_MODULE_PLTS is
enabled. This requires a check for __GFP_NOWARN in alloc_vmap_area()
- Improve/sanitise user tagged pointers handling in the kernel
- Inline asm fixes/cleanups
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Silence first allocation with CONFIG_ARM64_MODULE_PLTS=y
ARM: Silence first allocation with CONFIG_ARM_MODULE_PLTS=y
mm: Silence vmap() allocation failures based on caller gfp_flags
arm64: uaccess: suppress spurious clang warning
arm64: atomic_lse: match asm register sizes
arm64: armv8_deprecated: ensure extension of addr
arm64: uaccess: ensure extension of access_ok() addr
arm64: ensure extension of smp_store_release value
arm64: xchg: hazard against entire exchange variable
arm64: documentation: document tagged pointer stack constraints
arm64: entry: improve data abort handling of tagged pointers
arm64: hw_breakpoint: fix watchpoint matching for tagged pointers
arm64: traps: fix userspace cache maintenance emulation on a tagged pointer
When CONFIG_ARM64_MODULE_PLTS is enabled, the first allocation using the
module space fails, because the module is too big, and then the module
allocation is attempted from vmalloc space. Silence the first allocation
failure in that case by setting __GFP_NOWARN.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Our compat swp emulation holds the compat user address in an unsigned
int, which it passes to __user_swpX_asm(). When a 32-bit value is passed
in a register, the upper 32 bits of the register are unknown, and we
must extend the value to 64 bits before we can use it as a base address.
This patch casts the address to unsigned long to ensure it has been
suitably extended, avoiding the potential issue, and silencing a related
warning from clang.
Fixes: bd35a4adc4 ("arm64: Port SWP/SWPB emulation support from arm")
Cc: <stable@vger.kernel.org> # 3.19.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When handling a data abort from EL0, we currently zero the top byte of
the faulting address, as we assume the address is a TTBR0 address, which
may contain a non-zero address tag. However, the address may be a TTBR1
address, in which case we should not zero the top byte. This patch fixes
that. The effect is that the full TTBR1 address is passed to the task's
signal handler (or printed out in the kernel log).
When handling a data abort from EL1, we leave the faulting address
intact, as we assume it's either a TTBR1 address or a TTBR0 address with
tag 0x00. This is true as far as I'm aware, we don't seem to access a
tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to
forget about address tags, and code added in the future may not always
remember to remove tags from addresses before accessing them. So add tag
handling to the EL1 data abort handler as well. This also makes it
consistent with the EL0 data abort handler.
Fixes: d50240a5f6 ("arm64: mm: permit use of tagged pointers at EL0")
Cc: <stable@vger.kernel.org> # 3.12.x-
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When we take a watchpoint exception, the address that triggered the
watchpoint is found in FAR_EL1. We compare it to the address of each
configured watchpoint to see which one was hit.
The configured watchpoint addresses are untagged, while the address in
FAR_EL1 will have an address tag if the data access was done using a
tagged address. The tag needs to be removed to compare the address to
the watchpoints.
Currently we don't remove it, and as a result can report the wrong
watchpoint as being hit (specifically, always either the highest TTBR0
watchpoint or lowest TTBR1 watchpoint). This patch removes the tag.
Fixes: d50240a5f6 ("arm64: mm: permit use of tagged pointers at EL0")
Cc: <stable@vger.kernel.org> # 3.12.x-
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When we emulate userspace cache maintenance in the kernel, we can
currently send the task a SIGSEGV even though the maintenance was done
on a valid address. This happens if the address has a non-zero address
tag, and happens to not be mapped in.
When we get the address from a user register, we don't currently remove
the address tag before performing cache maintenance on it. If the
maintenance faults, we end up in either __do_page_fault, where find_vma
can't find the VMA if the address has a tag, or in do_translation_fault,
where the tagged address will appear to be above TASK_SIZE. In both
cases, the address is not mapped in, and the task is sent a SIGSEGV.
This patch removes the tag from the address before using it. With this
patch, the fault is handled correctly, the address gets mapped in, and
the cache maintenance succeeds.
As a second bug, if cache maintenance (correctly) fails on an invalid
tagged address, the address gets passed into arm64_notify_segfault,
where find_vma fails to find the VMA due to the tag, and the wrong
si_code may be sent as part of the siginfo_t of the segfault. With this
patch, the correct si_code is sent.
Fixes: 7dd01aef05 ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
Cc: <stable@vger.kernel.org> # 4.8.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
support; virtual interrupt controller performance improvements; support
for userspace virtual interrupt controller (slower, but necessary for
KVM on the weird Broadcom SoCs used by the Raspberry Pi 3)
* MIPS: basic support for hardware virtualization (ImgTec
P5600/P6600/I6400 and Cavium Octeon III)
* PPC: in-kernel acceleration for VFIO
* s390: support for guests without storage keys; adapter interruption
suppression
* x86: usual range of nVMX improvements, notably nested EPT support for
accessed and dirty bits; emulation of CPL3 CPUID faulting
* generic: first part of VCPU thread request API; kvm_stat improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJZEHUkAAoJEL/70l94x66DBeYH/09wrpJ2FjU4Rqv7FxmqgWfH
9WGi4wvn/Z+XzQSyfMJiu2SfZVzU69/Y67OMHudy7vBT6knB+ziM7Ntoiu/hUfbG
0g5KsDX79FW15HuvuuGh9kSjUsj7qsQdyPZwP4FW/6ZoDArV9mibSvdjSmiUSMV/
2wxaoLzjoShdOuCe9EABaPhKK0XCrOYkygT6Paz1pItDxaSn8iW3ulaCuWMprUfG
Niq+dFemK464E4yn6HVD88xg5j2eUM6bfuXB3qR3eTR76mHLgtwejBzZdDjLG9fk
32PNYKhJNomBxHVqtksJ9/7cSR6iNPs7neQ1XHemKWTuYqwYQMlPj1NDy0aslQU=
=IsiZ
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- HYP mode stub supports kexec/kdump on 32-bit
- improved PMU support
- virtual interrupt controller performance improvements
- support for userspace virtual interrupt controller (slower, but
necessary for KVM on the weird Broadcom SoCs used by the Raspberry
Pi 3)
MIPS:
- basic support for hardware virtualization (ImgTec P5600/P6600/I6400
and Cavium Octeon III)
PPC:
- in-kernel acceleration for VFIO
s390:
- support for guests without storage keys
- adapter interruption suppression
x86:
- usual range of nVMX improvements, notably nested EPT support for
accessed and dirty bits
- emulation of CPL3 CPUID faulting
generic:
- first part of VCPU thread request API
- kvm_stat improvements"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
kvm: nVMX: Don't validate disabled secondary controls
KVM: put back #ifndef CONFIG_S390 around kvm_vcpu_kick
Revert "KVM: Support vCPU-based gfn->hva cache"
tools/kvm: fix top level makefile
KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING
KVM: Documentation: remove VM mmap documentation
kvm: nVMX: Remove superfluous VMX instruction fault checks
KVM: x86: fix emulation of RSM and IRET instructions
KVM: mark requests that need synchronization
KVM: return if kvm_vcpu_wake_up() did wake up the VCPU
KVM: add explicit barrier to kvm_vcpu_kick
KVM: perform a wake_up in kvm_make_all_cpus_request
KVM: mark requests that do not need a wakeup
KVM: remove #ifndef CONFIG_S390 around kvm_vcpu_wake_up
KVM: x86: always use kvm_make_request instead of set_bit
KVM: add kvm_{test,clear}_request to replace {test,clear}_bit
s390: kvm: Cpu model support for msa6, msa7 and msa8
KVM: x86: remove irq disablement around KVM_SET_CLOCK/KVM_GET_CLOCK
kvm: better MWAIT emulation for guests
KVM: x86: virtualize cpuid faulting
...
- kdump support, including two necessary memblock additions:
memblock_clear_nomap() and memblock_cap_memory_range()
- ARMv8.3 HWCAP bits for JavaScript conversion instructions, complex
numbers and weaker release consistency
- arm64 ACPI platform MSI support
- arm perf updates: ACPI PMU support, L3 cache PMU in some Qualcomm
SoCs, Cortex-A53 L2 cache events and DTLB refills, MAINTAINERS update
for DT perf bindings
- architected timer errata framework (the arch/arm64 changes only)
- support for DMA_ATTR_FORCE_CONTIGUOUS in the arm64 iommu DMA API
- arm64 KVM refactoring to use common system register definitions
- remove support for ASID-tagged VIVT I-cache (no ARMv8 implementation
using it and deprecated in the architecture) together with some
I-cache handling clean-up
- PE/COFF EFI header clean-up/hardening
- define BUG() instruction without CONFIG_BUG
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZDKMoAAoJEGvWsS0AyF7xR+YP/0EMEz5MDfCv0PVYj7/AIa0G
Zphl7OhysIkeDAz7urXw9Jdl0NfORNIqmD1vZNVSc321IyNp56Od+kWd82lBrOWB
ad3nNT67pEmu0pAW7CO48ju3rTesEnEl3ra45E1tULeLihmv93jc4ZlfXgumlKq3
/GE84XJ5ZFmluuhq1zgNefeUtyl1tbxTxHJ74+INF7dTd/5sJcphpqS4Dzpb+msT
20WYliccQCBF9zBFUYHc2KjcXXKRQGxLulGS3MuoN2DLkD+U9YyR/OmA7SoXh2J2
WXC5b0x856xTQJFCJ39pb7rw5xHjt3l5zfU3VLSvqEVL/+asBqCcgGNtNUgOW1Es
dEHC6bc66Ley6mn7bbpFE3MK8D+K5q8HwMF6G5KDtIVB6DB/iQ6kzi5aXKoupxtb
1EuU4OW6cDhmOFQYjgIDofLgqbmVvJofdF6+NfxasfZmWrMgHzv0rYvaCDnAV/Tr
t7bhH7hf9/KcP/wpk86O2AMKKpgoNTqe1Qy8cWVFFLnut567Pb6zs/L3ZXfleoLv
t613yM8Zj2fE05ja8ylMDjaasidNpXGttb08/4kAn06Daaoueqla0jmduAhy4aaV
dQ3OFP9lJ5MFaFnMMTPfU3vtvNLMHuo9MZsYCrv5zCaNNs3lpAPUiPNh588ZscKa
sWx4PEiaCi+wcOsLsJvh
=SDkm
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- kdump support, including two necessary memblock additions:
memblock_clear_nomap() and memblock_cap_memory_range()
- ARMv8.3 HWCAP bits for JavaScript conversion instructions, complex
numbers and weaker release consistency
- arm64 ACPI platform MSI support
- arm perf updates: ACPI PMU support, L3 cache PMU in some Qualcomm
SoCs, Cortex-A53 L2 cache events and DTLB refills, MAINTAINERS update
for DT perf bindings
- architected timer errata framework (the arch/arm64 changes only)
- support for DMA_ATTR_FORCE_CONTIGUOUS in the arm64 iommu DMA API
- arm64 KVM refactoring to use common system register definitions
- remove support for ASID-tagged VIVT I-cache (no ARMv8 implementation
using it and deprecated in the architecture) together with some
I-cache handling clean-up
- PE/COFF EFI header clean-up/hardening
- define BUG() instruction without CONFIG_BUG
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits)
arm64: Fix the DMA mmap and get_sgtable API with DMA_ATTR_FORCE_CONTIGUOUS
arm64: Print DT machine model in setup_machine_fdt()
arm64: pmu: Wire-up Cortex A53 L2 cache events and DTLB refills
arm64: module: split core and init PLT sections
arm64: pmuv3: handle pmuv3+
arm64: Add CNTFRQ_EL0 trap handler
arm64: Silence spurious kbuild warning on menuconfig
arm64: pmuv3: use arm_pmu ACPI framework
arm64: pmuv3: handle !PMUv3 when probing
drivers/perf: arm_pmu: add ACPI framework
arm64: add function to get a cpu's MADT GICC table
drivers/perf: arm_pmu: split out platform device probe logic
drivers/perf: arm_pmu: move irq request/free into probe
drivers/perf: arm_pmu: split cpu-local irq request/free
drivers/perf: arm_pmu: rename irq request/free functions
drivers/perf: arm_pmu: handle no platform_device
drivers/perf: arm_pmu: simplify cpu_pmu_request_irqs()
drivers/perf: arm_pmu: factor out pmu registration
drivers/perf: arm_pmu: fold init into alloc
drivers/perf: arm_pmu: define armpmu_init_fn
...
Pull networking updates from David Millar:
"Here are some highlights from the 2065 networking commits that
happened this development cycle:
1) XDP support for IXGBE (John Fastabend) and thunderx (Sunil Kowuri)
2) Add a generic XDP driver, so that anyone can test XDP even if they
lack a networking device whose driver has explicit XDP support
(me).
3) Sparc64 now has an eBPF JIT too (me)
4) Add a BPF program testing framework via BPF_PROG_TEST_RUN (Alexei
Starovoitov)
5) Make netfitler network namespace teardown less expensive (Florian
Westphal)
6) Add symmetric hashing support to nft_hash (Laura Garcia Liebana)
7) Implement NAPI and GRO in netvsc driver (Stephen Hemminger)
8) Support TC flower offload statistics in mlxsw (Arkadi Sharshevsky)
9) Multiqueue support in stmmac driver (Joao Pinto)
10) Remove TCP timewait recycling, it never really could possibly work
well in the real world and timestamp randomization really zaps any
hint of usability this feature had (Soheil Hassas Yeganeh)
11) Support level3 vs level4 ECMP route hashing in ipv4 (Nikolay
Aleksandrov)
12) Add socket busy poll support to epoll (Sridhar Samudrala)
13) Netlink extended ACK support (Johannes Berg, Pablo Neira Ayuso,
and several others)
14) IPSEC hw offload infrastructure (Steffen Klassert)"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2065 commits)
tipc: refactor function tipc_sk_recv_stream()
tipc: refactor function tipc_sk_recvmsg()
net: thunderx: Optimize page recycling for XDP
net: thunderx: Support for XDP header adjustment
net: thunderx: Add support for XDP_TX
net: thunderx: Add support for XDP_DROP
net: thunderx: Add basic XDP support
net: thunderx: Cleanup receive buffer allocation
net: thunderx: Optimize CQE_TX handling
net: thunderx: Optimize RBDR descriptor handling
net: thunderx: Support for page recycling
ipx: call ipxitf_put() in ioctl error path
net: sched: add helpers to handle extended actions
qed*: Fix issues in the ptp filter config implementation.
qede: Fix concurrency issue in PTP Tx path processing.
stmmac: Add support for SIMATIC IOT2000 platform
net: hns: fix ethtool_get_strings overflow in hns driver
tcp: fix wraparound issue in tcp_lp
bpf, arm64: fix jit branch offset related to ldimm64
bpf, arm64: implement jiting of BPF_XADD
...
Pull EFI updates from Ingo Molnar:
"The main changes in this cycle were:
- move BGRT handling to drivers/acpi so it can be shared between x86
and ARM
- bring the EFI stub's initrd and FDT allocation logic in line with
the latest changes to the arm64 boot protocol
- improvements and fixes to the EFI stub's command line parsing
routines
- randomize the virtual mapping of the UEFI runtime services on
ARM/arm64
- ... and other misc enhancements, cleanups and fixes"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub/arm: Don't use TASK_SIZE when randomizing the RT space
ef/libstub/arm/arm64: Randomize the base of the UEFI rt services region
efi/libstub/arm/arm64: Disable debug prints on 'quiet' cmdline arg
efi/libstub: Unify command line param parsing
efi/libstub: Fix harmless command line parsing bug
efi/arm32-stub: Allow boot-time allocations in the vmlinux region
x86/efi: Clean up a minor mistake in comment
efi/pstore: Return error code (if any) from efi_pstore_write()
efi/bgrt: Enable ACPI BGRT handling on arm64
x86/efi/bgrt: Move efi-bgrt handling out of arch/x86
efi/arm-stub: Round up FDT allocation to mapping size
efi/arm-stub: Correct FDT and initrd allocation rules for arm64
Pull timer updates from Thomas Gleixner:
"The timer departement delivers:
- more year 2038 rework
- a massive rework of the arm achitected timer
- preparatory patches to allow NTP correction of clock event devices
to avoid early expiry
- the usual pile of fixes and enhancements all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (91 commits)
timer/sysclt: Restrict timer migration sysctl values to 0 and 1
arm64/arch_timer: Mark errata handlers as __maybe_unused
Clocksource/mips-gic: Remove redundant non devicetree init
MIPS/Malta: Probe gic-timer via devicetree
clocksource: Use GENMASK_ULL in definition of CLOCKSOURCE_MASK
acpi/arm64: Add SBSA Generic Watchdog support in GTDT driver
clocksource: arm_arch_timer: add GTDT support for memory-mapped timer
acpi/arm64: Add memory-mapped timer support in GTDT driver
clocksource: arm_arch_timer: simplify ACPI support code.
acpi/arm64: Add GTDT table parse driver
clocksource: arm_arch_timer: split MMIO timer probing.
clocksource: arm_arch_timer: add structs to describe MMIO timer
clocksource: arm_arch_timer: move arch_timer_needs_of_probing into DT init call
clocksource: arm_arch_timer: refactor arch_timer_needs_probing
clocksource: arm_arch_timer: split dt-only rate handling
x86/uv/time: Set ->min_delta_ticks and ->max_delta_ticks
unicore32/time: Set ->min_delta_ticks and ->max_delta_ticks
um/time: Set ->min_delta_ticks and ->max_delta_ticks
tile/time: Set ->min_delta_ticks and ->max_delta_ticks
score/time: Set ->min_delta_ticks and ->max_delta_ticks
...
Pull uaccess unification updates from Al Viro:
"This is the uaccess unification pile. It's _not_ the end of uaccess
work, but the next batch of that will go into the next cycle. This one
mostly takes copy_from_user() and friends out of arch/* and gets the
zero-padding behaviour in sync for all architectures.
Dealing with the nocache/writethrough mess is for the next cycle;
fortunately, that's x86-only. Same for cleanups in iov_iter.c (I am
sold on access_ok() in there, BTW; just not in this pile), same for
reducing __copy_... callsites, strn*... stuff, etc. - there will be a
pile about as large as this one in the next merge window.
This one sat in -next for weeks. -3KLoC"
* 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (96 commits)
HAVE_ARCH_HARDENED_USERCOPY is unconditional now
CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now
m32r: switch to RAW_COPY_USER
hexagon: switch to RAW_COPY_USER
microblaze: switch to RAW_COPY_USER
get rid of padding, switch to RAW_COPY_USER
ia64: get rid of copy_in_user()
ia64: sanitize __access_ok()
ia64: get rid of 'segment' argument of __do_{get,put}_user()
ia64: get rid of 'segment' argument of __{get,put}_user_check()
ia64: add extable.h
powerpc: get rid of zeroing, switch to RAW_COPY_USER
esas2r: don't open-code memdup_user()
alpha: fix stack smashing in old_adjtimex(2)
don't open-code kernel_setsockopt()
mips: switch to RAW_COPY_USER
mips: get rid of tail-zeroing in primitives
mips: make copy_from_user() zero tail explicitly
mips: clean and reorder the forest of macros...
mips: consolidate __invoke_... wrappers
...