This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)
to do the replacement at the end of the merge window.
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull exec domain removal from Richard Weinberger:
"This series removes execution domain support from Linux.
The idea behind exec domains was to support different ABIs. The
feature was never complete nor stable. Let's rip it out and make the
kernel signal handling code less complicated"
* 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (27 commits)
arm64: Removed unused variable
sparc: Fix execution domain removal
Remove rest of exec domains.
arch: Remove exec_domain from remaining archs
arc: Remove signal translation and exec_domain
xtensa: Remove signal translation and exec_domain
xtensa: Autogenerate offsets in struct thread_info
x86: Remove signal translation and exec_domain
unicore32: Remove signal translation and exec_domain
um: Remove signal translation and exec_domain
tile: Remove signal translation and exec_domain
sparc: Remove signal translation and exec_domain
sh: Remove signal translation and exec_domain
s390: Remove signal translation and exec_domain
mn10300: Remove signal translation and exec_domain
microblaze: Remove signal translation and exec_domain
m68k: Remove signal translation and exec_domain
m32r: Remove signal translation and exec_domain
m32r: Autogenerate offsets in struct thread_info
frv: Remove signal translation and exec_domain
...
As execution domain support is gone we can remove
signal translation from the signal code and remove
exec_domain from thread_info.
Signed-off-by: Richard Weinberger <richard@nod.at>
If an attacker can cause a controlled kernel stack overflow, overwriting
the restart block is a very juicy exploit target. This is because the
restart_block is held in the same memory allocation as the kernel stack.
Moving the restart block to struct task_struct prevents this exploit by
making the restart_block harder to locate.
Note that there are other fields in thread_info that are also easy
targets, at least on some architectures.
It's also a decent simplification, since the restart code is more or less
identical on all architectures.
[james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: Richard Weinberger <richard@nod.at>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When invoking syscall handlers on sh32, the saved userspace registers
are at the top of the stack. This seems to have been intentional, as it
is an easy way to pass r0, r1, ... to the handler as parameters 5, 6,
...
It causes problems, however, because the compiler is allowed to generate
code for a function which clobbers that function's own parameters. For
example, gcc generates the following code for clone:
<SyS_clone>:
mov.l 8c020714 <SyS_clone+0xc>,r1 ! 8c020540 <do_fork>
mov.l r7,@r15
mov r6,r7
jmp @r1
mov #0,r6
nop
.word 0x0540
.word 0x8c02
The `mov.l r7,@r15` clobbers the saved value of r0 passed from
userspace. For most system calls, this might not be a problem, because
we'll be overwriting r0 with the return value anyway. But in the case
of clone, copy_thread will need the original value of r0 if the
CLONE_SETTLS flag was specified.
The first patch in this series fixes this issue for system calls by
pushing to the stack and extra copy of r0-r2 before invoking the
handler. We discard this copy before restoring the userspace registers,
so it is not a problem if they are clobbered.
Exception handlers also receive the userspace register values in a
similar manner, and may hit the same problem. The second patch removes
the do_fpu_error handler, which looks susceptible to this problem and
which, as far as I can tell, has not been used in some time. The third
patch addresses other exception handlers.
This patch (of 3):
The userspace registers are stored at the top of the stack when the
syscall handler is invoked, which allows r0-r2 to act as parameters 5-7.
Parameters passed on the stack may be clobbered by the syscall handler.
The solution is to push an extra copy of the registers which might be
used as syscall parameters to the stack, so that the authoritative set
of saved register values does not get clobbered.
A few system call handlers are also updated to get the userspace
registers using current_pt_regs() instead of from the stack.
Signed-off-by: Bobby Bingham <koorogi@koorogi.info>
Cc: Paul Mundt <paul.mundt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This include is no longer needed.
(seems to be a leftover from try_to_freeze())
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Does block_sigmask() + tracehook_signal_handler(); called when
sigframe has been successfully built. All architectures converted
to it; block_sigmask() itself is gone now (merged into this one).
I'm still not too happy with the signature, but that's a separate
story (IMO we need a structure that would contain signal number +
siginfo + k_sigaction, so that get_signal_to_deliver() would fill one,
signal_delivered(), handle_signal() and probably setup...frame() -
take one).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Only 3 out of 63 do not. Renamed the current variant to __set_current_blocked(),
added set_current_blocked() that will exclude unblockable signals, switched
open-coded instances to it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
replace boilerplate "should we use ->saved_sigmask or ->blocked?"
with calls of obvious inlined helper...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
first fruits of ..._restore_sigmask() helpers: now we can take
boilerplate "signal didn't have a handler, clear RESTORE_SIGMASK
and restore the blocked mask from ->saved_mask" into a common
helper. Open-coded instances switched...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Complete the move of sh64 to it, trim the crap from prototypes,
tidy up a bit. Infrastructure in do_signal() had already been
there, in signal_64 as well as in signal_32 (where it was already
used).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
guts of saved_sigmask-based sigsuspend/rt_sigsuspend. Takes
kernel sigset_t *.
Open-coded instances replaced with calling it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.15 (GNU/Linux)
iEYEABECAAYFAk91TL0ACgkQGkmNcg7/o7hEjwCgmuz6QQKkow7e5q0x7DR5Z2NH
1YoAn3TpODDmpaBiou26uMRPhcR6e1qC
=JCA0
-----END PGP SIGNATURE-----
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Pull SuperH updates from Paul Mundt.
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh: (25 commits)
sh: Support I/O space swapping where needed.
sh: use set_current_blocked() and block_sigmask()
sh: no need to reset handler if SA_ONESHOT
sh: intc: Fix up section mismatch for intc_ack_data
sh: select ARCH_DISCARD_MEMBLOCK.
sh: Consolidate duplicate _32/_64 unistd definitions.
sh: ecovec: switch SDHI controllers to card polling
sh: Avoid exporting unimplemented syscalls.
sh: add platform_device for RSPI in setup-sh7757
SH: pci-sh7780: enable big-endian operation.
serial: sh-sci: fix a race of DMA submit_tx on transfer
sh: dma: Collect up CHCR of SH7763, SH7764, SH7780 and SH7785
sh: dma: Collect up CHCR of SH7723 and SH7730
sh/next: Fix build fail by asm/system.h in asm/bitops.h
arch/sh/drivers/dma/{dma-g2,dmabrg}.c: ensure arguments to request_irq and free_irq are compatible
sh: cpufreq: Wire up scaling_available_freqs support.
sh: cpufreq: notify about rate rounding fallback.
sh: cpufreq: Support CPU clock frequency table.
sh: cpufreq: struct device lookup from CPU topology.
sh: cpufreq: percpu struct clk accounting.
...
As described in e6fa16ab ("signal: sigprocmask() should do
retarget_shared_pending()") the modification of current->blocked is
incorrect as we need to check whether the signal we're about to block is
pending in the shared queue.
Also, use the new helper function introduced in commit 5e6292c0f2
("signal: add block_sigmask() for adding sigmask to current->blocked")
which centralises the code for updating current->blocked after
successfully delivering a signal and reduces the amount of duplicate code
across architectures. In the past some architectures got this code wrong,
so using this helper function should stop that from happening again.
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
get_signal_to_deliver() already resets the signal handler if SA_ONESHOT is
set in ka->sa.sa_flags, there's no need to do it again in handle_signal().
Furthermore, because we were modifying ka->sa.sa_handler (which is a copy
of sighand->action[]) instead of sighand->action[] the original code had
no effect on signal delivery.
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The old ctrl in/out routines are non-portable and unsuitable for
cross-platform use. While drivers/sh has already been sanitized, there
is still quite a lot of code that is not. This converts the arch/sh/ bits
over, which permits us to flag the routines as deprecated whilst still
building with -Werror for the architecture code, and to ensure that
future users are not added.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This follows the x86 xstate changes and implements a task_xstate slab
cache that is dynamically sized to match one of hard FP/soft FP/FPU-less.
This also tidies up and consolidates some of the SH-2A/SH-4 FPU
fragmentation. Now fpu state restorers are commonly defined, with the
init_fpu()/fpu_init() mess reworked to follow the x86 convention.
The fpu_init() register initialization has been replaced by xstate setup
followed by writing out to hardware via the standard restore path.
As init_fpu() now performs a slab allocation a secondary lighterweight
restorer is also introduced for the context switch.
In the future the DSP state will be rolled in here, too.
More work remains for math emulation and the SH-5 FPU, which presently
uses its own special (UP-only) interfaces.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Replace TIF_RESTORE_SIGMASK with TS_RESTORE_SIGMASK and define our own
set_restore_sigmask() function. This saves the costly SMP-safe set_bit
operation, which we do not need for the sigmask flag since TIF_SIGPENDING
always has to be set too.
Based on the x86 and powerpc change.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This only needs to flush the return code via the legacy path, and just
invalidates uselessly otherwise. This makes the behaviour consistent for
all of the trampoline setup paths.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
We do not want to use smp_processor_id() from these paths, as they trip
preempt BUGs. Switch the test over to the boot cpu directly.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add a keyctl to install a process's session keyring onto its parent. This
replaces the parent's session keyring. Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again. Normally this
will be after a wait*() syscall.
To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.
The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.
Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the
replacement to be performed at the point the parent process resumes userspace
execution.
This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership. However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.
This can be tested with the following program:
#include <stdio.h>
#include <stdlib.h>
#include <keyutils.h>
#define KEYCTL_SESSION_TO_PARENT 18
#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)
int main(int argc, char **argv)
{
key_serial_t keyring, key;
long ret;
keyring = keyctl_join_session_keyring(argv[1]);
OSERROR(keyring, "keyctl_join_session_keyring");
key = add_key("user", "a", "b", 1, keyring);
OSERROR(key, "add_key");
ret = keyctl(KEYCTL_SESSION_TO_PARENT);
OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");
return 0;
}
Compiled and linked with -lkeyutils, you should see something like:
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
355907932 --alswrv 4043 -1 \_ keyring: _uid.4043
[dhowells@andromeda ~]$ /tmp/newpag
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
1055658746 --alswrv 4043 4043 \_ user: a
[dhowells@andromeda ~]$ /tmp/newpag hello
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: hello
340417692 --alswrv 4043 4043 \_ user: a
Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
GCC does not issue unwind information for function epilogues.
Unfortunately we can catch a signal during an epilogue. The signal
handler writes the current context and signal return code onto the stack
overwriting previous contents. During unwinding, libgcc can try to
restore registers from the stack and restores corrupted ones. This can
lead to segmentation, misaligned access and sigbus faults.
For example, consider the following code:
mov.l r12,@-r15
mov.l r14,@-r15
sts.l pr,@-r15
mov r15,r14
<do stuff>
mov r14, r15
lds.l @r15+, pr
<<< SIGNAL HERE
mov.l @r15+, r14
mov.l @r15+, r12
rts
Unwind is aware that pr was pushed to stack in prolog, so tries to
restore it. Unfortunately it restores the last word of the signal
handler code placed on the stack by the kernel.
This patch tries to avoid the problem by adding a guard region on the
stack between where the function pushes data and where the signal handler
pushes its return code. We probably don't see this problem often because
exception handling unwinding in an epilogue only occurs due to a pthread
cancel signal. Also the kernel signal stack handler alignment of 8 bytes
could hide the occurance of this problem sometimes as the stack may not
be trampled at a particular required word.
This is not guaranteed to always work. It relies on a frame pointer
existing for the function (so it can get the correct sp value) which is
not always the case for the SH4.
Modifications will also be made to libgcc for the case where there is no
fp.
Signed-off-by: Carl Shaw <carl.shaw@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
GCC 4.5.0 complains about the declaration of variables
__kernel_sigreturn and __kernel_rt_sigreturn because they have type
void. Correctly declare these symbols as functions to fix the
following error,
arch/sh/kernel/signal_32.c: In function 'setup_frame':
arch/sh/kernel/signal_32.c:368:14: error: taking address of expression of type 'void'
arch/sh/kernel/signal_32.c: In function 'setup_rt_frame':
arch/sh/kernel/signal_32.c:452:14: error: taking address of expression of type 'void'
make[1]: *** [arch/sh/kernel/signal_32.o] Error 1
make: *** [arch/sh/kernel] Error 2
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The T-bit manipulation for syscall error checking had the side effect of
spuriously returning ERESTART* errno values over EINTR. So, we simplify
the error checking a bit and leave the T-bit alone.
Reported-by: Kaz Kojima <kkojima@rr.iij4u.or.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This follows the changes in commits:
7d6d637dac4f72c4279e
on powerpc. Adding in TIF_NOTIFY_RESUME, and cleaning up the syscall
tracing to be more generic. This is an incremental step to turning
on tracehook, as well as unifying more of the ptrace and signal code
across the 32/64 split.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The current kernel behaviour is to reenable interrupts unconditionally
when taking a page fault. This patch changes this to only enable them
if interrupts were previously enabled.
It also fixes a problem seen with this fix in place: the kernel previously
flushed the vsyscall page when handling a signal, which is not only
unncessary, but caused a possible sleep with interrupts disabled.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add implementation of flush_icache_range() suitable for signal handler
and kprobes. Remove flush_cache_sigtramp() and change signal.c to use
flush_icache_range().
Signed-off-by: Chris Smith <chris.smith@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently with preempt enabled there's the possibility to be preempted
after the TIF_USEDFPU test and the register save, leading to bogus
state post-__switch_to(). Use an explicit preempt_disable()/enable()
pair around unlazy_fpu()/clear_fpu() to avoid this. Follows the x86
change.
Reported-by: Takuo Koguchi <takuo.koguchi.sw@hitachi.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements kernel-level atomic rollback built on top of gUSA,
as an alternative non-IRQ based atomicity method. This is generally
a faster method for platforms that are lacking the LL/SC pairs that
SH-4A and later use, and is only supportable on legacy cores.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>