Like syscall entry all architectures have similar and pointlessly different
code to handle pending work before returning from a syscall to user space.
1) One-time syscall exit work:
- rseq syscall exit
- audit
- syscall tracing
- tracehook (single stepping)
2) Preparatory work
- Exit to user mode loop (common TIF handling).
- Architecture specific one time work arch_exit_to_user_mode_prepare()
- Address limit and lockdep checks
3) Final transition (lockdep, tracing, context tracking, RCU). Invokes
arch_exit_to_user_mode() to handle e.g. speculation mitigations
Provide a generic version based on the x86 code which has all the RCU and
instrumentation protections right.
Provide a variant for interrupt return to user mode as well which shares
the above #2 and #3 work items.
After syscall_exit_to_user_mode() and irqentry_exit_to_user_mode() the
architecture code just has to return to user space. The code after
returning from these functions must not be instrumented.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220519.613977173@linutronix.de
On syscall entry certain work needs to be done:
- Establish state (lockdep, context tracking, tracing)
- Conditional work (ptrace, seccomp, audit...)
This code is needlessly duplicated and different in all
architectures.
Provide a generic version based on the x86 implementation which has all the
RCU and instrumentation bits right.
As interrupt/exception entry from user space needs parts of the same
functionality, provide a function for this as well.
syscall_enter_from_user_mode() and irqentry_enter_from_user_mode() must be
called right after the low level ASM entry. The calling code must be
non-instrumentable. After the functions returns state is correct and the
subsequent functions can be instrumented.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220519.513463269@linutronix.de