linux/arch/x86/kernel
Andy Lutomirski 3d28ebceaf x86/mm: Rework lazy TLB to track the actual loaded mm
Lazy TLB state is currently managed in a rather baroque manner.
AFAICT, there are three possible states:

 - Non-lazy.  This means that we're running a user thread or a
   kernel thread that has called use_mm().  current->mm ==
   current->active_mm == cpu_tlbstate.active_mm and
   cpu_tlbstate.state == TLBSTATE_OK.

 - Lazy with user mm.  We're running a kernel thread without an mm
   and we're borrowing an mm_struct.  We have current->mm == NULL,
   current->active_mm == cpu_tlbstate.active_mm, cpu_tlbstate.state
   != TLBSTATE_OK (i.e. TLBSTATE_LAZY or 0).  The current cpu is set
   in mm_cpumask(current->active_mm).  CR3 points to
   current->active_mm->pgd.  The TLB is up to date.

 - Lazy with init_mm.  This happens when we call leave_mm().  We
   have current->mm == NULL, current->active_mm ==
   cpu_tlbstate.active_mm, but that mm is only relelvant insofar as
   the scheduler is tracking it for refcounting.  cpu_tlbstate.state
   != TLBSTATE_OK.  The current cpu is clear in
   mm_cpumask(current->active_mm).  CR3 points to swapper_pg_dir,
   i.e. init_mm->pgd.

This patch simplifies the situation.  Other than perf, x86 stops
caring about current->active_mm at all.  We have
cpu_tlbstate.loaded_mm pointing to the mm that CR3 references.  The
TLB is always up to date for that mm.  leave_mm() just switches us
to init_mm.  There are no longer any special cases for mm_cpumask,
and switch_mm() switches mms without worrying about laziness.

After this patch, cpu_tlbstate.state serves only to tell the TLB
flush code whether it may switch to init_mm instead of doing a
normal flush.

This makes fairly extensive changes to xen_exit_mmap(), which used
to look a bit like black magic.

Perf is unchanged.  With or without this change, perf may behave a bit
erratically if it tries to read user memory in kernel thread context.
We should build on this patch to teach perf to never look at user
memory when cpu_tlbstate.loaded_mm != current->mm.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-05 09:59:44 +02:00
..
acpi Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 23:54:56 -07:00
apic Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 21:41:07 -07:00
cpu x86/microcode/AMD: Change load_microcode_amd()'s param to bool to fix preemptibility bug 2017-05-29 08:22:48 +02:00
fpu KVM: x86: Fix load damaged SSEx MXCSR register 2017-05-15 16:08:56 +02:00
kprobes kprobes/x86: Fix to set RWX bits correctly before releasing trampoline 2017-05-26 22:37:00 -04:00
.gitignore
Makefile x86/ftrace: Use Makefile logic instead of #ifdef for compiling ftrace_*.o 2017-03-24 10:14:08 +01:00
alternative.c x86/alternatives: Prevent uninitialized stack byte read in apply_alternatives() 2017-05-24 16:18:12 +02:00
amd_gart_64.c x86: use set_memory.h header 2017-05-08 17:15:13 -07:00
amd_nb.c x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h 2016-11-16 20:46:38 +01:00
apb_timer.c Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-25 14:30:04 -08:00
aperture_64.c x86/boot/e820: Prefix the E820_* type names with "E820_TYPE_" 2017-01-28 22:55:22 +01:00
apm_32.c x86: Remap GDT tables in the fixmap section 2017-03-16 09:06:35 +01:00
asm-offsets.c efi: Get and store the secure boot status 2017-02-07 10:42:10 +01:00
asm-offsets_32.c sched/x86: Rewrite the switch_to() code 2016-08-24 12:31:41 +02:00
asm-offsets_64.c x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64 2017-02-21 12:48:35 +01:00
audit_64.c
bootflag.c
check.c
cpuid.c Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-12 19:25:04 -08:00
crash.c x86/boot/e820: Clean up the E820 table size define names 2017-01-28 22:55:23 +01:00
crash_dump_32.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
crash_dump_64.c
devicetree.c x86/cpufeature: Replace cpu_has_apic with boot_cpu_has() usage 2016-04-13 11:37:41 +02:00
doublefault.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
dumpstack.c x86/unwind: Ensure stack pointer is aligned 2017-04-18 10:30:23 +02:00
dumpstack_32.c x86/debug: Implement __WARN() using UD0 2017-03-27 10:20:28 +02:00
dumpstack_64.c x86/debug: Implement __WARN() using UD0 2017-03-27 10:20:28 +02:00
e820.c x86/boot/e820: Remove a redundant self assignment 2017-04-14 11:43:21 +02:00
early-quirks.c main drm pull request for 4.12 kernel 2017-05-03 11:44:24 -07:00
early_printk.c x86/earlyprintk: Add support for earlyprintk via USB3 debug port 2017-03-21 12:30:16 +01:00
ebda.c x86/boot: Simplify EBDA-vs-BIOS reservation logic 2016-07-22 11:46:01 +02:00
espfix_64.c x86/espfix: Add support for 5-level paging 2017-04-04 08:22:34 +02:00
ftrace.c x86/ftrace: Make sure that ftrace trampolines are not RWX 2017-05-26 22:37:02 -04:00
ftrace_32.S x86/ftrace: Fix ebp in ftrace_regs_caller that screws up unwinder 2017-04-21 09:48:16 +02:00
ftrace_64.S x86/ftrace: Use Makefile logic instead of #ifdef for compiling ftrace_*.o 2017-03-24 10:14:08 +01:00
head32.c x86/boot/e820: Move asm/e820.h to asm/e820/api.h 2017-01-28 09:31:13 +01:00
head64.c Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 20:51:12 -07:00
head_32.S x86/boot/32: Convert the 32-bit pgtable setup code from assembly to C 2017-01-06 08:39:26 +01:00
head_64.S x86/boot/64: Rename start_cpu() 2017-03-07 13:57:25 +01:00
hpet.c x86/hpet: Prevent might sleep splat on resume 2017-03-02 09:33:47 +01:00
hw_breakpoint.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
i8237.c
i8253.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
i8259.c x86: i8259: export legacy_pic symbol 2017-04-14 12:08:51 +02:00
io_delay.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
ioport.c Second batch of KVM changes for 4.11 merge window 2017-03-04 11:36:19 -08:00
irq.c x86/irq: Optimize free vector check in the CPU offline path 2017-04-20 15:25:09 +02:00
irq_32.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
irq_64.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
irq_work.c x86/irq, trace: Add __irq_entry annotation to x86's platform IRQ handlers 2017-01-05 08:58:49 +01:00
irqinit.c x86/irq: Remove a redundant #ifdef directive 2017-04-14 22:43:01 +02:00
itmt.c sched/x86: Remove unnecessary TBM3 check to update topology 2017-01-19 08:42:37 +01:00
jump_label.c locking/jump_labels: Update bug_at() boot message 2017-01-12 09:43:07 +01:00
kdebugfs.c x86/kdebugfs: Move boot params hierarchy under (debugfs)/x86/ 2017-03-01 09:57:02 +01:00
kexec-bzimage64.c x86/boot/e820: Clean up the E820 table size define names 2017-01-28 22:55:23 +01:00
kgdb.c sched/x86: Add 'struct inactive_task_frame' to better document the sleeping task stack frame 2016-08-24 12:27:41 +02:00
ksysfs.c x86: Apply more __ro_after_init and const 2016-08-10 14:55:05 +02:00
kvm.c x86/kvm: virt_xxx memory barriers instead of mandatory barriers 2017-04-12 20:17:38 +02:00
kvmclock.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
ldt.c x86/mm: Rework lazy TLB to track the actual loaded mm 2017-06-05 09:59:44 +02:00
livepatch.c livepatch/x86: apply alternatives and paravirt patches after relocations 2016-08-18 23:41:55 +02:00
machine_kexec_32.c x86: use set_memory.h header 2017-05-08 17:15:13 -07:00
machine_kexec_64.c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-12 10:11:50 -07:00
mmconf-fam10h_64.c
module.c mm, vmalloc: use __GFP_HIGHMEM implicitly 2017-05-08 17:15:13 -07:00
mpparse.c x86/boot/e820: Rename early_reserve_e820() to e820__memblock_alloc() and document it 2017-01-28 14:42:30 +01:00
msr.c x86/msr: Remove bogus cleanup from the error path 2016-12-25 10:47:41 +01:00
nmi.c * An EDAC driver for Cavium ThunderX RAS IP (Sergey Temerkhanov) 2017-05-01 11:36:00 -07:00
nmi_selftest.c
paravirt-spinlocks.c 4.11 is going to be a relatively large release for KVM, with a little over 2017-02-22 18:22:53 -08:00
paravirt.c x86/paravirt: Add 5-level support to the paravirt code 2017-04-04 08:22:34 +02:00
paravirt_patch_32.c x86/paravirt: Mark unused patch_default label 2016-12-22 17:43:35 +01:00
paravirt_patch_64.c x86/paravirt: Mark unused patch_default label 2016-12-22 17:43:35 +01:00
pci-calgary_64.c x86/pci-calgary: Use setup_timer() instead of open coding it. 2017-03-31 10:21:04 +02:00
pci-dma.c This is a tree wide change and has been kept separate for that reason. 2017-02-25 13:45:43 -08:00
pci-iommu_table.c x86: Fix non-static inlines 2016-04-16 13:21:40 +02:00
pci-nommu.c treewide: Constify most dma_map_ops structures 2017-01-24 12:23:35 -05:00
pci-swiotlb.c treewide: Constify most dma_map_ops structures 2017-01-24 12:23:35 -05:00
pcspeaker.c
perf_regs.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
platform-quirks.c x86/init: Add i8042 state to the platform data 2016-12-19 11:34:15 +01:00
pmem.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
probe_roms.c x86/boot/e820: Move asm/e820.h to asm/e820/api.h 2017-01-28 09:31:13 +01:00
process.c x86/arch_prctl: Add ARCH_[GET|SET]_CPUID 2017-03-20 16:10:34 +01:00
process_32.c x86/debug/32: Convert a smp_processor_id() call to raw to avoid DEBUG_PREEMPT warning 2017-05-29 08:22:49 +02:00
process_64.c x86/xen: add CONFIG_XEN_PV to Kconfig 2017-05-02 10:50:19 +02:00
ptrace.c x86/arch_prctl/64: Rename do_arch_prctl() to do_arch_prctl_64() 2017-03-20 16:10:32 +01:00
pvclock.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nmi.h> 2017-03-02 08:42:30 +01:00
quirks.c x86/quirks: Hide maybe-uninitialized warning 2016-10-25 11:45:13 +02:00
reboot.c x86/mce: Handle broadcasted MCE gracefully with kexec 2017-03-13 20:18:07 +01:00
reboot_fixups_32.c
relocate_kernel_32.S
relocate_kernel_64.S
resource.c x86/boot/e820: Harmonize the 'struct e820_table' fields 2017-01-28 09:33:16 +01:00
rtc.c timekeeping: Ignore the bogus sleep time if pm_trace is enabled 2016-11-29 18:02:58 +01:00
setup.c x86/timers: Move simple_udelay_calibration past init_hypervisor_platform 2017-05-26 13:04:09 +02:00
setup_percpu.c x86/boot/32: Fix UP boot on Quark and possibly other platforms 2017-05-09 08:14:24 +02:00
signal.c x86/debug: Fix the printk() debug output of signal_fault(), do_trap() and do_general_protection() 2017-04-11 09:11:13 +02:00
signal_compat.c x86/signals: Fix lower/upper bound reporting in compat siginfo 2017-04-05 10:16:43 +02:00
smp.c Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 20:51:12 -07:00
smpboot.c x86: Remap GDT tables in the fixmap section 2017-03-16 09:06:35 +01:00
stacktrace.c stacktrace/x86: add function for detecting reliable stack traces 2017-03-08 09:18:02 +01:00
step.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
sys_x86_64.c x86/hugetlb: Adjust to the new native/compat mmap bases 2017-03-14 16:29:16 +01:00
sysfb.c
sysfb_efi.c Merge branch 'linus' into efi/core, to pick up fixes 2016-05-07 07:00:07 +02:00
sysfb_simplefb.c x86/sysfb: Fix lfb_size calculation 2016-11-16 09:38:23 +01:00
tboot.c IOMMU Updates for Linux v4.12 2017-05-09 15:15:47 -07:00
tce_64.c x86/cpufeature: Remove cpu_has_clflush 2016-03-31 13:35:09 +02:00
time.c
tls.c x86/tls: Forcibly set the accessed bit in TLS segments 2017-03-19 12:14:35 +01:00
tls.h
topology.c
trace_clock.c
tracepoint.c tracing: Have the reg function allow to fail 2016-12-09 09:13:30 -05:00
traps.c Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 22:07:51 -07:00
tsc.c sched/clock, x86/perf: Fix "perf test tsc" 2017-03-23 07:31:49 +01:00
tsc_msr.c x86/tsc: Set TSC_KNOWN_FREQ and TSC_RELIABLE flags on Intel Atom SoCs 2016-11-18 10:58:31 +01:00
tsc_sync.c x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable 2017-02-10 09:47:17 +01:00
unwind_frame.c x86/unwind: Add end-of-stack check for ftrace handlers 2017-05-24 09:05:16 +02:00
unwind_guess.c x86/unwind: Ensure stack pointer is aligned 2017-04-18 10:30:23 +02:00
uprobes.c uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions 2016-08-12 08:29:24 +02:00
verify_cpu.S
vm86_32.c x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() 2017-04-26 10:02:06 +02:00
vmlinux.lds.S debug: Fix __bug_table[] in arch linker scripts 2017-04-03 10:22:40 +02:00
vsmp_64.c
x86_init.c x86/boot/e820: Rename default_machine_specific_memory_setup() to e820__memory_setup_default() 2017-01-28 14:42:26 +01:00