Commit Graph

16596 Commits

Author SHA1 Message Date
Tony W Wang-oc 33b4711df4 x86/cpu/centaur: Add Centaur family >=7 CPUs initialization support
Add Centaur family >=7 CPUs specific initialization support.

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1599562666-31351-3-git-send-email-TonyWWang-oc@zhaoxin.com
2020-09-11 10:53:19 +02:00
Tony W Wang-oc 8687bdc041 x86/cpu/centaur: Replace two-condition switch-case with an if statement
Use a normal if statements instead of a two-condition switch-case.

 [ bp: Massage commit message. ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1599562666-31351-2-git-send-email-TonyWWang-oc@zhaoxin.com
2020-09-11 10:50:01 +02:00
Christoph Hellwig 2f5388a29b dma-direct: remove dma_direct_{alloc,free}_pages
Just merge these helpers into the main dma_direct_{alloc,free} routines,
as the additional checks are always false for the two callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2020-09-11 09:10:29 +02:00
Borislav Petkov e2def7d49d x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR
If an exception needs to be handled while reading an MSR - which is in
most of the cases caused by a #GP on a non-existent MSR - then this
is most likely the incarnation of a BIOS or a hardware bug. Such bug
violates the architectural guarantee that MCA banks are present with all
MSRs belonging to them.

The proper fix belongs in the hardware/firmware - not in the kernel.

Handling an #MC exception which is raised while an NMI is being handled
would cause the nasty NMI nesting issue because of the shortcoming of
IRET of reenabling NMIs when executed. And the machine is in an #MC
context already so <Deity> be at its side.

Tracing MSR accesses while in #MC is another no-no due to tracing being
inherently a bad idea in atomic context:

  vmlinux.o: warning: objtool: do_machine_check()+0x4a: call to mce_rdmsrl() leaves .noinstr.text section

so remove all that "additional" functionality from mce_rdmsrl() and
provide it with a special exception handler which panics the machine
when that MSR is not accessible.

The exception handler prints a human-readable message explaining what
the panic reason is but, what is more, it panics while in the #GP
handler and latter won't have executed an IRET, thus opening the NMI
nesting issue in the case when the #MC has happened while handling
an NMI. (#MC itself won't be reenabled until MCG_STATUS hasn't been
cleared).

Suggested-by: Andy Lutomirski <luto@kernel.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
[ Add missing prototypes for ex_handler_* ]
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200906212130.GA28456@zn.tnic
2020-09-11 08:25:43 +02:00
Martin Radev f5ed777586 x86/sev-es: Check required CPU features for SEV-ES
Make sure the machine supports RDRAND, otherwise there is no trusted
source of randomness in the system.

To also check this in the pre-decompression stage, make has_cpuflag()
not depend on CONFIG_RANDOMIZE_BASE anymore.

Signed-off-by: Martin Radev <martin.b.radev@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200907131613.12703-73-joro@8bytes.org
2020-09-10 21:49:25 +02:00
Tom Lendacky 39336f4ffb x86/efi: Add GHCB mappings when SEV-ES is active
Calling down to EFI runtime services can result in the firmware
performing VMGEXIT calls. The firmware is likely to use the GHCB of the
OS (e.g., for setting EFI variables), so each GHCB in the system needs
to be identity-mapped in the EFI page tables, as unencrypted, to avoid
page faults.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Moved GHCB mapping loop to sev-es.c ]
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200907131613.12703-72-joro@8bytes.org
2020-09-10 21:48:50 +02:00
Arvind Sankar 0a4bb5e550 x86/fpu: Allow multiple bits in clearcpuid= parameter
Commit

  0c2a3913d6 ("x86/fpu: Parse clearcpuid= as early XSAVE argument")

changed clearcpuid parsing from __setup() to cmdline_find_option().
While the __setup() function would have been called for each clearcpuid=
parameter on the command line, cmdline_find_option() will only return
the last one, so the change effectively made it impossible to disable
more than one bit.

Allow a comma-separated list of bit numbers as the argument for
clearcpuid to allow multiple bits to be disabled again. Log the bits
being disabled for informational purposes.

Also fix the check on the return value of cmdline_find_option(). It
returns -1 when the option is not found, so testing as a boolean is
incorrect.

Fixes: 0c2a3913d6 ("x86/fpu: Parse clearcpuid= as early XSAVE argument")
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907213919.2423441-1-nivedita@alum.mit.edu
2020-09-10 18:32:05 +02:00
Julien Thierry ee819aedf3 objtool: Make unwind hint definitions available to other architectures
Unwind hints are useful to provide objtool with information about stack
states in non-standard functions/code.

While the type of information being provided might be very arch
specific, the mechanism to provide the information can be useful for
other architectures.

Move the relevant unwint hint definitions for all architectures to
see.

[ jpoimboe: REGS_IRET -> REGS_PARTIAL ]

Signed-off-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-09-10 10:43:13 -05:00
Julien Thierry 00089c048e objtool: Rename frame.h -> objtool.h
Header frame.h is getting more code annotations to help objtool analyze
object files.

Rename the file to objtool.h.

[ jpoimboe: add objtool.h to MAINTAINERS ]

Signed-off-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-09-10 10:43:13 -05:00
Ahmed S. Darwish a1f1066133 x86/tsc: Use seqcount_latch_t
Latch sequence counters have unique read and write APIs, and thus
seqcount_latch_t was recently introduced at seqlock.h.

Use that new data type instead of plain seqcount_t. This adds the
necessary type-safety and ensures that only latching-safe seqcount APIs
are to be used.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
[peterz: unwreck cyc2ns_read_begin()]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200827114044.11173-7-a.darwish@linutronix.de
2020-09-10 11:19:29 +02:00
Joerg Roedel 4ca68e023b x86/sev-es: Handle NMI State
When running under SEV-ES, the kernel has to tell the hypervisor when to
open the NMI window again after an NMI was injected. This is done with
an NMI-complete message to the hypervisor.

Add code to the kernel's NMI handler to send this message right at the
beginning of do_nmi(). This always allows nesting NMIs.

 [ bp: Mark __sev_es_nmi_complete() noinstr:
   vmlinux.o: warning: objtool: exc_nmi()+0x17: call to __sev_es_nmi_complete()
	leaves .noinstr.text section
   While at it, use __pa_nodebug() for the same reason due to
   CONFIG_DEBUG_VIRTUAL=y:
   vmlinux.o: warning: objtool: __sev_es_nmi_complete()+0xd9: call to __phys_addr()
   	leaves .noinstr.text section ]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-71-joro@8bytes.org
2020-09-09 18:02:35 +02:00
Joerg Roedel 094794f597 x86/sev-es: Support CPU offline/online
Add a play_dead handler when running under SEV-ES. This is needed
because the hypervisor can't deliver an SIPI request to restart the AP.
Instead, the kernel has to issue a VMGEXIT to halt the VCPU until the
hypervisor wakes it up again.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-70-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Joerg Roedel 3ecacdbd23 x86/head/64: Don't call verify_cpu() on starting APs
The APs are not ready to handle exceptions when verify_cpu() is called
in secondary_startup_64().

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200907131613.12703-69-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Joerg Roedel 520d030852 x86/smpboot: Load TSS and getcpu GDT entry before loading IDT
The IDT on 64-bit contains vectors which use paranoid_entry() and/or IST
stacks. To make these vectors work, the TSS and the getcpu GDT entry need
to be set up before the IDT is loaded.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-68-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky 8940ac9ced x86/realmode: Setup AP jump table
As part of the GHCB specification, the booting of APs under SEV-ES
requires an AP jump table when transitioning from one layer of code to
another (e.g. when going from UEFI to the OS). As a result, each layer
that parks an AP must provide the physical address of an AP jump table
to the next layer via the hypervisor.

Upon booting of the kernel, read the AP jump table address from the
hypervisor. Under SEV-ES, APs are started using the INIT-SIPI-SIPI
sequence. Before issuing the first SIPI request for an AP, the start
CS and IP is programmed into the AP jump table. Upon issuing the SIPI
request, the AP will awaken and jump to that start CS:IP address.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: - Adapted to different code base
                   - Moved AP table setup from SIPI sending path to
		     real-mode setup code
		   - Fix sparse warnings ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-67-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Doug Covelli 1a222de8dc x86/vmware: Add VMware-specific handling for VMMCALL under SEV-ES
Add VMware-specific handling for #VC faults caused by VMMCALL
instructions.

Signed-off-by: Doug Covelli <dcovelli@vmware.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: - Adapt to different paravirt interface ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-65-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky 99419b251e x86/kvm: Add KVM-specific VMMCALL handling under SEV-ES
Implement the callbacks to copy the processor state required by KVM to
the GHCB.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: - Split out of a larger patch
                   - Adapt to different callback functions ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-64-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Joerg Roedel f6a9f8a458 x86/paravirt: Allow hypervisor-specific VMMCALL handling under SEV-ES
Add two new paravirt callbacks to provide hypervisor-specific processor
state in the GHCB and to copy state from the hypervisor back to the
processor.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-63-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Joerg Roedel cb1ad3ecea x86/sev-es: Handle #DB Events
Handle #VC exceptions caused by #DB exceptions in the guest. Those
must be handled outside of instrumentation_begin()/end() so that the
handler will not be raised recursively.

Handle them by calling the kernel's debug exception handler.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-62-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Joerg Roedel a2d0171a9c x86/sev-es: Handle #AC Events
Implement a handler for #VC exceptions caused by #AC exceptions. The
#AC exception is just forwarded to do_alignment_check() and not pushed
down to the hypervisor, as requested by the SEV-ES GHCB Standardization
Specification.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-61-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky 2eb7dcf0cc x86/sev-es: Handle VMMCALL Events
Implement a handler for #VC exceptions caused by VMMCALL instructions.
This is only a starting point, VMMCALL emulation under SEV-ES needs
further hypervisor-specific changes to provide additional state.

 [ bp: Drop "this patch". ]

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to #VC handling infrastructure ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-60-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky ded476bbe2 x86/sev-es: Handle MWAIT/MWAITX Events
Implement a handler for #VC exceptions caused by MWAIT and MWAITX
instructions.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to #VC handling infrastructure ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-59-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky 0c2fd2ef64 x86/sev-es: Handle MONITOR/MONITORX Events
Implement a handler for #VC exceptions caused by MONITOR and MONITORX
instructions.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to #VC handling infrastructure ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-58-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky 8b4ce83707 x86/sev-es: Handle INVD Events
Implement a handler for #VC exceptions caused by INVD instructions.
Since Linux should never use INVD, just mark it as unsupported.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to #VC handling infrastructure ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-57-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky 5d55cf78a8 x86/sev-es: Handle RDPMC Events
Implement a handler for #VC exceptions caused by RDPMC instructions.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to #VC handling infrastructure ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-56-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky 4711e7acaa x86/sev-es: Handle RDTSC(P) Events
Implement a handler for #VC exceptions caused by RDTSC and RDTSCP
instructions. Also make it available in the pre-decompression stage
because the KASLR code uses RDTSC/RDTSCP to gather entropy and some
hypervisors intercept these instructions.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: - Adapt to #VC handling infrastructure
                   - Make it available early ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-55-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky a14a92fc4b x86/sev-es: Handle WBINVD Events
Implement a handler for #VC exceptions caused by WBINVD instructions.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to #VC handling framework ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-54-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky 479a7bf5c1 x86/sev-es: Handle DR7 read/write events
Add code to handle #VC exceptions on DR7 register reads and writes.
This is needed early because show_regs() reads DR7 to print it out.

Under SEV-ES, there is currently no support for saving/restoring the
DRx registers but software expects to be able to write to the DR7
register. For now, cache the value written to DR7 and return it on
read attempts, but do not touch the real hardware DR7.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: - Adapt to #VC handling framework
                   - Support early usage ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-53-joro@8bytes.org
2020-09-09 11:33:20 +02:00
Tom Lendacky a4afa6081c x86/sev-es: Handle MSR events
Implement a handler for #VC exceptions caused by RDMSR/WRMSR
instructions.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to #VC handling infrastructure. ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-52-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Joerg Roedel 0118b604c2 x86/sev-es: Handle MMIO String Instructions
Add handling for emulation of the MOVS instruction on MMIO regions, as
done by the memcpy_toio() and memcpy_fromio() functions.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-51-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Tom Lendacky 51ee7d6e3d x86/sev-es: Handle MMIO events
Add a handler for #VC exceptions caused by MMIO intercepts. These
intercepts come along as nested page faults on pages with reserved
bits set.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to VC handling framework ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-50-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Joerg Roedel 5e3427a7bc x86/sev-es: Handle instruction fetches from user-space
When a #VC exception is triggered by user-space, the instruction decoder
needs to read the instruction bytes from user addresses. Enhance
vc_decode_insn() to safely fetch kernel and user instructions.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-49-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Joerg Roedel d3529bb73f x86/sev-es: Wire up existing #VC exit-code handlers
Re-use the handlers for CPUID- and IOIO-caused #VC exceptions in the
early boot handler.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-48-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Tom Lendacky 0786138c78 x86/sev-es: Add a Runtime #VC Exception Handler
Add the handlers for #VC exceptions invoked at runtime.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-47-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Joerg Roedel a13644f3a5 x86/entry/64: Add entry code for #VC handler
The #VC handler needs special entry code because:

	1. It runs on an IST stack

	2. It needs to be able to handle nested #VC exceptions

To make this work, the entry code is implemented to pretend it doesn't
use an IST stack. When entered from user-mode or early SYSCALL entry
path it switches to the task stack. If entered from kernel-mode it tries
to switch back to the previous stack in the IRET frame.

The stack found in the IRET frame is validated first, and if it is not
safe to use it for the #VC handler, the code will switch to a
fall-back stack (the #VC2 IST stack). From there, it can cause nested
exceptions again.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-46-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Joerg Roedel 6b27edd74a x86/dumpstack/64: Add noinstr version of get_stack_info()
The get_stack_info() functionality is needed in the entry code for the
#VC exception handler. Provide a version of it in the .text.noinstr
section which can be called safely from there.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-45-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Joerg Roedel 315562c9af x86/sev-es: Adjust #VC IST Stack on entering NMI handler
When an NMI hits in the #VC handler entry code before it has switched to
another stack, any subsequent #VC exception in the NMI code-path will
overwrite the interrupted #VC handler's stack.

Make sure this doesn't happen by explicitly adjusting the #VC IST entry
in the NMI handler for the time it can cause #VC exceptions.

 [ bp: Touchups, spelling fixes. ]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-44-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Joerg Roedel 02772fb9b6 x86/sev-es: Allocate and map an IST stack for #VC handler
Allocate and map an IST stack and an additional fall-back stack for
the #VC handler.  The memory for the stacks is allocated only when
SEV-ES is active.

The #VC handler needs to use an IST stack because a #VC exception can be
raised from kernel space with unsafe stack, e.g. in the SYSCALL entry
path.

Since the #VC exception can be nested, the #VC handler switches back to
the interrupted stack when entered from kernel space. If switching back
is not possible, the fall-back stack is used.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-43-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Tom Lendacky 885689e47d x86/sev-es: Setup per-CPU GHCBs for the runtime handler
The runtime handler needs one GHCB per-CPU. Set them up and map them
unencrypted.

 [ bp: Touchups and simplification. ]

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-42-joro@8bytes.org
2020-09-09 11:33:19 +02:00
Joerg Roedel 1aa9aa8ee5 x86/sev-es: Setup GHCB-based boot #VC handler
Add the infrastructure to handle #VC exceptions when the kernel runs on
virtual addresses and has mapped a GHCB. This handler will be used until
the runtime #VC handler takes over.

Since the handler runs very early, disable instrumentation for sev-es.c.

 [ bp: Make vc_ghcb_invalidate() __always_inline so that it can be
   inlined in noinstr functions like __sev_es_nmi_complete(). ]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200908123816.GB3764@8bytes.org
2020-09-09 11:32:27 +02:00
Joerg Roedel 74d8d9d531 x86/sev-es: Setup an early #VC handler
Setup an early handler for #VC exceptions. There is no GHCB mapped
yet, so just re-use the vc_no_ghcb_handler(). It can only handle
CPUID exit-codes, but that should be enough to get the kernel through
verify_cpu() and __startup_64() until it runs on virtual addresses.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
[ boot failure Error: kernel_ident_mapping_init() failed. ]
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lkml.kernel.org/r/20200908123517.GA3764@8bytes.org
2020-09-09 10:45:24 +02:00
Joerg Roedel f980f9c31a x86/sev-es: Compile early handler code into kernel image
Setup sev-es.c and include the code from the pre-decompression stage
to also build it into the image of the running kernel. Temporarily add
__maybe_unused annotations to avoid build warnings until the functions
get used.

 [ bp: Use the non-tracing rd/wrmsr variants because:
   vmlinux.o: warning: objtool: __sev_es_nmi_complete()+0x11f: \
	   call to do_trace_write_msr() leaves .noinstr.text section
   as __sev_es_nmi_complete() is noinstr due to being called from the
   NMI handler exc_nmi(). ]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-39-joro@8bytes.org
2020-09-09 10:44:46 +02:00
Christoph Hellwig 47058bb54b x86: remove address space overrides using set_fs()
Stop providing the possibility to override the address space using
set_fs() now that there is no need for that any more.  To properly
handle the TASK_SIZE_MAX checking for 4 vs 5-level page tables on
x86 a new alternative is introduced, which just like the one in
entry_64.S has to use the hardcoded virtual address bits to escape
the fact that TASK_SIZE_MAX isn't actually a constant when 5-level
page tables are enabled.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-08 22:21:36 -04:00
Masami Hiramatsu d7641289da x86/kprobes: Use generic kretprobe trampoline handler
Use the generic kretprobe trampoline handler. Use regs->sp
for framepointer verification.

[ mingo: Minor edits. ]

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/159870601250.1229682.14598707734683575237.stgit@devnote2
2020-09-08 11:52:32 +02:00
Joerg Roedel 4b47cdbda6 x86/head/64: Move early exception dispatch to C code
Move the assembly coded dispatch between page-faults and all other
exceptions to C code to make it easier to maintain and extend.

Also change the return-type of early_make_pgtable() to bool and make it
static.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-36-joro@8bytes.org
2020-09-07 22:49:18 +02:00
Joerg Roedel 097ee5b778 x86/idt: Make IDT init functions static inlines
Move these two functions from kernel/idt.c to include/asm/desc.h:

	* init_idt_data()
	* idt_init_desc()

These functions are needed to setup IDT entries very early and need to
be called from head64.c. To be usable this early, these functions need
to be compiled without instrumentation and the stack-protector feature.

These features need to be kept enabled for kernel/idt.c, so head64.c
must use its own versions.

 [ bp: Take Kees' suggested patch title and add his Rev-by. ]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200907131613.12703-35-joro@8bytes.org
2020-09-07 22:44:43 +02:00
Joerg Roedel f5963ba7a4 x86/head/64: Install a CPU bringup IDT
Add a separate bringup IDT for the CPU bringup code that will be used
until the kernel switches to the idt_table. There are two reasons for a
separate IDT:

	1) When the idt_table is set up and the secondary CPUs are
	   booted, it contains entries (e.g. IST entries) which
	   require certain CPU state to be set up. This includes a
	   working TSS (for IST), MSR_GS_BASE (for stack protector) or
	   CR4.FSGSBASE (for paranoid_entry) path. By using a
	   dedicated IDT for early boot this state need not to be set
	   up early.

	2) The idt_table is static to idt.c, so any function
	   using/modifying must be in idt.c too. That means that all
	   compiler driven instrumentation like tracing or KASAN is
	   also active in this code. But during early CPU bringup the
	   environment is not set up for this instrumentation to work
	   correctly.

To avoid all of these hassles and make early exception handling robust,
use a dedicated bringup IDT.

The IDT is loaded two times, first on the boot CPU while the kernel is
still running on direct mapped addresses, and again later after the
switch to kernel addresses has happened. The second IDT load happens on
the boot and secondary CPUs.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-34-joro@8bytes.org
2020-09-07 22:18:38 +02:00
Joerg Roedel 3add38cb96 x86/head/64: Switch to initial stack earlier
Make sure there is a stack once the kernel runs from virtual addresses.
At this stage any secondary CPU which boots will have lost its stack
because the kernel switched to a new page-table which does not map the
real-mode stack anymore.

This is needed for handling early #VC exceptions caused by instructions
like CPUID.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200907131613.12703-33-joro@8bytes.org
2020-09-07 21:44:01 +02:00
Joerg Roedel 7b99819dfb x86/head/64: Load segment registers earlier
Make sure segments are properly set up before setting up an IDT and
doing anything that might cause a #VC exception. This is later needed
for early exception handling.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200907131613.12703-32-joro@8bytes.org
2020-09-07 21:40:53 +02:00
Joerg Roedel e04b883363 x86/head/64: Load GDT after switch to virtual addresses
Load the GDT right after switching to virtual addresses to make sure
there is a defined GDT for exception handling.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200907131613.12703-31-joro@8bytes.org
2020-09-07 21:35:54 +02:00
Joerg Roedel 866b556efa x86/head/64: Install startup GDT
Handling exceptions during boot requires a working GDT. The kernel GDT
can't be used on the direct mapping, so load a startup GDT and setup
segments.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-30-joro@8bytes.org
2020-09-07 21:33:17 +02:00
Joerg Roedel 4bed2266cc x86/idt: Split idt_data setup out of set_intr_gate()
The code to setup idt_data is needed for early exception handling, but
set_intr_gate() can't be used that early because it has pv-ops in its
code path which don't work that early.

Split out the idt_data initialization part from set_intr_gate() so
that it can be used separately.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200907131613.12703-29-joro@8bytes.org
2020-09-07 21:30:38 +02:00
Tom Lendacky a7de15d489 x86/sev-es: Add CPUID handling to #VC handler
Handle #VC exceptions caused by CPUID instructions. These happen in
early boot code when the KASLR code checks for RDTSC.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to #VC handling framework ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-28-joro@8bytes.org
2020-09-07 20:15:51 +02:00
Tom Lendacky 25189d08e5 x86/sev-es: Add support for handling IOIO exceptions
Add support for decoding and handling #VC exceptions for IOIO events.

[ jroedel@suse.de: Adapted code to #VC handling framework ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-26-joro@8bytes.org
2020-09-07 19:45:26 +02:00
Joerg Roedel 597cfe4821 x86/boot/compressed/64: Setup a GHCB-based VC Exception handler
Install an exception handler for #VC exception that uses a GHCB. Also
add the infrastructure for handling different exit-codes by decoding
the instruction that caused the exception and error handling.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-24-joro@8bytes.org
2020-09-07 19:45:25 +02:00
Joerg Roedel 29dcc60f6a x86/boot/compressed/64: Add stage1 #VC handler
Add the first handler for #VC exceptions. At stage 1 there is no GHCB
yet because the kernel might still be running on the EFI page table.

The stage 1 handler is limited to the MSR-based protocol to talk to the
hypervisor and can only support CPUID exit-codes, but that is enough to
get to stage 2.

 [ bp: Zap superfluous newlines after rd/wrmsr instruction mnemonics. ]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-20-joro@8bytes.org
2020-09-07 19:45:25 +02:00
Joerg Roedel 172639d799 x86/umip: Factor out instruction decoding
Factor out the code used to decode an instruction with the correct
address and operand sizes to a helper function.

No functional changes.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-10-joro@8bytes.org
2020-09-07 19:45:24 +02:00
Joerg Roedel 172b75e56b x86/umip: Factor out instruction fetch
Factor out the code to fetch the instruction from user-space to a helper
function.

No functional changes.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-9-joro@8bytes.org
2020-09-07 19:45:24 +02:00
Tom Lendacky 360e7c5c4c x86/cpufeatures: Add SEV-ES CPU feature
Add CPU feature detection for Secure Encrypted Virtualization with
Encrypted State. This feature enhances SEV by also encrypting the
guest register state, making it in-accessible to the hypervisor.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-6-joro@8bytes.org
2020-09-07 19:45:24 +02:00
Borislav Petkov c48f46ac7b Merge 'x86/cpu' to pick up dependent bits
Pick up work happening in parallel to avoid nasty merge conflicts later.

Signed-off-by: Borislav Petkov <bp@suse.de>
2020-09-07 19:43:43 +02:00
Colin Ian King 93921baa3f x86/resctrl: Fix spelling in user-visible warning messages
Fix spelling mistake "Could't" -> "Couldn't" in user-visible warning
messages.

 [ bp: Massage commit message; s/cpu/CPU/g ]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200810075508.46490-1-colin.king@canonical.com
2020-09-05 01:24:17 +02:00
Peter Zijlstra d53d9bc0cf x86/debug: Change thread.debugreg6 to thread.virtual_dr6
Current usage of thread.debugreg6 is convoluted at best. It starts life as
a copy of the hardware DR6 value, but then various bits are cleared and
set.

Replace this with a new variable thread.virtual_dr6 that is initialized to
0 when DR6 is read and only gains bits, at the same time the actual (on
stack) dr6 value which is read from the hardware only gets bits cleared.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.415372940@infradead.org
2020-09-04 15:12:58 +02:00
Peter Zijlstra f4956cf83e x86/debug: Support negative polarity DR6 bits
DR6 has a whole bunch of bits that have negative polarity; they were
architecturally reserved and defined to be 1 and are now getting used.
Since they're 1 by default, 0 becomes the signal value.

Handle this by xor'ing the read DR6 value by the reserved mask, this
will flip them around such that 1 is the signal value (positive
polarity).

Current Linux doesn't yet support any of these bits, but there's two
defined:

 - DR6[11] Bus Lock Debug Exception		(ISEr39)
 - DR6[16] Restricted Transactional Memory	(SDM)

Update ptrace_{set,get}_debugreg() to provide/consume the value in
architectural polarity. Although afaict ptrace_set_debugreg(6) is
pointless, the value is not consumed anywhere.

Change hw_breakpoint_restore() to alway write the DR6_RESERVED value
to DR6, again, no consumer for that write.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.354220797@infradead.org
2020-09-04 15:12:57 +02:00
Peter Zijlstra 21d44be7b6 x86/debug: Simplify hw_breakpoint_handler()
This is called with interrupts disabled, there's no point in using
get_cpu() and per_cpu().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.292906672@infradead.org
2020-09-04 15:12:56 +02:00
Peter Zijlstra b84d42b6c6 x86/debug: Remove aout_dump_debugregs()
Unused remnants for the bit-bucket.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.233022474@infradead.org
2020-09-04 15:12:55 +02:00
Peter Zijlstra 389cd0cd8b x86/debug: Remove the historical junk
Remove the historical junk and replace it with a WARN and a comment.

The problem is that even though the kernel only uses TF single-step in
kprobes and KGDB, both of which consume the event before this, QEMU/KVM has
bugs in this area that can trigger this state so it has to be dealt with.

Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.170216274@infradead.org
2020-09-04 15:12:55 +02:00
Peter Zijlstra f0b67c39c1 x86/debug: Move cond_local_irq_enable() block into exc_debug_user()
The cond_local_irq_enable() block, dealing with vm86 and sending
signals is only relevant for #DB-from-user, move it there.

This then reduces handle_debug() to only the notifier call, so rename
it to notify_debug().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.094265982@infradead.org
2020-09-04 15:12:53 +02:00
Peter Zijlstra 4eb5acc391 x86/debug: Move historical SYSENTER junk into exc_debug_kernel()
The historical SYSENTER junk is explicitly for from-kernel, so move it
to the #DB-from-kernel handler.

It is ordered after the notifier, which is important for KGDB which uses TF
single-step and needs to consume the event before that point.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.031099736@infradead.org
2020-09-04 15:12:53 +02:00
Peter Zijlstra 4182e94369 x86/debug: Simplify #DB signal code
There's no point in calculating si_code if it's not going to be used.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133200.967434217@infradead.org
2020-09-04 15:12:52 +02:00
Peter Zijlstra 7043679a98 x86/debug: Remove handle_debug(.user) argument
The handle_debug(.user) argument is used to terminate the #DB handler early
for the INT1-from-kernel case, since the kernel doesn't use INT1.

Remove the argument and handle this explicitly in #DB-from-kernel.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20200902133200.907020598@infradead.org
2020-09-04 15:12:52 +02:00
Peter Zijlstra 20a6e35a94 x86/debug: Move kprobe_debug_handler() into exc_debug_kernel()
Kprobes are on kernel text, and thus only matter for #DB-from-kernel.
Kprobes are ordered before the generic notifier, preserve that order.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20200902133200.847465360@infradead.org
2020-09-04 15:12:52 +02:00
Peter Zijlstra c182487da1 x86/debug: Sync BTF earlier
Move the BTF sync near the DR6 load, as this will be the only common
code guaranteed to run on every #DB.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20200902133200.786888252@infradead.org
2020-09-04 15:12:52 +02:00
Andy Lutomirski d5c678aed5 x86/debug: Allow a single level of #DB recursion
Trying to clear DR7 around a #DB from usermode malfunctions if the tasks
schedules when delivering SIGTRAP.

Rather than trying to define a special no-recursion region, just allow a
single level of recursion.  The same mechanism is used for NMI, and it
hasn't caused any problems yet.

Fixes: 9f58fdde95 ("x86/db: Split out dr6/7 handling")
Reported-by: Kyle Huey <me@kylehuey.com>
Debugged-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/8b9bd05f187231df008d48cf818a6a311cbd5c98.1597882384.git.luto@kernel.org
Link: https://lore.kernel.org/r/20200902133200.726584153@infradead.org
2020-09-04 15:09:29 +02:00
Vincenzo Frascino 74f1082487 arm64: mte: Add specific SIGSEGV codes
Add MTE-specific SIGSEGV codes to siginfo.h and update the x86
BUILD_BUG_ON(NSIGSEGV != 7) compile check.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
[catalin.marinas@arm.com: renamed precise/imprecise to sync/async]
[catalin.marinas@arm.com: dropped #ifdef __aarch64__, renumbered]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:06 +01:00
Nicolin Chen 1e9d90dbed dma-mapping: introduce dma_get_seg_boundary_nr_pages()
We found that callers of dma_get_seg_boundary mostly do an ALIGN
with page mask and then do a page shift to get number of pages:
    ALIGN(boundary + 1, 1 << shift) >> shift

However, the boundary might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1" or
passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here into a helper function doing:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

This patch introduces and applies dma_get_seg_boundary_nr_pages()
as an overflow-free helper for the dma_get_seg_boundary() callers
to get numbers of pages. It also takes care of the NULL dev case
for non-DMA API callers.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-03 18:12:15 +02:00
Kees Cook 5354e84598 x86/build: Add asserts for unwanted sections
In preparation for warning on orphan sections, enforce other
expected-to-be-zero-sized sections (since discarding them might hide
problems with them suddenly gaining unexpected entries).

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200821194310.3089815-25-keescook@chromium.org
2020-09-01 10:03:18 +02:00
Kees Cook 815d680771 x86/build: Enforce an empty .got.plt section
The .got.plt section should always be zero (or filled only with the
linker-generated lazy dispatch entry). Enforce this with an assert and
mark the section as INFO. This is more sensitive than just blindly
discarding the section.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200821194310.3089815-24-keescook@chromium.org
2020-09-01 10:03:18 +02:00
Peter Zijlstra a945c8345e static_call: Allow early init
In order to use static_call() to wire up x86_pmu, we need to
initialize earlier, specifically before memory allocation works; copy
some of the tricks from jump_label to enable this.

Primarily we overload key->next to store a sites pointer when there
are no modules, this avoids having to use kmalloc() to initialize the
sites and allows us to run much earlier.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20200818135805.220737930@infradead.org
2020-09-01 09:58:06 +02:00
Peter Zijlstra 6c3fce794e static_call: Add some validation
Verify the text we're about to change is as we expect it to be.

Requested-by: Steven Rostedt <rostedt@goodmis.org>

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200818135805.161974981@infradead.org
2020-09-01 09:58:06 +02:00
Peter Zijlstra 5b06fd3bb9 static_call: Handle tail-calls
GCC can turn our static_call(name)(args...) into a tail call, in which
case we get a JMP.d32 into the trampoline (which then does a further
tail-call).

Teach objtool to recognise and mark these in .static_call_sites and
adjust the code patching to deal with this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20200818135805.101186767@infradead.org
2020-09-01 09:58:06 +02:00
Peter Zijlstra 452cddbff7 static_call: Add static_call_cond()
Extend the static_call infrastructure to optimize the following common
pattern:

	if (func_ptr)
		func_ptr(args...)

For the trampoline (which is in effect a tail-call), we patch the
JMP.d32 into a RET, which then directly consumes the trampoline call.

For the in-line sites we replace the CALL with a NOP5.

NOTE: this is 'obviously' limited to functions with a 'void' return type.

NOTE: DEFINE_STATIC_COND_CALL() only requires a typename, as opposed
      to a full function.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20200818135805.042977182@infradead.org
2020-09-01 09:58:05 +02:00
Peter Zijlstra c43a43e439 x86/alternatives: Teach text_poke_bp() to emulate RET
Future patches will need to poke a RET instruction, provide the
infrastructure required for this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20200818135804.982214828@infradead.org
2020-09-01 09:58:05 +02:00
Josh Poimboeuf 1e7e478838 x86/static_call: Add inline static call implementation for x86-64
Add the inline static call implementation for x86-64. The generated code
is identical to the out-of-line case, except we move the trampoline into
it's own section.

Objtool uses the trampoline naming convention to detect all the call
sites. It then annotates those call sites in the .static_call_sites
section.

During boot (and module init), the call sites are patched to call
directly into the destination function.  The temporary trampoline is
then no longer used.

[peterz: merged trampolines, put trampoline in section]

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org
2020-09-01 09:58:05 +02:00
Josh Poimboeuf e6d6c071f2 x86/static_call: Add out-of-line static call implementation
Add the x86 out-of-line static call implementation.  For each key, a
permanent trampoline is created which is the destination for all static
calls for the given key.  The trampoline has a direct jump which gets
patched by static_call_update() when the destination function changes.

[peterz: fixed trampoline, rewrote patching code]

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20200818135804.804315175@infradead.org
2020-09-01 09:58:05 +02:00
Peter Zijlstra 6333e8f73b static_call: Avoid kprobes on inline static_call()s
Similar to how we disallow kprobes on any other dynamic text
(ftrace/jump_label) also disallow kprobes on inline static_call()s.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200818135804.744920586@infradead.org
2020-09-01 09:58:04 +02:00
Kees Cook c604abc3f6 vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG
The .comment section doesn't belong in STABS_DEBUG. Split it out into a
new macro named ELF_DETAILS. This will gain other non-debug sections
that need to be accounted for when linking with --orphan-handling=warn.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-arch@vger.kernel.org
Link: https://lore.kernel.org/r/20200821194310.3089815-5-keescook@chromium.org
2020-09-01 09:50:35 +02:00
Linus Torvalds dcc5c6f013 Three interrupt related fixes for X86:
- Move disabling of the local APIC after invoking fixup_irqs() to ensure
    that interrupts which are incoming are noted in the IRR and not ignored.
 
  - Unbreak affinity setting. The rework of the entry code reused the
    regular exception entry code for device interrupts. The vector number is
    pushed into the errorcode slot on the stack which is then lifted into an
    argument and set to -1 because that's regs->orig_ax which is used in
    quite some places to check whether the entry came from a syscall. But it
    was overlooked that orig_ax is used in the affinity cleanup code to
    validate whether the interrupt has arrived on the new target. It turned
    out that this vector check is pointless because interrupts are never
    moved from one vector to another on the same CPU. That check is a
    historical leftover from the time where x86 supported multi-CPU
    affinities, but not longer needed with the now strict single CPU
    affinity. Famous last words ...
 
  - Add a missing check for an empty cpumask into the matrix allocator. The
    affinity change added a warning to catch the case where an interrupt is
    moved on the same CPU to a different vector. This triggers because a
    condition with an empty cpumask returns an assignment from the allocator
    as the allocator uses for_each_cpu() without checking the cpumask for
    being empty. The historical inconsistent for_each_cpu() behaviour of
    ignoring the cpumask and unconditionally claiming that CPU0 is in the
    mask striked again. Sigh.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl9L6WYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRV5D/9dRq/4pn5g1esnzm4GhIr2To3Qp6cl
 s7VswTdN8FmWBqVz79ZVYqj663UpL3pPY1np01ctrxRQLeDVfWcI2BMR5irnny8h
 otORhFysuDUl+yfuomWVbzfQQNJ+VeQVeWKD3cIhD1I3sXqDX5Wpa8n086hYKQXx
 eutVC3+JdzJZFm68xarlLW7h2f1au1eZZFgVnyY+J5KO9Dwm63a4RITdDVk7KV4t
 uKEDza5P9SY+kE9LAGNq8BAEObf9FeMXw0mRM7atRKVsJQQGVk6bgiuaRr01w1+W
 hQCPx/3g6PHFnGgx/KQgHf1jgrZFhXOyIDo6ZeFy+SJGIZRB3n8o5Kjns2l8Pa+K
 2qy1TRoZIsGkwGCi/BM6viLzBikbh/gnGYy/8KTEJdKs8P3ZKHUZVSAB1dpapOWX
 4n+rKoVPnvxgRSeZZo+tgLkvUdh+/9Huyr9vHiYjtbbB8tFvjlkOmrZ6sirHByDy
 jg6TjOJVb1CC/PoW4M7JNfmeKvHQnTACwH6djdVGDLPJspuUsYkPI0Uk0CX21SA3
 45Tuylvl9jT6+vq95Av2RbAiipmSpZ/O1NHV8Paf466SKmhUgG3lv5PHh3xTm1U2
 Be/RbJ75x4Muuw42ttU1LcpcLPcOZRQNEREoNd5UysgYYgWRekBvU+ZRQNW4g2nw
 3JDgJgm0iBUN9w==
 =zIi4
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "Three interrupt related fixes for X86:

   - Move disabling of the local APIC after invoking fixup_irqs() to
     ensure that interrupts which are incoming are noted in the IRR and
     not ignored.

   - Unbreak affinity setting.

     The rework of the entry code reused the regular exception entry
     code for device interrupts. The vector number is pushed into the
     errorcode slot on the stack which is then lifted into an argument
     and set to -1 because that's regs->orig_ax which is used in quite
     some places to check whether the entry came from a syscall.

     But it was overlooked that orig_ax is used in the affinity cleanup
     code to validate whether the interrupt has arrived on the new
     target. It turned out that this vector check is pointless because
     interrupts are never moved from one vector to another on the same
     CPU. That check is a historical leftover from the time where x86
     supported multi-CPU affinities, but not longer needed with the now
     strict single CPU affinity. Famous last words ...

   - Add a missing check for an empty cpumask into the matrix allocator.

     The affinity change added a warning to catch the case where an
     interrupt is moved on the same CPU to a different vector. This
     triggers because a condition with an empty cpumask returns an
     assignment from the allocator as the allocator uses for_each_cpu()
     without checking the cpumask for being empty. The historical
     inconsistent for_each_cpu() behaviour of ignoring the cpumask and
     unconditionally claiming that CPU0 is in the mask struck again.
     Sigh.

  plus a new entry into the MAINTAINER file for the HPE/UV platform"

* tag 'x86-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/matrix: Deal with the sillyness of for_each_cpu() on UP
  x86/irq: Unbreak interrupt affinity setting
  x86/hotplug: Silence APIC only after all interrupts are migrated
  MAINTAINERS: Add entry for HPE Superdome Flex (UV) maintainers
2020-08-30 12:01:23 -07:00
Linus Torvalds b69bea8a65 A set of fixes for lockdep, tracing and RCU:
- Prevent recursion by using raw_cpu_* operations
 
   - Fixup the interrupt state in the cpu idle code to be consistent
 
   - Push rcu_idle_enter/exit() invocations deeper into the idle path so
     that the lock operations are inside the RCU watching sections
 
   - Move trace_cpu_idle() into generic code so it's called before RCU goes
     idle.
 
   - Handle raw_local_irq* vs. local_irq* operations correctly
 
   - Move the tracepoints out from under the lockdep recursion handling
     which turned out to be fragile and inconsistent.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl9L5qETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoV/NEADG+h02tj2I4gP7IQ3nVodEzS1+odPI
 orabY5ggH0kn4YIhPB4UtOd5zKZjr3FJs9wEhyhQpV6ZhvFfgaIKiYqfg+Q81aMO
 /BXrfh6jBD2Hu7gaPBnVdkKeh1ehl+w0PhTeJhPBHEEvbGeLUYWwyPNlaKz//VQl
 XCWl7e7o/Uw2UyJ469SCx3z+M2DMNqwdMys/zcqvTLiBdLNCwp4TW5ACzEA0rfHh
 Pepu3eIKnMURyt82QanrOATvT2io9pOOaUh59zeKi2WM8ikwKd/Eho2kXYng6GvM
 GzX4Kn13MsNobZXf9BhqEGICdRkaJqLsXlmBNmbJdSTCn5W2lLZqu2wCEp5VZHCc
 XwMbey8ek+BRskJMqAV4oq2GA8Om9KEYWOOdixyOG0UJCiW5qDowuDYBXTLV7FWj
 XhzLGuHpUF9eKLKokJ7ideLaDcpzwYjHr58pFLQrqPwmjVKWguLeYMg5BhhTiEuV
 wNfiLIGdMNsCpYKhnce3o9paV8+hy1ZveWhNy+/4HaDLoEwI2T62i8R7xxbrcWMg
 sgdAiQG+kVLwSJ13bN+Cz79uLYTIbqGaZHtOXmeIT3jSxBjx5RlXfzocwTHSYrNk
 GuLYHd7+QaemN49Rrf4bPR16Db7ifL32QkUtLBTBLcnos9jM+fcl+BWyqYRxhgDv
 xzDS+vfK8DvRiA==
 =Hgt6
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "A set of fixes for lockdep, tracing and RCU:

   - Prevent recursion by using raw_cpu_* operations

   - Fixup the interrupt state in the cpu idle code to be consistent

   - Push rcu_idle_enter/exit() invocations deeper into the idle path so
     that the lock operations are inside the RCU watching sections

   - Move trace_cpu_idle() into generic code so it's called before RCU
     goes idle.

   - Handle raw_local_irq* vs. local_irq* operations correctly

   - Move the tracepoints out from under the lockdep recursion handling
     which turned out to be fragile and inconsistent"

* tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep,trace: Expose tracepoints
  lockdep: Only trace IRQ edges
  mips: Implement arch_irqs_disabled()
  arm64: Implement arch_irqs_disabled()
  nds32: Implement arch_irqs_disabled()
  locking/lockdep: Cleanup
  x86/entry: Remove unused THUNKs
  cpuidle: Move trace_cpu_idle() into generic code
  cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic
  sched,idle,rcu: Push rcu_idle deeper into the idle path
  cpuidle: Fixup IRQ state
  lockdep: Use raw_cpu_*() for per-cpu variables
2020-08-30 11:43:50 -07:00
Wang Hai e33ab20648 x86/mpparse: Remove duplicate io_apic.h include
Remove asm/io_apic.h which is included more than once.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Link: https://lkml.kernel.org/r/20200819112910.7629-1-wanghai38@huawei.com
2020-08-27 12:00:28 +02:00
Thomas Gleixner e027fffff7 x86/irq: Unbreak interrupt affinity setting
Several people reported that 5.8 broke the interrupt affinity setting
mechanism.

The consolidation of the entry code reused the regular exception entry code
for device interrupts and changed the way how the vector number is conveyed
from ptregs->orig_ax to a function argument.

The low level entry uses the hardware error code slot to push the vector
number onto the stack which is retrieved from there into a function
argument and the slot on stack is set to -1.

The reason for setting it to -1 is that the error code slot is at the
position where pt_regs::orig_ax is. A positive value in pt_regs::orig_ax
indicates that the entry came via a syscall. If it's not set to a negative
value then a signal delivery on return to userspace would try to restart a
syscall. But there are other places which rely on pt_regs::orig_ax being a
valid indicator for syscall entry.

But setting pt_regs::orig_ax to -1 has a nasty side effect vs. the
interrupt affinity setting mechanism, which was overlooked when this change
was made.

Moving interrupts on x86 happens in several steps. A new vector on a
different CPU is allocated and the relevant interrupt source is
reprogrammed to that. But that's racy and there might be an interrupt
already in flight to the old vector. So the old vector is preserved until
the first interrupt arrives on the new vector and the new target CPU. Once
that happens the old vector is cleaned up, but this cleanup still depends
on the vector number being stored in pt_regs::orig_ax, which is now -1.

That -1 makes the check for cleanup: pt_regs::orig_ax == new_vector
always false. As a consequence the interrupt is moved once, but then it
cannot be moved anymore because the cleanup of the old vector never
happens.

There would be several ways to convey the vector information to that place
in the guts of the interrupt handling, but on deeper inspection it turned
out that this check is pointless and a leftover from the old affinity model
of X86 which supported multi-CPU affinities. Under this model it was
possible that an interrupt had an old and a new vector on the same CPU, so
the vector match was required.

Under the new model the effective affinity of an interrupt is always a
single CPU from the requested affinity mask. If the affinity mask changes
then either the interrupt stays on the CPU and on the same vector when that
CPU is still in the new affinity mask or it is moved to a different CPU, but
it is never moved to a different vector on the same CPU.

Ergo the cleanup check for the matching vector number is not required and
can be removed which makes the dependency on pt_regs:orig_ax go away.

The remaining check for new_cpu == smp_processsor_id() is completely
sufficient. If it matches then the interrupt was successfully migrated and
the cleanup can proceed.

For paranoia sake add a warning into the vector assignment code to
validate that the assumption of never moving to a different vector on
the same CPU holds.

Fixes: 633260fa14 ("x86/irq: Convey vector as argument and not in ptregs")
Reported-by: Alex bykov <alex.bykov@scylladb.com>
Reported-by: Avi Kivity <avi@scylladb.com>
Reported-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Graf <graf@amazon.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87wo1ltaxz.fsf@nanos.tec.linutronix.de
2020-08-27 09:29:23 +02:00
Ashok Raj 52d6b926aa x86/hotplug: Silence APIC only after all interrupts are migrated
There is a race when taking a CPU offline. Current code looks like this:

native_cpu_disable()
{
	...
	apic_soft_disable();
	/*
	 * Any existing set bits for pending interrupt to
	 * this CPU are preserved and will be sent via IPI
	 * to another CPU by fixup_irqs().
	 */
	cpu_disable_common();
	{
		....
		/*
		 * Race window happens here. Once local APIC has been
		 * disabled any new interrupts from the device to
		 * the old CPU are lost
		 */
		fixup_irqs(); // Too late to capture anything in IRR.
		...
	}
}

The fix is to disable the APIC *after* cpu_disable_common().

Testing was done with a USB NIC that provided a source of frequent
interrupts. A script migrated interrupts to a specific CPU and
then took that CPU offline.

Fixes: 60dcaad573 ("x86/hotplug: Silence APIC and NMI when CPU is dead")
Reported-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@nanos.tec.linutronix.de/
Link: https://lore.kernel.org/r/1598501530-45821-1-git-send-email-ashok.raj@intel.com
2020-08-27 09:29:23 +02:00
Tony Luck 1e36d9c688 x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()
A long time ago, Linux cleared IA32_MCG_STATUS at the very end of machine
check processing.

Then, some fancy recovery and IST manipulation was added in:

  d4812e169d ("x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks")

and clearing IA32_MCG_STATUS was pulled earlier in the function.

Next change moved the actual recovery out of do_machine_check() and
just used task_work_add() to schedule it later (before returning to the
user):

  5567d11c21 ("x86/mce: Send #MC singal from task work")

Most recently the fancy IST footwork was removed as no longer needed:

  b052df3da8 ("x86/entry: Get rid of ist_begin/end_non_atomic()")

At this point there is no reason remaining to clear IA32_MCG_STATUS early.
It can move back to the very end of the function.

Also move sync_core(). The comments for this function say that it should
only be called when instructions have been changed/re-mapped. Recovery
for an instruction fetch may change the physical address. But that
doesn't happen until the scheduled work runs (which could be on another
CPU).

 [ bp: Massage commit message. ]

Reported-by: Gabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200824221237.5397-1-tony.luck@intel.com
2020-08-26 18:40:18 +02:00
Fenghua Yu 29b6bd41ee x86/resctrl: Enable user to view thread or core throttling mode
Early Intel hardware implementations of Memory Bandwidth Allocation (MBA)
could only control bandwidth at the processor core level. This meant that
when two processes with different bandwidth allocations ran simultaneously
on the same core the hardware had to resolve this difference. It did so by
applying the higher throttling value (lower bandwidth) to both processes.

Newer implementations can apply different throttling values to each
thread on a core.

Introduce a new resctrl file, "thread_throttle_mode", on Intel systems
that shows to the user how throttling values are allocated, per-core or
per-thread.

On systems that support per-core throttling, the file will display "max".
On newer systems that support per-thread throttling, the file will display
"per-thread".

AMD confirmed in [1] that AMD bandwidth allocation is already at thread
level but that the AMD implementation does not use a memory delay
throttle mode. So to avoid confusion the thread throttling mode would be
UNDEFINED on AMD systems and the "thread_throttle_mode" file will not be
visible.

Originally-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/1598296281-127595-3-git-send-email-fenghua.yu@intel.com
Link: [1] https://lore.kernel.org/lkml/18d277fd-6523-319c-d560-66b63ff606b8@amd.com
2020-08-26 17:53:22 +02:00
Fenghua Yu e48cb1a3fb x86/resctrl: Enumerate per-thread MBA controls
Some systems support per-thread Memory Bandwidth Allocation (MBA) which
applies a throttling delay value to each hardware thread instead of to
a core. Per-thread MBA is enumerated by CPUID.

No feature flag is shown in /proc/cpuinfo. User applications need to
check a resctrl throttling mode info file to know if the feature is
supported.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/1598296281-127595-2-git-send-email-fenghua.yu@intel.com
2020-08-26 17:46:12 +02:00
Peter Zijlstra 9864f5b594 cpuidle: Move trace_cpu_idle() into generic code
Remove trace_cpu_idle() from the arch_cpu_idle() implementations and
put it in the generic code, right before disabling RCU. Gets rid of
more trace_*_rcuidle() users.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200821085348.428433395@infradead.org
2020-08-26 12:41:54 +02:00
Borislav Petkov 5f1dd4dda5 x86/fsgsbase: Replace static_cpu_has() with boot_cpu_has()
ptrace and prctl() are not really fast paths to warrant the use of
static_cpu_has() and cause alternatives patching for no good reason.
Replace with boot_cpu_has() which is simple and fast enough.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200818103715.32736-1-bp@alien8.de
2020-08-24 18:18:32 +02:00
Gustavo A. R. Silva df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Chris Down c31feed846 x86/msr: Make source of unrecognised MSR writes unambiguous
In many cases, task_struct.comm isn't enough to distinguish the
offender, since for interpreted languages it's likely just going to be
"python3" or whatever. Add the pid to make it unambiguous.

 [ bp: Make the printk string a single line for easier grepping. ]

Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/6f6fbd0ee6c99bc5e47910db700a6642159db01b.1598011595.git.chris@chrisdown.name
2020-08-22 11:40:38 +02:00
Chris Down 1f35c9c0ce x86/msr: Prevent userspace MSR access from dominating the console
Applications which manipulate MSRs from userspace often do so
infrequently, and all at once. As such, the default printk ratelimit
architecture supplied by pr_err_ratelimited() doesn't do enough to prevent
kmsg becoming completely overwhelmed with their messages and pushing
other salient information out of the circular buffer.

In one case, I saw over 80% of kmsg being filled with these messages,
and the default kmsg buffer being completely filled less than 5 minutes
after boot(!).

Make things much less aggressive, while still achieving the original
goal of fiter_write(). Operators will still get warnings that MSRs are
being manipulated from userspace, but they won't have other also
potentially useful messages pushed out of the kmsg buffer.

Of course, one can boot with `allow_writes=1` to avoid these messages at
all, but that then has the downfall that one doesn't get _any_
notification at all about these problems in the first place, and so is
much less likely to forget to fix it.

One might rather it was less binary: it was still logged, just less
often, so that application developers _do_ have the incentive to improve
their current methods, without the kernel having to push other useful
stuff out of the kmsg buffer.

This one example isn't the point, of course: I'm sure there are plenty
of other non-ideal-but-pragmatic cases where people are writing to MSRs
from userspace right now, and it will take time for those people to find
other solutions.

Overall, keep the intent of the original patch, while mitigating its
sometimes heavy effects on kmsg composition.

 [ bp: Massage a bit. ]

Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/563994ef132ce6cffd28fc659254ca37d032b5ef.1598011595.git.chris@chrisdown.name
2020-08-22 11:27:40 +02:00
Brendan Shanks b91e7089ae x86/umip: Add emulation/spoofing for SLDT and STR instructions
Add emulation/spoofing of SLDT and STR for both 32- and 64-bit
processes.

Wine users have found a small number of Windows apps using SLDT that
were crashing when run on UMIP-enabled systems.

Originally-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Reported-by: Andreas Rammhold <andi@notmuch.email>
Signed-off-by: Brendan Shanks <bshanks@codeweavers.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Tested-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Link: https://lkml.kernel.org/r/20200710224525.21966-1-bshanks@codeweavers.com
2020-08-20 19:10:26 +02:00
Christian Brauner c723523bf3
x86: switch to kernel_clone()
The old _do_fork() helper is removed in favor of the new kernel_clone() helper.
The latter adheres to naming conventions for kernel internal syscall helpers.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20200819104655.436656-8-christian.brauner@ubuntu.com
2020-08-20 13:12:58 +02:00
Yazen Ghannam 368d188720 x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type
was intended to be used by the kernel to filter out invalid error codes
on a system. However, this is unnecessary after a few product releases
because the hardware will only report valid error codes. Thus, there's
no need for it with future systems.

Remove the xec_bitmap field and all references to it.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200720145353.43924-1-Yazen.Ghannam@amd.com
2020-08-20 10:34:38 +02:00
James Morse 709c436272 cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file
resctrl/core.c defines get_cache_id() for use in its cpu-hotplug
callbacks. This gets the id attribute of the cache at the corresponding
level of a CPU.

Later rework means this private function needs to be shared. Move
it to the header file.

The name conflicts with a different definition in intel_cacheinfo.c,
name it get_cpu_cacheinfo_id() to show its relation with
get_cpu_cacheinfo().

Now this is visible on other architectures, check the id attribute
has actually been set.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-11-james.morse@arm.com
2020-08-19 11:04:23 +02:00
James Morse 316e7f901f x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps
Intel CPUs expect the cache bitmap provided by user-space to have on a
single span of 1s, whereas AMD can support bitmaps like 0xf00f. Arm's
MPAM support also allows sparse bitmaps.

Similarly, Intel CPUs check at least one bit set, whereas AMD CPUs are
quite happy with an empty bitmap. Arm's MPAM allows an empty bitmap.

To move resctrl out to /fs/, platform differences like this need to be
explained.

Add two resource properties arch_has_{empty,sparse}_bitmaps. Test these
around the relevant parts of cbm_validate().

Merging the validate calls causes AMD to gain the min_cbm_bits test
needed for Haswell, but as it always sets this value to 1, it will never
match.

 [ bp: Massage commit message. ]

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-10-james.morse@arm.com
2020-08-19 10:41:40 +02:00
James Morse 5df3ca9334 x86/resctrl: Merge AMD/Intel parse_bw() calls
Now after arch_needs_linear has been added, the parse_bw() calls are
almost the same between AMD and Intel.

The difference is '!is_mba_sc()', which is not checked on AMD. This
will always be true on AMD CPUs as mba_sc cannot be enabled as
is_mba_linear() is false.

Removing this duplication means user-space visible behaviour and
error messages are not validated or generated in different places.

Reviewed-by : Babu Moger <babu.moger@amd.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-9-james.morse@arm.com
2020-08-19 09:38:57 +02:00
James Morse 41215b7947 x86/resctrl: Add struct rdt_membw::arch_needs_linear to explain AMD/Intel MBA difference
The configuration values user-space provides to the resctrl filesystem
are ABI. To make this work on another architecture, all the ABI bits
should be moved out of /arch/x86 and under /fs.

To do this, the differences between AMD and Intel CPUs needs to be
explained to resctrl via resource properties, instead of function
pointers that let the arch code accept subtly different values on
different platforms/architectures.

For MBA, Intel CPUs reject configuration attempts for non-linear
resources, whereas AMD ignore this field as its MBA resource is never
linear. To merge the parse/validate functions, this difference needs to
be explained.

Add struct rdt_membw::arch_needs_linear to indicate the arch code needs
the linear property to be true to configure this resource. AMD can set
this and delay_linear to false. Intel can set arch_needs_linear to
true to keep the existing "No support for non-linear MB domains" error
message for affected platforms.

 [ bp: convert "we" etc to passive voice. ]

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-8-james.morse@arm.com
2020-08-19 09:34:51 +02:00
James Morse e6b2fac36f x86/resctrl: Use is_closid_match() in more places
rdtgroup_tasks_assigned() and show_rdt_tasks() loop over threads testing
for a CTRL/MON group match by closid/rmid with the provided rdtgrp.
Further down the file are helpers to do this, move these further up and
make use of them here.

These helpers additionally check for alloc/mon capable. This is harmless
as rdtgroup_mkdir() tests these capable flags before allowing the config
directories to be created.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-7-james.morse@arm.com
2020-08-19 09:08:36 +02:00
James Morse f995801ba3 x86/resctrl: Use container_of() in delayed_work handlers
mbm_handle_overflow() and cqm_handle_limbo() are both provided with
the domain's work_struct when called, but use get_domain_from_cpu()
to find the domain, along with the appropriate error handling.

container_of() saves some list walking and bitmap testing, use that
instead.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-5-james.morse@arm.com
2020-08-18 17:05:08 +02:00
James Morse ae0fbedd2a x86/resctrl: Fix stale comment
The comment in rdtgroup_init() refers to the non existent function
rdt_mount(), which has now been renamed rdt_get_tree(). Fix the
comment.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-4-james.morse@arm.com
2020-08-18 17:02:24 +02:00
James Morse e89f85b917 x86/resctrl: Remove struct rdt_membw::max_delay
max_delay is used by x86's __get_mem_config_intel() as a local variable.
Remove it, replacing it with a local variable.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-3-james.morse@arm.com
2020-08-18 17:01:23 +02:00
James Morse abe8f12b44 x86/resctrl: Remove unused struct mbm_state::chunks_bw
Nothing reads struct mbm_states's chunks_bw value, its a copy of
chunks. Remove it.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20200708163929.2783-2-james.morse@arm.com
2020-08-18 16:51:55 +02:00
Linus Torvalds 50f6c7dbd9 Misc fixes and small updates all around the place:
- Fix mitigation state sysfs output
  - Fix an FPU xstate/sxave code assumption bug triggered by Architectural LBR support
  - Fix Lightning Mountain SoC TSC frequency enumeration bug
  - Fix kexec debug output
  - Fix kexec memory range assumption bug
  - Fix a boundary condition in the crash kernel code
 
  - Optimize porgatory.ro generation a bit
  - Enable ACRN guests to use X2APIC mode
  - Reduce a __text_poke() IRQs-off critical section for the benefit of PREEMPT_RT
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl83ybgRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iJnQ/+OAkE5hiQ+F1ikQ4rKyjaT6FjvynReNUA
 ysQjcCypGB4x+slR8o3k5yrzYJ9WbDfOz7a0uekZtNHvJ80+3yheV5Yvf+Uz3EYM
 Jj/OubCNMNnvS5cJMNXs196SGd/ELLWBbCjwUWPsiWJ0ZMTgKmpZz1LgB1QZjhyw
 fbAc1WgTLVO+emE5FwBrmFzvgBxn5EtiFoLhegFtACHadNcJLiKpXpiK3NKkEirO
 owF1/Qg6mn6MowKDBDkWgmwi0HVYbraqu0hXRrCq9o105CVwgwUdORTwjK3rnUNs
 et10Zz2UmSpjXJOhKZdZLFCtYOmrADmS4pnoXF6W6cLLFvkq4b2ducnlFBtNKqMh
 ljPkIT04sF99gIKijEYWsru+MgS4qO1VNHtJxkr/ZCUjqahsa1nN9F0lP0QOXjwf
 hbK4h1NrML3UiCGAe2hjIh9zY2c8s2Q90PyCvZkKNKquSQ1E011hzcEE2RIoBBYB
 mc1d6lgfCFWVkbgRA5sx1CVtgnAvHk2wu9w/8N9XTGjPgiQJRr3I8cNUZw59gaMH
 43auWyvpVAA4vdfbKJrPVrTLhTTnQYv0A966l7/i0d8MkGN4u09sAiB3ZevZMEK9
 45b7IXWluCi0ikBAmCvQ+qEzhg7pApCziVKuaZ/4j+qPLTDAutGwz7YuaXyOKrUX
 Aj/uCev6D6c=
 =fvpv
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Misc fixes and small updates all around the place:

   - Fix mitigation state sysfs output

   - Fix an FPU xstate/sxave code assumption bug triggered by
     Architectural LBR support

   - Fix Lightning Mountain SoC TSC frequency enumeration bug

   - Fix kexec debug output

   - Fix kexec memory range assumption bug

   - Fix a boundary condition in the crash kernel code

   - Optimize porgatory.ro generation a bit

   - Enable ACRN guests to use X2APIC mode

   - Reduce a __text_poke() IRQs-off critical section for the benefit of
     PREEMPT_RT"

* tag 'x86-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternatives: Acquire pte lock with interrupts enabled
  x86/bugs/multihit: Fix mitigation reporting when VMX is not in use
  x86/fpu/xstate: Fix an xstate size check warning with architectural LBRs
  x86/purgatory: Don't generate debug info for purgatory.ro
  x86/tsr: Fix tsc frequency enumeration bug on Lightning Mountain SoC
  kexec_file: Correctly output debugging information for the PT_LOAD ELF header
  kexec: Improve & fix crash_exclude_mem_range() to handle overlapping ranges
  x86/crash: Correct the address boundary of function parameters
  x86/acrn: Remove redundant chars from ACRN signature
  x86/acrn: Allow ACRN guest to use X2APIC mode
2020-08-15 10:38:03 -07:00
Juergen Gross e1ac3e66d3 x86/paravirt: Remove set_pte_at() pv-op
On x86 set_pte_at() is now always falling back to set_pte(). So instead
of having this fallback after the paravirt maze just drop the
set_pte_at paravirt operation and let set_pte_at() use the set_pte()
function directly.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200815100641.26362-6-jgross@suse.com
2020-08-15 13:52:12 +02:00
Juergen Gross 0cabf99149 x86/paravirt: Remove 32-bit support from CONFIG_PARAVIRT_XXL
The last 32-bit user of stuff under CONFIG_PARAVIRT_XXL is gone.

Remove 32-bit specific parts.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200815100641.26362-2-jgross@suse.com
2020-08-15 13:52:11 +02:00
Linus Torvalds 0520058d05 xen: branch for v5.9-rc1b
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCXzaSXAAKCRCAXGG7T9hj
 vuSEAP4qOIv7Hr1wMJfTsN7ZoNNr/K6ph8ADcjFm9RGikn8MawD8CU/OfcFKJFwl
 UVwM1HPnRG6pvCI9bmHS4WYrIBYBVw0=
 =Bi6R
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.9-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull more xen updates from Juergen Gross:

 - Remove support for running as 32-bit Xen PV-guest.

   32-bit PV guests are rarely used, are lacking security fixes for
   Meltdown, and can be easily replaced by PVH mode. Another series for
   doing more cleanup will follow soon (removal of 32-bit-only pvops
   functionality).

 - Fixes and additional features for the Xen display frontend driver.

* tag 'for-linus-5.9-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  drm/xen-front: Pass dumb buffer data offset to the backend
  xen: Sync up with the canonical protocol definition in Xen
  drm/xen-front: Add YUYV to supported formats
  drm/xen-front: Fix misused IS_ERR_OR_NULL checks
  xen/gntdev: Fix dmabuf import with non-zero sgt offset
  x86/xen: drop tests for highmem in pv code
  x86/xen: eliminate xen-asm_64.S
  x86/xen: remove 32-bit Xen PV guest support
2020-08-14 13:34:37 -07:00
Linus Torvalds cd94257d7a hyperv-fixes for 5.9-rc
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl82Y6cTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXlcxCACP21ZI7RZvQBcFtTj5MWa0uwoofFqF
 JDG0MvZ5zFKIJFX0pwlZIUrZY5aVJ1NwCDgCI0EXbZEazTaNCD2knFPqrLe3WUFY
 mSDF9df7oW9UvTe9L4g3rYAdqsrkbgqhBypm9Vpbcazg/Ki6QVCgAhIo1lbq62+m
 J2/0kLO1lVY6opr6vyobaWbm/Y4b0fbrx7N6KwUDhZUYGLGKaOc+WvsZinNl4XW6
 VPiEVQUApvVxwG43rLNXjPe83DtassJ2GevSS1whXnZ+K0bViWhyYicbqEl9iV1i
 nlNIkEMX5A1rdwV1zEAGyY/zWi+fi2+IdKGGEbtyUsely1vHtZuaDCiQ
 =DE2Y
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyper-v fixes from Wei Liu:

 - fix oops reporting on Hyper-V

 - make objtool happy

* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: Make hv_setup_sched_clock inline
  Drivers: hv: vmbus: Only notify Hyper-V for die events that are oops
2020-08-14 13:31:25 -07:00
Eric Dumazet 8ab49526b5 x86/fsgsbase/64: Fix NULL deref in 86_fsgsbase_read_task
syzbot found its way in 86_fsgsbase_read_task() and triggered this oops:

   KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
   CPU: 0 PID: 6866 Comm: syz-executor262 Not tainted 5.8.0-syzkaller #0
   RIP: 0010:x86_fsgsbase_read_task+0x16d/0x310 arch/x86/kernel/process_64.c:393
   Call Trace:
     putreg32+0x3ab/0x530 arch/x86/kernel/ptrace.c:876
     genregs32_set arch/x86/kernel/ptrace.c:1026 [inline]
     genregs32_set+0xa4/0x100 arch/x86/kernel/ptrace.c:1006
     copy_regset_from_user include/linux/regset.h:326 [inline]
     ia32_arch_ptrace arch/x86/kernel/ptrace.c:1061 [inline]
     compat_arch_ptrace+0x36c/0xd90 arch/x86/kernel/ptrace.c:1198
     __do_compat_sys_ptrace kernel/ptrace.c:1420 [inline]
     __se_compat_sys_ptrace kernel/ptrace.c:1389 [inline]
     __ia32_compat_sys_ptrace+0x220/0x2f0 kernel/ptrace.c:1389
     do_syscall_32_irqs_on arch/x86/entry/common.c:84 [inline]
     __do_fast_syscall_32+0x57/0x80 arch/x86/entry/common.c:126
     do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:149
     entry_SYSENTER_compat_after_hwframe+0x4d/0x5c

This can happen if ptrace() or sigreturn() pokes an LDT selector into FS
or GS for a task with no LDT and something tries to read the base before
a return to usermode notices the bad selector and fixes it.

The fix is to make sure ldt pointer is not NULL.

Fixes: 07e1d88ada ("x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately")
Co-developed-by: Jann Horn <jannh@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-14 13:30:18 -07:00
Sebastian Andrzej Siewior a6d996cbd3 x86/alternatives: Acquire pte lock with interrupts enabled
pte lock is never acquired in-IRQ context so it does not require interrupts
to be disabled. The lock is a regular spinlock which cannot be acquired
with interrupts disabled on RT.

RT complains about pte_lock() in __text_poke() because it's invoked after
disabling interrupts.

__text_poke() has to disable interrupts as use_temporary_mm() expects
interrupts to be off because it invokes switch_mm_irqs_off() and uses
per-CPU (current active mm) data.

Move the PTE lock handling outside the interrupt disabled region.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by; Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200813105026.bvugytmsso6muljw@linutronix.de
2020-08-13 14:11:54 +02:00
Michael Kelley b9d8cf2eb3 x86/hyperv: Make hv_setup_sched_clock inline
Make hv_setup_sched_clock inline so the reference to pv_ops works
correctly with objtool updates to detect noinstr violations.
See https://lore.kernel.org/patchwork/patch/1283635/

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/1597022991-24088-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-08-11 10:41:15 +00:00
Juergen Gross a13f2ef168 x86/xen: remove 32-bit Xen PV guest support
Xen is requiring 64-bit machines today and since Xen 4.14 it can be
built without 32-bit PV guest support. There is no need to carry the
burden of 32-bit PV guest support in the kernel any longer, as new
guests can be either HVM or PVH, or they can use a 64 bit kernel.

Remove the 32-bit Xen PV support from the kernel.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2020-08-11 08:26:48 +02:00
Linus Torvalds 97d052ea3f A set of locking fixes and updates:
- Untangle the header spaghetti which causes build failures in various
     situations caused by the lockdep additions to seqcount to validate that
     the write side critical sections are non-preemptible.
 
   - The seqcount associated lock debug addons which were blocked by the
     above fallout.
 
     seqcount writers contrary to seqlock writers must be externally
     serialized, which usually happens via locking - except for strict per
     CPU seqcounts. As the lock is not part of the seqcount, lockdep cannot
     validate that the lock is held.
 
     This new debug mechanism adds the concept of associated locks.
     sequence count has now lock type variants and corresponding
     initializers which take a pointer to the associated lock used for
     writer serialization. If lockdep is enabled the pointer is stored and
     write_seqcount_begin() has a lockdep assertion to validate that the
     lock is held.
 
     Aside of the type and the initializer no other code changes are
     required at the seqcount usage sites. The rest of the seqcount API is
     unchanged and determines the type at compile time with the help of
     _Generic which is possible now that the minimal GCC version has been
     moved up.
 
     Adding this lockdep coverage unearthed a handful of seqcount bugs which
     have been addressed already independent of this.
 
     While generaly useful this comes with a Trojan Horse twist: On RT
     kernels the write side critical section can become preemtible if the
     writers are serialized by an associated lock, which leads to the well
     known reader preempts writer livelock. RT prevents this by storing the
     associated lock pointer independent of lockdep in the seqcount and
     changing the reader side to block on the lock when a reader detects
     that a writer is in the write side critical section.
 
  - Conversion of seqcount usage sites to associated types and initializers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8xmPYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTuQEACyzQCjU8PgehPp9oMqWzaX2fcVyuZO
 QU2yw6gmz2oTz3ZHUNwdW8UnzGh2OWosK3kDruoD9FtSS51lER1/ISfSPCGfyqxC
 KTjOcB1Kvxwq/3LcCx7Zi3ZxWApat74qs3EhYhKtEiQ2Y9xv9rLq8VV1UWAwyxq0
 eHpjlIJ6b6rbt+ARslaB7drnccOsdK+W/roNj4kfyt+gezjBfojGRdMGQNMFcpnv
 shuTC+vYurAVIiVA/0IuizgHfwZiXOtVpjVoEWaxg6bBH6HNuYMYzdSa/YrlDkZs
 n/aBI/Xkvx+Eacu8b1Zwmbzs5EnikUK/2dMqbzXKUZK61eV4hX5c2xrnr1yGWKTs
 F/juh69Squ7X6VZyKVgJ9RIccVueqwR2EprXWgH3+RMice5kjnXH4zURp0GHALxa
 DFPfB6fawcH3Ps87kcRFvjgm6FBo0hJ1AxmsW1dY4ACFB9azFa2euW+AARDzHOy2
 VRsUdhL9CGwtPjXcZ/9Rhej6fZLGBXKr8uq5QiMuvttp4b6+j9FEfBgD4S6h8csl
 AT2c2I9LcbWqyUM9P4S7zY/YgOZw88vHRuDH7tEBdIeoiHfrbSBU7EQ9jlAKq/59
 f+Htu2Io281c005g7DEeuCYvpzSYnJnAitj5Lmp/kzk2Wn3utY1uIAVszqwf95Ul
 81ppn2KlvzUK8g==
 =7Gj+
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:
 "A set of locking fixes and updates:

   - Untangle the header spaghetti which causes build failures in
     various situations caused by the lockdep additions to seqcount to
     validate that the write side critical sections are non-preemptible.

   - The seqcount associated lock debug addons which were blocked by the
     above fallout.

     seqcount writers contrary to seqlock writers must be externally
     serialized, which usually happens via locking - except for strict
     per CPU seqcounts. As the lock is not part of the seqcount, lockdep
     cannot validate that the lock is held.

     This new debug mechanism adds the concept of associated locks.
     sequence count has now lock type variants and corresponding
     initializers which take a pointer to the associated lock used for
     writer serialization. If lockdep is enabled the pointer is stored
     and write_seqcount_begin() has a lockdep assertion to validate that
     the lock is held.

     Aside of the type and the initializer no other code changes are
     required at the seqcount usage sites. The rest of the seqcount API
     is unchanged and determines the type at compile time with the help
     of _Generic which is possible now that the minimal GCC version has
     been moved up.

     Adding this lockdep coverage unearthed a handful of seqcount bugs
     which have been addressed already independent of this.

     While generally useful this comes with a Trojan Horse twist: On RT
     kernels the write side critical section can become preemtible if
     the writers are serialized by an associated lock, which leads to
     the well known reader preempts writer livelock. RT prevents this by
     storing the associated lock pointer independent of lockdep in the
     seqcount and changing the reader side to block on the lock when a
     reader detects that a writer is in the write side critical section.

   - Conversion of seqcount usage sites to associated types and
     initializers"

* tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  locking/seqlock, headers: Untangle the spaghetti monster
  locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header
  x86/headers: Remove APIC headers from <asm/smp.h>
  seqcount: More consistent seqprop names
  seqcount: Compress SEQCNT_LOCKNAME_ZERO()
  seqlock: Fold seqcount_LOCKNAME_init() definition
  seqlock: Fold seqcount_LOCKNAME_t definition
  seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
  hrtimer: Use sequence counter with associated raw spinlock
  kvm/eventfd: Use sequence counter with associated spinlock
  userfaultfd: Use sequence counter with associated spinlock
  NFSv4: Use sequence counter with associated spinlock
  iocost: Use sequence counter with associated spinlock
  raid5: Use sequence counter with associated spinlock
  vfs: Use sequence counter with associated spinlock
  timekeeping: Use sequence counter with associated raw spinlock
  xfrm: policy: Use sequence counters with associated lock
  netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
  netfilter: conntrack: Use sequence counter with associated spinlock
  sched: tasks: Use sequence counter with associated spinlock
  ...
2020-08-10 19:07:44 -07:00
Linus Torvalds fc80c51fd4 Kbuild updates for v5.9
- run the checker (e.g. sparse) after the compiler
 
  - remove unneeded cc-option tests for old compiler flags
 
  - fix tar-pkg to install dtbs
 
  - introduce ccflags-remove-y and asflags-remove-y syntax
 
  - allow to trace functions in sub-directories of lib/
 
  - introduce hostprogs-always-y and userprogs-always-y syntax
 
  - various Makefile cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl8wJXEVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGMGEP/0jDq/WafbfPN0aU83EqEWLt/sKg
 bluzmf/6HGx3XVRnuAzsHNNqysUx77WJiDsU/jbC/zdH8Iox3Sc1diE2sELLNAfY
 iJmQ8NBPggyU74aYG3OJdpDjz8T9EX/nVaYrjyFlbuXElM+Qvo8Z4Fz6NpWqKWlA
 gU+yGxEPPdX6MLHcSPSIu1hGWx7UT4fgfx3zDFTI2qvbQgQjKtzyTjAH5Cm3o87h
 rfomvHSSoAUg+Fh1LediRh1tJlkdVO+w7c+LNwCswmdBtkZuxecj1bQGUTS8GaLl
 CCWOKYfWp0KsVf1veXNNNaX/ecbp+Y34WErFq3V9Fdq5RmVlp+FPSGMyjDMRiQ/p
 LGvzbJLPpG586MnK8of0dOj6Es6tVPuq6WH2HuvsyTGcZJDpFTTxRcK3HDkE8ig6
 ZtuM3owB/Mep8IzwY2yWQiDrc7TX5Fz8S4hzGPU1zG9cfj4VT6TBqHGAy1Eql/0l
 txj6vJpnbQSdXiIX8MIU3yH35Y7eW3JYWgspTZH5Woj1S/wAWwuG93Fuuxq6mQIJ
 q6LSkMavtOfuCjOA9vJBZewpKXRU6yo0CzWNL/5EZ6z/r/I+DGtfb/qka8oYUDjX
 9H0cecL37AQxDHRPTxCZDQF0TpYiFJ6bmnMftK9NKNuIdvsk9DF7UBa3EdUNIj38
 yKS3rI7Lw55xWuY3
 =bkNQ
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - run the checker (e.g. sparse) after the compiler

 - remove unneeded cc-option tests for old compiler flags

 - fix tar-pkg to install dtbs

 - introduce ccflags-remove-y and asflags-remove-y syntax

 - allow to trace functions in sub-directories of lib/

 - introduce hostprogs-always-y and userprogs-always-y syntax

 - various Makefile cleanups

* tag 'kbuild-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: stop filtering out $(GCC_PLUGINS_CFLAGS) from cc-option base
  kbuild: include scripts/Makefile.* only when relevant CONFIG is enabled
  kbuild: introduce hostprogs-always-y and userprogs-always-y
  kbuild: sort hostprogs before passing it to ifneq
  kbuild: move host .so build rules to scripts/gcc-plugins/Makefile
  kbuild: Replace HTTP links with HTTPS ones
  kbuild: trace functions in subdirectories of lib/
  kbuild: introduce ccflags-remove-y and asflags-remove-y
  kbuild: do not export LDFLAGS_vmlinux
  kbuild: always create directories of targets
  powerpc/boot: add DTB to 'targets'
  kbuild: buildtar: add dtbs support
  kbuild: remove cc-option test of -ffreestanding
  kbuild: remove cc-option test of -fno-stack-protector
  Revert "kbuild: Create directory for target DTB"
  kbuild: run the checker after the compiler
2020-08-09 14:10:26 -07:00
Linus Torvalds 32663c78c1 Tracing updates for 5.9
- The biggest news in that the tracing ring buffer can now time events that
    interrupted other ring buffer events. Before this change, if an interrupt
    came in while recording another event, and that interrupt also had an
    event, those events would all have the same time stamp as the event it
    interrupted. Now, with the new design, those events will have a unique time
    stamp and rightfully display the time for those events that were recorded
    while interrupting another event.
 
  - Bootconfig how has an "override" operator that lets the users have a
    default config, but then add options to override the default.
 
  - A fix was made to properly filter function graph tracing to the ftrace
    PIDs. This came in at the end of the -rc cycle, and needs to be backported.
 
  - Several clean ups, performance updates, and minor fixes as well.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXy3GOBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qphsAP9ci1jtrC2+cMBMCNKb/AFpA/nDaKsD
 hpsDzvD0YPOmCAEA9QbZset8wUNG49R4FexP7egQ8Ad2S6Oa5f60jWleDQY=
 =lH+q
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - The biggest news in that the tracing ring buffer can now time events
   that interrupted other ring buffer events.

   Before this change, if an interrupt came in while recording another
   event, and that interrupt also had an event, those events would all
   have the same time stamp as the event it interrupted.

   Now, with the new design, those events will have a unique time stamp
   and rightfully display the time for those events that were recorded
   while interrupting another event.

 - Bootconfig how has an "override" operator that lets the users have a
   default config, but then add options to override the default.

 - A fix was made to properly filter function graph tracing to the
   ftrace PIDs. This came in at the end of the -rc cycle, and needs to
   be backported.

 - Several clean ups, performance updates, and minor fixes as well.

* tag 'trace-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (39 commits)
  tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers
  kprobes: Fix compiler warning for !CONFIG_KPROBES_ON_FTRACE
  tracing: Use trace_sched_process_free() instead of exit() for pid tracing
  bootconfig: Fix to find the initargs correctly
  Documentation: bootconfig: Add bootconfig override operator
  tools/bootconfig: Add testcases for value override operator
  lib/bootconfig: Add override operator support
  kprobes: Remove show_registers() function prototype
  tracing/uprobe: Remove dead code in trace_uprobe_register()
  kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler
  ftrace: Fix ftrace_trace_task return value
  tracepoint: Use __used attribute definitions from compiler_attributes.h
  tracepoint: Mark __tracepoint_string's __used
  trace : Have tracing buffer info use kvzalloc instead of kzalloc
  tracing: Remove outdated comment in stack handling
  ftrace: Do not let direct or IPMODIFY ftrace_ops be added to module and set trampolines
  ftrace: Setup correct FTRACE_FL_REGS flags for module
  tracing/hwlat: Honor the tracing_cpumask
  tracing/hwlat: Drop the duplicate assignment in start_kthread()
  tracing: Save one trace_event->type by using __TRACE_LAST_TYPE
  ...
2020-08-07 18:29:15 -07:00
Linus Torvalds 81e11336d9 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - a few MM hotfixes

 - kthread, tools, scripts, ntfs and ocfs2

 - some of MM

Subsystems affected by this patch series: kthread, tools, scripts, ntfs,
ocfs2 and mm (hofixes, pagealloc, slab-generic, slab, slub, kcsan,
debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, mincore,
sparsemem, vmalloc, kasan, pagealloc, hugetlb and vmscan).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (162 commits)
  mm: vmscan: consistent update to pgrefill
  mm/vmscan.c: fix typo
  khugepaged: khugepaged_test_exit() check mmget_still_valid()
  khugepaged: retract_page_tables() remember to test exit
  khugepaged: collapse_pte_mapped_thp() protect the pmd lock
  khugepaged: collapse_pte_mapped_thp() flush the right range
  mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
  mm: thp: replace HTTP links with HTTPS ones
  mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
  mm/page_alloc.c: skip setting nodemask when we are in interrupt
  mm/page_alloc: fallbacks at most has 3 elements
  mm/page_alloc: silence a KASAN false positive
  mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()
  mm/page_alloc.c: simplify pageblock bitmap access
  mm/page_alloc.c: extract the common part in pfn_to_bitidx()
  mm/page_alloc.c: replace the definition of NR_MIGRATETYPE_BITS with PB_migratetype_bits
  mm/shuffle: remove dynamic reconfiguration
  mm/memory_hotplug: document why shuffle_zone() is relevant
  mm/page_alloc: remove nr_free_pagecache_pages()
  mm: remove vm_total_pages
  ...
2020-08-07 11:39:33 -07:00
Mike Rapoport ca15ca406f mm: remove unneeded includes of <asm/pgalloc.h>
Patch series "mm: cleanup usage of <asm/pgalloc.h>"

Most architectures have very similar versions of pXd_alloc_one() and
pXd_free_one() for intermediate levels of page table.  These patches add
generic versions of these functions in <asm-generic/pgalloc.h> and enable
use of the generic functions where appropriate.

In addition, functions declared and defined in <asm/pgalloc.h> headers are
used mostly by core mm and early mm initialization in arch and there is no
actual reason to have the <asm/pgalloc.h> included all over the place.
The first patch in this series removes unneeded includes of
<asm/pgalloc.h>

In the end it didn't work out as neatly as I hoped and moving
pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require
unnecessary changes to arches that have custom page table allocations, so
I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
to mm/.

This patch (of 8):

In most cases <asm/pgalloc.h> header is required only for allocations of
page table memory.  Most of the .c files that include that header do not
use symbols declared in <asm/pgalloc.h> and do not require that header.

As for the other header files that used to include <asm/pgalloc.h>, it is
possible to move that include into the .c file that actually uses symbols
from <asm/pgalloc.h> and drop the include from the header file.

The process was somewhat automated using

	sed -i -E '/[<"]asm\/pgalloc\.h/d' \
                $(grep -L -w -f /tmp/xx \
                        $(git grep -E -l '[<"]asm/pgalloc\.h'))

where /tmp/xx contains all the symbols defined in
arch/*/include/asm/pgalloc.h.

[rppt@linux.ibm.com: fix powerpc warning]

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:26 -07:00
Linus Torvalds e1ec517e18 Merge branch 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull init and set_fs() cleanups from Al Viro:
 "Christoph's 'getting rid of ksys_...() uses under KERNEL_DS' series"

* 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (50 commits)
  init: add an init_dup helper
  init: add an init_utimes helper
  init: add an init_stat helper
  init: add an init_mknod helper
  init: add an init_mkdir helper
  init: add an init_symlink helper
  init: add an init_link helper
  init: add an init_eaccess helper
  init: add an init_chmod helper
  init: add an init_chown helper
  init: add an init_chroot helper
  init: add an init_chdir helper
  init: add an init_rmdir helper
  init: add an init_unlink helper
  init: add an init_umount helper
  init: add an init_mount helper
  init: mark create_dev as __init
  init: mark console_on_rootfs as __init
  init: initialize ramdisk_execute_command at compile time
  devtmpfs: refactor devtmpfsd()
  ...
2020-08-07 09:40:34 -07:00
Linus Torvalds 19b39c38ab Merge branch 'work.regset' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ptrace regset updates from Al Viro:
 "Internal regset API changes:

   - regularize copy_regset_{to,from}_user() callers

   - switch to saner calling conventions for ->get()

   - kill user_regset_copyout()

  The ->put() side of things will have to wait for the next cycle,
  unfortunately.

  The balance is about -1KLoC and replacements for ->get() instances are
  a lot saner"

* 'work.regset' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits)
  regset: kill user_regset_copyout{,_zero}()
  regset(): kill ->get_size()
  regset: kill ->get()
  csky: switch to ->regset_get()
  xtensa: switch to ->regset_get()
  parisc: switch to ->regset_get()
  nds32: switch to ->regset_get()
  nios2: switch to ->regset_get()
  hexagon: switch to ->regset_get()
  h8300: switch to ->regset_get()
  openrisc: switch to ->regset_get()
  riscv: switch to ->regset_get()
  c6x: switch to ->regset_get()
  ia64: switch to ->regset_get()
  arc: switch to ->regset_get()
  arm: switch to ->regset_get()
  sh: convert to ->regset_get()
  arm64: switch to ->regset_get()
  mips: switch to ->regset_get()
  sparc: switch to ->regset_get()
  ...
2020-08-07 09:29:25 -07:00
Pawan Gupta f29dfa53cc x86/bugs/multihit: Fix mitigation reporting when VMX is not in use
On systems that have virtualization disabled or unsupported, sysfs
mitigation for X86_BUG_ITLB_MULTIHIT is reported incorrectly as:

  $ cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit
  KVM: Vulnerable

System is not vulnerable to DoS attack from a rogue guest when
virtualization is disabled or unsupported in the hardware. Change the
mitigation reporting for these cases.

Fixes: b8e8c8303f ("kvm: mmu: ITLB_MULTIHIT mitigation")
Reported-by: Nelson Dsouza <nelson.dsouza@linux.intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/0ba029932a816179b9d14a30db38f0f11ef1f166.1594925782.git.pawan.kumar.gupta@linux.intel.com
2020-08-07 01:32:00 +02:00
Kan Liang 76d10256a9 x86/fpu/xstate: Fix an xstate size check warning with architectural LBRs
An xstate size check warning is triggered on machines which support
Architectural LBRs.

    XSAVE consistency problem, dumping leaves
    WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:649 fpu__init_system_xstate+0x4d4/0xd0e
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted intel-arch_lbr+
    RIP: 0010:fpu__init_system_xstate+0x4d4/0xd0e

The xstate size check routine, init_xstate_size(), compares the size
retrieved from the hardware with the size of task->fpu, which is
calculated by the software.

The size from the hardware is the total size of the enabled xstates in
XCR0 | IA32_XSS. Architectural LBR state is a dynamic supervisor
feature, which sets the corresponding bit in the IA32_XSS at boot time.
The size from the hardware includes the size of the Architectural LBR
state.

However, a dynamic supervisor feature doesn't allocate a buffer in the
task->fpu. The size of task->fpu doesn't include the size of the
Architectural LBR state. The mismatch will trigger the warning.

Three options as below were considered to fix the issue:

- Correct the size from the hardware by subtracting the size of the
  dynamic supervisor features.
  The purpose of the check is to compare the size CPU told with the size
  of the XSAVE buffer, which is calculated by the software. If the
  software mucks with the number from hardware, it removes the value of
  the check.
  This option is not a good option.

- Prevent the hardware from counting the size of the dynamic supervisor
  feature by temporarily removing the corresponding bits in IA32_XSS.
  Two extra MSR writes are required to flip the IA32_XSS. The option is
  not pretty, but it is workable. The check is only called once at early
  boot time. The synchronization or context-switching doesn't need to be
  worried.
  This option is implemented here.

- Remove the check entirely, because the check hasn't found any real
  problems. The option may be an alternative as option 2.
  This option is not implemented here.

Add a new function, get_xsaves_size_no_dynamic(), which retrieves the
total size without the dynamic supervisor features from the hardware.
The size will be used to compare with the size of task->fpu.

Fixes: f0dccc9da4 ("x86/fpu/xstate: Support dynamic supervisor feature for LBR")
Reported-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/r/1595253051-75374-1-git-send-email-kan.liang@linux.intel.com
2020-08-07 01:32:00 +02:00
Dilip Kota 7d98585860 x86/tsr: Fix tsc frequency enumeration bug on Lightning Mountain SoC
Frequency descriptor of Lightning Mountain SoC doesn't have all the
frequency entries so resulting in the below failure causing a kernel hang:

    Error MSR_FSB_FREQ index 15 is unknown
    tsc: Fast TSC calibration failed

So, add all the frequency entries in the Lightning Mountain SoC frequency
descriptor.

Fixes: 0cc5359d8f ("x86/cpu: Update init data for new Airmont CPU model")
Fixes: 812c2d7506 ("x86/tsc_msr: Use named struct initializers")
Signed-off-by: Dilip Kota <eswara.kota@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/211c643ae217604b46cbec43a2c0423946dc7d2d.1596440057.git.eswara.kota@linux.intel.com
2020-08-07 01:32:00 +02:00
Lianbo Jiang a3e1c3bb24 x86/crash: Correct the address boundary of function parameters
Let's carefully handle the boundary of the function parameter to make
sure that the arguments passed doesn't exceed the address range.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Young <dyoung@redhat.com>
Link: https://lore.kernel.org/r/20200804044933.1973-2-lijiang@redhat.com
2020-08-07 01:32:00 +02:00
Shuo Liu 4c7bfa383e x86/acrn: Remove redundant chars from ACRN signature
hypervisor_cpuid_base() only handles 12 chars of the hypervisor
signature string but is provided with 14 chars.

Remove the redundancy. Additionally, replace the user space uint32_t
with preferred kernel type u32.

Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20200806114111.9448-1-shuo.a.liu@intel.com
2020-08-07 01:32:00 +02:00
Shuo Liu 86d709ce30 x86/acrn: Allow ACRN guest to use X2APIC mode
The ACRN Hypervisor did not support x2APIC and thus x2APIC support was
disabled by always returning false when VM checked for x2APIC support.

ACRN received full support of x2APIC and exports the capability through
CPUID feature bits.

Let VM decide if it needs to switch to x2APIC mode according to CPUID
features.

Originally-by: Yakui Zhao <yakui.zhao@intel.com>
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20200806113802.9325-1-shuo.a.liu@intel.com
2020-08-07 01:32:00 +02:00
Linus Torvalds 921d2597ab s390: implement diag318
x86:
 * Report last CPU for debugging
 * Emulate smaller MAXPHYADDR in the guest than in the host
 * .noinstr and tracing fixes from Thomas
 * nested SVM page table switching optimization and fixes
 
 Generic:
 * Unify shadow MMU cache data structures across architectures
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8pC+oUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcOwgAjomqtEqQNlp7DdZT7VyyklzbxX1/
 ud7v+oOJ8K4sFlf64lSthjPo3N9rzZCcw+yOXmuyuITngXOGc3tzIwXpCzpLtuQ1
 WO1Ql3B/2dCi3lP5OMmsO1UAZqy9pKLg1dfeYUPk48P5+p7d/NPmk+Em5kIYzKm5
 JsaHfCp2EEXomwmljNJ8PQ1vTjIQSSzlgYUBZxmCkaaX7zbEUMtxAQCStHmt8B84
 33LczwXBm3viSWrzsoBV37I70+tseugiSGsCfUyupXOvq55d6D9FCqtCb45Hn4Vh
 Ik8ggKdalsk/reiGEwNw1/3nr6mRMkHSbl+Mhc4waOIFf9dn0urgQgOaDg==
 =YVx0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "s390:
   - implement diag318

  x86:
   - Report last CPU for debugging
   - Emulate smaller MAXPHYADDR in the guest than in the host
   - .noinstr and tracing fixes from Thomas
   - nested SVM page table switching optimization and fixes

  Generic:
   - Unify shadow MMU cache data structures across architectures"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
  KVM: SVM: Fix sev_pin_memory() error handling
  KVM: LAPIC: Set the TDCR settable bits
  KVM: x86: Specify max TDP level via kvm_configure_mmu()
  KVM: x86/mmu: Rename max_page_level to max_huge_page_level
  KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR
  KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch
  KVM: x86: Pull the PGD's level from the MMU instead of recalculating it
  KVM: VMX: Make vmx_load_mmu_pgd() static
  KVM: x86/mmu: Add separate helper for shadow NPT root page role calc
  KVM: VMX: Drop a duplicate declaration of construct_eptp()
  KVM: nSVM: Correctly set the shadow NPT root level in its MMU role
  KVM: Using macros instead of magic values
  MIPS: KVM: Fix build error caused by 'kvm_run' cleanup
  KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF
  KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
  KVM: VMX: optimize #PF injection when MAXPHYADDR does not match
  KVM: VMX: Add guest physical address check in EPT violation and misconfig
  KVM: VMX: introduce vmx_need_pf_intercept
  KVM: x86: update exception bitmap on CPUID changes
  KVM: x86: rename update_bp_intercept to update_exception_bitmap
  ...
2020-08-06 12:59:31 -07:00
Peter Zijlstra 0cd39f4600 locking/seqlock, headers: Untangle the spaghetti monster
By using lockdep_assert_*() from seqlock.h, the spaghetti monster
attacked.

Attack back by reducing seqlock.h dependencies from two key high level headers:

 - <linux/seqlock.h>:               -Remove <linux/ww_mutex.h>
 - <linux/time.h>:                  -Remove <linux/seqlock.h>
 - <linux/sched.h>:                 +Add    <linux/seqlock.h>

The price was to add it to sched.h ...

Core header fallout, we add direct header dependencies instead of gaining them
parasitically from higher level headers:

 - <linux/dynamic_queue_limits.h>:  +Add <asm/bug.h>
 - <linux/hrtimer.h>:               +Add <linux/seqlock.h>
 - <linux/ktime.h>:                 +Add <asm/bug.h>
 - <linux/lockdep.h>:               +Add <linux/smp.h>
 - <linux/sched.h>:                 +Add <linux/seqlock.h>
 - <linux/videodev2.h>:             +Add <linux/kernel.h>

Arch headers fallout:

 - PARISC: <asm/timex.h>:           +Add <asm/special_insns.h>
 - SH:     <asm/io.h>:              +Add <asm/page.h>
 - SPARC:  <asm/timer_64.h>:        +Add <uapi/asm/asi.h>
 - SPARC:  <asm/vvar.h>:            +Add <asm/processor.h>, <asm/barrier.h>
                                    -Remove <linux/seqlock.h>
 - X86:    <asm/fixmap.h>:          +Add <asm/pgtable_types.h>
                                    -Remove <asm/acpi.h>

There's also a bunch of parasitic header dependency fallout in .c files, not listed
separately.

[ mingo: Extended the changelog, split up & fixed the original patch. ]

Co-developed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200804133438.GK2674@hirez.programming.kicks-ass.net
2020-08-06 16:13:13 +02:00
Ingo Molnar 13c01139b1 x86/headers: Remove APIC headers from <asm/smp.h>
The APIC headers are relatively complex and bring in additional
header dependencies - while smp.h is a relatively simple header
included from high level headers.

Remove the dependency and add in the missing #include's in .c
files where they gained it indirectly before.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-08-06 16:13:09 +02:00
Ingo Molnar a703f3633f Merge branch 'WIP.locking/seqlocks' into locking/urgent
Pick up the full seqlock series PeterZ is working on.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-08-06 10:16:38 +02:00
Linus Torvalds 8186749621 drm next for 5.9-rc1
core:
 - add user def flag to cmd line modes
 - dma_fence_wait added might_sleep
 - dma-fence lockdep annotations
 - indefinite fences are bad documentation
 - gem CMA functions used in more drivers
 - struct mutex removal
 - more drm_ debug macro usage
 - set/drop master api fixes
 - fix for drm/mm hole size comparison
 - drm/mm remove invalid entry optimization
 - optimise drm/mm hole handling
 - VRR debugfs added
 - uncompressed AFBC modifier support
 - multiple display id blocks in EDID
 - multiple driver sg handling fixes
 - __drm_atomic_helper_crtc_reset in all drivers
 - managed vram helpers
 
 ttm:
 - ttm_mem_reg handling cleanup
 - remove bo offset field
 - drop CMA memtype flag
 - drop mappable flag
 
 xilinx:
 - New Xilinx ZynqMP DisplayPort Subsystem driver
 
 nouveau:
 - add CRC support
 - start using NVIDIA published class header files
 - convert all push buffer emission to new macros
 - Proper push buffer space management for EVO/NVD channels.
 - firmware loading fixes
 - 2MiB system memory pages support on Pascal and newer
 
 vkms:
 - larget cursor support
 
 i915:
 - Rocketlake platform enablement
 - Early DG1 enablement
 - Numerous GEM refactorings
 - DP MST fixes
 - FBC, PSR, Cursor, Color, Gamma fixes
 - TGL, RKL, EHL workaround updates
 - TGL 8K display support fixes
 - SDVO/HDMI/DVI fixes
 
 amdgpu:
 - Initial support for Sienna Cichlid GPU
 - Initial support for Navy Flounder GPU
 - SI UVD/VCE support
 - expose rotation property
 - Add support for unique id on Arcturus
 - Enable runtime PM on vega10 boards that support BACO
 - Skip BAR resizing if the bios already did id
 - Major swSMU code cleanup
 - Fixes for DCN bandwidth calculations
 
 amdkfd:
 - Track SDMA usage per process
 - SMI events interface
 
 radeon:
 - Default to on chip GART for AGP boards on all arches
 - Runtime PM reference count fixes
 
 msm:
 - headers regenerated causing churn
 - a650/a640 display and GPU enablement
 - dpu dither support for 6bpc panels
 - dpu cursor fix
 - dsi/mdp5 enablement for sdm630/sdm636/sdm66
 
 tegra:
 - video capture prep support
 - reflection support
 
 mediatek:
 - convert mtk_dsi to bridge API
 
 meson:
 - FBC support
 
 sun4i:
 - iommu support
 
 rockchip:
 - register locking fix
 - per-pixel alpha support PX30 VOP
 
 -
 mgag200:
 - ported to simple and shmem helpers
 - device init cleanups
 - use managed pci functions
 - dropped hw cursor support
 
 ast:
 - use managed pci functions
 - use managed VRAM helpers
 - rework cursor support
 
 malidp:
 - dev_groups support
 
 hibmc:
 - refactor hibmc_drv_vdac:
 
 vc4:
 - create TXP CRTC
 
 imx:
 - error path fixes and cleanups
 
 etnaviv:
 - clock handling and error handling cleanups
 - use pin_user_pages
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJfK1atAAoJEAx081l5xIa+vDkQAJvl/mjbEA7fDy8Ysa0cgPLI
 8nI4Bo/MaxkyRfUcP8+f/n3QQrRME37C0xa/Mn6SG1oFAdlovPwDqmDr5kjhkrMI
 geo8oJb2Q+AsrJr+ejpuF+iq0FxWi64bLbwJFJ2nBet+lHTMzoPceWeq3gG1Vvfl
 h6PV4B/9TjrnbhcKLIQSEmJ0kZp9uMkDBF/iynVn4+AKAkG1rQNjigdTH48IFPoz
 28KuqG0B4NWu648zYXhjsN0kD3Dxjv3YOH+FsoWQpQa9icCTySYbySsQ7l0/XvA3
 4BPtP3rWMhU46FHTBkWF72WQR4F0B4wm5DJJKMeG4vb1mXakOqAKcAq7JAbka+wL
 PBIiU+AcAKRSiwHmYDuDWtDoSpvYncuo0p3IvNP5hhih+7hqCnLIULDWS+V8AUzW
 39usS/DXsVKk/HGYIYC89cRwsqWYD4c7edzOBdPQxW4LNYCD2gXezLJ5TeeR2lih
 y9JIVnPiluWleOovs4W3BoZNRuLc1rHBO6COToXjlme/48Z+sRHBAoge6UZurqRN
 jr+e60cS7n/DOeJQuNf4UHZnK48Pc24+3kVfMHlX+OKn8VuKPGr+USkeHV/NYL/B
 USiKCAxkkZM0dxerSb1/Ra9kGnchf0QBpA6Fsem8kV61Z4GVc+K6xJWg7KXB6n/3
 7ZyalUKLwlOCz9sYsCCe
 =Yvtd
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "New xilinx displayport driver, AMD support for two new GPUs (more
  header files), i915 initial support for RocketLake and some work on
  their DG1 (discrete chip).

  The core also grew some lockdep annotations to try and constrain what
  drivers do with dma-fences, and added some documentation on why the
  idea of indefinite fences doesn't work.

  The long list is below.

  I do have some fixes trees outstanding, but I'll follow up with those
  later.

  core:
   - add user def flag to cmd line modes
   - dma_fence_wait added might_sleep
   - dma-fence lockdep annotations
   - indefinite fences are bad documentation
   - gem CMA functions used in more drivers
   - struct mutex removal
   - more drm_ debug macro usage
   - set/drop master api fixes
   - fix for drm/mm hole size comparison
   - drm/mm remove invalid entry optimization
   - optimise drm/mm hole handling
   - VRR debugfs added
   - uncompressed AFBC modifier support
   - multiple display id blocks in EDID
   - multiple driver sg handling fixes
   - __drm_atomic_helper_crtc_reset in all drivers
   - managed vram helpers

  ttm:
   - ttm_mem_reg handling cleanup
   - remove bo offset field
   - drop CMA memtype flag
   - drop mappable flag

  xilinx:
   - New Xilinx ZynqMP DisplayPort Subsystem driver

  nouveau:
   - add CRC support
   - start using NVIDIA published class header files
   - convert all push buffer emission to new macros
   - Proper push buffer space management for EVO/NVD channels.
   - firmware loading fixes
   - 2MiB system memory pages support on Pascal and newer

  vkms:
   - larger cursor support

  i915:
   - Rocketlake platform enablement
   - Early DG1 enablement
   - Numerous GEM refactorings
   - DP MST fixes
   - FBC, PSR, Cursor, Color, Gamma fixes
   - TGL, RKL, EHL workaround updates
   - TGL 8K display support fixes
   - SDVO/HDMI/DVI fixes

  amdgpu:
   - Initial support for Sienna Cichlid GPU
   - Initial support for Navy Flounder GPU
   - SI UVD/VCE support
   - expose rotation property
   - Add support for unique id on Arcturus
   - Enable runtime PM on vega10 boards that support BACO
   - Skip BAR resizing if the bios already did id
   - Major swSMU code cleanup
   - Fixes for DCN bandwidth calculations

  amdkfd:
   - Track SDMA usage per process
   - SMI events interface

  radeon:
   - Default to on chip GART for AGP boards on all arches
   - Runtime PM reference count fixes

  msm:
   - headers regenerated causing churn
   - a650/a640 display and GPU enablement
   - dpu dither support for 6bpc panels
   - dpu cursor fix
   - dsi/mdp5 enablement for sdm630/sdm636/sdm66

  tegra:
   - video capture prep support
   - reflection support

  mediatek:
   - convert mtk_dsi to bridge API

  meson:
   - FBC support

  sun4i:
   - iommu support

  rockchip:
   - register locking fix
   - per-pixel alpha support PX30 VOP

  mgag200:
   - ported to simple and shmem helpers
   - device init cleanups
   - use managed pci functions
   - dropped hw cursor support

  ast:
   - use managed pci functions
   - use managed VRAM helpers
   - rework cursor support

  malidp:
   - dev_groups support

  hibmc:
   - refactor hibmc_drv_vdac:

  vc4:
   - create TXP CRTC

  imx:
   - error path fixes and cleanups

  etnaviv:
   - clock handling and error handling cleanups
   - use pin_user_pages"

* tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm: (1747 commits)
  drm/msm: use kthread_create_worker instead of kthread_run
  drm/msm/mdp5: Add MDP5 configuration for SDM636/660
  drm/msm/dsi: Add DSI configuration for SDM660
  drm/msm/mdp5: Add MDP5 configuration for SDM630
  drm/msm/dsi: Add phy configuration for SDM630/636/660
  drm/msm/a6xx: add A640/A650 hwcg
  drm/msm/a6xx: hwcg tables in gpulist
  drm/msm/dpu: add SM8250 to hw catalog
  drm/msm/dpu: add SM8150 to hw catalog
  drm/msm/dpu: intf timing path for displayport
  drm/msm/dpu: set missing flush bits for INTF_2 and INTF_3
  drm/msm/dpu: don't use INTF_INPUT_CTRL feature on sdm845
  drm/msm/dpu: move some sspp caps to dpu_caps
  drm/msm/dpu: update UBWC config for sm8150 and sm8250
  drm/msm/dpu: use right setup_blend_config for sm8150 and sm8250
  drm/msm/a6xx: set ubwc config for A640 and A650
  drm/msm/adreno: un-open-code some packets
  drm/msm: sync generated headers
  drm/msm/a6xx: add build_bw_table for A640/A650
  drm/msm/a6xx: fix crashstate capture for A650
  ...
2020-08-05 19:50:06 -07:00
Linus Torvalds 2324d50d05 It's been a busy cycle for documentation - hopefully the busiest for a
while to come.  Changes include:
 
  - Some new Chinese translations
 
  - Progress on the battle against double words words and non-HTTPS URLs
 
  - Some block-mq documentation
 
  - More RST conversions from Mauro.  At this point, that task is
    essentially complete, so we shouldn't see this kind of churn again for a
    while.  Unless we decide to switch to asciidoc or something...:)
 
  - Lots of typo fixes, warning fixes, and more.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl8oVkwPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YoW8H/jJ/xnXFn7tkgVPQAlL3k5HCnK7A5nDP9RVR
 cg1pTx1cEFdjzxPlJyExU6/v+AImOvtweHXC+JDK7YcJ6XFUNYXJI3LxL5KwUXbY
 BL/xRFszDSXH2C7SJF5GECcFYp01e/FWSLN3yWAh+g+XwsKiTJ8q9+CoIDkHfPGO
 7oQsHKFu6s36Af0LfSgxk4sVB7EJbo8e4psuPsP5SUrl+oXRO43Put0rXkR4yJoH
 9oOaB51Do5fZp8I4JVAqGXvpXoExyLMO4yw0mASm6YSZ3KyjR8Fae+HD9Cq4ZuwY
 0uzb9K+9NEhqbfwtyBsi99S64/6Zo/MonwKwevZuhtsDTK4l4iU=
 =JQLZ
 -----END PGP SIGNATURE-----

Merge tag 'docs-5.9' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "It's been a busy cycle for documentation - hopefully the busiest for a
  while to come. Changes include:

   - Some new Chinese translations

   - Progress on the battle against double words words and non-HTTPS
     URLs

   - Some block-mq documentation

   - More RST conversions from Mauro. At this point, that task is
     essentially complete, so we shouldn't see this kind of churn again
     for a while. Unless we decide to switch to asciidoc or
     something...:)

   - Lots of typo fixes, warning fixes, and more"

* tag 'docs-5.9' of git://git.lwn.net/linux: (195 commits)
  scripts/kernel-doc: optionally treat warnings as errors
  docs: ia64: correct typo
  mailmap: add entry for <alobakin@marvell.com>
  doc/zh_CN: add cpu-load Chinese version
  Documentation/admin-guide: tainted-kernels: fix spelling mistake
  MAINTAINERS: adjust kprobes.rst entry to new location
  devices.txt: document rfkill allocation
  PCI: correct flag name
  docs: filesystems: vfs: correct flag name
  docs: filesystems: vfs: correct sync_mode flag names
  docs: path-lookup: markup fixes for emphasis
  docs: path-lookup: more markup fixes
  docs: path-lookup: fix HTML entity mojibake
  CREDITS: Replace HTTP links with HTTPS ones
  docs: process: Add an example for creating a fixes tag
  doc/zh_CN: add Chinese translation prefer section
  doc/zh_CN: add clearing-warn-once Chinese version
  doc/zh_CN: add admin-guide index
  doc:it_IT: process: coding-style.rst: Correct __maybe_unused compiler label
  futex: MAINTAINERS: Re-add selftests directory
  ...
2020-08-04 22:47:54 -07:00
Linus Torvalds 4da9f33026 Support for FSGSBASE. Almost 5 years after the first RFC to support it,
this has been brought into a shape which is maintainable and actually
 works.
 
 This final version was done by Sasha Levin who took it up after Intel
 dropped the ball. Sasha discovered that the SGX (sic!) offerings out there
 ship rogue kernel modules enabling FSGSBASE behind the kernels back which
 opens an instantanious unpriviledged root hole.
 
 The FSGSBASE instructions provide a considerable speedup of the context
 switch path and enable user space to write GSBASE without kernel
 interaction. This enablement requires careful handling of the exception
 entries which go through the paranoid entry path as they cannot longer rely
 on the assumption that user GSBASE is positive (as enforced via prctl() on
 non FSGSBASE enabled systemn). All other entries (syscalls, interrupts and
 exceptions) can still just utilize SWAPGS unconditionally when the entry
 comes from user space. Converting these entries to use FSGSBASE has no
 benefit as SWAPGS is only marginally slower than WRGSBASE and locating and
 retrieving the kernel GSBASE value is not a free operation either. The real
 benefit of RD/WRGSBASE is the avoidance of the MSR reads and writes.
 
 The changes come with appropriate selftests and have held up in field
 testing against the (sanitized) Graphene-SGX driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8pGnoTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTYJD/9873GkwvGcc/Vq/dJH1szGTgFftPyZ
 c/Y9gzx7EGBPLo25BS820L+ZlynzXHDxExKfCEaD10TZfe5XIc1vYNR0J74M2NmK
 IBgEDstJeW93ai+rHCFRXIevhpzU4GgGYJ1MeeOgbVMN3aGU1g6HfzMvtF0fPn8Y
 n6fsLZa43wgnoTdjwjjikpDTrzoZbaL1mbODBzBVPAaTbim7IKKTge6r/iCKrOjz
 Uixvm3g9lVzx52zidJ9kWa8esmbOM1j0EPe7/hy3qH9DFo87KxEzjHNH3T6gY5t6
 NJhRAIfY+YyTHpPCUCshj6IkRudE6w/qjEAmKP9kWZxoJrvPCTWOhCzelwsFS9b9
 gxEYfsnaKhsfNhB6fi0PtWlMzPINmEA7SuPza33u5WtQUK7s1iNlgHfvMbjstbwg
 MSETn4SG2/ZyzUrSC06lVwV8kh0RgM3cENc/jpFfIHD0vKGI3qfka/1RY94kcOCG
 AeJd0YRSU2RqL7lmxhHyG8tdb8eexns41IzbPCLXX2sF00eKNkVvMRYT2mKfKLFF
 q8v1x7yuwmODdXfFR6NdCkGm9IU7wtL6wuQ8Nhu9UraFmcXo6X6FLJC18FqcvSb9
 jvcRP4XY/8pNjjf44JB8yWfah0xGQsaMIKQGP4yLv4j6Xk1xAQKH1MqcC7l1D2HN
 5Z24GibFqSK/vA==
 =QaAN
 -----END PGP SIGNATURE-----

Merge tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fsgsbase from Thomas Gleixner:
 "Support for FSGSBASE. Almost 5 years after the first RFC to support
  it, this has been brought into a shape which is maintainable and
  actually works.

  This final version was done by Sasha Levin who took it up after Intel
  dropped the ball. Sasha discovered that the SGX (sic!) offerings out
  there ship rogue kernel modules enabling FSGSBASE behind the kernels
  back which opens an instantanious unpriviledged root hole.

  The FSGSBASE instructions provide a considerable speedup of the
  context switch path and enable user space to write GSBASE without
  kernel interaction. This enablement requires careful handling of the
  exception entries which go through the paranoid entry path as they
  can no longer rely on the assumption that user GSBASE is positive (as
  enforced via prctl() on non FSGSBASE enabled systemn).

  All other entries (syscalls, interrupts and exceptions) can still just
  utilize SWAPGS unconditionally when the entry comes from user space.
  Converting these entries to use FSGSBASE has no benefit as SWAPGS is
  only marginally slower than WRGSBASE and locating and retrieving the
  kernel GSBASE value is not a free operation either. The real benefit
  of RD/WRGSBASE is the avoidance of the MSR reads and writes.

  The changes come with appropriate selftests and have held up in field
  testing against the (sanitized) Graphene-SGX driver"

* tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  x86/fsgsbase: Fix Xen PV support
  x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbase
  selftests/x86/fsgsbase: Add a missing memory constraint
  selftests/x86/fsgsbase: Fix a comment in the ptrace_write_gsbase test
  selftests/x86: Add a syscall_arg_fault_64 test for negative GSBASE
  selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE
  selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write
  Documentation/x86/64: Add documentation for GS/FS addressing mode
  x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2
  x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit
  x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit
  x86/entry/64: Introduce the FIND_PERCPU_BASE macro
  x86/entry/64: Switch CR3 before SWAPGS in paranoid entry
  x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation
  x86/process/64: Use FSGSBASE instructions on thread copy and ptrace
  x86/process/64: Use FSBSBASE in switch_to() if available
  x86/process/64: Make save_fsgs_for_kvm() ready for FSGSBASE
  x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
  x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
  ...
2020-08-04 21:16:22 -07:00
Linus Torvalds 125cfa0d4d The conversion of X86 syscall, interrupt and exception entry/exit handling
to the generic code. Pretty much a straight forward 1:1 conversion plus the
 consolidation of the KVM handling of pending work before entering guest
 mode.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8pEFgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYocEwD/474Eb7LzZ8yahyUBirWJP3k3qzgs9j
 dZUxqB6LNuDOstEyTGLPdx1dmQP2vHbFfjoM7YBOH37EGcHsqjGliLvn2Y05ZD7O
 6kYwjz6qVnJcm3IMtfSUn/8LkfO5pGUdKd3U5ngDmPLpkeaQ4nPKqiO0uIb0wzwa
 cO7l10tG4YjMCWQxPNIaOh8kncLieQBediJPFjkQjV+Fh33kSU3LWTl3fccz6b5+
 mgSUFL0qjQpp+Nl7lCaDQQiAop9GTUETfDtximRydZauiM2NpCfz+QBmQzq50Xv1
 G3DWZoBIZBjmWJUgfSmS/s4GOYkBTBnT/fUcZmIDcgdRwvtEvRzIhcP87/wn7P3N
 UKpLdHqmvA0BFDXZbNZgS362++29pj5Lnb+u3QbWSKQ9UqHN0NUlSY4wzfTLXsGp
 Mzpp4TW0u/8kyOlo7wK3lVDgNJaPG31aiNVuDPgLe4cEluO5cq7/7g2GcFBqF1Ly
 SqNGD1IccteNQTNvDopczPy7qUl5Lal+Ia06szNSPR48gLrvhSWdyYr2i1sD7vx4
 hAhR0Hsi9dacGv46TrRw1OdDzq9bOW68G8GIgLJgDXaayPXLnx6TQEUjzQtIkE/i
 ydTPUarp5QOFByt+RBjI90ZcW4RuLgMTOEVONPXtSn8IoCP2Kdg9u3gD9AmUW3Q2
 JFkKMiSiJPGxlw==
 =84y7
 -----END PGP SIGNATURE-----

Merge tag 'x86-entry-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 conversion to generic entry code from Thomas Gleixner:
 "The conversion of X86 syscall, interrupt and exception entry/exit
  handling to the generic code.

  Pretty much a straight-forward 1:1 conversion plus the consolidation
  of the KVM handling of pending work before entering guest mode"

* tag 'x86-entry-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kvm: Use __xfer_to_guest_mode_work_pending() in kvm_run_vcpu()
  x86/kvm: Use generic xfer to guest work function
  x86/entry: Cleanup idtentry_enter/exit
  x86/entry: Use generic interrupt entry/exit code
  x86/entry: Cleanup idtentry_entry/exit_user
  x86/entry: Use generic syscall exit functionality
  x86/entry: Use generic syscall entry function
  x86/ptrace: Provide pt_regs helper for entry/exit
  x86/entry: Move user return notifier out of loop
  x86/entry: Consolidate 32/64 bit syscall entry
  x86/entry: Consolidate check_user_regs()
  x86: Correct noinstr qualifiers
  x86/idtentry: Remove stale comment
2020-08-04 21:05:46 -07:00
Linus Torvalds 9ba27414f2 fork-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXyge/QAKCRCRxhvAZXjc
 oildAQCCWpnTeXm6hrIE3VZ36X5npFtbaEthdBVAUJM7mo0FYwEA8+Wbnubg6jCw
 mztkXCnTfU7tApUdhKtQzcpEws45/Qk=
 =REE/
 -----END PGP SIGNATURE-----

Merge tag 'fork-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull fork cleanups from Christian Brauner:
 "This is cleanup series from when we reworked a chunk of the process
  creation paths in the kernel and switched to struct
  {kernel_}clone_args.

  High-level this does two main things:

   - Remove the double export of both do_fork() and _do_fork() where
     do_fork() used the incosistent legacy clone calling convention.

     Now we only export _do_fork() which is based on struct
     kernel_clone_args.

   - Remove the copy_thread_tls()/copy_thread() split making the
     architecture specific HAVE_COYP_THREAD_TLS config option obsolete.

  This switches all remaining architectures to select
  HAVE_COPY_THREAD_TLS and thus to the copy_thread_tls() calling
  convention. The current split makes the process creation codepaths
  more convoluted than they need to be. Each architecture has their own
  copy_thread() function unless it selects HAVE_COPY_THREAD_TLS then it
  has a copy_thread_tls() function.

  The split is not needed anymore nowadays, all architectures support
  CLONE_SETTLS but quite a few of them never bothered to select
  HAVE_COPY_THREAD_TLS and instead simply continued to use copy_thread()
  and use the old calling convention. Removing this split cleans up the
  process creation codepaths and paves the way for implementing clone3()
  on such architectures since it requires the copy_thread_tls() calling
  convention.

  After having made each architectures support copy_thread_tls() this
  series simply renames that function back to copy_thread(). It also
  switches all architectures that call do_fork() directly over to
  _do_fork() and the struct kernel_clone_args calling convention. This
  is a corollary of switching the architectures that did not yet support
  it over to copy_thread_tls() since do_fork() is conditional on not
  supporting copy_thread_tls() (Mostly because it lacks a separate
  argument for tls which is trivial to fix but there's no need for this
  function to exist.).

  The do_fork() removal is in itself already useful as it allows to to
  remove the export of both do_fork() and _do_fork() we currently have
  in favor of only _do_fork(). This has already been discussed back when
  we added clone3(). The legacy clone() calling convention is - as is
  probably well-known - somewhat odd:

    #
    # ABI hall of shame
    #
    config CLONE_BACKWARDS
    config CLONE_BACKWARDS2
    config CLONE_BACKWARDS3

  that is aggravated by the fact that some architectures such as sparc
  follow the CLONE_BACKWARDSx calling convention but don't really select
  the corresponding config option since they call do_fork() directly.

  So do_fork() enforces a somewhat arbitrary calling convention in the
  first place that doesn't really help the individual architectures that
  deviate from it. They can thus simply be switched to _do_fork()
  enforcing a single calling convention. (I really hope that any new
  architectures will __not__ try to implement their own calling
  conventions...)

  Most architectures already have made a similar switch (m68k comes to
  mind).

  Overall this removes more code than it adds even with a good portion
  of added comments. It simplifies a chunk of arch specific assembly
  either by moving the code into C or by simply rewriting the assembly.

  Architectures that have been touched in non-trivial ways have all been
  actually boot and stress tested: sparc and ia64 have been tested with
  Debian 9 images. They are the two architectures which have been
  touched the most. All non-trivial changes to architectures have seen
  acks from the relevant maintainers. nios2 with a custom built
  buildroot image. h8300 I couldn't get something bootable to test on
  but the changes have been fairly automatic and I'm sure we'll hear
  people yell if I broke something there.

  All other architectures that have been touched in trivial ways have
  been compile tested for each single patch of the series via git rebase
  -x "make ..." v5.8-rc2. arm{64} and x86{_64} have been boot tested
  even though they have just been trivially touched (removal of the
  HAVE_COPY_THREAD_TLS macro from their Kconfig) because well they are
  basically "core architectures" and since it is trivial to get your
  hands on a useable image"

* tag 'fork-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  arch: rename copy_thread_tls() back to copy_thread()
  arch: remove HAVE_COPY_THREAD_TLS
  unicore: switch to copy_thread_tls()
  sh: switch to copy_thread_tls()
  nds32: switch to copy_thread_tls()
  microblaze: switch to copy_thread_tls()
  hexagon: switch to copy_thread_tls()
  c6x: switch to copy_thread_tls()
  alpha: switch to copy_thread_tls()
  fork: remove do_fork()
  h8300: select HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
  nios2: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
  ia64: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
  sparc: unconditionally enable HAVE_COPY_THREAD_TLS
  sparc: share process creation helpers between sparc and sparc64
  sparc64: enable HAVE_COPY_THREAD_TLS
  fork: fold legacy_clone_args_valid() into _do_fork()
2020-08-04 14:47:45 -07:00
Linus Torvalds 3950e97543 Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull execve updates from Eric Biederman:
 "During the development of v5.7 I ran into bugs and quality of
  implementation issues related to exec that could not be easily fixed
  because of the way exec is implemented. So I have been diggin into
  exec and cleaning up what I can.

  This cycle I have been looking at different ideas and different
  implementations to see what is possible to improve exec, and cleaning
  the way exec interfaces with in kernel users. Only cleaning up the
  interfaces of exec with rest of the kernel has managed to stabalize
  and make it through review in time for v5.9-rc1 resulting in 2 sets of
  changes this cycle.

   - Implement kernel_execve

   - Make the user mode driver code a better citizen

  With kernel_execve the code size got a little larger as the copying of
  parameters from userspace and copying of parameters from userspace is
  now separate. The good news is kernel threads no longer need to play
  games with set_fs to use exec. Which when combined with the rest of
  Christophs set_fs changes should security bugs with set_fs much more
  difficult"

* 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits)
  exec: Implement kernel_execve
  exec: Factor bprm_stack_limits out of prepare_arg_pages
  exec: Factor bprm_execve out of do_execve_common
  exec: Move bprm_mm_init into alloc_bprm
  exec: Move initialization of bprm->filename into alloc_bprm
  exec: Factor out alloc_bprm
  exec: Remove unnecessary spaces from binfmts.h
  umd: Stop using split_argv
  umd: Remove exit_umh
  bpfilter: Take advantage of the facilities of struct pid
  exit: Factor thread_group_exited out of pidfd_poll
  umd: Track user space drivers with struct pid
  bpfilter: Move bpfilter_umh back into init data
  exec: Remove do_execve_file
  umh: Stop calling do_execve_file
  umd: Transform fork_usermode_blob into fork_usermode_driver
  umd: Rename umd_info.cmdline umd_info.driver_name
  umd: For clarity rename umh_info umd_info
  umh: Separate the user mode driver and the user mode helper support
  umh: Remove call_usermodehelper_setup_file.
  ...
2020-08-04 14:27:25 -07:00
Linus Torvalds 99ea1521a0 Remove uninitialized_var() macro for v5.9-rc1
- Clean up non-trivial uses of uninitialized_var()
 - Update documentation and checkpatch for uninitialized_var() removal
 - Treewide removal of uninitialized_var()
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oYLQWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsfjEACvf0D3WL3H7sLHtZ2HeMwOgAzq
 il08t6vUscINQwiIIK3Be43ok3uQ1Q+bj8sr2gSYTwunV2IYHFferzgzhyMMno3o
 XBIGd1E+v1E4DGBOiRXJvacBivKrfvrdZ7AWiGlVBKfg2E0fL1aQbe9AYJ6eJSbp
 UGqkBkE207dugS5SQcwrlk1tWKUL089lhDAPd7iy/5RK76OsLRCJFzIerLHF2ZK2
 BwvA+NWXVQI6pNZ0aRtEtbbxwEU4X+2J/uaXH5kJDszMwRrgBT2qoedVu5LXFPi8
 +B84IzM2lii1HAFbrFlRyL/EMueVFzieN40EOB6O8wt60Y4iCy5wOUzAdZwFuSTI
 h0xT3JI8BWtpB3W+ryas9cl9GoOHHtPA8dShuV+Y+Q2bWe1Fs6kTl2Z4m4zKq56z
 63wQCdveFOkqiCLZb8s6FhnS11wKtAX4czvXRXaUPgdVQS1Ibyba851CRHIEY+9I
 AbtogoPN8FXzLsJn7pIxHR4ADz+eZ0dQ18f2hhQpP6/co65bYizNP5H3h+t9hGHG
 k3r2k8T+jpFPaddpZMvRvIVD8O2HvJZQTyY6Vvneuv6pnQWtr2DqPFn2YooRnzoa
 dbBMtpon+vYz6OWokC5QNWLqHWqvY9TmMfcVFUXE4AFse8vh4wJ8jJCNOFVp8On+
 drhmmImUr1YylrtVOw==
 =xHmk
 -----END PGP SIGNATURE-----

Merge tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull uninitialized_var() macro removal from Kees Cook:
 "This is long overdue, and has hidden too many bugs over the years. The
  series has several "by hand" fixes, and then a trivial treewide
  replacement.

   - Clean up non-trivial uses of uninitialized_var()

   - Update documentation and checkpatch for uninitialized_var() removal

   - Treewide removal of uninitialized_var()"

* tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  compiler: Remove uninitialized_var() macro
  treewide: Remove uninitialized_var() usage
  checkpatch: Remove awareness of uninitialized_var() macro
  mm/debug_vm_pgtable: Remove uninitialized_var() usage
  f2fs: Eliminate usage of uninitialized_var() macro
  media: sur40: Remove uninitialized_var() usage
  KVM: PPC: Book3S PR: Remove uninitialized_var() usage
  clk: spear: Remove uninitialized_var() usage
  clk: st: Remove uninitialized_var() usage
  spi: davinci: Remove uninitialized_var() usage
  ide: Remove uninitialized_var() usage
  rtlwifi: rtl8192cu: Remove uninitialized_var() usage
  b43: Remove uninitialized_var() usage
  drbd: Remove uninitialized_var() usage
  x86/mm/numa: Remove uninitialized_var() usage
  docs: deprecated.rst: Add uninitialized_var()
2020-08-04 13:49:43 -07:00
Linus Torvalds e53bc3ff99 Boris is on vacation and he asked us to send you the pending RAS bits:
- Print the PPIN field on CPUs that fill them out
  - Fix an MCE injection bug
  - Simplify a kzalloc in dev_mcelog_init_device()
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oZ+ERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iIVQ//XLP3qky/X/DOcOhdqw7JEDUpElK+sLdb
 nFLcSP0vgdoip0jjz0voEf2tb0nRbZRqjnCIuHj5Dq8fTWbM/qcD0zM35N5lk5bJ
 N+y9cE+FQg2oayU/0O3IezoT0lAW3YzdE2HdzhZ2ZEiDKx3fAj9QoJbKlc4TxshC
 Tg93IpMq5dmnj6Ge3BfNE56O4WCLIQxs2MCIWHWjBEU8qb4Nry7vEL9hijmFQNRd
 85i9pDrOkdnLDKhTagHUR4HbhkAacWeJKe0UoraJFr8/1nCm8Hcii5hI1t5N9KXg
 /+asjkhLX75uYy2aMmwF7qelSLNbOieMNji7rt42vevBsqrAtX0zuSvbEBb8bNdx
 f4WS5GcmUTZVHTuppPzAX3YNmYdrGqmecerAY+1j+uHtWFGs07ZCnKx+kQbiK0m4
 LqaQmi16KsV/Jj9BYZu8wLd/uIAsRqnY5nJlbe7of0/kFPzlJkRkT8Z3DPhSp/ee
 qHc3eR1EFMQvNr8F1e17TPCTcq5YjLM9BXFX4JJppaX30YOb6ZH0z5jYMbouJeKD
 8qtQ1BtawWKqkoi/0PtbH6khRFXH6oUNWZ+2aAKQqWwfJDdP8Q7fFyIvgN487s79
 u7fN4h8qj7w1AeIRJQH5v1G/Es14gdBcotgkM2AflQLF+c7QormnHdCJZad4gTAj
 /X8Ks5DE51w=
 =z0MD
 -----END PGP SIGNATURE-----

Merge tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 RAS updates from Ingo Molnar:
 "Boris is on vacation and he asked us to send you the pending RAS bits:

   - Print the PPIN field on CPUs that fill them out

   - Fix an MCE injection bug

   - Simplify a kzalloc in dev_mcelog_init_device()"

* tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce, EDAC/mce_amd: Print PPIN in machine check records
  x86/mce/dev-mcelog: Use struct_size() helper in kzalloc()
  x86/mce/inject: Fix a wrong assignment of i_mce.status
2020-08-03 17:42:23 -07:00
Linus Torvalds 5183a617ec The biggest change is the removal of SGI UV1 support, which allowed the
removal of the legacy EFI old_mmap code as well.
 
 This removes quite a bunch of old code & quirks.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oYBoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1izAg//adabmMctmGxpWETEj1UmZdgJcKu5l+H3
 SiC6hoB46NB5J6E2whhOXuJGqgBGY/qx9Qy3LMFwgXNki85TQEEFTcGfd+RcXTXz
 B+oASL5Z+KOJ+QQqFf8W/dYye4ZXXxQN8gFSl9zaY3iXSY28JLMFrwkckkIrBwLZ
 9DmB7NaxeMQ22IEan641sK/YafGIriHvMpe34s2/p2WpemyncG1y1tZcPxZ/9K9U
 7OZ+4O6gr9n3FloD9yp3z1jFXiU4gSWEgbZvy3bD/AjbXs+oaaaIFDZ6yM/C3JYo
 Qw0PPTBSXkKTmG3UaK4vml4NPROLqyAzv8k5UTJmYQiVlSkyAVW80mzMQ9qDAhfK
 ny2+XE7ahjsFiNLFbboSOqLuTyHogkzcFCBrwW1wCTnPDEX5QCQF9t0agrY4pqqZ
 8lSMLj0BmkAPoi5XHq4WVEJL+0GyYFWmJwdtUAzd2Hj5w6mse0TiL7PtqAUfdFH0
 E5nBnY5RvqtWwfEungGl+zFmSRvH8UzOUydMMuG8koNlGaNZF4TwNVrMukbVqArX
 34Zt1bQrLUY/T4xNM+Zx1BWOFd5D3BUuI7xVhE5zMg20A8PHtPzMByoomoQrgGX2
 MDVUoqnPxeXWUil06+ND6koOtaL+xJy0JsUIFs7h9SGsF+JxsrYKbbLfjjDvtpuK
 sdb+fYzmgBU=
 =NChe
 -----END PGP SIGNATURE-----

Merge tag 'x86-platform-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform updates from Ingo Molnar:
 "The biggest change is the removal of SGI UV1 support, which allowed
  the removal of the legacy EFI old_mmap code as well.

  This removes quite a bunch of old code & quirks"

* tag 'x86-platform-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Remove unused EFI_UV1_MEMMAP code
  x86/platform/uv: Remove uv bios and efi code related to EFI_UV1_MEMMAP
  x86/efi: Remove references to no-longer-used efi_have_uv1_memmap()
  x86/efi: Delete SGI UV1 detection.
  x86/platform/uv: Remove efi=old_map command line option
  x86/platform/uv: Remove vestigial mention of UV1 platform from bios header
  x86/platform/uv: Remove support for UV1 platform from uv
  x86/platform/uv: Remove support for uv1 platform from uv_hub
  x86/platform/uv: Remove support for UV1 platform from uv_bau
  x86/platform/uv: Remove support for UV1 platform from uv_mmrs
  x86/platform/uv: Remove support for UV1 platform from x2apic_uv_x
  x86/platform/uv: Remove support for UV1 platform from uv_tlb
  x86/platform/uv: Remove support for UV1 platform from uv_time
2020-08-03 17:38:43 -07:00
Linus Torvalds c813e8c9df Filter MSR writes from user-space by default, and print a syslog entry if
they happen outside the allowed set of MSRs, which is a single one for now,
 MSR_IA32_ENERGY_PERF_BIAS.
 
 The plan is to eventually disable MSR writes by default (they can still be
 enabled via allow_writes=on).
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oWVwRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j61Q/+KDteLFjfVaJIEe0yfZ8YWjAPb1HRDg1u
 ia8wyZael5hHpLOQSOhkaTwbvF8dMSLK3oky8ZO2oA5gv99jyB+OZxOGS6mvMcGM
 aI9chECEghlvUKy1i+NNHSK4W9p4SSu0eK57EkX1VRalZmntTVEX7nlw7oI6nacW
 8iP7MIqYWdF1QI6n5gDh0i/62/ictOjG2yemooPNh1I5E5Cw+MhOqTZ7gL+nu1M6
 e2+oqo9i3+1yBAN3RH4M8tCwrtTwoVnmQlgPAqT5u2eOATu1b8cTOs2/aC6dfcGg
 3u1hcbXuQKRE4DuT7QhDJN1cQZt/GMdymTkR7xIlVqzmNPkM+DCiWPujVMU7r2ia
 z1Chwxd1gmDrX3g6XPrcXneGrasLkHIRAwYS8EN2wZyNIpb71Y9/uWZVJPza6OO2
 Gdxz/y1vCYUu43FSy+gCCirWW+h/HFc4BEzikvB+splQPSiJWJw/Nol+xkBnPds7
 LjQTpKSXd1nDtDyTsz0nZAp7EEft78Ez2mj4zYNgqZDBXLEiRPES47VEO3Ls9SS4
 t7AtIY1GtdUlFIPrA0mrbcggf/gYxJCFofxR2J7gzyH3DFXIlfw9HbS4+r4xv9RP
 v2wJUwN35y5SJRNymztanArTNPkGXOlcJ3s7nYQk13UQ0hWTtW3e86pMPsi5C3ui
 6DjHG8GXWh4=
 =X+M1
 -----END PGP SIGNATURE-----

Merge tag 'x86-misc-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 MSR filtering from Ingo Molnar:
 "Filter MSR writes from user-space by default, and print a syslog entry
  if they happen outside the allowed set of MSRs, which is a single one
  for now, MSR_IA32_ENERGY_PERF_BIAS.

  The plan is to eventually disable MSR writes by default (they can
  still be enabled via allow_writes=on)"

* tag 'x86-misc-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/msr: Filter MSR writes
2020-08-03 17:23:49 -07:00
Linus Torvalds 69094c2032 A single commit that removes the microcode loader's FW_LOADER coupling.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oVl8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gb5w//Y6V8o2qam5GMG3CAFMWeuaPLehTLX+e3
 GxXB+rQpVbe8WrroroP14KOoYYI5Nq6T/giO5HwL0h3/CbYZr8xcv12HSh7AQ8hX
 Zluybu9eN3P4ot0HEyIOphQCSppvjF9E0Zljs77miDbAC7AljdF/BQ3aqXOSxmJR
 haMvO0VDtE36JaxKIBKcrt/dmk9iSfdlY2FFjZ1Ia52bDplFzDHEmBi/3MfjahXa
 AkygeTwoRvUsvpBiY6jzRDLJb6JMuP8bqoOxEhzbNWKNye/EYUJdIehWl772+Rdz
 U+iXshAT302XPJ88Fo2VOQU3JhLbWHf9VBpMA+DrnWh0bFKvB6nZKlPu9TkMo25J
 WbKuyKFKoxUHXLH1mr8MkKHrUXDcnocBIn//n9WXQLA2/gTsxmN/Wtt0y0f5iQK0
 Y3KH/GfkdJimbIg5Y/NvepbRlghQomm919IRtECOw3QkPMdgDarpEh3Un8VSSoI3
 6pj9NPNmvnaP8wEg4BZR55YdMC+7EsYT+oRWdBiWeGbu6+TSCWjamrTrA2e0OtLM
 Jw25YS67iLVJW7Vsp3FnBovnJ3FxY/ss0OcF70VPaj6/P8YZhaESqoZfUs8U2NRP
 IxP8dxlxIM6AT7XdKmLDQ7S2YtEdCvoJtoK2yU4Nx1hNusy3d6WeOxLRbnofG4yo
 e3vRLccLt6M=
 =/LiU
 -----END PGP SIGNATURE-----

Merge tag 'x86-microcode-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 microcode update from Ingo Molnar:
 "Remove the microcode loader's FW_LOADER coupling"

* tag 'x86-microcode-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Do not select FW_LOADER
2020-08-03 17:22:13 -07:00
Linus Torvalds 335ad94c21 Misc changes:
- Prepare for Intel's new SERIALIZE instruction
  - Enable split-lock debugging on more CPUs
  - Add more Intel CPU models
  - Optimize stack canary initialization a bit
  - Simplify the Spectre logic a bit
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oTsQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gueQ//Vh9sTi8+q5ZCxXnJQOi59SZsFy1quC2Q
 6bFoSQ46npMBoYyC2eDQ4exBncWLqorT8Vq/evlW3XPldUzHKOk7b4Omonwyrrj5
 dg5fqcRjpjU8ni6egmy4ElMjab53gDuv0yNazjONeBGeWuBGu4vI2bP2eY3Addfm
 2eo2d5ZIMRCdShrUNwToJWWt6q4DzL/lcrVZAlX0LwlWVLqUCdIARALRM7V1XDsC
 udxS8KnvhTaJ7l63BSJREe3AGksLQd9P4UkJS4IE4t0zINBIrME043BYBMTh2Vvk
 y3jykKegIbmhPquGXG8grJbPDUF2/3FxmGKTIhpoo++agb2fxt921y5kqMJwniNS
 H/Gk032iGzjjwWnOoWE56UeuDTOlweSIrm4EG22HyEDK7kOMJusjYAV5fB4Sv7vj
 TBy5q0PCIutjXDTL1hIWf0WDiQt6eGNQS/yt3FlapLBGVRQwMU/pKYVVIOIaFtNs
 szx1ZeiT358Ww8a2fQlb8pqv50Upmr2wqFkAsMbm+NN3N92cqK6gJlo1p7fnxIuG
 +YVASobjsqbn0S62v/9SB/KRJz07adlZ6Tl/O/ILRvWyqik7COCCHDVJ62Zzaz5z
 LqR2daVM5H+Lp6jGZuIoq/JiUkxUe2K990eWHb3PUpOC4Rh73PvtMc7WFhbAjbye
 XV3eOEDi65c=
 =sL2Q
 -----END PGP SIGNATURE-----

Merge tag 'x86-cpu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu updates from Ingo Molar:

 - prepare for Intel's new SERIALIZE instruction

 - enable split-lock debugging on more CPUs

 - add more Intel CPU models

 - optimize stack canary initialization a bit

 - simplify the Spectre logic a bit

* tag 'x86-cpu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Refactor sync_core() for readability
  x86/cpu: Relocate sync_core() to sync_core.h
  x86/cpufeatures: Add enumeration for SERIALIZE instruction
  x86/split_lock: Enable the split lock feature on Sapphire Rapids and Alder Lake CPUs
  x86/cpu: Add Lakefield, Alder Lake and Rocket Lake models to the to Intel CPU family
  x86/stackprotector: Pre-initialize canary for secondary CPUs
  x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
2020-08-03 17:08:02 -07:00
Linus Torvalds 4ee4810315 Improve x86 debuggability: print registers with the same log level as the backtrace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oSVwRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iVgA//dJgCrIXMtv30UCHH6i4pWqUB9uPAoqk4
 tK3A6L2+D+1nSbREe3DW/8Lz5val3KjfZmDkwZzdtT6W9B3lv7fNE8dU/2yA18Ov
 DPAAf+s+A/rQlNOi+PNQ9tm4zRitHQzdth2p9CeR+Ty5WcaPZ+sTjlSZH4me6a2h
 M4bK8pemseifO7DLlB+ZehrsFF1dcxqgKeZGXY4bpL7w3iunfc8+SNhVa3QX2Yay
 i1XqCHXaAL7LEpmxyTtH0NYrMoEO2z6qpJ4r93I41ku4lHBXjyY+6siJYHX4cmRr
 RoUbPNlI6BWODS+O0RRYnwlWwkqzC6kahHWjsq8kq6n3QXDXHh8oqqw+qV5BCtSL
 q6CBCAjjYOCH+fMKp22U23kKG+YpTtbtFEewGhMvto5Czis4UdLcuGjuLSmhSrYC
 Lc78DE6V2xNb5bb769oVHSk99ztb/6JhJk3jvIpCAY4Trl52m4RZBTNAGjlSekyQ
 A1EA7siA7W6Ccs0S0XXXVJ8Uzx/UzwzLFfPGZJPCWV/G/+CH4Gq+7dkKh6TKZ+pu
 Cc8yFxs2eBRr6a8GZtmlCN4ydLBkrOzwgWRaLO1CcZC/mgUJhWKoQYuhuuN/Fcnv
 q208MaVOXzV4mAHlqjqTccmFVeB5rL4/+s+7vADj0CBi8xFLkiIySmBrVyy+01Q6
 UfhzuNsJBUk=
 =zyiS
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 debug fixlets from Ingo Molnar:
 "Improve x86 debuggability: print registers with the same log level as
  the backtrace"

* tag 'x86-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/dumpstack: Show registers dump with trace's log level
  x86/dumpstack: Add log_lvl to __show_regs()
  x86/dumpstack: Add log_lvl to show_iret_regs()
2020-08-03 17:00:00 -07:00
Linus Torvalds 37e88224c0 Misc cleanups all around the place.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oRTgRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1huHQ//T2hZk5zlpOtojxvdAzsPgtV4tHawseK8
 +ZZEbrH5qo5/ZMF18qyEJCm9p1yg8uIu71InULRCSgjU3v82GVCcuLXuE36U904G
 gHUqkYPnqxCqx+Li125aye9tKWahXe1DxX+uWbV0Ju7fiCO0rwYIzpWn1bnR6ilp
 fmLGSbgPlTVJwZ9mBvyi3VUlH5tDYidFN74TREUOwx2g5uhg+8uEo44Eb/bx8ESF
 dGt1Z/fnfDHkUZtmhzJk5Uz8nbw7rPHU/EZ4iZAxEzxTutY5PhsvbIfLO4t4HhGn
 utZCk/pIdiLLQ1GaTvFxqi3iolDqpOuXpnDlfEAJD8UlMCnwyh1Certq5LaRbtHS
 8SW3/CeJgzqzrrsYhkxVu2PMFWriSMxgKTLiN0KnzJN0Hu7A5lHbBY/6G7zpsF/A
 2KJ4e8lZiPCcNF7LteSRroUe4hNOYxZ2FlYTXm3AgycSL189UMfWlHFb5c+b4m1a
 cNJpz+jAom8foXN4KhRkl5PFKXVXDGTVln3NRJCh1Mqd1Ef4hsTo9H6FgHX/EfHg
 slJDwwPac80v0dzlMTSsMkyseaKRAqIObWOiknPt1wv/qja7ibVZ5mUbZ+/mfJX/
 YWybcPi1omgUSNt7TNx6jtma67rUjmJW0x9g7UJ/ttEkf6yG2lemrdusydBYuIni
 0Z2+hWzI9MM=
 =X7o0
 -----END PGP SIGNATURE-----

Merge tag 'x86-cleanups-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Ingo Molnar:
 "Misc cleanups all around the place"

* tag 'x86-cleanups-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ioperm: Initialize pointer bitmap with NULL rather than 0
  x86: uv: uv_hub.h: Delete duplicated word
  x86: cmpxchg_32.h: Delete duplicated word
  x86: bootparam.h: Delete duplicated word
  x86/mm: Remove the unused mk_kernel_pgd() #define
  x86/tsc: Remove unused "US_SCALE" and "NS_SCALE" leftover macros
  x86/ioapic: Remove unused "IOAPIC_AUTO" define
  x86/mm: Drop unused MAX_PHYSADDR_BITS
  x86/msr: Move the F15h MSRs where they belong
  x86/idt: Make idt_descr static
  initrd: Remove erroneous comment
  x86/mm/32: Fix -Wmissing prototypes warnings for init.c
  cpu/speculation: Add prototype for cpu_show_srbds()
  x86/mm: Fix -Wmissing-prototypes warnings for arch/x86/mm/init.c
  x86/asm: Unify __ASSEMBLY__ blocks
  x86/cpufeatures: Mark two free bits in word 3
  x86/msr: Lift AMD family 0x15 power-specific MSRs
2020-08-03 16:53:28 -07:00
Linus Torvalds 97c6f57dc9 A single commit that improves the alternatives patching syslog debug output.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oJc8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jlkRAAkF72DJULcyWrBvoNNeawf+I1DBlRnz0Y
 brZDicQ4SOe5zlIywhUNJHRMY1JCu9qpBaPFVh9yerdaK3IgV6eYE8u+NVrf4Jth
 4azle40SMI2W5eJe73JIEdo3Yn9/2fOZnuwpIxLdeNBCMvLNEuSDwiY+J7kVRfRZ
 OtrkD8mbrZGqOaZxk/AKU4KO0Rh0SfMsaZ6nxdpGOxkh1KwEnClb4ehOo1SXL2xl
 JyNWINOJkjoJhrUTJXmQIJ2lJTtHiI8DttnkPYgygnarBvC3uwoEo7e2O7X696Wm
 iZy75iGjW/yLWHsobJJKwhs16ijFp3P/S+l7BS9w5327ZIemrr7eSbVSu1lOWxGT
 c7tJsPPhfo6658+Zaq9u9LEfCotn4oHREI8HmDTTQiF+MwF2T/R1g0cqgAV/Woo9
 UF+uSbyFsAfoz9N/B+aMfS5rpuSE8ugfZ654GV2nOLhSlDsmUkfOpbzL6PBJSF9P
 DaB4FW7JMHFAU0mVaCAMtAahTp5PPLIdfV/m0AmTIIrMgtmH53Ta/Tt0xFoQy6Ev
 u8GmS0SzM4xgZIe0rKaSgXAooI9eyY5zqF0s8Mwj64+55DLwQ6aEkBxyDiVJzO+b
 nY8t52do6OLCxnG9QEJEsXJXipyFzWOvcj7PTZuZRUACpLe8Bxt42ViTr1Mu0hZs
 ojVETEqeNcU=
 =cpIN
 -----END PGP SIGNATURE-----

Merge tag 'x86-alternatives-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86/alternatives update from Ingo Molnar:
 "A single commit that improves the alternatives patching syslog debug
  output"

* tag 'x86-alternatives-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternatives: Add pr_fmt() to debug macros
2020-08-03 15:56:37 -07:00
Linus Torvalds e4cbce4d13 The main changes in this cycle were:
- Improve uclamp performance by using a static key for the fast path
 
  - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for
    better power efficiency of RT tasks on battery powered devices.
    (The default is to maximize performance & reduce RT latencies.)
 
  - Improve utime and stime tracking accuracy, which had a fixed boundary
    of error, which created larger and larger relative errors as the values
    become larger. This is now replaced with more precise arithmetics,
    using the new mul_u64_u64_div_u64() helper in math64.h.
 
  - Improve the deadline scheduler, such as making it capacity aware
 
  - Improve frequency-invariant scheduling
 
  - Misc cleanups in energy/power aware scheduling
 
  - Add sched_update_nr_running tracepoint to track changes to nr_running
 
  - Documentation additions and updates
 
  - Misc cleanups and smaller fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oJDURHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ixLg//bqWzFlfWirvngTgDxDnplwUTyKXmMCcq
 R1IYhlyK2O5FxvhbRmdmW11W3yzyTPvgCs6Q/70negGaPNe2w1OxfxiK9NMKz5eu
 M1LoXas7pL5g7Pr/ZxxHk/8VqJLV4t9MkodiiInmV6lTaznT3sU6a/kpYQjJyFnG
 Tuu9jd6JhdRKmePDJnNmUBoGQ7JiOQDcX4HtkcQ3OA+An3624tmJzbW1yts+uj7J
 ZWo2EY60RfbA9MxQXGPOaR/nAjngWs4Q6tddAh10mftsPq1gR2iFUKju1d31MQt/
 RHLdiqJf+AyUC4popKG7a+7ilCKMBwPociSreTJNPyEUQ1X4AM3vUVk4yjUoiDph
 k2WdsCF8/JRdhXg0NnrpPUqOaAbQj53EeXnitEb92E7WyTZgLOvAtpV//xZo6utp
 2QHerfrQ9SoGQjz/ho78za5vQtV1x25yDhd+X4XV4QEhIy85G9/2JCpC/Kc/TXLf
 OO7A4X69XztKTEJhP60g8ldCPUe4N2vbh1vKY6oAD8AFQVVNZ6n7375/Qa//b0/k
 ++hcYkPc2EK97/aBFdvzDgqb7aUo7Mtn2ibke16sQU4szulaoRuAHQG4jdGKMwbD
 dk2VBoxyxeYFXWHsNneSe87+ha3sd0dSN0ul1EB/SlFrVELMvy634YXnMYGW8ima
 PzyPB0ezpuA=
 =PbO7
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Improve uclamp performance by using a static key for the fast path

 - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for
   better power efficiency of RT tasks on battery powered devices.
   (The default is to maximize performance & reduce RT latencies.)

 - Improve utime and stime tracking accuracy, which had a fixed boundary
   of error, which created larger and larger relative errors as the
   values become larger. This is now replaced with more precise
   arithmetics, using the new mul_u64_u64_div_u64() helper in math64.h.

 - Improve the deadline scheduler, such as making it capacity aware

 - Improve frequency-invariant scheduling

 - Misc cleanups in energy/power aware scheduling

 - Add sched_update_nr_running tracepoint to track changes to nr_running

 - Documentation additions and updates

 - Misc cleanups and smaller fixes

* tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  sched/doc: Factorize bits between sched-energy.rst & sched-capacity.rst
  sched/doc: Document capacity aware scheduling
  sched: Document arch_scale_*_capacity()
  arm, arm64: Fix selection of CONFIG_SCHED_THERMAL_PRESSURE
  Documentation/sysctl: Document uclamp sysctl knobs
  sched/uclamp: Add a new sysctl to control RT default boost value
  sched/uclamp: Fix a deadlock when enabling uclamp static key
  sched: Remove duplicated tick_nohz_full_enabled() check
  sched: Fix a typo in a comment
  sched/uclamp: Remove unnecessary mutex_init()
  arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE
  sched: Cleanup SCHED_THERMAL_PRESSURE kconfig entry
  arch_topology, sched/core: Cleanup thermal pressure definition
  trace/events/sched.h: fix duplicated word
  linux/sched/mm.h: drop duplicated words in comments
  smp: Fix a potential usage of stale nr_cpus
  sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal
  sched: nohz: stop passing around unused "ticks" parameter.
  sched: Better document ttwu()
  sched: Add a tracepoint to track rq->nr_running
  ...
2020-08-03 14:58:38 -07:00
Linus Torvalds b34133fec8 - HW support updates:
- Add uncore support for Intel Comet Lake
 
    - Add RAPL support for Hygon Fam18h
 
    - Add Intel "IIO stack to PMON mapping" support on Skylake-SP CPUs,
      which enumerates per device performance counters via sysfs and enables
      the perf stat --iiostat functionality
 
    - Add support for Intel "Architectural LBRs", which generalized the model
      specific LBR hardware tracing feature into a model-independent, architected
      performance monitoring feature. Usage is mostly seamless to tooling, as the
      pre-existing LBR features are kept, but there's a couple of advantages
      under the hood, such as faster context-switching, faster LBR reads,
      cleaner exposure of LBR features to guest kernels, etc.
 
      ( Since architectural LBRs are supported via XSAVE, there's related
        changes to the x86 FPU code as well. )
 
  - ftrace/perf updates: Add support to add a text poke event to record changes
                         to kernel text (i.e. self-modifying code) in order to
                         support tracers like Intel PT decoding through
                         jump labels, kprobes and ftrace trampolines.
 
  - Misc cleanups, smaller fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oAgcRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gcSA/9EwKeLF03jkEXwzF/a/YhCxZXODH/klz/
 5D/Li+0HJy9TTVQWaSOxu31VcnWyAPER97aRjHohNMrAFKpAC4GwzxF2fjKUzzKJ
 eoWIgXvtlMM+nQb93UTB2+9Z3eHBEpKsqP8oc6qeXa74b2p3WfmvFRPBWFuzmOlH
 nb26F/Cu46HTEUfWvggU9flS0HpkdZ8X2Rt14sRwq5Gi2Wa/5+ygaksD+5nwRlGM
 r7jBrZBDTOGhy7HjrjpDPif056YU31giKmMQ/j17h1NaT3ciyXYSi0FuKEghDCII
 2OFyH0wZ1vsp63GISosIKFLFoBmOd4He4/sKjdtOtnosan250t3/ZDH/7tw6Rq2V
 tf1o/dMbDmV9v0lAVBZO76Z74ZQbk3+TvFxyDwtBSQYBe2eVfNz0VY4YjSRlRIp0
 1nIbJqiMLa7uquL2K4zZKapt7qsMaVqLO4YUVTzYPvv3luAqFLvC83a2+hapz4cs
 w4nET8lpWanUBK0hidQe1J6NPM4v1mnsvuZfM0p/QwKN9uvV5KoT6YJhRqfTy51g
 je+G80q0XqOH0H8x9iWuLiJe0G72UyhRqzSTxg+Cjj9cAhnsFPFLCNMWSVHqioLP
 JXGQiTp+6SQM6JDXkj5F8InsyT4KfzqizMSnAaH+6bsv9iQKDL4AbD7r92g6nbN9
 PP43QQh23Fg=
 =4pKU
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event updates from Ingo Molnar:
 "HW support updates:

   - Add uncore support for Intel Comet Lake

   - Add RAPL support for Hygon Fam18h

   - Add Intel "IIO stack to PMON mapping" support on Skylake-SP CPUs,
     which enumerates per device performance counters via sysfs and
     enables the perf stat --iiostat functionality

   - Add support for Intel "Architectural LBRs", which generalized the
     model specific LBR hardware tracing feature into a
     model-independent, architected performance monitoring feature.

     Usage is mostly seamless to tooling, as the pre-existing LBR
     features are kept, but there's a couple of advantages under the
     hood, such as faster context-switching, faster LBR reads, cleaner
     exposure of LBR features to guest kernels, etc.

     ( Since architectural LBRs are supported via XSAVE, there's related
       changes to the x86 FPU code as well. )

  ftrace/perf updates:

   - Add support to add a text poke event to record changes to kernel
     text (i.e. self-modifying code) in order to support tracers like
     Intel PT decoding through jump labels, kprobes and ftrace
     trampolines.

  Misc cleanups, smaller fixes..."

* tag 'perf-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
  perf/x86/rapl: Add Hygon Fam18h RAPL support
  kprobes: Remove unnecessary module_mutex locking from kprobe_optimizer()
  x86/perf: Fix a typo
  perf: <linux/perf_event.h>: drop a duplicated word
  perf/x86/intel/lbr: Support XSAVES for arch LBR read
  perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch
  x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature
  x86/fpu/xstate: Support dynamic supervisor feature for LBR
  x86/fpu: Use proper mask to replace full instruction mask
  perf/x86: Remove task_ctx_size
  perf/x86/intel/lbr: Create kmem_cache for the LBR context data
  perf/core: Use kmem_cache to allocate the PMU specific data
  perf/core: Factor out functions to allocate/free the task_ctx_data
  perf/x86/intel/lbr: Support Architectural LBR
  perf/x86/intel/lbr: Factor out intel_pmu_store_lbr
  perf/x86/intel/lbr: Factor out rdlbr_all() and wrlbr_all()
  perf/x86/intel/lbr: Mark the {rd,wr}lbr_{to,from} wrappers __always_inline
  perf/x86/intel/lbr: Unify the stored format of LBR information
  perf/x86/intel/lbr: Support LBR_CTL
  perf/x86: Expose CPUID enumeration bits for arch LBR
  ...
2020-08-03 14:51:09 -07:00
Linus Torvalds 9ba19ccd2d These were the main changes in this cycle:
- LKMM updates: mostly documentation changes, but also some new litmus tests for atomic ops.
 
  - KCSAN updates: the most important change is that GCC 11 now has all fixes in place
                   to support KCSAN, so GCC support can be enabled again. Also more annotations.
 
  - futex updates: minor cleanups and simplifications
 
  - seqlock updates: merge preparatory changes/cleanups for the 'associated locks' facilities.
 
  - lockdep updates:
     - simplify IRQ trace event handling
     - add various new debug checks
     - simplify header dependencies, split out <linux/lockdep_types.h>, decouple
       lockdep from other low level headers some more
     - fix NMI handling
 
  - misc cleanups and smaller fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8n9/wRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hZFQ//dD+AKw9Nym+WbylovmeD0qxWxPyeN/jG
 vBVDTOJIJLtZTkZf6YHcYOJlPwaMDYUQluqTPQhsaQZy/NoEb5NM2cFAj2R9gjyT
 O8665T1dvhW9Sh353mBpuwviqdrnvCeHTBEcglSlFY7hxToYAflUN0+DXGVtNys8
 PFNf3L9SHT0GLVC8+di/eJzQaRqxiB0Pq7kvh2RvPJM/dcQNA9Ho3CCNO5j6qGoY
 u7OnMT8xJXkgbdjjUO4RO0v9VjMuNthZ2JiONDgvgKtJfIL2wt5YXIv1EYX0GuWp
 WZgIzE4o1G7GJOOzKpFfZFyK8grHu2fWgK1plvodWjlLkBmltJZ1qyOM+wngd/m2
 TgtPo73/YFbxFUbbBpkb0eiIaH2t99kMvfCWd05+GiPCtzn9UL9GfFRWd42vonwc
 sQWjFrHKlnuzifUfNcLmKg7R2nUtF3Dm/SydiTJ+9NtH/QA17YJKWnlE1moulNtQ
 p7H7+8UdcvSQ7F38A74v2IYNIyDsv5qcE8ar4QHdaanBBX/LCyD0UlfgsgxEReXf
 GDKkpx7LFQlI6Y2YB+dZgkCwhNBl3/OQ3v6hC95B37fA67dAIQyPIWHiHbaM+029
 gghqU4GcUcbjSnHPzl9PPL+hi9MyXrMjpb7CBXytg4NI4EE1waHR+0kX14V8ndRj
 MkWQOKPUgB0=
 =3MTT
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - LKMM updates: mostly documentation changes, but also some new litmus
   tests for atomic ops.

 - KCSAN updates: the most important change is that GCC 11 now has all
   fixes in place to support KCSAN, so GCC support can be enabled again.
   Also more annotations.

 - futex updates: minor cleanups and simplifications

 - seqlock updates: merge preparatory changes/cleanups for the
   'associated locks' facilities.

 - lockdep updates:
    - simplify IRQ trace event handling
    - add various new debug checks
    - simplify header dependencies, split out <linux/lockdep_types.h>,
      decouple lockdep from other low level headers some more
    - fix NMI handling

 - misc cleanups and smaller fixes

* tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  kcsan: Improve IRQ state trace reporting
  lockdep: Refactor IRQ trace events fields into struct
  seqlock: lockdep assert non-preemptibility on seqcount_t write
  lockdep: Add preemption enabled/disabled assertion APIs
  seqlock: Implement raw_seqcount_begin() in terms of raw_read_seqcount()
  seqlock: Add kernel-doc for seqcount_t and seqlock_t APIs
  seqlock: Reorder seqcount_t and seqlock_t API definitions
  seqlock: seqcount_t latch: End read sections with read_seqcount_retry()
  seqlock: Properly format kernel-doc code samples
  Documentation: locking: Describe seqlock design and usage
  locking/qspinlock: Do not include atomic.h from qspinlock_types.h
  locking/atomic: Move ATOMIC_INIT into linux/types.h
  lockdep: Move list.h inclusion into lockdep.h
  locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs
  futex: Remove unused or redundant includes
  futex: Consistently use fshared as boolean
  futex: Remove needless goto's
  futex: Remove put_futex_key()
  rwsem: fix commas in initialisation
  docs: locking: Replace HTTP links with HTTPS ones
  ...
2020-08-03 14:39:35 -07:00
Linus Torvalds 3b4b84b2ea Fix a recent IRQ affinities regression, add in a missing debugfs printout
that helps the debugging of IRQ affinity logic bugs, and fix a memory leak.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8nEn8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ipYA/+KOWjDuRp1YBZeZ4/55RjGzimsW5jkLIY
 0Na3WGjN/QBKCzmRJNnMyW1UjRgpHpBhOsphTcHVdhJo9jg5+DX+XdVTwKGTqAI+
 7DqzP4dzifSgUwdcxIbKwtZquBRzKk1K0Z25b6Jc0WJwkGRx3LWhhRDERPUEHtXg
 Sl07XxiuqFLcQZz9o3hisKzEfA2llB4bfXOjLCJlLK3HUZKccoBjWKbTrI3ymCiz
 f0iV9a7kNzo4fJNddKOBTtDWFEhpj6NgEVtLNdAaDti7MSSjPbB1BsiK64UInGMQ
 4881ItYAOHGuCHe8yYnjlWA5kmwX14KjN6c3RAXK3n4+wvf+17RJC+FLH1PbkFIx
 hZ8k9x2Y5Dpt8vD8fGkoqi2nr2JYbIiOm79AjrD+Li+wWKG3iw4AGEoBHBJzKHUb
 naEGiUDJpn7pdpPWMACoctAIhy7/gDA1pPyb5F7Bf/RwoskIyu4i/d/xz05zBg3H
 HZMC2Lqcgh7LTS91NnmCx8XELdgL14mN19LK5enH3QTIPtdxmZ5x4quKw6ajMAAQ
 jwRpExqy6E1TQkIG5T5hjT0EMuj4uA6OzaoeOroFzKuzo+jiEDl49WAx+9Im9oBb
 i7hT4PM/wR7BcfmTMVhmmns4Dp0LkW7dRxHIjo7Fzft5iF8UkO7o7A4VUoAIxrSm
 xDFlBO/mo3w=
 =oKU9
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:
 "Fix a recent IRQ affinities regression, add in a missing debugfs
  printout that helps the debugging of IRQ affinity logic bugs, and fix
  a memory leak"

* tag 'irq-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/debugfs: Add missing irqchip flags
  genirq/affinity: Make affinity setting if activated opt-in
  irqdomain/treewide: Free firmware node after domain removal
2020-08-03 14:21:52 -07:00
Ingo Molnar 28cff52eae Merge branch 'linus' into locking/core, to resolve conflict
Conflicts:
	arch/arm/include/asm/percpu.h

As Stephen Rothwell noted, there's a conflict between this commit
in locking/core:

  a21ee6055c ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")

and this fresh upstream commit:

  aa54ea903a ("ARM: percpu.h: fix build error")

a21ee6055c is a simpler solution to the dependency problem and doesn't
further increase header hell - so this conflict resolution effectively
reverts aa54ea903a and uses the a21ee6055c solution.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 12:16:09 +02:00
Christoph Hellwig c8376994c8 initrd: remove support for multiple floppies
Remove the special handling for multiple floppies in the initrd code.
No one should be using floppies for booting these days. (famous last
words..)

Includes a spelling fix from Colin Ian King <colin.king@canonical.com>.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-30 08:22:33 +02:00
Thomas Gleixner bdd6558959 x86/i8259: Use printk_deferred() to prevent deadlock
0day reported a possible circular locking dependency:

Chain exists of:
  &irq_desc_lock_class --> console_owner --> &port_lock_key

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&port_lock_key);
                               lock(console_owner);
                               lock(&port_lock_key);
  lock(&irq_desc_lock_class);

The reason for this is a printk() in the i8259 interrupt chip driver
which is invoked with the irq descriptor lock held, which reverses the
lock operations vs. printk() from arbitrary contexts.

Switch the printk() to printk_deferred() to avoid that.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87365abt2v.fsf@nanos.tec.linutronix.de
2020-07-29 16:27:16 +02:00
Ingo Molnar e89d4ca1df Linux 5.8-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8d8h4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGd0sH/2iktYhMwPxzzpnb
 eI3OuTX/mRn4vUFOfpx9dmGVleMfKkpbvnn3IY7wA62Qfv7J7lkFRa1Bd1DlqXfW
 yyGTGDSKG5chiRCOU3s9ni92M4xIzFlrojyt/dIK2lUGMzUPI9FGlZRGQLKqqwLh
 2syOXRWbcQ7e52IHtDSy3YBNveKRsP4NyqV+GxGiex18SMB/M3Pw9EMH614eDPsE
 QAGQi5uGv4hPJtFHgXgUyBPLFHIyFAiVxhFRIj7u2DSEKY79+wO1CGWFiFvdTY4B
 CbqKXLffY3iQdFsLJkj9Dl8cnOQnoY44V0EBzhhORxeOp71StUVaRwQMFa5tp48G
 171s5Hs=
 =BQIl
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc7' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-28 13:18:01 +02:00
Al Viro 0557d64d98 x86: switch to ->regset_get()
All instances of ->get() in arch/x86 switched; that might or might
not be worth splitting up.  Notes:

	* for xstateregs_get() the amount we want to store is determined at
the boot time; see init_xstate_size() and update_regset_xstate_info() for
details.  task->thread.fpu.state.xsave ends with a flexible array member and
the amount of data in it depends upon the FPU features supported/enabled.

	* fpregs_get() writes slightly less than full ->thread.fpu.state.fsave
(the last word is not copied); we pass the full size of state.fsave and let
membuf_write() trim to the amount declared by regset - __regset_get() will
make sure that the space in buffer is no more than that.

	* copy_xstate_to_user() and its helpers are gone now.

	* fpregs_soft_get() was getting user_regset_copyout() arguments
wrong.  Since "x86: x86 user_regset math_emu" back in 2008...  I really
doubt that it's worth splitting out for -stable, though - you need
a 486SX box for that to trigger...

[Kevin's braino fix for copy_xstate_to_kernel() essentially duplicated here]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-07-27 14:31:07 -04:00
Thomas Gleixner f0c7baca18 genirq/affinity: Make affinity setting if activated opt-in
John reported that on a RK3288 system the perf per CPU interrupts are all
affine to CPU0 and provided the analysis:

 "It looks like what happens is that because the interrupts are not per-CPU
  in the hardware, armpmu_request_irq() calls irq_force_affinity() while
  the interrupt is deactivated and then request_irq() with IRQF_PERCPU |
  IRQF_NOBALANCING.  

  Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls
  irq_setup_affinity() which returns early because IRQF_PERCPU and
  IRQF_NOBALANCING are set, leaving the interrupt on its original CPU."

This was broken by the recent commit which blocked interrupt affinity
setting in hardware before activation of the interrupt. While this works in
general, it does not work for this particular case. As contrary to the
initial analysis not all interrupt chip drivers implement an activate
callback, the safe cure is to make the deferred interrupt affinity setting
at activation time opt-in.

Implement the necessary core logic and make the two irqchip implementations
for which this is required opt-in. In hindsight this would have been the
right thing to do, but ...

Fixes: baedb87d1b ("genirq/affinity: Handle affinity setting on inactive interrupts correctly")
Reported-by: John Keeping <john@metanate.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de
2020-07-27 16:20:40 +02:00
Ricardo Neri 9998a9832c x86/cpu: Relocate sync_core() to sync_core.h
Having sync_core() in processor.h is problematic since it is not possible
to check for hardware capabilities via the *cpu_has() family of macros.
The latter needs the definitions in processor.h.

It also looks more intuitive to relocate the function to sync_core.h.

This changeset does not make changes in functionality.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20200727043132.15082-3-ricardo.neri-calderon@linux.intel.com
2020-07-27 12:42:06 +02:00
Ingo Molnar 538b10856b Linux 5.8-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8d8h4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGd0sH/2iktYhMwPxzzpnb
 eI3OuTX/mRn4vUFOfpx9dmGVleMfKkpbvnn3IY7wA62Qfv7J7lkFRa1Bd1DlqXfW
 yyGTGDSKG5chiRCOU3s9ni92M4xIzFlrojyt/dIK2lUGMzUPI9FGlZRGQLKqqwLh
 2syOXRWbcQ7e52IHtDSy3YBNveKRsP4NyqV+GxGiex18SMB/M3Pw9EMH614eDPsE
 QAGQi5uGv4hPJtFHgXgUyBPLFHIyFAiVxhFRIj7u2DSEKY79+wO1CGWFiFvdTY4B
 CbqKXLffY3iQdFsLJkj9Dl8cnOQnoY44V0EBzhhORxeOp71StUVaRwQMFa5tp48G
 171s5Hs=
 =BQIl
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc7' into x86/cpu, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-27 12:41:13 +02:00
Ingo Molnar 2d65685a4a Merge branch 'x86/urgent' into x86/cleanups
Refresh the branch for a dependent commit.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-26 19:52:30 +02:00
Ingo Molnar f87032aec4 Merge branch 'locking/nmi' into x86/entry
Resolve conflicts with ongoing lockdep work that fixed the NMI entry code.

Conflicts:
	arch/x86/entry/common.c
	arch/x86/include/asm/idtentry.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-26 18:26:25 +02:00
Ingo Molnar c84d53051f Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc6' into locking/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-25 21:49:36 +02:00
Fenghua Yu 3aae57f0c3 x86/split_lock: Enable the split lock feature on Sapphire Rapids and Alder Lake CPUs
Add Sapphire Rapids and Alder Lake processors to CPU list to enumerate
and enable the split lock feature.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/1595634320-79689-1-git-send-email-fenghua.yu@intel.com
2020-07-25 12:17:00 +02:00
Ingo Molnar fb4405ae6e Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc6' into x86/cpu, to refresh the branch before adding new commits

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-25 12:16:16 +02:00
Thomas Gleixner a27a0a5549 x86/entry: Cleanup idtentry_enter/exit
Remove the temporary defines and fixup all references.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.855839271@linutronix.de
2020-07-24 15:05:01 +02:00
Thomas Gleixner 517e499227 x86/entry: Cleanup idtentry_entry/exit_user
Cleanup the temporary defines and use irqentry_ instead of idtentry_.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.602603691@linutronix.de
2020-07-24 15:05:00 +02:00
Thomas Gleixner 167fd210ec x86/entry: Use generic syscall exit functionality
Replace the x86 variant with the generic version. Provide the relevant
architecture specific helper functions and defines.

Use a temporary define for idtentry_exit_user which will be cleaned up
seperately.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.494648601@linutronix.de
2020-07-24 15:04:59 +02:00
Thomas Gleixner b35ad8405d Merge branch 'core/entry' into x86/entry
Pick up generic entry code to migrate x86 over.
2020-07-24 15:03:59 +02:00
Ira Weiny 7f6fa101df x86: Correct noinstr qualifiers
The noinstr qualifier is to be specified before the return type in the
same way inline is used.

These 2 cases were missed by previous patches.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200723161405.852613-1-ira.weiny@intel.com
2020-07-24 09:54:15 +02:00
Dave Airlie 41206a073c Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge v5.8-rc6 into drm-next

I've got a silent conflict + two trees based on fixes to merge.

Fixes a silent merge with amdgpu

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-07-24 08:48:05 +10:00
Jon Derrick ec0160891e irqdomain/treewide: Free firmware node after domain removal
Commit 711419e504 ("irqdomain: Add the missing assignment of
domain->fwnode for named fwnode") unintentionally caused a dangling pointer
page fault issue on firmware nodes that were freed after IRQ domain
allocation. Commit e3beca48a4 fixed that dangling pointer issue by only
freeing the firmware node after an IRQ domain allocation failure. That fix
no longer frees the firmware node immediately, but leaves the firmware node
allocated after the domain is removed.

The firmware node must be kept around through irq_domain_remove, but should be
freed it afterwards.

Add the missing free operations after domain removal where where appropriate.

Fixes: e3beca48a4 ("irqdomain/treewide: Keep firmware node unconditionally allocated")
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>	# drivers/pci
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1595363169-7157-1-git-send-email-jonathan.derrick@intel.com
2020-07-23 00:08:52 +02:00
Dmitry Safonov ef2ff0f5d6 x86/dumpstack: Show registers dump with trace's log level
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.

Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).

Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).

After all preparations are done, provide log_lvl parameter for
show_regs_if_on_stack() and wire up to actual log level used as
an argument for show_trace_log_lvl().

[1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/
[2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://lkml.kernel.org/r/20200629144847.492794-4-dima@arista.com
2020-07-22 23:56:54 +02:00
Dmitry Safonov 44e215352c x86/dumpstack: Add log_lvl to __show_regs()
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.

Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).

Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).

Add log_lvl parameter to __show_regs().
Keep the used log level intact to separate visible change.

[1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/
[2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://lkml.kernel.org/r/20200629144847.492794-3-dima@arista.com
2020-07-22 23:56:53 +02:00
Dmitry Safonov fd07f802a7 x86/dumpstack: Add log_lvl to show_iret_regs()
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.

Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).

Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).

Add log_lvl parameter to show_iret_regs() as a preparation to add it
to __show_regs() and show_regs_if_on_stack().

[1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/
[2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://lkml.kernel.org/r/20200629144847.492794-2-dima@arista.com
2020-07-22 23:56:53 +02:00
Thomas Gleixner d181d2da01 x86/dumpstack: Dump user space code correctly again
H.J. reported that post 5.7 a segfault of a user space task does not longer
dump the Code bytes when /proc/sys/debug/exception-trace is enabled. It
prints 'Code: Bad RIP value.' instead.

This was broken by a recent change which made probe_kernel_read() reject
non-kernel addresses.

Update show_opcodes() so it retrieves user space opcodes via
copy_from_user_nmi().

Fixes: 98a23609b1 ("maccess: always use strict semantics for probe_kernel_read")
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/87h7tz306w.fsf@nanos.tec.linutronix.de
2020-07-22 23:47:48 +02:00
Josh Poimboeuf 039a7a30ec x86/stacktrace: Fix reliable check for empty user task stacks
If a user task's stack is empty, or if it only has user regs, ORC
reports it as a reliable empty stack.  But arch_stack_walk_reliable()
incorrectly treats it as unreliable.

That happens because the only success path for user tasks is inside the
loop, which only iterates on non-empty stacks.  Generally, a user task
must end in a user regs frame, but an empty stack is an exception to
that rule.

Thanks to commit 71c9582528 ("x86/unwind/orc: Fix error handling in
__unwind_start()"), unwind_start() now sets state->error appropriately.
So now for both ORC and FP unwinders, unwind_done() and !unwind_error()
always means the end of the stack was successfully reached.  So the
success path for kthreads is no longer needed -- it can also be used for
empty user tasks.

Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lkml.kernel.org/r/f136a4e5f019219cbc4f4da33b30c2f44fa65b84.1594994374.git.jpoimboe@redhat.com
2020-07-22 23:47:47 +02:00
Josh Poimboeuf 372a8eaa05 x86/unwind/orc: Fix ORC for newly forked tasks
The ORC unwinder fails to unwind newly forked tasks which haven't yet
run on the CPU.  It correctly reads the 'ret_from_fork' instruction
pointer from the stack, but it incorrectly interprets that value as a
call stack address rather than a "signal" one, so the address gets
incorrectly decremented in the call to orc_find(), resulting in bad ORC
data.

Fix it by forcing 'ret_from_fork' frames to be signal frames.

Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lkml.kernel.org/r/f91a8778dde8aae7f71884b5df2b16d552040441.1594994374.git.jpoimboe@redhat.com
2020-07-22 23:47:47 +02:00
Peter Zijlstra 015dc08918 Merge branch 'sched/urgent' 2020-07-22 10:22:02 +02:00
Joerg Roedel de2b41be8f x86, vmlinux.lds: Page-align end of ..page_aligned sections
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
page-aligned, but the end of the .bss..page_aligned section is not
guaranteed to be page-aligned.

As a result, objects from other .bss sections may end up on the same 4k
page as the idt_table, and will accidentially get mapped read-only during
boot, causing unexpected page-faults when the kernel writes to them.

This could be worked around by making the objects in the page aligned
sections page sized, but that's wrong.

Explicit sections which store only page aligned objects have an implicit
guarantee that the object is alone in the page in which it is placed. That
works for all objects except the last one. That's inconsistent.

Enforcing page sized objects for these sections would wreckage memory
sanitizers, because the object becomes artificially larger than it should
be and out of bound access becomes legit.

Align the end of the .bss..page_aligned and .data..page_aligned section on
page-size so all objects places in these sections are guaranteed to have
their own page.

[ tglx: Amended changelog ]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
2020-07-22 09:38:37 +02:00
Eric W. Biederman be619f7f06 exec: Implement kernel_execve
To allow the kernel not to play games with set_fs to call exec
implement kernel_execve.  The function kernel_execve takes pointers
into kernel memory and copies the values pointed to onto the new
userspace stack.

The calls with arguments from kernel space of do_execve are replaced
with calls to kernel_execve.

The calls do_execve and do_execveat are made static as there are now
no callers outside of exec.

The comments that mention do_execve are updated to refer to
kernel_execve or execve depending on the circumstances.  In addition
to correcting the comments, this makes it easy to grep for do_execve
and verify it is not used.

Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-hch@lst.de
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/87wo365ikj.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-07-21 08:24:52 -05:00
Kevin Buettner 5714ee50bb copy_xstate_to_kernel: Fix typo which caused GDB regression
This fixes a regression encountered while running the
gdb.base/corefile.exp test in GDB's test suite.

In my testing, the typo prevented the sw_reserved field of struct
fxregs_state from being output to the kernel XSAVES area.  Thus the
correct mask corresponding to XCR0 was not present in the core file for
GDB to interrogate, resulting in the following behavior:

   [kev@f32-1 gdb]$ ./gdb -q testsuite/outputs/gdb.base/corefile/corefile testsuite/outputs/gdb.base/corefile/corefile.core
   Reading symbols from testsuite/outputs/gdb.base/corefile/corefile...
   [New LWP 232880]

   warning: Unexpected size of section `.reg-xstate/232880' in core file.

With the typo fixed, the test works again as expected.

Signed-off-by: Kevin Buettner <kevinb@redhat.com>
Fixes: 9e46365459 ("copy_xstate_to_kernel(): don't leave parts of destination uninitialized")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-19 17:09:10 -07:00
Linus Torvalds efb9666e90 A pile of fixes for x86:
- Fix the I/O bitmap invalidation on XEN PV, which was overlooked in the
    recent ioperm/iopl rework. This caused the TSS and XEN's I/O bitmap to
    get out of sync.
 
  - Use the proper vectors for HYPERV.
 
  - Make disabling of stack protector for the entry code work with GCC
    builds which enable stack protector by default. Removing the option is
    not sufficient, it needs an explicit -fno-stack-protector to shut it
    off.
 
  - Mark check_user_regs() noinstr as it is called from noinstr code. The
    missing annotation causes it to be placed in the text section which
    makes it instrumentable.
 
  - Add the missing interrupt disable in exc_alignment_check()
 
  - Fixup a XEN_PV build dependency in the 32bit entry code
 
  - A few fixes to make the Clang integrated assembler happy
 
  - Move EFI stub build to the right place for out of tree builds
 
  - Make prepare_exit_to_usermode() static. It's not longer called from ASM
    code.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8UR+MTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoQCUD/4/9W5FFvdZvQPwmXsHPaVnW9hUsXxG
 0tjc34xqDcgEl1U3khu+6jj+oHx+JM+4wGP/V49Wqx6xkrJ33/a8uYErAgI7+Pmp
 s3T2gXMWkgJtYFlDQdAMbeuuM2cOFZJw4BxxvTMth5EixQvk1EkX6QyBjLaSGo8y
 78sWtZ6Oh5Ql9ua/9TOilewLsCsQSFIFn0o/hawwwPUMrwGvD29scha0XHom+AO7
 uwejfU8klq2HJJaLaaiUaiNBkFz0TNGJtY+3mQpw8BPjCuuBQhYygrS0X4uQzo01
 4XJzhDnOVbAYWqi0/T+mAEmuJ9NBZJwYiYrwBYCkZgELwJKLzhzO2GOgP9xEsFY4
 VUNgqHFhKrQp10k2k4L/A5tmr+0GntiCQhdZi+/gty6oO/t3ni57pRcAhA9qBNOb
 8ZqumBwgaaAIqcmdtoyXAIveWOHnzwKEg6wmIGFbyCwHjeLJKJG7KhpXIpEuX+j2
 DC7EfYvRB+jllAk1CBypBvzD0DHfMZ0myPxCcZiW2wHTVAlkpY7hiIyPHqocjE9L
 OjOQ7FS6E2/p24lYVcLUFWcESxGFvQjjxwXk7htjpGUIZsQOhz/LOW+CIPCsfbqE
 HoEsHmNltksYYV9FDfevXRp5sbxpx3wQSLOgqNqiOpy4cTCG8boalUqHQ0OsN8Oa
 EgU067yF77ymRg==
 =QAeH
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master

Pull x86 fixes from Thomas Gleixner:
 "A pile of fixes for x86:

   - Fix the I/O bitmap invalidation on XEN PV, which was overlooked in
     the recent ioperm/iopl rework. This caused the TSS and XEN's I/O
     bitmap to get out of sync.

   - Use the proper vectors for HYPERV.

   - Make disabling of stack protector for the entry code work with GCC
     builds which enable stack protector by default. Removing the option
     is not sufficient, it needs an explicit -fno-stack-protector to
     shut it off.

   - Mark check_user_regs() noinstr as it is called from noinstr code.
     The missing annotation causes it to be placed in the text section
     which makes it instrumentable.

   - Add the missing interrupt disable in exc_alignment_check()

   - Fixup a XEN_PV build dependency in the 32bit entry code

   - A few fixes to make the Clang integrated assembler happy

   - Move EFI stub build to the right place for out of tree builds

   - Make prepare_exit_to_usermode() static. It's not longer called from
     ASM code"

* tag 'x86-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Don't add the EFI stub to targets
  x86/entry: Actually disable stack protector
  x86/ioperm: Fix io bitmap invalidation on Xen PV
  x86: math-emu: Fix up 'cmp' insn for clang ias
  x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERV
  x86/entry: Add compatibility with IAS
  x86/entry/common: Make prepare_exit_to_usermode() static
  x86/entry: Mark check_user_regs() noinstr
  x86/traps: Disable interrupts in exc_aligment_check()
  x86/entry/32: Fix XEN_PV build dependency
2020-07-19 12:16:09 -07:00
Andy Lutomirski cadfad8701 x86/ioperm: Fix io bitmap invalidation on Xen PV
tss_invalidate_io_bitmap() wasn't wired up properly through the pvop
machinery, so the TSS and Xen's io bitmap would get out of sync
whenever disabling a valid io bitmap.

Add a new pvop for tss_invalidate_io_bitmap() to fix it.

This is XSA-329.

Fixes: 22fe5b0439 ("x86/ioperm: Move TSS bitmap update to exit to user work")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/d53075590e1f91c19f8af705059d3ff99424c020.1595030016.git.luto@kernel.org
2020-07-18 12:31:49 +02:00
Thomas Gleixner baedb87d1b genirq/affinity: Handle affinity setting on inactive interrupts correctly
Setting interrupt affinity on inactive interrupts is inconsistent when
hierarchical irq domains are enabled. The core code should just store the
affinity and not call into the irq chip driver for inactive interrupts
because the chip drivers may not be in a state to handle such requests.

X86 has a hacky workaround for that but all other irq chips have not which
causes problems e.g. on GIC V3 ITS.

Instead of adding more ugly hacks all over the place, solve the problem in
the core code. If the affinity is set on an inactive interrupt then:

    - Store it in the irq descriptors affinity mask
    - Update the effective affinity to reflect that so user space has
      a consistent view
    - Don't call into the irq chip driver

This is the core equivalent of the X86 workaround and works correctly
because the affinity setting is established in the irq chip when the
interrupt is activated later on.

Note, that this is only effective when hierarchical irq domains are enabled
by the architecture. Doing it unconditionally would break legacy irq chip
implementations.

For hierarchial irq domains this works correctly as none of the drivers can
have a dependency on affinity setting in inactive state by design.

Remove the X86 workaround as it is not longer required.

Fixes: 02edee152d ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
Reported-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com
Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
2020-07-17 23:30:43 +02:00
steve.wahl@hpe.com 66d67fecd8 x86/efi: Remove references to no-longer-used efi_have_uv1_memmap()
In removing UV1 support, efi_have_uv1_memmap is no longer used.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200713212955.786177105@hpe.com
2020-07-17 16:47:47 +02:00
steve.wahl@hpe.com a6b740f173 x86/platform/uv: Remove support for UV1 platform from x2apic_uv_x
UV1 is not longer supported.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200713212954.846026992@hpe.com
2020-07-17 16:47:44 +02:00
Kees Cook 3f649ab728 treewide: Remove uninitialized_var() usage
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
	xargs perl -pi -e \
		's/\buninitialized_var\(([^\)]+)\)/\1/g;
		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-16 12:35:15 -07:00
Thomas Gleixner e3beca48a4 irqdomain/treewide: Keep firmware node unconditionally allocated
Quite some non OF/ACPI users of irqdomains allocate firmware nodes of type
IRQCHIP_FWNODE_NAMED or IRQCHIP_FWNODE_NAMED_ID and free them right after
creating the irqdomain. The only purpose of these FW nodes is to convey
name information. When this was introduced the core code did not store the
pointer to the node in the irqdomain. A recent change stored the firmware
node pointer in irqdomain for other reasons and missed to notice that the
usage sites which do the alloc_fwnode/create_domain/free_fwnode sequence
are broken by this. Storing a dangling pointer is dangerous itself, but in
case that the domain is destroyed later on this leads to a double free.

Remove the freeing of the firmware node after creating the irqdomain from
all affected call sites to cure this.

Fixes: 711419e504 ("irqdomain: Add the missing assignment of domain->fwnode for named fwnode")
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/873661qakd.fsf@nanos.tec.linutronix.de
2020-07-14 17:44:42 +02:00
Peter Zijlstra ba1f2b2eaa x86/entry: Fix NMI vs IRQ state tracking
While the nmi_enter() users did
trace_hardirqs_{off_prepare,on_finish}() there was no matching
lockdep_hardirqs_*() calls to complete the picture.

Introduce idtentry_{enter,exit}_nmi() to enable proper IRQ state
tracking across the NMIs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200623083721.216740948@infradead.org
2020-07-10 12:00:01 +02:00
Thomas Gleixner bc916e67c0 Merge branch 'x86/urgent' into x86/entry to pick up upstream fixes. 2020-07-10 11:45:22 +02:00
Thomas Gleixner bce9b042ec x86/traps: Disable interrupts in exc_aligment_check()
exc_alignment_check() fails to disable interrupts before returning to the
entry code.

Fixes: ca4c6a9858 ("x86/traps: Make interrupt enable/disable symmetric in C code")
Reported-by: syzbot+0889df9502bc0f112b31@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200708192934.076519438@linutronix.de
2020-07-09 11:18:29 +02:00
Zhenzhong Duan 05eee619ed x86/kvm: Add "nopvspin" parameter to disable PV spinlocks
There are cases where a guest tries to switch spinlocks to bare metal
behavior (e.g. by setting "xen_nopvspin" on XEN platform and
"hv_nopvspin" on HYPER_V).

That feature is missed on KVM, add a new parameter "nopvspin" to disable
PV spinlocks for KVM guest.

The new 'nopvspin' parameter will also replace Xen and Hyper-V specific
parameters in future patches.

Define variable nopvsin as global because it will be used in future
patches as above.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:57 -04:00
Zhenzhong Duan 5aefd786fb x86/kvm: Change print code to use pr_*() format
pr_*() is preferred than printk(KERN_* ...), after change all the print
in arch/x86/kernel/kvm.c will have "kvm-guest: xxx" style.

No functional change.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:56 -04:00
Zhenzhong Duan de585020da Revert "KVM: X86: Fix setup the virt_spin_lock_key before static key get initialized"
This reverts commit 34226b6b70.

Commit 8990cac6e5 ("x86/jump_label: Initialize static branching
early") adds jump_label_init() call in setup_arch() to make static
keys initialized early, so we could use the original simpler code
again.

The similar change for XEN is in commit 090d54bcbc ("Revert
"x86/paravirt: Set up the virt_spin_lock_key after static keys get
initialized"")

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:56 -04:00
Paolo Bonzini 26d05b368a Merge branch 'kvm-async-pf-int' into HEAD 2020-07-08 16:20:30 -04:00