mirror of https://gitee.com/openkylin/linux.git
1474 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Linus Torvalds | cea061e455 |
Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar: "The main changes in this cycle were: - Add "Jailhouse" hypervisor support (Jan Kiszka) - Update DeviceTree support (Ivan Gorinov) - Improve DMI date handling (Andy Shevchenko)" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/PCI: Fix a potential regression when using dmi_get_bios_year() firmware/dmi_scan: Uninline dmi_get_bios_year() helper x86/devicetree: Use CPU description from Device Tree of/Documentation: Specify local APIC ID in "reg" MAINTAINERS: Add entry for Jailhouse x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI x86: Consolidate PCI_MMCONFIG configs x86: Align x86_64 PCI_MMCONFIG with 32-bit variant x86/jailhouse: Enable PCI mmconfig access in inmates PCI: Scan all functions when running over Jailhouse jailhouse: Provide detection for non-x86 systems x86/devicetree: Fix device IRQ settings in DT x86/devicetree: Initialize device tree before using it pci: Simplify code by using the new dmi_get_bios_year() helper ACPI/sleep: Simplify code by using the new dmi_get_bios_year() helper x86/pci: Simplify code by using the new dmi_get_bios_year() helper dmi: Introduce the dmi_get_bios_year() helper function x86/platform/quark: Re-use DEFINE_SHOW_ATTRIBUTE() macro x86/platform/atom: Re-use DEFINE_SHOW_ATTRIBUTE() macro |
|
Linus Torvalds | d22fff8141 |
Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar: - Extend the memmap= boot parameter syntax to allow the redeclaration and dropping of existing ranges, and to support all e820 range types (Jan H. Schönherr) - Improve the W+X boot time security checks to remove false positive warnings on Xen (Jan Beulich) - Support booting as Xen PVH guest (Juergen Gross) - Improved 5-level paging (LA57) support, in particular it's possible now to have a single kernel image for both 4-level and 5-level hardware (Kirill A. Shutemov) - AMD hardware RAM encryption support (SME/SEV) fixes (Tom Lendacky) - Preparatory commits for hardware-encrypted RAM support on Intel CPUs. (Kirill A. Shutemov) - Improved Intel-MID support (Andy Shevchenko) - Show EFI page tables in page_tables debug files (Andy Lutomirski) - ... plus misc fixes and smaller cleanups * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) x86/cpu/tme: Fix spelling: "configuation" -> "configuration" x86/boot: Fix SEV boot failure from change to __PHYSICAL_MASK_SHIFT x86/mm: Update comment in detect_tme() regarding x86_phys_bits x86/mm/32: Remove unused node_memmap_size_bytes() & CONFIG_NEED_NODE_MEMMAP_SIZE logic x86/mm: Remove pointless checks in vmalloc_fault x86/platform/intel-mid: Add special handling for ACPI HW reduced platforms ACPI, x86/boot: Introduce the ->reduced_hw_early_init() ACPI callback ACPI, x86/boot: Split out acpi_generic_reduce_hw_init() and export x86/pconfig: Provide defines and helper to run MKTME_KEY_PROG leaf x86/pconfig: Detect PCONFIG targets x86/tme: Detect if TME and MKTME is activated by BIOS x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G x86/boot/compressed/64: Use page table in trampoline memory x86/boot/compressed/64: Use stack from trampoline memory x86/boot/compressed/64: Make sure we have a 32-bit code segment x86/mm: Do not use paravirtualized calls in native_set_p4d() kdump, vmcoreinfo: Export pgtable_l5_enabled value x86/boot/compressed/64: Prepare new top-level page table for trampoline x86/boot/compressed/64: Set up trampoline memory x86/boot/compressed/64: Save and restore trampoline memory ... |
|
Jaak Ristioja | 1897a9691e |
Documentation: Fix early-microcode.txt references after file rename
The file Documentation/x86/early-microcode.txt was renamed to Documentation/x86/microcode.txt in |
|
David Rientjes | fc5d1073ca |
x86/mm/32: Remove unused node_memmap_size_bytes() & CONFIG_NEED_NODE_MEMMAP_SIZE logic
node_memmap_size_bytes() has been unused since the v3.9 kernel, so remove it.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Fixes:
|
|
Peter Zijlstra | d0266046ad |
x86: Remove FAST_FEATURE_TESTS
Since we want to rely on static branches to avoid speculation, remove any possible fallback code for static_cpu_has. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: torvalds@linux-foundation.org Link: https://lkml.kernel.org/r/20180319154717.705383007@infradead.org |
|
Christoph Hellwig | b6e05477c1 |
dma/direct: Handle the memory encryption bit in common code
Give the basic phys_to_dma() and dma_to_phys() helpers a __-prefix and add the memory encryption mask to the non-prefixed versions. Use the __-prefixed versions directly instead of clearing the mask again in various places. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-13-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Christoph Hellwig | fec777c385 |
x86/dma: Use DMA-direct (CONFIG_DMA_DIRECT_OPS=y)
The generic DMA-direct (CONFIG_DMA_DIRECT_OPS=y) implementation is now functionally equivalent to the x86 nommu dma_map implementation, so switch over to using it. That includes switching from using x86_dma_supported in various IOMMU drivers to use dma_direct_supported instead, which provides the same functionality. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-4-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Walleij | 95260c17b2 |
Linux 4.16-rc5
-----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlqlyPEeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGNa0H/RIa/StQuYu/SBwa JRqQFmkIsx+gG+FyamJrqGzRfyjounES8PbfyaN3cCrzYgeRwMp1U/bZW6/l5tkb OjTtrCJ6CJaa21fC/7aqn3rhejHciKyk83EinMu5WjDpsQcaF2xKr3SaPa62Ja24 fhawKq3CnUa+OUuAbicVX8yn4viUB6x8FjSN/IWfp3Cs4IBR7SGxxD7A4MET9FbQ 5OOu0al8ly9QeCggTtJyk+cApeLfexEBTbUur9gm7GcH9jhUtJSyZCZsDJx6M2yb CwdgF4fyk58c1fuHvTFb0AdUns55ba3nicybRHHMVbDpZIG9v4/M1yJETHHf5cD7 t3rFjrY= =+Ldf -----END PGP SIGNATURE----- Merge tag 'v4.16-rc5' into devel Linux 4.16-rc5 merged into the GPIO devel branch to resolve a nasty conflict between fixes and devel in the RCAR driver. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> |
|
Ingo Molnar | 3c76db70eb |
Merge branch 'x86/pti' into x86/mm, to pick up dependencies
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | ed58d66f60 |
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti updates from Thomas Gleixner: "Yet another pile of melted spectrum related updates: - Drop native vsyscall support finally as it causes more trouble than benefit. - Make microcode loading more robust. There were a few issues especially related to late loading which are now surfacing because late loading of the IB* microcodes addressing spectre issues has become more widely used. - Simplify and robustify the syscall handling in the entry code - Prevent kprobes on the entry trampoline code which lead to kernel crashes when the probe hits before CR3 is updated - Don't check microcode versions when running on hypervisors as they are considered as lying anyway. - Fix the 32bit objtool build and a coment typo" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kprobes: Fix kernel crash when probing .entry_trampoline code x86/pti: Fix a comment typo x86/microcode: Synchronize late microcode loading x86/microcode: Request microcode on the BSP x86/microcode/intel: Look into the patch cache first x86/microcode: Do not upload microcode if CPUs are offline x86/microcode/intel: Writeback and invalidate caches before updating microcode x86/microcode/intel: Check microcode revision before updating sibling threads x86/microcode: Get rid of struct apply_microcode_ctx x86/spectre_v2: Don't check microcode versions when running under hypervisors x86/vsyscall/64: Drop "native" vsyscalls x86/entry/64/compat: Save one instruction in entry_INT80_compat() x86/entry: Do not special-case clone(2) in compat entry x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscalls x86/syscalls: Use proper syscall definition for sys_ioperm() x86/entry: Remove stale syscall prototype x86/syscalls/32: Simplify $entry == $compat entries objtool: Fix 32-bit build |
|
Jan Kiszka | 8364e1f837 |
x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI
Jailhouse does not use ACPI, but it does support MMCONFIG. Make sure the latter can be built without having to enable ACPI as well. Primarily, its required to make the AMD mmconf-fam10h_64 depend upon MMCONFIG and ACPI, instead of just the former. Saves some bytes in the Jailhouse non-root kernel. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: jailhouse-dev@googlegroups.com Cc: linux-pci@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: https://lkml.kernel.org/r/788bbd5325d1922235e9562c213057425fbc548c.1520408357.git.jan.kiszka@siemens.com |
|
Jan Kiszka | b45c9f3656 |
x86: Consolidate PCI_MMCONFIG configs
Since
|
|
Jan Kiszka | 55027a7772 |
x86: Align x86_64 PCI_MMCONFIG with 32-bit variant
Allow to enable PCI_MMCONFIG when only SFI is present and make this option default on. This will help consolidating both into one Kconfig statement. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: jailhouse-dev@googlegroups.com Cc: linux-pci@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: https://lkml.kernel.org/r/a2faf78c54f340f5549149e8b679c95950dae83d.1520408357.git.jan.kiszka@siemens.com |
|
Andy Lutomirski | 076ca272a1 |
x86/vsyscall/64: Drop "native" vsyscalls
Since Linux v3.2, vsyscalls have been deprecated and slow. From v3.2 on, Linux had three vsyscall modes: "native", "emulate", and "none". "emulate" is the default. All known user programs work correctly in emulate mode, but vsyscalls turn into page faults and are emulated. This is very slow. In "native" mode, the vsyscall page is easily usable as an exploit gadget, but vsyscalls are a bit faster -- they turn into normal syscalls. (This is in contrast to vDSO functions, which can be much faster than syscalls.) In "none" mode, there are no vsyscalls. For all practical purposes, "native" was really just a chicken bit in case something went wrong with the emulation. It's been over six years, and nothing has gone wrong. Delete it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/519fee5268faea09ae550776ce969fa6e88668b0.1520449896.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | 85a2d939c0 |
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner: "Yet another pile of melted spectrum related changes: - sanitize the array_index_nospec protection mechanism: Remove the overengineered array_index_nospec_mask_check() magic and allow const-qualified types as index to avoid temporary storage in a non-const local variable. - make the microcode loader more robust by properly propagating error codes. Provide information about new feature bits after micro code was updated so administrators can act upon. - optimizations of the entry ASM code which reduce code footprint and make the code simpler and faster. - fix the {pmd,pud}_{set,clear}_flags() implementations to work properly on paravirt kernels by removing the address translation operations. - revert the harmful vmexit_fill_RSB() optimization - use IBRS around firmware calls - teach objtool about retpolines and add annotations for indirect jumps and calls. - explicitly disable jumplabel patching in __init code and handle patching failures properly instead of silently ignoring them. - remove indirect paravirt calls for writing the speculation control MSR as these calls are obviously proving the same attack vector which is tried to be mitigated. - a few small fixes which address build issues with recent compiler and assembler versions" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() KVM/x86: Remove indirect MSR op calls from SPEC_CTRL objtool, retpolines: Integrate objtool with retpoline support more closely x86/entry/64: Simplify ENCODE_FRAME_POINTER extable: Make init_kernel_text() global jump_label: Warn on failed jump_label patching attempt jump_label: Explicitly disable jump labels in __init code x86/entry/64: Open-code switch_to_thread_stack() x86/entry/64: Move ASM_CLAC to interrupt_entry() x86/entry/64: Remove 'interrupt' macro x86/entry/64: Move the switch_to_thread_stack() call to interrupt_entry() x86/entry/64: Move ENTER_IRQ_STACK from interrupt macro to interrupt_entry x86/entry/64: Move PUSH_AND_CLEAR_REGS from interrupt macro to helper function x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP objtool: Add module specific retpoline rules objtool: Add retpoline validation objtool: Use existing global variables for options x86/mm/sme, objtool: Annotate indirect call in sme_encrypt_execute() x86/boot, objtool: Annotate indirect jump in secondary_startup_64() x86/paravirt, objtool: Annotate indirect calls ... |
|
Ingo Molnar | 3f7df3efeb |
Linux 4.16-rc3
-----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlqTdg8eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG10wH/iSt+OKmBdUZSAYv ADvfifLynLgugFYNzuijj8/gVt6b0ZIB2/wSYfdPjDErLFogis6wjnxl0lf3sEMB g7Oy8SE+pPPQ7587lFkg6Pj53405b6BwCbSkg8PLlwepSGiu0JmGvUYmz753tIeP kRIIQk/KrLlxNFixhGWNfQ9k8PqJ0NCgcbj+mTxmFkfIw2FKnBtYz72LR7Eut3Mt PJFh4pLKsHKlcjvX8+SehDdLwlEBv/ohDP7S7gRyR+QX1aNZhZAXyHQ0C8/tw8h6 DnRvlTWp9EGTFxp8bYie5xcWusIcfy1eAA8yiG2kH+Mx7kLa8cmU234bHhUiu9yT YJSLoI4= =XBoV -----END PGP SIGNATURE----- Merge tag 'v4.16-rc3' into x86/mm, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
William Breathitt Gray | 17a2a129b3 |
isa: Remove ISA_BUS_API selection for ISA_BUS
ISA_BUS_API is selected by drivers themselves when necessary. The ISA_BUS Kconfig option is now simply a mask for true ISA device drivers and relevant configuration. For now, the ISA_BUS Kconfig option is only available for X86, but may be added for other arch builds in the future if the need arises. Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> |
|
Peter Zijlstra | d5028ba8ee |
objtool, retpolines: Integrate objtool with retpoline support more closely
Disable retpoline validation in objtool if your compiler sucks, and otherwise select the validation stuff for CONFIG_RETPOLINE=y (most builds would already have it set due to ORC). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Kirill A. Shutemov | 6657fca06e |
x86/mm: Allow to boot without LA57 if CONFIG_X86_5LEVEL=y
All pieces of the puzzle are in place and we can now allow to boot with CONFIG_X86_5LEVEL=y on a machine without LA57 support. Kernel will detect that LA57 is missing and fold p4d at runtime. Update the documentation and the Kconfig option description to reflect the change. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180214182542.69302-10-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Matthew Whitehead | 69b8d3fcab |
x86/Kconfig: Exclude i586-class CPUs lacking PAE support from the HIGHMEM64G Kconfig group
i586-class machines also lack support for Physical Address Extension (PAE), so add them to the exclusion list. Signed-off-by: Matthew Whitehead <tedheadster@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1518713696-11360-2-git-send-email-tedheadster@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Kirill A. Shutemov | 162434e7f5 |
x86/mm: Make MAX_PHYSADDR_BITS and MAX_PHYSMEM_BITS dynamic
For boot-time switching between paging modes, we need to be able to adjust size of physical address space at runtime. As part of making physical address space size variable, we have to make X86_5LEVEL dependent on SPARSEMEM_VMEMMAP. !SPARSEMEM_VMEMMAP configuration doesn't build with variable MAX_PHYSMEM_BITS. For !SPARSEMEM_VMEMMAP SECTIONS_WIDTH depends on MAX_PHYSMEM_BITS: SECTIONS_WIDTH SECTIONS_SHIFT MAX_PHYSMEM_BITS And SECTIONS_WIDTH is used on pre-processor stage, it doesn't work if it's dyncamic. See include/linux/page-flags-layout.h. Effect on kernel image size: text data bss dec hex filename 8628393 4734340 1368064 14730797 e0c62d vmlinux.before 8628892 4734340 1368064 14731296 e0c820 vmlinux.after Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180214111656.88514-8-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Kirill A. Shutemov | eedb92abb9 |
x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y
We need to be able to adjust virtual memory layout at runtime to be able to switch between 4- and 5-level paging at boot-time. KASLR already has movable __VMALLOC_BASE, __VMEMMAP_BASE and __PAGE_OFFSET. Let's re-use it. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180214111656.88514-4-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | aec6487e99 |
x86/Kconfig: Further simplify the NR_CPUS config
Clean up various aspects of the x86 CONFIG_NR_CPUS configuration switches: - Rename the three CONFIG_NR_CPUS related variables to create a common namespace for them: RANGE_BEGIN_CPUS => NR_CPUS_RANGE_BEGIN RANGE_END_CPUS => NR_CPUS_RANGE_END DEF_CONFIG_CPUS => NR_CPUS_DEFAULT - Align them vertically, such as: config NR_CPUS_RANGE_END int depends on X86_64 default 8192 if SMP && ( MAXSMP || CPUMASK_OFFSTACK) default 512 if SMP && (!MAXSMP && !CPUMASK_OFFSTACK) default 1 if !SMP - Update help text, add more comments. Test results: # i386 allnoconfig: CONFIG_NR_CPUS_RANGE_BEGIN=1 CONFIG_NR_CPUS_RANGE_END=1 CONFIG_NR_CPUS_DEFAULT=1 CONFIG_NR_CPUS=1 # i386 defconfig: CONFIG_NR_CPUS_RANGE_BEGIN=2 CONFIG_NR_CPUS_RANGE_END=8 CONFIG_NR_CPUS_DEFAULT=8 CONFIG_NR_CPUS=8 # i386 allyesconfig: CONFIG_NR_CPUS_RANGE_BEGIN=2 CONFIG_NR_CPUS_RANGE_END=64 CONFIG_NR_CPUS_DEFAULT=32 CONFIG_NR_CPUS=32 # x86_64 allnoconfig: CONFIG_NR_CPUS_RANGE_BEGIN=1 CONFIG_NR_CPUS_RANGE_END=1 CONFIG_NR_CPUS_DEFAULT=1 CONFIG_NR_CPUS=1 # x86_64 defconfig: CONFIG_NR_CPUS_RANGE_BEGIN=2 CONFIG_NR_CPUS_RANGE_END=512 CONFIG_NR_CPUS_DEFAULT=64 CONFIG_NR_CPUS=64 # x86_64 allyesconfig: CONFIG_NR_CPUS_RANGE_BEGIN=8192 CONFIG_NR_CPUS_RANGE_END=8192 CONFIG_NR_CPUS_DEFAULT=8192 CONFIG_NR_CPUS=8192 Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180210113629.jcv6su3r4suuno63@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Randy Dunlap | a0d0bb4deb |
x86/Kconfig: Simplify NR_CPUS config
Clean up and simplify the X86 NR_CPUS Kconfig symbol/option by introducing RANGE_BEGIN_CPUS, RANGE_END_CPUS, and DEF_CONFIG_CPUS. Then combine some default values when their conditionals can be reduced. Also move the X86_BIGSMP kconfig option inside an "if X86_32"/"endif" config block and drop its explicit "depends on X86_32". Combine the max. 8192 cases of RANGE_END_CPUS (X86_64 only). Split RANGE_END_CPUS and DEF_CONFIG_CPUS into separate cases for X86_32 and X86_64. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/0b833246-ed4b-e451-c426-c4464725be92@infradead.org Link: lkml.kernel.org/r/CA+55aFzOd3j6ZUSkEwTdk85qtt1JywOtm3ZAb-qAvt8_hJ6D4A@mail.gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | a2e5790d84 |
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton: - kasan updates - procfs - lib/bitmap updates - other lib/ updates - checkpatch tweaks - rapidio - ubsan - pipe fixes and cleanups - lots of other misc bits * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits) Documentation/sysctl/user.txt: fix typo MAINTAINERS: update ARM/QUALCOMM SUPPORT patterns MAINTAINERS: update various PALM patterns MAINTAINERS: update "ARM/OXNAS platform support" patterns MAINTAINERS: update Cortina/Gemini patterns MAINTAINERS: remove ARM/CLKDEV SUPPORT file pattern MAINTAINERS: remove ANDROID ION pattern mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors mm: docs: fix parameter names mismatch mm: docs: fixup punctuation pipe: read buffer limits atomically pipe: simplify round_pipe_size() pipe: reject F_SETPIPE_SZ with size over UINT_MAX pipe: fix off-by-one error when checking buffer limits pipe: actually allow root to exceed the pipe buffer limits pipe, sysctl: remove pipe_proc_fn() pipe, sysctl: drop 'min' parameter from pipe-max-size converter kasan: rework Kconfig settings crash_dump: is_kdump_kernel can be boolean kernel/mutex: mutex_is_locked can be boolean ... |
|
Kees Cook | 2bc2f688fd |
Makefile: move stack-protector availability out of Kconfig
Various portions of the kernel, especially per-architecture pieces, need to know if the compiler is building with the stack protector. This was done in the arch/Kconfig with 'select', but this doesn't allow a way to do auto-detected compiler support. In preparation for creating an on-if-available default, move the logic for the definition of CONFIG_CC_STACKPROTECTOR into the Makefile. Link: http://lkml.kernel.org/r/1510076320-69931-3-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Ingo Molnar | 8284507916 |
Merge branch 'linus' into sched/urgent, to resolve conflicts
Conflicts: arch/arm64/kernel/entry.S arch/x86/Kconfig include/linux/sched/mm.h kernel/fork.c Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Mathieu Desnoyers | 10bcc80e9d |
membarrier/x86: Provide core serializing command
There are two places where core serialization is needed by membarrier: 1) When returning from the membarrier IPI, 2) After scheduler updates curr to a thread with a different mm, before going back to user-space, since the curr->mm is used by membarrier to check whether it needs to send an IPI to that CPU. x86-32 uses IRET as return from interrupt, and both IRET and SYSEXIT to go back to user-space. The IRET instruction is core serializing, but not SYSEXIT. x86-64 uses IRET as return from interrupt, which takes care of the IPI. However, it can return to user-space through either SYSRETL (compat code), SYSRETQ, or IRET. Given that SYSRET{L,Q} is not core serializing, we rely instead on write_cr3() performed by switch_mm() to provide core serialization after changing the current mm, and deal with the special case of kthread -> uthread (temporarily keeping current mm into active_mm) by adding a sync_core() in that specific case. Use the new sync_core_before_usermode() to guarantee this. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-10-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Mathieu Desnoyers | ac1ab12a3e |
lockin/x86: Implement sync_core_before_usermode()
Ensure that a core serializing instruction is issued before returning to user-mode. x86 implements return to user-space through sysexit, sysrel, and sysretq, which are not core serializing. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-8-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | 617aebe6a9 |
Currently, hardened usercopy performs dynamic bounds checking on slab
cache objects. This is good, but still leaves a lot of kernel memory available to be copied to/from userspace in the face of bugs. To further restrict what memory is available for copying, this creates a way to whitelist specific areas of a given slab cache object for copying to/from userspace, allowing much finer granularity of access control. Slab caches that are never exposed to userspace can declare no whitelist for their objects, thereby keeping them unavailable to userspace via dynamic copy operations. (Note, an implicit form of whitelisting is the use of constant sizes in usercopy operations and get_user()/put_user(); these bypass all hardened usercopy checks since these sizes cannot change at runtime.) This new check is WARN-by-default, so any mistakes can be found over the next several releases without breaking anyone's system. The series has roughly the following sections: - remove %p and improve reporting with offset - prepare infrastructure and whitelist kmalloc - update VFS subsystem with whitelists - update SCSI subsystem with whitelists - update network subsystem with whitelists - update process memory with whitelists - update per-architecture thread_struct with whitelists - update KVM with whitelists and fix ioctl bug - mark all other allocations as not whitelisted - update lkdtm for more sensible test overage -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Kees Cook <kees@outflux.net> iQIcBAABCgAGBQJabvleAAoJEIly9N/cbcAmO1kQAJnjVPutnLSbnUteZxtsv7W4 43Cggvokfxr6l08Yh3hUowNxZVKjhF9uwMVgRRg9Nl5WdYCN+vCQbHz+ZdzGJXKq cGqdKWgexMKX+aBdNDrK7BphUeD46sH7JWR+a/lDV/BgPxBCm9i5ZZCgXbPP89AZ NpLBji7gz49wMsnm/x135xtNlZ3dG0oKETzi7MiR+NtKtUGvoIszSKy5JdPZ4m8q 9fnXmHqmwM6uQFuzDJPt1o+D1fusTuYnjI7EgyrJRRhQ+BB3qEFZApXnKNDRS9Dm uB7jtcwefJCjlZVCf2+PWTOEifH2WFZXLPFlC8f44jK6iRW2Nc+wVRisJ3vSNBG1 gaRUe/FSge68eyfQj5OFiwM/2099MNkKdZ0fSOjEBeubQpiFChjgWgcOXa5Bhlrr C4CIhFV2qg/tOuHDAF+Q5S96oZkaTy5qcEEwhBSW15ySDUaRWFSrtboNt6ZVOhug d8JJvDCQWoNu1IQozcbv6xW/Rk7miy8c0INZ4q33YUvIZpH862+vgDWfTJ73Zy9H jR/8eG6t3kFHKS1vWdKZzOX1bEcnd02CGElFnFYUEewKoV7ZeeLsYX7zodyUAKyi Yp5CImsDbWWTsptBg6h9nt2TseXTxYCt2bbmpJcqzsqSCUwOQNQ4/YpuzLeG0ihc JgOmUnQNJWCTwUUw5AS1 =tzmJ -----END PGP SIGNATURE----- Merge tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardened usercopy whitelisting from Kees Cook: "Currently, hardened usercopy performs dynamic bounds checking on slab cache objects. This is good, but still leaves a lot of kernel memory available to be copied to/from userspace in the face of bugs. To further restrict what memory is available for copying, this creates a way to whitelist specific areas of a given slab cache object for copying to/from userspace, allowing much finer granularity of access control. Slab caches that are never exposed to userspace can declare no whitelist for their objects, thereby keeping them unavailable to userspace via dynamic copy operations. (Note, an implicit form of whitelisting is the use of constant sizes in usercopy operations and get_user()/put_user(); these bypass all hardened usercopy checks since these sizes cannot change at runtime.) This new check is WARN-by-default, so any mistakes can be found over the next several releases without breaking anyone's system. The series has roughly the following sections: - remove %p and improve reporting with offset - prepare infrastructure and whitelist kmalloc - update VFS subsystem with whitelists - update SCSI subsystem with whitelists - update network subsystem with whitelists - update process memory with whitelists - update per-architecture thread_struct with whitelists - update KVM with whitelists and fix ioctl bug - mark all other allocations as not whitelisted - update lkdtm for more sensible test overage" * tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits) lkdtm: Update usercopy tests for whitelisting usercopy: Restrict non-usercopy caches to size 0 kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl kvm: whitelist struct kvm_vcpu_arch arm: Implement thread_struct whitelist for hardened usercopy arm64: Implement thread_struct whitelist for hardened usercopy x86: Implement thread_struct whitelist for hardened usercopy fork: Provide usercopy whitelisting for task_struct fork: Define usercopy region in thread_stack slab caches fork: Define usercopy region in mm_struct slab caches net: Restrict unwhitelisted proto caches to size 0 sctp: Copy struct sctp_sock.autoclose to userspace using put_user() sctp: Define usercopy region in SCTP proto slab cache caif: Define usercopy region in caif proto slab cache ip: Define usercopy region in IP proto slab cache net: Define usercopy region in struct proto slab cache scsi: Define usercopy region in scsi_sense_cache slab cache cifs: Define usercopy region in cifs_request slab cache vxfs: Define usercopy region in vxfs_inode slab cache ufs: Define usercopy region in ufs_inode_cache slab cache ... |
|
Linus Torvalds | 47fcc0360c |
Driver Core updates for 4.16-rc1
Here is the set of "big" driver core patches for 4.16-rc1. The majority of the work here is in the firmware subsystem, with reworks to try to attempt to make the code easier to handle in the long run, but no functional change. There's also some tree-wide sysfs attribute fixups with lots of acks from the various subsystem maintainers, as well as a handful of other normal fixes and changes. And finally, some license cleanups for the driver core and sysfs code. All have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWnLvPw8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynNzACgkzjPoBytJWbpWFt6SR6L33/u4kEAnRFvVCGL s6ygQPQhZIjKk2Lxa2hC =Zihy -----END PGP SIGNATURE----- Merge tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of "big" driver core patches for 4.16-rc1. The majority of the work here is in the firmware subsystem, with reworks to try to attempt to make the code easier to handle in the long run, but no functional change. There's also some tree-wide sysfs attribute fixups with lots of acks from the various subsystem maintainers, as well as a handful of other normal fixes and changes. And finally, some license cleanups for the driver core and sysfs code. All have been in linux-next for a while with no reported issues" * tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (48 commits) device property: Define type of PROPERTY_ENRTY_*() macros device property: Reuse property_entry_free_data() device property: Move property_entry_free_data() upper firmware: Fix up docs referring to FIRMWARE_IN_KERNEL firmware: Drop FIRMWARE_IN_KERNEL Kconfig option USB: serial: keyspan: Drop firmware Kconfig options sysfs: remove DEBUG defines sysfs: use SPDX identifiers drivers: base: add coredump driver ops sysfs: add attribute specification for /sysfs/devices/.../coredump test_firmware: fix missing unlock on error in config_num_requests_store() test_firmware: make local symbol test_fw_config static sysfs: turn WARN() into pr_warn() firmware: Fix a typo in fallback-mechanisms.rst treewide: Use DEVICE_ATTR_WO treewide: Use DEVICE_ATTR_RO treewide: Use DEVICE_ATTR_RW sysfs.h: Use octal permissions component: add debugfs support bus: simple-pm-bus: convert bool SIMPLE_PM_BUS to tristate ... |
|
Linus Torvalds | 73da9e1a9f |
Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton: - misc fixes - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) mm: remove PG_highmem description tools, vm: new option to specify kpageflags file mm/swap.c: make functions and their kernel-doc agree mm, memory_hotplug: fix memmap initialization mm: correct comments regarding do_fault_around() mm: numa: do not trap faults on shared data section pages. hugetlb, mbind: fall back to default policy if vma is NULL hugetlb, mempolicy: fix the mbind hugetlb migration mm, hugetlb: further simplify hugetlb allocation API mm, hugetlb: get rid of surplus page accounting tricks mm, hugetlb: do not rely on overcommit limit during migration mm, hugetlb: integrate giga hugetlb more naturally to the allocation path mm, hugetlb: unify core page allocation accounting and initialization mm/memcontrol.c: try harder to decrease [memory,memsw].limit_in_bytes mm/memcontrol.c: make local symbol static mm/hmm: fix uninitialized use of 'entry' in hmm_vma_walk_pmd() include/linux/mmzone.h: fix explanation of lower bits in the SPARSEMEM mem_map pointer mm/compaction.c: fix comment for try_to_compact_pages() mm/page_ext.c: make page_ext_init a noop when CONFIG_PAGE_EXTENSION but nothing uses it zsmalloc: use U suffix for negative literals being shifted ... |
|
Pavel Tatashin | 2e3ca40f03 |
mm: relax deferred struct page requirements
There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, as all the page initialization code is in common code. Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code does not really use hotplug memory functionality. So, we can remove this requirement as well. This patch allows to use deferred struct page initialization on all platforms with memblock allocator. Tested on x86, arm64, and sparc. Also, verified that code compiles on PPC with CONFIG_MEMORY_HOTPLUG disabled. Link: http://lkml.kernel.org/r/20171117014601.31606-1-pasha.tatashin@oracle.com Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390] Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | b2fe5fa686 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: 1) Significantly shrink the core networking routing structures. Result of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf 2) Add netdevsim driver for testing various offloads, from Jakub Kicinski. 3) Support cross-chip FDB operations in DSA, from Vivien Didelot. 4) Add a 2nd listener hash table for TCP, similar to what was done for UDP. From Martin KaFai Lau. 5) Add eBPF based queue selection to tun, from Jason Wang. 6) Lockless qdisc support, from John Fastabend. 7) SCTP stream interleave support, from Xin Long. 8) Smoother TCP receive autotuning, from Eric Dumazet. 9) Lots of erspan tunneling enhancements, from William Tu. 10) Add true function call support to BPF, from Alexei Starovoitov. 11) Add explicit support for GRO HW offloading, from Michael Chan. 12) Support extack generation in more netlink subsystems. From Alexander Aring, Quentin Monnet, and Jakub Kicinski. 13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From Russell King. 14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso. 15) Many improvements and simplifications to the NFP driver bpf JIT, from Jakub Kicinski. 16) Support for ipv6 non-equal cost multipath routing, from Ido Schimmel. 17) Add resource abstration to devlink, from Arkadi Sharshevsky. 18) Packet scheduler classifier shared filter block support, from Jiri Pirko. 19) Avoid locking in act_csum, from Davide Caratti. 20) devinet_ioctl() simplifications from Al viro. 21) More TCP bpf improvements from Lawrence Brakmo. 22) Add support for onlink ipv6 route flag, similar to ipv4, from David Ahern. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits) tls: Add support for encryption using async offload accelerator ip6mr: fix stale iterator net/sched: kconfig: Remove blank help texts openvswitch: meter: Use 64-bit arithmetic instead of 32-bit tcp_nv: fix potential integer overflow in tcpnv_acked r8169: fix RTL8168EP take too long to complete driver initialization. qmi_wwan: Add support for Quectel EP06 rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK ipmr: Fix ptrdiff_t print formatting ibmvnic: Wait for device response when changing MAC qlcnic: fix deadlock bug tcp: release sk_frag.page in tcp_disconnect ipv4: Get the address of interface correctly. net_sched: gen_estimator: fix lockdep splat net: macb: Handle HRESP error net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring ipv6: addrconf: break critical section in addrconf_verify_rtnl() ipv6: change route cache aging logic i40e/i40evf: Update DESC_NEEDED value to reflect larger value bnxt_en: cleanup DIM work on device shutdown ... |
|
Linus Torvalds | 2382dc9a3e |
dma mapping changes for Linux 4.16:
This pull requests contains a consolidation of the generic no-IOMMU code, a well as the glue code for swiotlb. All the code is based on the x86 implementation with hooks to allow all architectures that aren't cache coherent to use it. The x86 conversion itself has been deferred because the x86 maintainers were a little busy in the last months. -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlpxcVoLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYN/Lw/+Je9teM4NPQ8lU/ncbJN/bUzCFGJ6dFt2eVX/6xs3 sfl8vBdeHt6CBM02rRNecEr31z3+orjQes5JnlEJFYeG3jumV0zCPw/zbxqjzbJ1 3n6cckLxbxzy8Ca1G/BVjHLAUX5eWp1ujn/Q4d03VKVQZhJvFYlqDbP3TrNVx7xn k86u37p/o+ngjwX66UdZ3C4iIBF8zqy6n2kkpv4HUQtHHzPwEvliN39eNilovb56 iGOzjDX1UWHAu4xCTVnPHSG4fA4XU41NWzIN3DIVPE25lYSISSl9TFAdR8GeZA0G 0Yj6sW53pRSoUwco1ocoS44/FgrPOB5/vHIL06pABvicXBiomje1QylqcK7zAczk esjkfPEZrmZuu99GtqFyDNKEvKKdy+aBGaTZ3y+NxsuBs+0xS2Owz1IE4Tk28xaw xh7zn+CVdk2fJh6ZIdw5Eu9b9VN08UriqDmDzO/ylDlcNGcDi7wcxiSTEkHJ1ON/ g9nletV6f3egL0wljDcOnhCJCHTvmWEeq3z8lE55QzPzSH0hHpnGQ2WD0tKrroxz kjOZp0TdXa4F5iysOHe2xl2sftOH0zIkBQJ+oBcK12mTaLu21+yeuCggQXJ/CBdk 1Ol7l9g9T0TDuZPfiTHt5+6jmECQs92LElWA8x7uF7Fpix3BpnafWaaSMSsosF3F D1Y= =Nrl9 -----END PGP SIGNATURE----- Merge tag 'dma-mapping-4.16' of git://git.infradead.org/users/hch/dma-mapping Pull dma mapping updates from Christoph Hellwig: "Except for a runtime warning fix from Christian this is all about consolidation of the generic no-IOMMU code, a well as the glue code for swiotlb. All the code is based on the x86 implementation with hooks to allow all architectures that aren't cache coherent to use it. The x86 conversion itself has been deferred because the x86 maintainers were a little busy in the last months" * tag 'dma-mapping-4.16' of git://git.infradead.org/users/hch/dma-mapping: (57 commits) MAINTAINERS: add the iommu list for swiotlb and xen-swiotlb arm64: use swiotlb_alloc and swiotlb_free arm64: replace ZONE_DMA with ZONE_DMA32 mips: use swiotlb_{alloc,free} mips/netlogic: remove swiotlb support tile: use generic swiotlb_ops tile: replace ZONE_DMA with ZONE_DMA32 unicore32: use generic swiotlb_ops ia64: remove an ifdef around the content of pci-dma.c ia64: clean up swiotlb support ia64: use generic swiotlb_ops ia64: replace ZONE_DMA with ZONE_DMA32 swiotlb: remove various exports swiotlb: refactor coherent buffer allocation swiotlb: refactor coherent buffer freeing swiotlb: wire up ->dma_supported in swiotlb_dma_ops swiotlb: add common swiotlb_map_ops swiotlb: rename swiotlb_free to swiotlb_exit x86: rename swiotlb_dma_ops powerpc: rename swiotlb_dma_ops ... |
|
Linus Torvalds | 669c0f762e |
Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Thomas Gleixner: "The platform support for x86 contains the following updates: - A set of updates for the UV platform to support new CPUs and to fix some of the UV4A BAU MRRs - The initial platform support for the jailhouse hypervisor to allow native Linux guests (inmates) in non-root cells. - A fix for the PCI initialization on Intel MID platforms" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/jailhouse: Respect pci=lastbus command line settings x86/jailhouse: Set X86_FEATURE_TSC_KNOWN_FREQ x86/platform/intel-mid: Move PCI initialization to arch_init() x86/platform/uv/BAU: Replace hard-coded values with MMR definitions x86/platform/UV: Fix UV4A BAU MMRs x86/platform/UV: Fix GAM MMR references in the UV x2apic code x86/platform/UV: Fix GAM MMR changes in UV4A x86/platform/UV: Add references to access fixed UV4A HUB MMRs x86/platform/UV: Fix UV4A support on new Intel Processors x86/platform/UV: Update uv_mmrs.h to prepare for UV4A fixes x86/jailhouse: Add PCI dependency x86/jailhouse: Hide x2apic code when CONFIG_X86_X2APIC=n x86/jailhouse: Initialize PCI support x86/jailhouse: Wire up IOAPIC for legacy UART ports x86/jailhouse: Halt instead of failing to restart x86/jailhouse: Silence ACPI warning x86/jailhouse: Avoid access of unsupported platform resources x86/jailhouse: Set up timekeeping x86/jailhouse: Enable PMTIMER x86/jailhouse: Enable APIC and SMP support ... |
|
Benjamin Gilbert | c508c46e6e |
firmware: Fix up docs referring to FIRMWARE_IN_KERNEL
We've removed the option, so stop talking about it. Signed-off-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
|
David S. Miller | c02b3741eb |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Overlapping changes all over. The mini-qdisc bits were a little bit tricky, however. Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Kees Cook | f7d83c1cf3 |
x86: Implement thread_struct whitelist for hardened usercopy
This whitelists the FPU register state portion of the thread_struct for copying to userspace, instead of the default entire struct. This is needed because FPU register state is dynamically sized, so it doesn't bypass the hardened usercopy checks. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Mathias Krause <minipli@googlemail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> |
|
Arnd Bergmann | abde587b61 |
x86/jailhouse: Add PCI dependency
Building jailhouse support without PCI results in a link error:
arch/x86/kernel/jailhouse.o: In function `jailhouse_init_platform':
jailhouse.c:(.init.text+0x235): undefined reference to `pci_probe'
arch/x86/kernel/jailhouse.o: In function `jailhouse_pci_arch_init':
jailhouse.c:(.init.text+0x265): undefined reference to `pci_direct_init'
jailhouse.c:(.init.text+0x26c): undefined reference to `pcibios_last_bus'
Add the missing Kconfig dependency.
Fixes:
|
|
Jan Kiszka | 87e65d05bb |
x86/jailhouse: Enable PMTIMER
Jailhouse exposes the PMTIMER as only reference clock to all cells. Pick up its address from the setup data. Allow to enable the Linux support of it by relaxing its strict dependency on ACPI. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: jailhouse-dev@googlegroups.com Link: https://lkml.kernel.org/r/6d5c3fadd801eb3fba9510e2d3db14a9c404a1a0.1511770314.git.jan.kiszka@siemens.com |
|
Jan Kiszka | 4a362601ba |
x86/jailhouse: Add infrastructure for running in non-root cell
The Jailhouse hypervisor is able to statically partition a multicore system into multiple so-called cells. Linux is used as boot loader and continues to run in the root cell after Jailhouse is enabled. Linux can also run in non-root cells. Jailhouse does not emulate usual x86 devices. It also provides no complex ACPI but basic platform information that the boot loader forwards via setup data. This adds the infrastructure to detect when running in a non-root cell so that the platform can be configured as required in succeeding steps. Support is limited to x86-64 so far, primarily because no boot loader stub exists for i386 and, thus, we wouldn't be able to test the 32-bit path. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: jailhouse-dev@googlegroups.com Link: https://lkml.kernel.org/r/7f823d077b38b1a70c526b40b403f85688c137d3.1511770314.git.jan.kiszka@siemens.com |
|
Linus Torvalds | 40548c6b6c |
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti updates from Thomas Gleixner: "This contains: - a PTI bugfix to avoid setting reserved CR3 bits when PCID is disabled. This seems to cause issues on a virtual machine at least and is incorrect according to the AMD manual. - a PTI bugfix which disables the perf BTS facility if PTI is enabled. The BTS AUX buffer is not globally visible and causes the CPU to fault when the mapping disappears on switching CR3 to user space. A full fix which restores BTS on PTI is non trivial and will be worked on. - PTI bugfixes for EFI and trusted boot which make sure that the user space visible page table entries have the NX bit cleared - removal of dead code in the PTI pagetable setup functions - add PTI documentation - add a selftest for vsyscall to verify that the kernel actually implements what it advertises. - a sysfs interface to expose vulnerability and mitigation information so there is a coherent way for users to retrieve the status. - the initial spectre_v2 mitigations, aka retpoline: + The necessary ASM thunk and compiler support + The ASM variants of retpoline and the conversion of affected ASM code + Make LFENCE serializing on AMD so it can be used as speculation trap + The RSB fill after vmexit - initial objtool support for retpoline As I said in the status mail this is the most of the set of patches which should go into 4.15 except two straight forward patches still on hold: - the retpoline add on of LFENCE which waits for ACKs - the RSB fill after context switch Both should be ready to go early next week and with that we'll have covered the major holes of spectre_v2 and go back to normality" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) x86,perf: Disable intel_bts when PTI security/Kconfig: Correct the Documentation reference for PTI x86/pti: Fix !PCID and sanitize defines selftests/x86: Add test_vsyscall x86/retpoline: Fill return stack buffer on vmexit x86/retpoline/irq32: Convert assembler indirect jumps x86/retpoline/checksum32: Convert assembler indirect jumps x86/retpoline/xen: Convert Xen hypercall indirect jumps x86/retpoline/hyperv: Convert assembler indirect jumps x86/retpoline/ftrace: Convert ftrace assembler indirect jumps x86/retpoline/entry: Convert entry assembler indirect jumps x86/retpoline/crypto: Convert crypto assembler indirect jumps x86/spectre: Add boot time option to select Spectre v2 mitigation x86/retpoline: Add initial retpoline support objtool: Allow alternatives to be ignored objtool: Detect jumps to retpoline thunks x86/pti: Make unpoison of pgd for trusted boot work for real x86/alternatives: Fix optimize_nops() checking sysfs/cpu: Fix typos in vulnerability documentation x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC ... |
|
Masami Hiramatsu | 540adea380 |
error-injection: Separate error-injection from kprobe
Since error-injection framework is not limited to be used by kprobes, nor bpf. Other kernel subsystems can use it freely for checking safeness of error-injection, e.g. livepatch, ftrace etc. So this separate error-injection framework from kprobes. Some differences has been made: - "kprobe" word is removed from any APIs/structures. - BPF_ALLOW_ERROR_INJECTION() is renamed to ALLOW_ERROR_INJECTION() since it is not limited for BPF too. - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this feature. It is automatically enabled if the arch supports error injection feature for kprobe or ftrace etc. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
|
David Woodhouse | 76b043848f |
x86/retpoline: Add initial retpoline support
Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide the corresponding thunks. Provide assembler macros for invoking the thunks in the same way that GCC does, from native and inline assembler. This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In some circumstances, IBRS microcode features may be used instead, and the retpoline can be disabled. On AMD CPUs if lfence is serialising, the retpoline can be dramatically simplified to a simple "lfence; jmp *\reg". A future patch, after it has been verified that lfence really is serialising in all circumstances, can enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition to X86_FEATURE_RETPOLINE. Do not align the retpoline in the altinstr section, because there is no guarantee that it stays aligned when it's copied over the oldinstr during alternative patching. [ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks] [ tglx: Put actual function CALL/JMP in front of the macros, convert to symbolic labels ] [ dwmw2: Convert back to numeric labels, merge objtool fixes ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk |
|
Christoph Hellwig | ea8c64ace8 |
dma-mapping: move swiotlb arch helpers to a new header
phys_to_dma, dma_to_phys and dma_capable are helpers published by architecture code for use of swiotlb and xen-swiotlb only. Drivers are not supposed to use these directly, but use the DMA API instead. Move these to a new asm/dma-direct.h helper, included by a linux/dma-direct.h wrapper that provides the default linear mapping unless the architecture wants to override it. In the MIPS case the existing dma-coherent.h is reused for now as untangling it will take a bit of work. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Robin Murphy <robin.murphy@arm.com> |
|
Eric Biggers | f328299e54 |
locking/refcounts: Remove stale comment from the ARCH_HAS_REFCOUNT Kconfig entry
ARCH_HAS_REFCOUNT is no longer marked as broken ('if BROKEN'), so remove the stale comment regarding it being broken. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171229195303.17781-1-ebiggers3@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Thomas Gleixner | 61dc0f555b |
x86/cpu: Implement CPU vulnerabilites sysfs functions
Implement the CPU vulnerabilty show functions for meltdown, spectre_v1 and spectre_v2. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180107214913.177414879@linutronix.de |
|
David S. Miller | 6bb8824732 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
net/ipv6/ip6_gre.c is a case of parallel adds. include/trace/events/tcp.h is a little bit more tricky. The removal of in-trace-macro ifdefs in 'net' paralleled with moving show_tcp_state_name and friends over to include/trace/events/sock.h in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Linus Torvalds | caf9a82657 |
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 PTI preparatory patches from Thomas Gleixner: "Todays Advent calendar window contains twentyfour easy to digest patches. The original plan was to have twenty three matching the date, but a late fixup made that moot. - Move the cpu_entry_area mapping out of the fixmap into a separate address space. That's necessary because the fixmap becomes too big with NRCPUS=8192 and this caused already subtle and hard to diagnose failures. The top most patch is fresh from today and cures a brain slip of that tall grumpy german greybeard, who ignored the intricacies of 32bit wraparounds. - Limit the number of CPUs on 32bit to 64. That's insane big already, but at least it's small enough to prevent address space issues with the cpu_entry_area map, which have been observed and debugged with the fixmap code - A few TLB flush fixes in various places plus documentation which of the TLB functions should be used for what. - Rename the SYSENTER stack to CPU_ENTRY_AREA stack as it is used for more than sysenter now and keeping the name makes backtraces confusing. - Prevent LDT inheritance on exec() by moving it to arch_dup_mmap(), which is only invoked on fork(). - Make vysycall more robust. - A few fixes and cleanups of the debug_pagetables code. Check PAGE_PRESENT instead of checking the PTE for 0 and a cleanup of the C89 initialization of the address hint array which already was out of sync with the index enums. - Move the ESPFIX init to a different place to prepare for PTI. - Several code moves with no functional change to make PTI integration simpler and header files less convoluted. - Documentation fixes and clarifications" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit init: Invoke init_espfix_bsp() from mm_init() x86/cpu_entry_area: Move it out of the fixmap x86/cpu_entry_area: Move it to a separate unit x86/mm: Create asm/invpcid.h x86/mm: Put MMU to hardware ASID translation in one place x86/mm: Remove hard-coded ASID limit checks x86/mm: Move the CR3 construction functions to tlbflush.h x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what x86/mm: Remove superfluous barriers x86/mm: Use __flush_tlb_one() for kernel memory x86/microcode: Dont abuse the TLB-flush interface x86/uv: Use the right TLB-flush API x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation x86/mm/64: Improve the memory map documentation x86/ldt: Prevent LDT inheritance on exec x86/ldt: Rework locking arch, mm: Allow arch_dup_mmap() to fail x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode ... |
|
Thomas Gleixner | 7bbcbd3d1c |
x86/Kconfig: Limit NR_CPUS on 32-bit to a sane amount
The recent cpu_entry_area changes fail to compile on 32-bit when BIGSMP=y and NR_CPUS=512, because the fixmap area becomes too big. Limit the number of CPUs with BIGSMP to 64, which is already way to big for 32-bit, but it's at least a working limitation. We performed a quick survey of 32-bit-only machines that might be affected by this change negatively, but found none. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Andrey Ryabinin | 2aeb07365b |
x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
[ Note, this is a Git cherry-pick of the following commit: d17a1d97dc20: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow") ... for easier x86 PTI code testing and back-porting. ] The KASAN shadow is currently mapped using vmemmap_populate() since that provides a semi-convenient way to map pages into init_top_pgt. However, since that no longer zeroes the mapped pages, it is not suitable for KASAN, which requires zeroed shadow memory. Add kasan_populate_shadow() interface and use it instead of vmemmap_populate(). Besides, this allows us to take advantage of gigantic pages and use them to populate the shadow, which should save us some memory wasted on page tables and reduce TLB pressure. Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Alexander Potapenko <glider@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Josef Bacik | 9802d86585 |
bpf: add a bpf_override_function helper
Error injection is sloppy and very ad-hoc. BPF could fill this niche perfectly with it's kprobe functionality. We could make sure errors are only triggered in specific call chains that we care about with very specific situations. Accomplish this with the bpf_override_funciton helper. This will modify the probe'd callers return value to the specified value and set the PC to an override function that simply returns, bypassing the originally probed function. This gives us a nice clean way to implement systematic error injection for all of our code paths. Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
|
Linus Torvalds | 02fc87b117 |
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar: - topology enumeration fixes - KASAN fix - two entry fixes (not yet the big series related to KASLR) - remove obsolete code - instruction decoder fix - better /dev/mem sanity checks, hopefully working better this time - pkeys fixes - two ACPI fixes - 5-level paging related fixes - UMIP fixes that should make application visible faults more debuggable - boot fix for weird virtualization environment * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/decoder: Add new TEST instruction pattern x86/PCI: Remove unused HyperTransport interrupt support x86/umip: Fix insn_get_code_seg_params()'s return value x86/boot/KASLR: Remove unused variable x86/entry/64: Add missing irqflags tracing to native_load_gs_index() x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing x86/pkeys/selftests: Fix protection keys write() warning x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey' x86/mpx/selftests: Fix up weird arrays x86/pkeys: Update documentation about availability x86/umip: Print a warning into the syslog if UMIP-protected instructions are used x86/smpboot: Fix __max_logical_packages estimate x86/topology: Avoid wasting 128k for package id array perf/x86/intel/uncore: Cache logical pkg id in uncore driver x86/acpi: Reduce code duplication in mp_override_legacy_irq() x86/acpi: Handle SCI interrupts above legacy space gracefully x86/boot: Fix boot failure when SMP MP-table is based at 0 x86/mm: Limit mmap() of /dev/mem to valid physical addresses x86/selftests: Add test for mapping placement for 5-level paging ... |
|
Andrey Ryabinin | f68d62a567 |
x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
[ Note, this commit is a cherry-picked version of: d17a1d97dc20: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow") ... for easier x86 entry code testing and back-porting. ] The KASAN shadow is currently mapped using vmemmap_populate() since that provides a semi-convenient way to map pages into init_top_pgt. However, since that no longer zeroes the mapped pages, it is not suitable for KASAN, which requires zeroed shadow memory. Add kasan_populate_shadow() interface and use it instead of vmemmap_populate(). Besides, this allows us to take advantage of gigantic pages and use them to populate the shadow, which should save us some memory wasted on page tables and reduce TLB pressure. Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Alexander Potapenko <glider@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Andrey Ryabinin | d17a1d97dc |
x86/mm/kasan: don't use vmemmap_populate() to initialize shadow
The kasan shadow is currently mapped using vmemmap_populate() since that provides a semi-convenient way to map pages into init_top_pgt. However, since that no longer zeroes the mapped pages, it is not suitable for kasan, which requires zeroed shadow memory. Add kasan_populate_shadow() interface and use it instead of vmemmap_populate(). Besides, this allows us to take advantage of gigantic pages and use them to populate the shadow, which should save us some memory wasted on page tables and reduce TLB pressure. Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Alexander Potapenko <glider@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Levin, Alexander (Sasha Levin) | 4675ff05de |
kmemcheck: rip it out
Fix up makefiles, remove references, and git rm kmemcheck. Link: http://lkml.kernel.org/r/20171007030159.22241-4-alexander.levin@verizon.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vegard Nossum <vegardno@ifi.uio.no> Cc: Pekka Enberg <penberg@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexander Potapenko <glider@google.com> Cc: Tim Hansen <devtimhansen@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Ricardo Neri | 796ebc81b9 |
x86/umip: Select X86_INTEL_UMIP by default
UMIP does cause any performance penalty to the vast majority of x86 code that does not use the legacy instructions affected by UMIP. Also describe UMIP more accurately and explain the behavior that can be expected by the (few) applications that use the affected instructions. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1510640985-18412-2-git-send-email-ricardo.neri-calderon@linux.intel.com [ Spelling fixes, rewrote the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | b18d62891a |
Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 APIC updates from Thomas Gleixner: "This update provides a major overhaul of the APIC initialization and vector allocation code: - Unification of the APIC and interrupt mode setup which was scattered all over the place and was hard to follow. This also distangles the timer setup from the APIC initialization which brings a clear separation of functionality. Great detective work from Dou Lyiang! - Refactoring of the x86 vector allocation mechanism. The existing code was based on nested loops and rather convoluted APIC callbacks which had a horrible worst case behaviour and tried to serve all different use cases in one go. This led to quite odd hacks when supporting the new managed interupt facility for multiqueue devices and made it more or less impossible to deal with the vector space exhaustion which was a major roadblock for server hibernation. Aside of that the code dealing with cpu hotplug and the system vectors was disconnected from the actual vector management and allocation code, which made it hard to follow and maintain. Utilizing the new bitmap matrix allocator core mechanism, the new allocator and management code consolidates the handling of system vectors, legacy vectors, cpu hotplug mechanisms and the actual allocation which needs to be aware of system and legacy vectors and hotplug constraints into a single consistent entity. This has one visible change: The support for multi CPU targets of interrupts, which is only available on a certain subset of CPUs/APIC variants has been removed in favour of single interrupt targets. A proper analysis of the multi CPU target feature revealed that there is no real advantage as the vast majority of interrupts end up on the CPU with the lowest APIC id in the set of target CPUs anyway. That change was agreed on by the relevant folks and allowed to simplify the implementation significantly and to replace rather fragile constructs like the vector cleanup IPI with straight forward and solid code. Furthermore this allowed to cleanly separate the allocation details for legacy, normal and managed interrupts: * Legacy interrupts are not longer wasting 16 vectors unconditionally * Managed interrupts have now a guaranteed vector reservation, but the actual vector assignment happens when the interrupt is requested. It's guaranteed not to fail. * Normal interrupts no longer allocate vectors unconditionally when the interrupt is set up (IO/APIC init or MSI(X) enable). The mechanism has been switched to a best effort reservation mode. The actual allocation happens when the interrupt is requested. Contrary to managed interrupts the request can fail due to vector space exhaustion, but drivers must handle a fail of request_irq() anyway. When the interrupt is freed, the vector is handed back as well. This solves a long standing problem with large unconditional vector allocations for a certain class of enterprise devices which prevented server hibernation due to vector space exhaustion when the unused allocated vectors had to be migrated to CPU0 while unplugging all non boot CPUs. The code has been equipped with trace points and detailed debugfs information to aid analysis of the vector space" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) x86/vector/msi: Select CONFIG_GENERIC_IRQ_RESERVATION_MODE PCI/MSI: Set MSI_FLAG_MUST_REACTIVATE in core code genirq: Add config option for reservation mode x86/vector: Use correct per cpu variable in free_moved_vector() x86/apic/vector: Ignore set_affinity call for inactive interrupts x86/apic: Fix spelling mistake: "symmectic" -> "symmetric" x86/apic: Use dead_cpu instead of current CPU when cleaning up ACPI/init: Invoke early ACPI initialization earlier x86/vector: Respect affinity mask in irq descriptor x86/irq: Simplify hotplug vector accounting x86/vector: Switch IOAPIC to global reservation mode x86/vector/msi: Switch to global reservation mode x86/vector: Handle managed interrupts proper x86/io_apic: Reevaluate vector configuration on activate() iommu/amd: Reevaluate vector configuration on activate() iommu/vt-d: Reevaluate vector configuration on activate() x86/apic/msi: Force reactivation of interrupts at startup time x86/vector: Untangle internal state from irq_cfg x86/vector: Compile SMP only code conditionally x86/apic: Remove unused callbacks ... |
|
Linus Torvalds | d6ec9d9a4d |
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Ingo Molnar: "Note that in this cycle most of the x86 topics interacted at a level that caused them to be merged into tip:x86/asm - but this should be a temporary phenomenon, hopefully we'll back to the usual patterns in the next merge window. The main changes in this cycle were: Hardware enablement: - Add support for the Intel UMIP (User Mode Instruction Prevention) CPU feature. This is a security feature that disables certain instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri) [ Note that this is disabled by default for now, there are some smaller enhancements in the pipeline that I'll follow up with in the next 1-2 days, which allows this to be enabled by default.] - Add support for the AMD SEV (Secure Encrypted Virtualization) CPU feature, on top of SME (Secure Memory Encryption) support that was added in v4.14. (Tom Lendacky, Brijesh Singh) - Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES, VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela) Other changes: - A big series of entry code simplifications and enhancements (Andy Lutomirski) - Make the ORC unwinder default on x86 and various objtool enhancements. (Josh Poimboeuf) - 5-level paging enhancements (Kirill A. Shutemov) - Micro-optimize the entry code a bit (Borislav Petkov) - Improve the handling of interdependent CPU features in the early FPU init code (Andi Kleen) - Build system enhancements (Changbin Du, Masahiro Yamada) - ... plus misc enhancements, fixes and cleanups" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits) x86/build: Make the boot image generation less verbose selftests/x86: Add tests for the STR and SLDT instructions selftests/x86: Add tests for User-Mode Instruction Prevention x86/traps: Fix up general protection faults caused by UMIP x86/umip: Enable User-Mode Instruction Prevention at runtime x86/umip: Force a page fault when unable to copy emulated result to user x86/umip: Add emulation code for UMIP instructions x86/cpufeature: Add User-Mode Instruction Prevention definitions x86/insn-eval: Add support to resolve 16-bit address encodings x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode x86/insn-eval: Add wrapper function for 32 and 64-bit addresses x86/insn-eval: Add support to resolve 32-bit address encodings x86/insn-eval: Compute linear address in several utility functions resource: Fix resource_size.cocci warnings X86/KVM: Clear encryption attribute when SEV is active X86/KVM: Decrypt shared per-cpu variables when SEV is active percpu: Introduce DEFINE_PER_CPU_DECRYPTED x86: Add support for changing memory encryption attribute in early boot x86/io: Unroll string I/O when SEV is active x86/boot: Add early boot support when running with SEV active ... |
|
Ricardo Neri | aa35f89697 |
x86/umip: Enable User-Mode Instruction Prevention at runtime
User-Mode Instruction Prevention (UMIP) is enabled by setting/clearing a bit in %cr4. It makes sense to enable UMIP at some point while booting, before user spaces come up. Like SMAP and SMEP, is not critical to have it enabled very early during boot. This is because UMIP is relevant only when there is a user space to be protected from. Given these similarities, UMIP can be enabled along with SMAP and SMEP. At the moment, UMIP is disabled by default at build time. It can be enabled at build time by selecting CONFIG_X86_INTEL_UMIP. If enabled at build time, it can be disabled at run time by adding clearcpuid=514 to the kernel parameters. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1509935277-22138-10-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | b3d9a13681 |
Merge branch 'linus' into x86/asm, to pick up fixes and resolve conflicts
Conflicts: arch/x86/kernel/cpu/Makefile Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | 141d3b1daa |
Merge branch 'linus' into x86/apic, to resolve conflicts
Conflicts: arch/x86/include/asm/x2apic.h Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | 8c5db92a70 |
Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | 75ec4eb3dc |
Merge branch 'x86/mm' into x86/asm, to pick up pending changes
Concentrate x86 MM and asm related changes into a single super-topic, in preparation for larger changes. Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Greg Kroah-Hartman | b24413180f |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
|
Andrey Ryabinin | 12a8cc7fcf |
x86/kasan: Use the same shadow offset for 4- and 5-level paging
We are going to support boot-time switching between 4- and 5-level paging. For KASAN it means we cannot have different KASAN_SHADOW_OFFSET for different paging modes: the constant is passed to gcc to generate code and cannot be changed at runtime. This patch changes KASAN code to use 0xdffffc0000000000 as shadow offset for both 4- and 5-level paging. For 5-level paging it means that shadow memory region is not aligned to PGD boundary anymore and we have to handle unaligned parts of the region properly. In addition, we have to exclude paravirt code from KASAN instrumentation as we now use set_pgd() before KASAN is fully ready. [kirill.shutemov@linux.intel.com: clenaup, changelog message] Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@suse.de> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170929140821.37654-4-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Thomas Gleixner | c201c91799 |
x86/vector/msi: Select CONFIG_GENERIC_IRQ_RESERVATION_MODE
Select CONFIG_GENERIC_IRQ_RESERVATION_MODE so PCI/MSI domains get the
MSI_FLAG_MUST_REACTIVATE flag set in pci_msi_create_irq_domain().
Remove the explicit setters of this flag in the apic/msi code as they are
not longer required.
Fixes:
|
|
Josh Poimboeuf | 11af847446 |
x86/unwind: Rename unwinder config options to 'CONFIG_UNWINDER_*'
Rename the unwinder config options from: CONFIG_ORC_UNWINDER CONFIG_FRAME_POINTER_UNWINDER CONFIG_GUESS_UNWINDER to: CONFIG_UNWINDER_ORC CONFIG_UNWINDER_FRAME_POINTER CONFIG_UNWINDER_GUESS ... in order to give them a more logical config namespace. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Kees Cook | 39208aa7ec |
locking/refcounts, x86/asm: Enable CONFIG_ARCH_HAS_REFCOUNT
With the section inlining bug fixed for the x86 refcount protection, we can turn the config back on. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1504382986-49301-3-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Thomas Gleixner | 0fa115da40 |
x86/irq/vector: Initialize matrix allocator
Initialize the matrix allocator and add the proper accounting points to the code. No functional change, just preparation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213155.108410660@linutronix.de |
|
Linus Torvalds | 89fd915c40 |
libnvdimm for 4.14
* Media error handling support in the Block Translation Table (BTT) driver is reworked to address sleeping-while-atomic locking and memory-allocation-context conflicts. * The dax_device lookup overhead for xfs and ext4 is moved out of the iomap hot-path to a mount-time lookup. * A new 'ecc_unit_size' sysfs attribute is added to advertise the read-modify-write boundary property of a persistent memory range. * Preparatory fix-ups for arm and powerpc pmem support are included along with other miscellaneous fixes. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZtsAGAAoJEB7SkWpmfYgCrzMP/2vPvZvrFjZn5pAoZjlmTmHM ySceoOC7vwvVXIsSs52FhSjcxEoXo9cklXPwhXOPVtVUFdSDJBUOIUxwIziE6Y+5 sFJ2xT9K+5zKBUiXJwqFQDg52dn//eBNnnnDz+HQrBSzGrbWQhIZY2m19omPzv1I BeN0OCGOdW3cjSo3BCFl1d+KrSl704e7paeKq/TO3GIiAilIXleTVxcefEEodV2K ZvWHpFIhHeyN8dsF8teI952KcCT92CT/IaabxQIwCxX0/8/GFeDc5aqf77qiYWKi uxCeQXdgnaE8EZNWZWGWIWul6eYEkoCNbLeUQ7eJnECq61VxVajJS0NyGa5T9OiM P046Bo2b1b3R0IHxVIyVG0ZCm3YUMAHSn/3uRxPgESJ4bS/VQ3YP5M6MLxDOlc90 IisLilagitkK6h8/fVuVrwciRNQ71XEC34t6k7GCl/1ZnLlLT+i4/jc5NRZnGEZh aXAAGHdteQ+/mSz6p2UISFUekbd6LerwzKRw8ibDvH6pTud8orYR7g2+JoGhgb6Y pyFVE8DhIcqNKAMxBsjiRZ46OQ7qrT+AemdAG3aVv6FaNoe4o5jPLdw2cEtLqtpk +DNm0/lSWxxxozjrvu6EUZj6hk8R5E19XpRzV5QJkcKUXMu7oSrFLdMcC4FeIjl9 K4hXLV3fVBVRMiS0RA6z =5iGY -----END PGP SIGNATURE----- Merge tag 'libnvdimm-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm from Dan Williams: "A rework of media error handling in the BTT driver and other updates. It has appeared in a few -next releases and collected some late- breaking build-error and warning fixups as a result. Summary: - Media error handling support in the Block Translation Table (BTT) driver is reworked to address sleeping-while-atomic locking and memory-allocation-context conflicts. - The dax_device lookup overhead for xfs and ext4 is moved out of the iomap hot-path to a mount-time lookup. - A new 'ecc_unit_size' sysfs attribute is added to advertise the read-modify-write boundary property of a persistent memory range. - Preparatory fix-ups for arm and powerpc pmem support are included along with other miscellaneous fixes" * tag 'libnvdimm-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (26 commits) libnvdimm, btt: fix format string warnings libnvdimm, btt: clean up warning and error messages ext4: fix null pointer dereference on sbi libnvdimm, nfit: move the check on nd_reserved2 to the endpoint dax: fix FS_DAX=n BLOCK=y compilation libnvdimm: fix integer overflow static analysis warning libnvdimm, nd_blk: remove mmio_flush_range() libnvdimm, btt: rework error clearing libnvdimm: fix potential deadlock while clearing errors libnvdimm, btt: cache sector_size in arena_info libnvdimm, btt: ensure that flags were also unchanged during a map_read libnvdimm, btt: refactor map entry operations with macros libnvdimm, btt: fix a missed NVDIMM_IO_ATOMIC case in the write path libnvdimm, nfit: export an 'ecc_unit_size' sysfs attribute ext4: perform dax_device lookup at mount ext2: perform dax_device lookup at mount xfs: perform dax_device lookup at mount dax: introduce a fs_dax_get_by_bdev() helper libnvdimm, btt: check memory allocation failure libnvdimm, label: fix index block size calculation ... |
|
Michal Hocko | 3072e413e3 |
mm/memory_hotplug: introduce add_pages
There are new users of memory hotplug emerging. Some of them require different subset of arch_add_memory. There are some which only require allocation of struct pages without mapping those pages to the kernel address space. We currently have __add_pages for that purpose. But this is rather lowlevel and not very suitable for the code outside of the memory hotplug. E.g. x86_64 wants to update max_pfn which should be done by the caller. Introduce add_pages() which should care about those details if they are needed. Each architecture should define its implementation and select CONFIG_ARCH_HAS_ADD_PAGES. All others use the currently existing __add_pages. Link: http://lkml.kernel.org/r/20170817000548.32038-7-jglisse@redhat.com Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Evgeny Baskakov <ebaskakov@nvidia.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mark Hairgrove <mhairgrove@nvidia.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Sherry Cheung <SCheung@nvidia.com> Cc: Subhash Gutti <sgutti@nvidia.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Bob Liu <liubo95@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Naoya Horiguchi | 9c670ea379 |
mm: thp: introduce CONFIG_ARCH_ENABLE_THP_MIGRATION
Introduce CONFIG_ARCH_ENABLE_THP_MIGRATION to limit thp migration functionality to x86_64, which should be safer at the first step. Link: http://lkml.kernel.org/r/20170717193955.20207-5-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Rik van Riel | df3735c5b4 |
x86,mpx: make mpx depend on x86-64 to free up VMA flag
Patch series "mm,fork,security: introduce MADV_WIPEONFORK", v4. If a child process accesses memory that was MADV_WIPEONFORK, it will get zeroes. The address ranges are still valid, they are just empty. If a child process accesses memory that was MADV_DONTFORK, it will get a segmentation fault, since those address ranges are no longer valid in the child after fork. Since MADV_DONTFORK also seems to be used to allow very large programs to fork in systems with strict memory overcommit restrictions, changing the semantics of MADV_DONTFORK might break existing programs. The use case is libraries that store or cache information, and want to know that they need to regenerate it in the child process after fork. Examples of this would be: - systemd/pulseaudio API checks (fail after fork) (replacing a getpid check, which is too slow without a PID cache) - PKCS#11 API reinitialization check (mandated by specification) - glibc's upcoming PRNG (reseed after fork) - OpenSSL PRNG (reseed after fork) The security benefits of a forking server having a re-inialized PRNG in every child process are pretty obvious. However, due to libraries having all kinds of internal state, and programs getting compiled with many different versions of each library, it is unreasonable to expect calling programs to re-initialize everything manually after fork. A further complication is the proliferation of clone flags, programs bypassing glibc's functions to call clone directly, and programs calling unshare, causing the glibc pthread_atfork hook to not get called. It would be better to have the kernel take care of this automatically. The patchset also adds MADV_KEEPONFORK, to undo the effects of a prior MADV_WIPEONFORK. This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO: https://man.openbsd.org/minherit.2 This patch (of 2): MPX only seems to be available on 64 bit CPUs, starting with Skylake and Goldmont. Move VM_MPX into the 64 bit only portion of vma->vm_flags, in order to free up a VMA flag. Link: http://lkml.kernel.org/r/20170811212829.29186-2-riel@redhat.com Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Matthew Wilcox <willy@infradead.org> Cc: Colm MacCártaigh <colm@allcosts.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | f57091767a |
Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cache quality monitoring update from Thomas Gleixner: "This update provides a complete rewrite of the Cache Quality Monitoring (CQM) facility. The existing CQM support was duct taped into perf with a lot of issues and the attempts to fix those turned out to be incomplete and horrible. After lengthy discussions it was decided to integrate the CQM support into the Resource Director Technology (RDT) facility, which is the obvious choise as in hardware CQM is part of RDT. This allowed to add Memory Bandwidth Monitoring support on top. As a result the mechanisms for allocating cache/memory bandwidth and the corresponding monitoring mechanisms are integrated into a single management facility with a consistent user interface" * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) x86/intel_rdt: Turn off most RDT features on Skylake x86/intel_rdt: Add command line options for resource director technology x86/intel_rdt: Move special case code for Haswell to a quirk function x86/intel_rdt: Remove redundant ternary operator on return x86/intel_rdt/cqm: Improve limbo list processing x86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplug x86/intel_rdt: Modify the intel_pqr_state for better performance x86/intel_rdt/cqm: Clear the default RMID during hotcpu x86/intel_rdt: Show bitmask of shareable resource with other executing units x86/intel_rdt/mbm: Handle counter overflow x86/intel_rdt/mbm: Add mbm counter initialization x86/intel_rdt/mbm: Basic counting of MBM events (total and local) x86/intel_rdt/cqm: Add CPU hotplug support x86/intel_rdt/cqm: Add sched_in support x86/intel_rdt: Introduce rdt_enable_key for scheduling x86/intel_rdt/cqm: Add mount,umount support x86/intel_rdt/cqm: Add rmdir support x86/intel_rdt: Separate the ctrl bits from rmdir x86/intel_rdt/cqm: Add mon_data x86/intel_rdt: Prepare for RDT monitor data support ... |
|
Linus Torvalds | b1b6f83ac9 |
Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm changes from Ingo Molnar: "PCID support, 5-level paging support, Secure Memory Encryption support The main changes in this cycle are support for three new, complex hardware features of x86 CPUs: - Add 5-level paging support, which is a new hardware feature on upcoming Intel CPUs allowing up to 128 PB of virtual address space and 4 PB of physical RAM space - a 512-fold increase over the old limits. (Supercomputers of the future forecasting hurricanes on an ever warming planet can certainly make good use of more RAM.) Many of the necessary changes went upstream in previous cycles, v4.14 is the first kernel that can enable 5-level paging. This feature is activated via CONFIG_X86_5LEVEL=y - disabled by default. (By Kirill A. Shutemov) - Add 'encrypted memory' support, which is a new hardware feature on upcoming AMD CPUs ('Secure Memory Encryption', SME) allowing system RAM to be encrypted and decrypted (mostly) transparently by the CPU, with a little help from the kernel to transition to/from encrypted RAM. Such RAM should be more secure against various attacks like RAM access via the memory bus and should make the radio signature of memory bus traffic harder to intercept (and decrypt) as well. This feature is activated via CONFIG_AMD_MEM_ENCRYPT=y - disabled by default. (By Tom Lendacky) - Enable PCID optimized TLB flushing on newer Intel CPUs: PCID is a hardware feature that attaches an address space tag to TLB entries and thus allows to skip TLB flushing in many cases, even if we switch mm's. (By Andy Lutomirski) All three of these features were in the works for a long time, and it's coincidence of the three independent development paths that they are all enabled in v4.14 at once" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (65 commits) x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y) x86/mm: Use pr_cont() in dump_pagetable() x86/mm: Fix SME encryption stack ptr handling kvm/x86: Avoid clearing the C-bit in rsvd_bits() x86/CPU: Align CR3 defines x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages acpi, x86/mm: Remove encryption mask from ACPI page protection type x86/mm, kexec: Fix memory corruption with SME on successive kexecs x86/mm/pkeys: Fix typo in Documentation/x86/protection-keys.txt x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=y x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y x86/mm: Allow userspace have mappings above 47-bit x86/mm: Prepare to expose larger address space to userspace x86/mpx: Do not allow MPX if we have mappings above 47-bit x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit() x86/xen: Redefine XEN_ELFNOTE_INIT_P2M using PUD_SIZE * PTRS_PER_PUD x86/mm/dump_pagetables: Fix printout of p4d level x86/mm/dump_pagetables: Generalize address normalization x86/boot: Fix memremap() related build failure ... |
|
Linus Torvalds | 5f82e71a00 |
Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar: - Add 'cross-release' support to lockdep, which allows APIs like completions, where it's not the 'owner' who releases the lock, to be tracked. It's all activated automatically under CONFIG_PROVE_LOCKING=y. - Clean up (restructure) the x86 atomics op implementation to be more readable, in preparation of KASAN annotations. (Dmitry Vyukov) - Fix static keys (Paolo Bonzini) - Add killable versions of down_read() et al (Kirill Tkhai) - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini) - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra) - Remove smp_mb__before_spinlock() and convert its usages, introduce smp_mb__after_spinlock() (Peter Zijlstra) * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) locking/lockdep/selftests: Fix mixed read-write ABBA tests sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK() acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures smp: Avoid using two cache lines for struct call_single_data locking/lockdep: Untangle xhlock history save/restore from task independence locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being futex: Remove duplicated code and fix undefined behaviour Documentation/locking/atomic: Finish the document... locking/lockdep: Fix workqueue crossrelease annotation workqueue/lockdep: 'Fix' flush_work() annotation locking/lockdep/selftests: Add mixed read-write ABBA tests mm, locking/barriers: Clarify tlb_flush_pending() barriers locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive locking/lockdep: Explicitly initialize wq_barrier::done::map locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING locking/refcounts, x86/asm: Implement fast refcount overflow protection locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease ... |
|
Linus Torvalds | b0c79f49c3 |
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar: - Introduce the ORC unwinder, which can be enabled via CONFIG_ORC_UNWINDER=y. The ORC unwinder is a lightweight, Linux kernel specific debuginfo implementation, which aims to be DWARF done right for unwinding. Objtool is used to generate the ORC unwinder tables during build, so the data format is flexible and kernel internal: there's no dependency on debuginfo created by an external toolchain. The ORC unwinder is almost two orders of magnitude faster than the (out of tree) DWARF unwinder - which is important for perf call graph profiling. It is also significantly simpler and is coded defensively: there has not been a single ORC related kernel crash so far, even with early versions. (knock on wood!) But the main advantage is that enabling the ORC unwinder allows CONFIG_FRAME_POINTERS to be turned off - which speeds up the kernel measurably: With frame pointers disabled, GCC does not have to add frame pointer instrumentation code to every function in the kernel. The kernel's .text size decreases by about 3.2%, resulting in better cache utilization and fewer instructions executed, resulting in a broad kernel-wide speedup. Average speedup of system calls should be roughly in the 1-3% range - measurements by Mel Gorman [1] have shown a speedup of 5-10% for some function execution intense workloads. The main cost of the unwinder is that the unwinder data has to be stored in RAM: the memory cost is 2-4MB of RAM, depending on kernel config - which is a modest cost on modern x86 systems. Given how young the ORC unwinder code is it's not enabled by default - but given the performance advantages the plan is to eventually make it the default unwinder on x86. See Documentation/x86/orc-unwinder.txt for more details. - Remove lguest support: its intended role was that of a temporary proof of concept for virtualization, plus its removal will enable the reduction (removal) of the paravirt API as well, so Rusty agreed to its removal. (Juergen Gross) - Clean up and fix FSGS related functionality (Andy Lutomirski) - Clean up IO access APIs (Andy Shevchenko) - Enhance the symbol namespace (Jiri Slaby) * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits) objtool: Handle GCC stack pointer adjustment bug x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone() x86/fpu/math-emu: Add ENDPROC to functions x86/boot/64: Extract efi_pe_entry() from startup_64() x86/boot/32: Extract efi_pe_entry() from startup_32() x86/lguest: Remove lguest support x86/paravirt/xen: Remove xen_patch() objtool: Fix objtool fallthrough detection with function padding x86/xen/64: Fix the reported SS and CS in SYSCALL objtool: Track DRAP separately from callee-saved registers objtool: Fix validate_branch() return codes x86: Clarify/fix no-op barriers for text_poke_bp() x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs selftests/x86/fsgsbase: Test selectors 1, 2, and 3 x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common x86/asm: Fix UNWIND_HINT_REGS macro for older binutils x86/asm/32: Fix regs_get_register() on segment registers x86/xen/64: Rearrange the SYSCALL entries x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads ... |
|
Dan Williams | 8f98ae0c9b | Merge branch 'for-4.14/fs' into libnvdimm-for-next | |
Robin Murphy | 5deb67f77a |
libnvdimm, nd_blk: remove mmio_flush_range()
mmio_flush_range() suffers from a lack of clearly-defined semantics,
and is somewhat ambiguous to port to other architectures where the
scope of the writeback implied by "flush" and ordering might matter,
but MMIO would tend to imply non-cacheable anyway. Per the rationale
in
|
|
Vitaly Kuznetsov | 9e52fc2b50 |
x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)
There's a subtle bug in how some of the paravirt guest code handles page table freeing on x86: On x86 software page table walkers depend on the fact that remote TLB flush does an IPI: walk is performed lockless but with interrupts disabled and in case the page table is freed the freeing CPU will get blocked as remote TLB flush is required. On other architectures which don't require an IPI to do remote TLB flush we have an RCU-based mechanism (see include/asm-generic/tlb.h for more details). In virtualized environments we may want to override the ->flush_tlb_others callback in pv_mmu_ops and use a hypercall asking the hypervisor to do a remote TLB flush for us. This breaks the assumption about IPIs. Xen PV has been doing this for years and the upcoming remote TLB flush for Hyper-V will do it too. This is not safe, as software page table walkers may step on an already freed page. Fix the bug by enabling the RCU-based page table freeing mechanism, CONFIG_HAVE_RCU_TABLE_FREE=y. Testing with kernbench and mmap/munmap microbenchmarks, and neither showed any noticeable performance impact. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Juergen Gross <jgross@suse.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: KY Srinivasan <kys@microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20170828082251.5562-1-vkuznets@redhat.com [ Rewrote/fixed/clarified the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | 7b3d61cc73 |
locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being
Mike Galbraith bisected a boot crash back to the following commit:
|
|
Ingo Molnar | 413d63d71b |
Merge branch 'linus' into x86/mm to pick up fixes and to fix conflicts
Conflicts: arch/x86/kernel/head64.c arch/x86/mm/mmap.c Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Ingo Molnar | 10c9850cb2 |
Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Juergen Gross | ecda85e702 |
x86/lguest: Remove lguest support
Lguest seems to be rather unused these days. It has seen only patches ensuring it still builds the last two years and its official state is "Odd Fixes". Remove it in order to be able to clean up the paravirt code. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: boris.ostrovsky@oracle.com Cc: lguest@lists.ozlabs.org Cc: rusty@rustcorp.com.au Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20170816173157.8633-3-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | e18a5ebc2d |
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull watchdog fix from Thomas Gleixner: "A fix for the hardlockup watchdog to prevent false positives with extreme Turbo-Modes which make the perf/NMI watchdog fire faster than the hrtimer which is used to verify. Slightly larger than the minimal fix, which just would increase the hrtimer frequency, but comes with extra overhead of more watchdog timer interrupts and thread wakeups for all users. With this change we restrict the overhead to the extreme Turbo-Mode systems" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kernel/watchdog: Prevent false positives with turbo modes |
|
Nicholas Piggin | 92e5aae457 |
kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog
Commit |
|
Thomas Gleixner | 7edaeb6841 |
kernel/watchdog: Prevent false positives with turbo modes
The hardlockup detector on x86 uses a performance counter based on unhalted
CPU cycles and a periodic hrtimer. The hrtimer period is about 2/5 of the
performance counter period, so the hrtimer should fire 2-3 times before the
performance counter NMI fires. The NMI code checks whether the hrtimer
fired since the last invocation. If not, it assumess a hard lockup.
The calculation of those periods is based on the nominal CPU
frequency. Turbo modes increase the CPU clock frequency and therefore
shorten the period of the perf/NMI watchdog. With extreme Turbo-modes (3x
nominal frequency) the perf/NMI period is shorter than the hrtimer period
which leads to false positives.
A simple fix would be to shorten the hrtimer period, but that comes with
the side effect of more frequent hrtimer and softlockup thread wakeups,
which is not desired.
Implement a low pass filter, which checks the perf/NMI period against
kernel time. If the perf/NMI fires before 4/5 of the watchdog period has
elapsed then the event is ignored and postponed to the next perf/NMI.
That solves the problem and avoids the overhead of shorter hrtimer periods
and more frequent softlockup thread wakeups.
Fixes:
|
|
Kees Cook | 7a46ec0e2f |
locking/refcounts, x86/asm: Implement fast refcount overflow protection
This implements refcount_t overflow protection on x86 without a noticeable performance impact, though without the fuller checking of REFCOUNT_FULL. This is done by duplicating the existing atomic_t refcount implementation but with normally a single instruction added to detect if the refcount has gone negative (e.g. wrapped past INT_MAX or below zero). When detected, the handler saturates the refcount_t to INT_MIN / 2. With this overflow protection, the erroneous reference release that would follow a wrap back to zero is blocked from happening, avoiding the class of refcount-overflow use-after-free vulnerabilities entirely. Only the overflow case of refcounting can be perfectly protected, since it can be detected and stopped before the reference is freed and left to be abused by an attacker. There isn't a way to block early decrements, and while REFCOUNT_FULL stops increment-from-zero cases (which would be the state _after_ an early decrement and stops potential double-free conditions), this fast implementation does not, since it would require the more expensive cmpxchg loops. Since the overflow case is much more common (e.g. missing a "put" during an error path), this protection provides real-world protection. For example, the two public refcount overflow use-after-free exploits published in 2016 would have been rendered unexploitable: http://perception-point.io/2016/01/14/analysis-and-exploitation-of-a-linux-kernel-vulnerability-cve-2016-0728/ http://cyseclabs.com/page?n=02012016 This implementation does, however, notice an unchecked decrement to zero (i.e. caller used refcount_dec() instead of refcount_dec_and_test() and it resulted in a zero). Decrements under zero are noticed (since they will have resulted in a negative value), though this only indicates that a use-after-free may have already happened. Such notifications are likely avoidable by an attacker that has already exploited a use-after-free vulnerability, but it's better to have them reported than allow such conditions to remain universally silent. On first overflow detection, the refcount value is reset to INT_MIN / 2 (which serves as a saturation value) and a report and stack trace are produced. When operations detect only negative value results (such as changing an already saturated value), saturation still happens but no notification is performed (since the value was already saturated). On the matter of races, since the entire range beyond INT_MAX but before 0 is negative, every operation at INT_MIN / 2 will trap, leaving no overflow-only race condition. As for performance, this implementation adds a single "js" instruction to the regular execution flow of a copy of the standard atomic_t refcount operations. (The non-"and_test" refcount_dec() function, which is uncommon in regular refcount design patterns, has an additional "jz" instruction to detect reaching exactly zero.) Since this is a forward jump, it is by default the non-predicted path, which will be reinforced by dynamic branch prediction. The result is this protection having virtually no measurable change in performance over standard atomic_t operations. The error path, located in .text.unlikely, saves the refcount location and then uses UD0 to fire a refcount exception handler, which resets the refcount, handles reporting, and returns to regular execution. This keeps the changes to .text size minimal, avoiding return jumps and open-coded calls to the error reporting routine. Example assembly comparison: refcount_inc() before: .text: ffffffff81546149: f0 ff 45 f4 lock incl -0xc(%rbp) refcount_inc() after: .text: ffffffff81546149: f0 ff 45 f4 lock incl -0xc(%rbp) ffffffff8154614d: 0f 88 80 d5 17 00 js ffffffff816c36d3 ... .text.unlikely: ffffffff816c36d3: 48 8d 4d f4 lea -0xc(%rbp),%rcx ffffffff816c36d7: 0f ff (bad) These are the cycle counts comparing a loop of refcount_inc() from 1 to INT_MAX and back down to 0 (via refcount_dec_and_test()), between unprotected refcount_t (atomic_t), fully protected REFCOUNT_FULL (refcount_t-full), and this overflow-protected refcount (refcount_t-fast): 2147483646 refcount_inc()s and 2147483647 refcount_dec_and_test()s: cycles protections atomic_t 82249267387 none refcount_t-fast 82211446892 overflow, untested dec-to-zero refcount_t-full 144814735193 overflow, untested dec-to-zero, inc-from-zero This code is a modified version of the x86 PAX_REFCOUNT atomic_t overflow defense from the last public patch of PaX/grsecurity, based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. Thanks to PaX Team for various suggestions for improvement for repurposing this code to be a refcount-only protection. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Jann Horn <jannh@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arozansk@redhat.com Cc: axboe@kernel.dk Cc: kernel-hardening@lists.openwall.com Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/20170815161924.GA133115@beast Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Vikas Shivappa | f01d7d51f5 |
x86/intel_rdt: Introduce a common compile option for RDT
We currently have a CONFIG_RDT_A which is for RDT(Resource directory technology) allocation based resctrl filesystem interface. As a preparation to add support for RDT monitoring as well into the same resctrl filesystem, change the config option to be CONFIG_RDT which would include both RDT allocation and monitoring code. No functional change. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-4-git-send-email-vikas.shivappa@linux.intel.com |
|
Josh Poimboeuf | 81d3871900 |
x86/kconfig: Consolidate unwinders into multiple choice selection
There are three mutually exclusive unwinders. Make that more obvious by combining them into a multiple-choice selection: CONFIG_FRAME_POINTER_UNWINDER CONFIG_ORC_UNWINDER CONFIG_GUESS_UNWINDER (if CONFIG_EXPERT=y) Frame pointers are still the default (for now). The old CONFIG_FRAME_POINTER option is still used in some arch-independent places, so keep it around, but make it invisible to the user on x86 - it's now selected by CONFIG_FRAME_POINTER_UNWINDER=y. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/20170725135424.zukjmgpz3plf5pmt@treble Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Josh Poimboeuf | ee9f8fce99 |
x86/unwind: Add the ORC unwinder
Add the new ORC unwinder which is enabled by CONFIG_ORC_UNWINDER=y. It plugs into the existing x86 unwinder framework. It relies on objtool to generate the needed .orc_unwind and .orc_unwind_ip sections. For more details on why ORC is used instead of DWARF, see Documentation/x86/orc-unwinder.txt - but the short version is that it's a simplified, fundamentally more robust debugninfo data structure, which also allows up to two orders of magnitude faster lookups than the DWARF unwinder - which matters to profiling workloads like perf. Thanks to Andy Lutomirski for the performance improvement ideas: splitting the ORC unwind table into two parallel arrays and creating a fast lookup table to search a subset of the unwind table. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/0a6cbfb40f8da99b7a45a1a8302dc6aef16ec812.1500938583.git.jpoimboe@redhat.com [ Extended the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Kirill A. Shutemov | 77ef56e4f0 |
x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y
Most of things are in place and we can enable support for 5-level paging. The patch makes XEN_PV and XEN_PVH dependent on !X86_5LEVEL. Both are not ready to work with 5-level paging. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170716225954.74185-9-kirill.shutemov@linux.intel.com [ Minor readability edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Tom Lendacky | f88a68facd |
x86/mm: Extend early_memremap() support with additional attrs
Add early_memremap() support to be able to specify encrypted and decrypted mappings with and without write-protection. The use of write-protection is necessary when encrypting data "in place". The write-protect attribute is considered cacheable for loads, but not stores. This implies that the hardware will never give the core a dirty line with this memtype. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Toshimitsu Kani <toshi.kani@hpe.com> Cc: kasan-dev@googlegroups.com Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-efi@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/479b5832c30fae3efa7932e48f81794e86397229.1500319216.git.thomas.lendacky@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Tom Lendacky | 7744ccdbc1 |
x86/mm: Add Secure Memory Encryption (SME) support
Add support for Secure Memory Encryption (SME). This initial support provides a Kconfig entry to build the SME support into the kernel and defines the memory encryption mask that will be used in subsequent patches to mark pages as encrypted. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Toshimitsu Kani <toshi.kani@hpe.com> Cc: kasan-dev@googlegroups.com Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-efi@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/a6c34d16caaed3bc3e2d6f0987554275bd291554.1500319216.git.thomas.lendacky@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Daniel Micay | 6974f0c455 |
include/linux/string.h: add the option of fortified string.h functions
This adds support for compiling with a rough equivalent to the glibc _FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer overflow checks for string.h functions when the compiler determines the size of the source or destination buffer at compile-time. Unlike glibc, it covers buffer reads in addition to writes. GNU C __builtin_*_chk intrinsics are avoided because they would force a much more complex implementation. They aren't designed to detect read overflows and offer no real benefit when using an implementation based on inline checks. Inline checks don't add up to much code size and allow full use of the regular string intrinsics while avoiding the need for a bunch of _chk functions and per-arch assembly to avoid wrapper overhead. This detects various overflows at compile-time in various drivers and some non-x86 core kernel code. There will likely be issues caught in regular use at runtime too. Future improvements left out of initial implementation for simplicity, as it's all quite optional and can be done incrementally: * Some of the fortified string functions (strncpy, strcat), don't yet place a limit on reads from the source based on __builtin_object_size of the source buffer. * Extending coverage to more string functions like strlcat. * It should be possible to optionally use __builtin_object_size(x, 1) for some functions (C strings) to detect intra-object overflows (like glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative approach to avoid likely compatibility issues. * The compile-time checks should be made available via a separate config option which can be enabled by default (or always enabled) once enough time has passed to get the issues it catches fixed. Kees said: "This is great to have. While it was out-of-tree code, it would have blocked at least CVE-2016-3858 from being exploitable (improper size argument to strlcpy()). I've sent a number of fixes for out-of-bounds-reads that this detected upstream already" [arnd@arndb.de: x86: fix fortified memcpy] Link: http://lkml.kernel.org/r/20170627150047.660360-1-arnd@arndb.de [keescook@chromium.org: avoid panic() in favor of BUG()] Link: http://lkml.kernel.org/r/20170626235122.GA25261@beast [keescook@chromium.org: move from -mm, add ARCH_HAS_FORTIFY_SOURCE, tweak Kconfig help] Link: http://lkml.kernel.org/r/20170526095404.20439-1-danielmicay@gmail.com Link: http://lkml.kernel.org/r/1497903987-21002-8-git-send-email-keescook@chromium.org Signed-off-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Daniel Axtens <dja@axtens.net> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Nicholas Piggin | 05a4a95279 |
kernel/watchdog: split up config options
Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR. LOCKUP_DETECTOR implies the general boot, sysctl, and programming interfaces for the lockup detectors. An architecture that wants to use a hard lockup detector must define HAVE_HARDLOCKUP_DETECTOR_PERF or HAVE_HARDLOCKUP_DETECTOR_ARCH. Alternatively an arch can define HAVE_NMI_WATCHDOG, which provides the minimum arch_touch_nmi_watchdog, and it otherwise does its own thing and does not implement the LOCKUP_DETECTOR interfaces. sparc is unusual in that it has started to implement some of the interfaces, but not fully yet. It should probably be converted to a full HAVE_HARDLOCKUP_DETECTOR_ARCH. [npiggin@gmail.com: fix] Link: http://lkml.kernel.org/r/20170617223522.66c0ad88@roar.ozlabs.ibm.com Link: http://lkml.kernel.org/r/20170616065715.18390-4-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Don Zickus <dzickus@redhat.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Tested-by: Babu Moger <babu.moger@oracle.com> [sparc] Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | d691b7e7d1 |
powerpc updates for 4.13
Highlights include: - Support for STRICT_KERNEL_RWX on 64-bit server CPUs. - Platform support for FSP2 (476fpe) board - Enable ZONE_DEVICE on 64-bit server CPUs. - Generic & powerpc spin loop primitives to optimise busy waiting - Convert VDSO update function to use new update_vsyscall() interface - Optimisations to hypercall/syscall/context-switch paths - Improvements to the CPU idle code on Power8 and Power9. As well as many other fixes and improvements. Thanks to: Akshay Adiga, Andrew Donnellan, Andrew Jeffery, Anshuman Khandual, Anton Blanchard, Balbir Singh, Benjamin Herrenschmidt, Christophe Leroy, Christophe Lombard, Colin Ian King, Dan Carpenter, Gautham R. Shenoy, Hari Bathini, Ian Munsie, Ivan Mikhaylov, Javier Martinez Canillas, Madhavan Srinivasan, Masahiro Yamada, Matt Brown, Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pavel Machek, Russell Currey, Santosh Sivaraj, Stephen Rothwell, Thiago Jung Bauermann, Yang Li. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZXyPCAAoJEFHr6jzI4aWAI9QQAISf2x5y//cqCi4ISyQB5pTq KLS/yQajNkQOw7c0fzBZOaH5Xd/SJ6AcKWDg8yDlpDR3+sRRsr98iIRECgKS5I7/ DxD9ywcbSoMXFQQo1ZMCp5CeuMUIJRtugBnUQM+JXCSUCPbznCY5DchDTLyTBTpO MeMVhI//JxthhoOMA9MudiEGaYCU9ho442Z4OJUSiLUv8WRbvQX9pTqoc4vx1fxA BWf2mflztBVcIfKIyxIIIlDLukkMzix6gSYPMCbC7lzkbnU7JSqKiheJXjo1gJS2 ePHKDxeNR2/QP0g/j3aT/MR1uTt9MaNBSX3gANE1xQ9OoJ8m1sOtCO4gNbSdLWae eXhDnoiEp930DRZOeEioOItuWWoxFaMyYk3BMmRKV4QNdYL3y3TRVeL2/XmRGqYL Lxz4IY/x5TteFEJNGcRX90uizNSi8AaEXPF16pUk8Ctt6eH3ZSwPMv2fHeYVCMr0 KFlKHyaPEKEoztyzLcUR6u9QB56yxDN58bvLpd32AeHvKhqyxFoySy59x9bZbatn B2y8mmDItg860e0tIG6jrtplpOVvL8i5jla5RWFVoQDuxxrLAds3vG9JZQs+eRzx Fiic93bqeUAS6RzhXbJ6QQJYIyhE7yqpcgv7ME1W87SPef3HPBk9xlp3yIDwdA2z bBDvrRnvupusz8qCWrxe =w8rj -----END PGP SIGNATURE----- Merge tag 'powerpc-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights include: - Support for STRICT_KERNEL_RWX on 64-bit server CPUs. - Platform support for FSP2 (476fpe) board - Enable ZONE_DEVICE on 64-bit server CPUs. - Generic & powerpc spin loop primitives to optimise busy waiting - Convert VDSO update function to use new update_vsyscall() interface - Optimisations to hypercall/syscall/context-switch paths - Improvements to the CPU idle code on Power8 and Power9. As well as many other fixes and improvements. Thanks to: Akshay Adiga, Andrew Donnellan, Andrew Jeffery, Anshuman Khandual, Anton Blanchard, Balbir Singh, Benjamin Herrenschmidt, Christophe Leroy, Christophe Lombard, Colin Ian King, Dan Carpenter, Gautham R. Shenoy, Hari Bathini, Ian Munsie, Ivan Mikhaylov, Javier Martinez Canillas, Madhavan Srinivasan, Masahiro Yamada, Matt Brown, Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pavel Machek, Russell Currey, Santosh Sivaraj, Stephen Rothwell, Thiago Jung Bauermann, Yang Li" * tag 'powerpc-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (158 commits) powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs powerpc/mm/radix: Implement STRICT_RWX/mark_rodata_ro() for Radix powerpc/mm/hash: Implement mark_rodata_ro() for hash powerpc/vmlinux.lds: Align __init_begin to 16M powerpc/lib/code-patching: Use alternate map for patch_instruction() powerpc/xmon: Add patch_instruction() support for xmon powerpc/kprobes/optprobes: Use patch_instruction() powerpc/kprobes: Move kprobes over to patch_instruction() powerpc/mm/radix: Fix execute permissions for interrupt_vectors powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() powerpc/64s: Blacklist rtas entry/exit from kprobes powerpc/64s: Blacklist functions invoked on a trap powerpc/64s: Un-blacklist system_call() from kprobes powerpc/64s: Move system_call() symbol to just after setting MSR_EE powerpc/64s: Blacklist system_call() and system_call_common() from kprobes powerpc/64s: Convert .L__replay_interrupt_return to a local label powerpc64/elfv1: Only dereference function descriptor for non-text symbols cxl: Export library to support IBM XSL powerpc/dts: Use #include "..." to include local DT powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8 ... |
|
Linus Torvalds | b6ffe9ba46 |
libnvdimm for 4.13
* Introduce the _flushcache() family of memory copy helpers and use them for persistent memory write operations on x86. The _flushcache() semantic indicates that the cache is either bypassed for the copy operation (movnt) or any lines dirtied by the copy operation are written back (clwb, clflushopt, or clflush). * Extend dax_operations with ->copy_from_iter() and ->flush() operations. These operations and other infrastructure updates allow all persistent memory specific dax functionality to be pushed into libnvdimm and the pmem driver directly. It also allows dax-specific sysfs attributes to be linked to a host device, for example: /sys/block/pmem0/dax/write_cache * Add support for the new NVDIMM platform/firmware mechanisms introduced in ACPI 6.2 and UEFI 2.7. This support includes the v1.2 namespace label format, extensions to the address-range-scrub command set, new error injection commands, and a new BTT (block-translation-table) layout. These updates support inter-OS and pre-OS compatibility. * Fix a longstanding memory corruption bug in nfit_test. * Make the pmem and nvdimm-region 'badblocks' sysfs files poll(2) capable. * Miscellaneous fixes and small updates across libnvdimm and the nfit driver. Acknowledgements that came after the branch was pushed: commit |
|
Aneesh Kumar K.V | e1073d1e79 |
mm/hugetlb: clean up ARCH_HAS_GIGANTIC_PAGE
This moves the #ifdef in C code to a Kconfig dependency. Also we move the gigantic_page_supported() function to be arch specific. This allows architectures to conditionally enable runtime allocation of gigantic huge page. Architectures like ppc64 supports different gigantic huge page size (16G and 1G) based on the translation mode selected. This provides an opportunity for ppc64 to enable runtime allocation only w.r.t 1G hugepage. No functional change in this patch. Link: http://lkml.kernel.org/r/1494995292-4443-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Huang Ying | 38d8b4e6bd |
mm, THP, swap: delay splitting THP during swap out
Patch series "THP swap: Delay splitting THP during swapping out", v11. This patchset is to optimize the performance of Transparent Huge Page (THP) swap. Recently, the performance of the storage devices improved so fast that we cannot saturate the disk bandwidth with single logical CPU when do page swap out even on a high-end server machine. Because the performance of the storage device improved faster than that of single logical CPU. And it seems that the trend will not change in the near future. On the other hand, the THP becomes more and more popular because of increased memory size. So it becomes necessary to optimize THP swap performance. The advantages of the THP swap support include: - Batch the swap operations for the THP to reduce lock acquiring/releasing, including allocating/freeing the swap space, adding/deleting to/from the swap cache, and writing/reading the swap space, etc. This will help improve the performance of the THP swap. - The THP swap space read/write will be 2M sequential IO. It is particularly helpful for the swap read, which are usually 4k random IO. This will improve the performance of the THP swap too. - It will help the memory fragmentation, especially when the THP is heavily used by the applications. The 2M continuous pages will be free up after THP swapping out. - It will improve the THP utilization on the system with the swap turned on. Because the speed for khugepaged to collapse the normal pages into the THP is quite slow. After the THP is split during the swapping out, it will take quite long time for the normal pages to collapse back into the THP after being swapped in. The high THP utilization helps the efficiency of the page based memory management too. There are some concerns regarding THP swap in, mainly because possible enlarged read/write IO size (for swap in/out) may put more overhead on the storage device. To deal with that, the THP swap in should be turned on only when necessary. For example, it can be selected via "always/never/madvise" logic, to be turned on globally, turned off globally, or turned on only for VMA with MADV_HUGEPAGE, etc. This patchset is the first step for the THP swap support. The plan is to delay splitting THP step by step, finally avoid splitting THP during the THP swapping out and swap out/in the THP as a whole. As the first step, in this patchset, the splitting huge page is delayed from almost the first step of swapping out to after allocating the swap space for the THP and adding the THP into the swap cache. This will reduce lock acquiring/releasing for the locks used for the swap cache management. With the patchset, the swap out throughput improves 15.5% (from about 3.73GB/s to about 4.31GB/s) in the vm-scalability swap-w-seq test case with 8 processes. The test is done on a Xeon E5 v3 system. The swap device used is a RAM simulated PMEM (persistent memory) device. To test the sequential swapping out, the test case creates 8 processes, which sequentially allocate and write to the anonymous pages until the RAM and part of the swap device is used up. This patch (of 5): In this patch, splitting huge page is delayed from almost the first step of swapping out to after allocating the swap space for the THP (Transparent Huge Page) and adding the THP into the swap cache. This will batch the corresponding operation, thus improve THP swap out throughput. This is the first step for the THP swap optimization. The plan is to delay splitting the THP step by step and avoid splitting the THP finally. In this patch, one swap cluster is used to hold the contents of each THP swapped out. So, the size of the swap cluster is changed to that of the THP (Transparent Huge Page) on x86_64 architecture (512). For other architectures which want such THP swap optimization, ARCH_USES_THP_SWAP_CLUSTER needs to be selected in the Kconfig file for the architecture. In effect, this will enlarge swap cluster size by 2 times on x86_64. Which may make it harder to find a free cluster when the swap space becomes fragmented. So that, this may reduce the continuous swap space allocation and sequential write in theory. The performance test in 0day shows no regressions caused by this. In the future of THP swap optimization, some information of the swapped out THP (such as compound map count) will be recorded in the swap_cluster_info data structure. The mem cgroup swap accounting functions are enhanced to support charge or uncharge a swap cluster backing a THP as a whole. The swap cluster allocate/free functions are added to allocate/free a swap cluster for a THP. A fair simple algorithm is used for swap cluster allocation, that is, only the first swap device in priority list will be tried to allocate the swap cluster. The function will fail if the trying is not successful, and the caller will fallback to allocate a single swap slot instead. This works good enough for normal cases. If the difference of the number of the free swap clusters among multiple swap devices is significant, it is possible that some THPs are split earlier than necessary. For example, this could be caused by big size difference among multiple swap devices. The swap cache functions is enhanced to support add/delete THP to/from the swap cache as a set of (HPAGE_PMD_NR) sub-pages. This may be enhanced in the future with multi-order radix tree. But because we will split the THP soon during swapping out, that optimization doesn't make much sense for this first step. The THP splitting functions are enhanced to support to split THP in swap cache during swapping out. The page lock will be held during allocating the swap cluster, adding the THP into the swap cache and splitting the THP. So in the code path other than swapping out, if the THP need to be split, the PageSwapCache(THP) will be always false. The swap cluster is only available for SSD, so the THP swap optimization in this patchset has no effect for HDD. [ying.huang@intel.com: fix two issues in THP optimize patch] Link: http://lkml.kernel.org/r/87k25ed8zo.fsf@yhuang-dev.intel.com [hannes@cmpxchg.org: extensive cleanups and simplifications, reduce code size] Link: http://lkml.kernel.org/r/20170515112522.32457-2-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Andrew Morton <akpm@linux-foundation.org> [for config option] Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> [for changes in huge_memory.c and huge_mm.h] Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Shaohua Li <shli@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | 4422d80ed7 |
Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Thomas Gleixner: "The RAS updates for the 4.13 merge window: - Cleanup of the MCE injection facility (Borsilav Petkov) - Rework of the AMD/SMCA handling (Yazen Ghannam) - Enhancements for ACPI/APEI to handle new notitication types (Shiju Jose) - atomic_t to refcount_t conversion (Elena Reshetova) - A few fixes and enhancements all over the place" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: RAS/CEC: Check the correct variable in the debugfs error handling x86/mce: Always save severity in machine_check_poll() x86/MCE, xen/mcelog: Make /dev/mcelog registration messages more precise x86/mce: Update bootlog description to reflect behavior on AMD x86/mce: Don't disable MCA banks when offlining a CPU on AMD x86/mce/mce-inject: Preset the MCE injection struct x86/mce: Clean up include files x86/mce: Get rid of register_mce_write_callback() x86/mce: Merge mce_amd_inj into mce-inject x86/mce/AMD: Use saved threshold block info in interrupt handler x86/mce/AMD: Use msr_stat when clearing MCA_STATUS x86/mce/AMD: Carve out SMCA bank configuration x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers x86/mce: Convert threshold_bank.cpus from atomic_t to refcount_t RAS: Make local function parse_ras_param() static ACPI/APEI: Handle GSIV and GPIO notification types |
|
Linus Torvalds | 8c073517a9 |
Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 PCI updates from Thomas Gleixner: "This update provides the seperation of x86 PCI accessors from the global PCI lock in the generic PCI config space accessors. The reasons for this are: - x86 has it's own PCI config lock for various reasons, so the accessors have to lock two locks nested. - The ECAM (mmconfig) access to the extended configuration space does not require locking. The existing generic locking causes a massive lock contention when accessing the extended config space of the Uncore facility for performance monitoring. The commit which switched the access to the primary config space over to ECAM mode has been removed from the branch, so the primary config space is still accessed with type1 accessors properly serialized by the x86 internal locking. Bjorn agreed on merging this through the x86 tree" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/PCI: Select CONFIG_PCI_LOCKLESS_CONFIG PCI: Provide Kconfig option for lockless config space accessors x86/PCI/ce4100: Properly lock accessor functions x86/PCI: Abort if legacy init fails x86/PCI: Remove duplicate defines |
|
Linus Torvalds | 03ffbcdd78 |
Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner: "The irq department delivers: - Expand the generic infrastructure handling the irq migration on CPU hotplug and convert X86 over to it. (Thomas Gleixner) Aside of consolidating code this is a preparatory change for: - Finalizing the affinity management for multi-queue devices. The main change here is to shut down interrupts which are affine to a outgoing CPU and reenabling them when the CPU comes online again. That avoids moving interrupts pointlessly around and breaking and reestablishing affinities for no value. (Christoph Hellwig) Note: This contains also the BLOCK-MQ and NVME changes which depend on the rework of the irq core infrastructure. Jens acked them and agreed that they should go with the irq changes. - Consolidation of irq domain code (Marc Zyngier) - State tracking consolidation in the core code (Jeffy Chen) - Add debug infrastructure for hierarchical irq domains (Thomas Gleixner) - Infrastructure enhancement for managing generic interrupt chips via devmem (Bartosz Golaszewski) - Constification work all over the place (Tobias Klauser) - Two new interrupt controller drivers for MVEBU (Thomas Petazzoni) - The usual set of fixes, updates and enhancements all over the place" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits) irqchip/or1k-pic: Fix interrupt acknowledgement irqchip/irq-mvebu-gicp: Allocate enough memory for spi_bitmap irqchip/gic-v3: Fix out-of-bound access in gic_set_affinity nvme: Allocate queues for all possible CPUs blk-mq: Create hctx for each present CPU blk-mq: Include all present CPUs in the default queue mapping genirq: Avoid unnecessary low level irq function calls genirq: Set irq masked state when initializing irq_desc genirq/timings: Add infrastructure for estimating the next interrupt arrival time genirq/timings: Add infrastructure to track the interrupt timings genirq/debugfs: Remove pointless NULL pointer check irqchip/gic-v3-its: Don't assume GICv3 hardware supports 16bit INTID irqchip/gic-v3-its: Add ACPI NUMA node mapping irqchip/gic-v3-its-platform-msi: Make of_device_ids const irqchip/gic-v3-its: Make of_device_ids const irqchip/irq-mvebu-icu: Add new driver for Marvell ICU irqchip/irq-mvebu-gicp: Add new driver for Marvell GICP dt-bindings/interrupt-controller: Add DT binding for the Marvell ICU genirq/irqdomain: Remove auto-recursive hierarchy support irqchip/MSI: Use irq_domain_update_bus_token instead of an open coded access ... |
|
Oliver O'Halloran | 65f7d04978 |
mm, x86: Add ARCH_HAS_ZONE_DEVICE to Kconfig
Currently ZONE_DEVICE depends on X86_64 and this will get unwieldly as new architectures (and platforms) get ZONE_DEVICE support. Move to an arch selected Kconfig option to save us the trouble. Cc: linux-mm@kvack.org Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> |
|
Thomas Gleixner | df65c1bcd9 |
x86/PCI: Select CONFIG_PCI_LOCKLESS_CONFIG
All x86 PCI configuration space accessors have either their own serialization or can operate completely lockless (ECAM). Disable the global lock in the generic PCI configuration space accessors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <helgaas@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20170316215057.295079391@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
|
Thomas Gleixner | c7d6c9dd87 |
x86/apic: Implement effective irq mask update
Add the effective irq mask update to the apic implementations and enable effective irq masks for x86. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.878370703@linutronix.de |
|
Thomas Gleixner | ad7a929fa4 |
x86/irq: Use irq_migrate_all_off_this_cpu()
The generic migration code supports all the required features already. Remove the x86 specific implementation and use the generic one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.851311033@linutronix.de |
|
Ingo Molnar | a4eb8b9935 |
Merge branch 'linus' into x86/mm, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Borislav Petkov | bc8e80d56c |
x86/mce: Merge mce_amd_inj into mce-inject
Reuse mce_amd_inj's debugfs interface so that mce-inject can benefit from it too. The old functionality is still preserved under CONFIG_X86_MCELOG_LEGACY. Tested-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/20170613162835.30750-4-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Kirill A. Shutemov | e585513b76 |
x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation
This patch provides all required callbacks required by the generic get_user_pages_fast() code and switches x86 over - and removes the platform specific implementation. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170606113133.22974-2-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Dan Williams | 0aed55af88 |
x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations
The pmem driver has a need to transfer data with a persistent memory destination and be able to rely on the fact that the destination writes are not cached. It is sufficient for the writes to be flushed to a cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync() to ensure data-writes have reached a power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn around and fence previous writes with an "sfence". Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and memcpy_flushcache, that guarantee that the destination buffer is not dirty in the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines will be used to replace the "pmem api" (include/linux/pmem.h + arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache() and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE config symbol, and fallback to copy_from_iter_nocache() and plain memcpy() otherwise. This is meant to satisfy the concern from Linus that if a driver wants to do something beyond the normal nocache semantics it should be something private to that driver [1], and Al's concern that anything uaccess related belongs with the rest of the uaccess code [2]. The first consumer of this interface is a new 'copy_from_iter' dax operation so that pmem can inject cache maintenance operations without imposing this overhead on other dax-capable drivers. [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html Cc: <x86@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> |
|
Bilal Amarni | 47b2c3fff4 |
security/keys: add CONFIG_KEYS_COMPAT to Kconfig
CONFIG_KEYS_COMPAT is defined in arch-specific Kconfigs and is missing for several 64-bit architectures : mips, parisc, tile. At the moment and for those architectures, calling in 32-bit userspace the keyctl syscall would return an ENOSYS error. This patch moves the CONFIG_KEYS_COMPAT option to security/keys/Kconfig, to make sure the compatibility wrapper is registered by default for any 64-bit architecture as long as it is configured with CONFIG_COMPAT. [DH: Modified to remove arm64 compat enablement also as requested by Eric Biggers] Signed-off-by: Bilal Amarni <bilal.amarni@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> cc: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: James Morris <james.l.morris@oracle.com> |
|
Andy Lutomirski | ce4a4e565f |
x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code
The UP asm/tlbflush.h generates somewhat nicer code than the SMP version. Aside from that, it's fallen quite a bit behind the SMP code: - flush_tlb_mm_range() didn't flush individual pages if the range was small. - The lazy TLB code was much weaker. This usually wouldn't matter, but, if a kernel thread flushed its lazy "active_mm" more than once (due to reclaim or similar), it wouldn't be unlazied and would instead pointlessly flush repeatedly. - Tracepoints were missing. Aside from that, simply having the UP code around was a maintanence burden, since it means that any change to the TLB flush code had to make sure not to break it. Simplify everything by deleting the UP code. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Benjamin Peterson | c9525a3fab |
x86/watchdog: Fix Kconfig help text file path reference to lockup watchdog documentation
Signed-off-by: Benjamin Peterson <bp@benjamin.pe>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes:
|
|
Linus Torvalds | 76f1948a79 |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatch updates from Jiri Kosina: - a per-task consistency model is being added for architectures that support reliable stack dumping (extending this, currently rather trivial set, is currently in the works). This extends the nature of the types of patches that can be applied by live patching infrastructure. The code stems from the design proposal made [1] back in November 2014. It's a hybrid of SUSE's kGraft and RH's kpatch, combining advantages of both: it uses kGraft's per-task consistency and syscall barrier switching combined with kpatch's stack trace switching. There are also a number of fallback options which make it quite flexible. Most of the heavy lifting done by Josh Poimboeuf with help from Miroslav Benes and Petr Mladek [1] https://lkml.kernel.org/r/20141107140458.GA21774@suse.cz - module load time patch optimization from Zhou Chengming - a few assorted small fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: add missing printk newlines livepatch: Cancel transition a safe way for immediate patches livepatch: Reduce the time of finding module symbols livepatch: make klp_mutex proper part of API livepatch: allow removal of a disabled patch livepatch: add /proc/<pid>/patch_state livepatch: change to a per-task consistency model livepatch: store function sizes livepatch: use kstrtobool() in enabled_store() livepatch: move patching functions into patch.c livepatch: remove unnecessary object loaded check livepatch: separate enabled and patched states livepatch/s390: add TIF_PATCH_PENDING thread flag livepatch/s390: reorganize TIF thread flag bits livepatch/powerpc: add TIF_PATCH_PENDING thread flag livepatch/x86: add TIF_PATCH_PENDING thread flag livepatch: create temporary klp_update_patch_state() stub x86/entry: define _TIF_ALLWORK_MASK flags explicitly stacktrace/x86: add function for detecting reliable stack traces |
|
Linus Torvalds | d3b5d35290 |
Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar: "The main x86 MM changes in this cycle were: - continued native kernel PCID support preparation patches to the TLB flushing code (Andy Lutomirski) - various fixes related to 32-bit compat syscall returning address over 4Gb in applications, launched from 64-bit binaries - motivated by C/R frameworks such as Virtuozzo. (Dmitry Safonov) - continued Intel 5-level paging enablement: in particular the conversion of x86 GUP to the generic GUP code. (Kirill A. Shutemov) - x86/mpx ABI corner case fixes/enhancements (Joerg Roedel) - ... plus misc updates, fixes and cleanups" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits) mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash x86/mm: Fix flush_tlb_page() on Xen x86/mm: Make flush_tlb_mm_range() more predictable x86/mm: Remove flush_tlb() and flush_tlb_current_task() x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() x86/mm/64: Fix crash in remove_pagetable() Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation" x86/boot/e820: Remove a redundant self assignment x86/mm: Fix dump pagetables for 4 levels of page tables x86/mpx, selftests: Only check bounds-vs-shadow when we keep shadow x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space Revert "x86/mm/numa: Remove numa_nodemask_from_meminfo()" x86/espfix: Add support for 5-level paging x86/kasan: Extend KASAN to support 5-level paging x86/mm: Add basic defines/helpers for CONFIG_X86_5LEVEL=y x86/paravirt: Add 5-level support to the paravirt code x86/mm: Define virtual memory map for 5-level paging x86/asm: Remove __VIRTUAL_MASK_SHIFT==47 assert x86/boot: Detect 5-level paging support x86/mm/numa: Remove numa_nodemask_from_meminfo() ... |
|
Linus Torvalds | 3fb9268e43 |
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar: "The main changes in this cycle were: - unwinder fixes and enhancements - improve ftrace interaction with the unwinder - optimize the code footprint of WARN() and related debugging constructs - ... plus misc updates, cleanups and fixes" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/unwind: Dump all stacks in unwind_dump() x86/unwind: Silence more entry-code related warnings x86/ftrace: Fix ebp in ftrace_regs_caller that screws up unwinder x86/unwind: Remove unused 'sp' parameter in unwind_dump() x86/unwind: Prepend hex mask value with '0x' in unwind_dump() x86/unwind: Properly zero-pad 32-bit values in unwind_dump() x86/unwind: Ensure stack pointer is aligned debug: Avoid setting BUGFLAG_WARNING twice x86/unwind: Silence entry-related warnings x86/unwind: Read stack return address in update_stack_state() x86/unwind: Move common code into update_stack_state() debug: Fix __bug_table[] in arch linker scripts debug: Add _ONCE() logic to report_bug() x86/debug: Define BUG() again for !CONFIG_BUG x86/debug: Implement __WARN() using UD0 x86/ftrace: Use Makefile logic instead of #ifdef for compiling ftrace_*.o x86/ftrace: Add -mfentry support to x86_32 with DYNAMIC_FTRACE set x86/ftrace: Clean up ftrace_regs_caller x86/ftrace: Add stack frame pointer to ftrace_caller x86/ftrace: Move the ftrace specific code out of entry_32.S ... |
|
Linus Torvalds | 16b76293c5 |
Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar: "The biggest changes in this cycle were: - reworking of the e820 code: separate in-kernel and boot-ABI data structures and apply a whole range of cleanups to the kernel side. No change in functionality. - enable KASLR by default: it's used by all major distros and it's out of the experimental stage as well. - ... misc fixes and cleanups" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) x86/KASLR: Fix kexec kernel boot crash when KASLR randomization fails x86/reboot: Turn off KVM when halting a CPU x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup x86: Enable KASLR by default boot/param: Move next_arg() function to lib/cmdline.c for later reuse x86/boot: Fix Sparse warning by including required header file x86/boot/64: Rename start_cpu() x86/xen: Update e820 table handling to the new core x86 E820 code x86/boot: Fix pr_debug() API braindamage xen, x86/headers: Add <linux/device.h> dependency to <asm/xen/page.h> x86/boot/e820: Simplify e820__update_table() x86/boot/e820: Separate the E820 ABI structures from the in-kernel structures x86/boot/e820: Fix and clean up e820_type switch() statements x86/boot/e820: Rename the remaining E820 APIs to the e820__*() prefix x86/boot/e820: Remove unnecessary #include's x86/boot/e820: Rename e820_mark_nosave_regions() to e820__register_nosave_regions() x86/boot/e820: Rename e820_reserve_resources*() to e820__reserve_resources*() x86/boot/e820: Use bool in query APIs x86/boot/e820: Document e820__reserve_setup_data() x86/boot/e820: Clean up __e820__update_table() et al ... |
|
Linus Torvalds | 3dee9fb2a4 |
Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar: "The main changes in this cycle were: - add the 'Corrected Errors Collector' kernel feature which collect and monitor correctable errors statistics and will preemptively (soft-)offline physical pages that have a suspiciously high error count. - handle MCE errors during kexec() more gracefully - factor out and deprecate the /dev/mcelog driver - ... plus misc fixes and cleanpus" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Check MCi_STATUS[MISCV] for usable addr on Intel only ACPI/APEI: Use setup_deferrable_timer() x86/mce: Update notifier priority check x86/mce: Enable PPIN for Knights Landing/Mill x86/mce: Do not register notifiers with invalid prio x86/mce: Factor out and deprecate the /dev/mcelog driver RAS: Add a Corrected Errors Collector x86/mce: Rename mce_log to mce_log_buffer x86/mce: Rename mce_log()'s argument x86/mce: Init some CPU features early x86/mce: Handle broadcasted MCE gracefully with kexec |
|
Al Viro | 2fefc97b21 |
HAVE_ARCH_HARDENED_USERCOPY is unconditional now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
|
Al Viro | 701cac61d0 |
CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now
all architectures converted Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
|
Ingo Molnar | 6dd29b3df9 |
Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation"
This reverts commit |
|
Ingo Molnar | 6807c84652 |
x86: Enable KASLR by default
KASLR is mature (and important) enough to be enabled by default on x86. Also enable it by default in the defconfigs. Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: dan.j.williams@intel.com Cc: dave.jiang@intel.com Cc: dyoung@redhat.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Kirill A. Shutemov | 4c7c44837b |
x86/mm: Define virtual memory map for 5-level paging
The first part of memory map (up to %esp fixup) simply scales existing map for 4-level paging by factor of 9 -- number of bits addressed by the additional page table level. The rest of the map is unchanged. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170330080731.65421-4-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Al Viro | beba3a20bf |
x86: switch to RAW_COPY_USER
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
|
Tony Luck | 5de97c9f6d |
x86/mce: Factor out and deprecate the /dev/mcelog driver
Move all code relating to /dev/mcelog to a separate source file. /dev/mcelog driver can now operate from the machine check notifier with lowest prio. Signed-off-by: Tony Luck <tony.luck@intel.com> [ Move the mce_helper and trigger functionality behind CONFIG_X86_MCELOG_LEGACY. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170327093304.10683-6-bp@alien8.de [ Renamed CONFIG_X86_MCELOG to CONFIG_X86_MCELOG_LEGACY. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Steven Rostedt (VMware) | 644e0e8dc7 |
x86/ftrace: Add -mfentry support to x86_32 with DYNAMIC_FTRACE set
x86_64 has had fentry support for some time. I did not add support to x86_32 as I was unsure if it will be used much in the future. It is still very much used, and there's issues with function graph tracing with gcc playing around with the mcount frames, causing function graph to panic. The fentry code does not have this issue, and is able to cope as there is no frame to mess up. Note, this only adds support for fentry when DYNAMIC_FTRACE is set. There's really no reason to not have that set, because the performance of the machine drops significantly when it's not enabled. Keep !DYNAMIC_FTRACE around to test it off, as there's still some archs that have FTRACE but not DYNAMIC_FTRACE. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20170323143446.052202377@goodmis.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
|
Kirill A. Shutemov | 2947ba054a |
x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation
This patch provides all required callbacks required by the generic get_user_pages_fast() code and switches x86 over - and removes the platform specific implementation. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aneesh Kumar K . V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dann Frazier <dann.frazier@canonical.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170316213906.89528-1-kirill.shutemov@linux.intel.com [ Minor readability edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Dmitry Safonov | 1b028f784e |
x86/mm: Introduce mmap_compat_base() for 32-bit mmap()
mmap() uses a base address, from which it starts to look for a free space for allocation. The base address is stored in mm->mmap_base, which is calculated during exec(). The address depends on task's size, set rlimit for stack, ASLR randomization. The base depends on the task size and the number of random bits which are different for 64-bit and 32bit applications. Due to the fact, that the base address is fixed, its mmap() from a compat (32bit) syscall issued by a 64bit task will return a address which is based on the 64bit base address and does not fit into the 32bit address space (4GB). The returned pointer is truncated to 32bit, which results in an invalid address. To solve store a seperate compat address base plus a compat legacy address base in mm_struct. These bases are calculated at exec() time and can be used later to address the 32bit compat mmap() issued by 64 bit applications. As a consequence of this change 32-bit applications issuing a 64-bit syscall (after doing a long jump) will get a 64-bit mapping now. Before this change 32-bit applications always got a 32bit mapping. [ tglx: Massaged changelog and added a comment ] Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: 0x7f454c46@gmail.com Cc: linux-mm@kvack.org Cc: Andy Lutomirski <luto@kernel.org> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Borislav Petkov <bp@suse.de> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20170306141721.9188-4-dsafonov@virtuozzo.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
|
Josh Poimboeuf | af085d9084 |
stacktrace/x86: add function for detecting reliable stack traces
For live patching and possibly other use cases, a stack trace is only useful if it can be assured that it's completely reliable. Add a new save_stack_trace_tsk_reliable() function to achieve that. Note that if the target task isn't the current task, and the target task is allowed to run, then it could be writing the stack while the unwinder is reading it, resulting in possible corruption. So the caller of save_stack_trace_tsk_reliable() must ensure that the task is either 'current' or inactive. save_stack_trace_tsk_reliable() relies on the x86 unwinder's detection of pt_regs on the stack. If the pt_regs are not user-mode registers from a syscall, then they indicate an in-kernel interrupt or exception (e.g. preemption or a page fault), in which case the stack is considered unreliable due to the nature of frame pointers. It also relies on the x86 unwinder's detection of other issues, such as: - corrupted stack data - stack grows the wrong way - stack walk doesn't reach the bottom - user didn't provide a large enough entries array Such issues are reported by checking unwind_error() and !unwind_done(). Also add CONFIG_HAVE_RELIABLE_STACKTRACE so arch-independent code can determine at build time whether the function is implemented. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Ingo Molnar <mingo@kernel.org> # for the x86 changes Signed-off-by: Jiri Kosina <jkosina@suse.cz> |
|
Linus Torvalds | 5d8a00eee2 |
The usual collection of new drivers, non-critical fixes, and updates
to existing clk drivers. The bulk of the work is on Allwinner and Rockchip SoCs, but there's also an Intel Atom driver in here too. New Drivers: - Tegra BPMP firmware - Hisilicon hi3660 SoCs - Rockchip rk3328 SoCs - Intel Atom PMC - STM32F746 - IDT VersaClock 5P49V5923 and 5P49V5933 - Marvell mv98dx3236 SoCs - Allwinner V3s SoCs Removed Drivers: - Samsung Exynos4415 SoCs Updates: - Migrate ABx500 to OF - Qualcomm IPQ4019 CPU clks and general PLL support - Qualcomm MSM8974 RPM - Rockchip non-critical fixes and clk id additions - Samsung Exynos4412 CPUs - Socionext UniPhier NAND and eMMC support - ZTE zx296718 i2s and other audio clks - Renesas CAN and MSIOF clks for R-Car M3-W - Renesas resets for R-Car Gen2 and Gen3 and RZ/G1 - TI CDCE913, CDCE937, and CDCE949 clk generators - Marvell Armada ap806 CPU frequencies - STM32F4* I2S/SAI support - Broadcom BCM2835 DSI support - Allwinner sun5i and A80 conversion to new style clk bindings -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJYsLxxAAoJEK0CiJfG5JUl0p0P/AiBaYvrmHBx3H9jdC3iQxd2 7luFN3OqpykmZc3xx2xO3WaZ96kwwxiMu8sj3+VQo6oCkEuOY2ru6uPiDOcF4P3+ 8ku2taoWlESDbVLebVTNJoRXBaBLaV+9BCN7AKvXpVw+/UkJI5hgr0yMdh4tgtvu K08tTMkDNDbA33KXuJo8/chQFqi2W6XBXk22YMkqqA8jx0F4EM759LcgUlD1YfBS HKkgSOgsW3Zwhl27ZEAJMthcmS4+wFaEgFBeipg/hxTLI3aQtmDtRfXwg0wkbBx2 8sVz9SyBwkjOT9+41kve+Je94NK3blnJEjbxPASveMwyhdX1TlDQCPfrXya/1zxz N1By1NpA6iEYwi4hy+OtBYlcsBHztAM/+eljDY2kEDvfiKjMa44GYmgBu4n8pq+n 75NJxws6ZkzPs5/QsLT3hvTaL1SNX6PaEW8HabDXO40ccZc4CYvFZVOXMAnKaXzZ 31hj8EvQ5x6hci+SPYyVu6j3ipOxN96VcZqEJ+hWyyuZEMK6Up1o/0lGZFgwa0UD SIl7RiTFKO6ko+8hYlk1g0DGtEyWDsdso1Bw4zaHwMngM/CwjJVzpK5T2t1fJyEh lN5MdhcOi0nsiRWdRxOwOlHDLf93qSo87mvseU1MCEXYN1aqTV3VxSm1YU8ZgQVk sAjpsJqj45enfDa9BmIt =o8o/ -----END PGP SIGNATURE----- Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Stephen Boyd: "The usual collection of new drivers, non-critical fixes, and updates to existing clk drivers. The bulk of the work is on Allwinner and Rockchip SoCs, but there's also an Intel Atom driver in here too. New Drivers: - Tegra BPMP firmware - Hisilicon hi3660 SoCs - Rockchip rk3328 SoCs - Intel Atom PMC - STM32F746 - IDT VersaClock 5P49V5923 and 5P49V5933 - Marvell mv98dx3236 SoCs - Allwinner V3s SoCs Removed Drivers: - Samsung Exynos4415 SoCs Updates: - Migrate ABx500 to OF - Qualcomm IPQ4019 CPU clks and general PLL support - Qualcomm MSM8974 RPM - Rockchip non-critical fixes and clk id additions - Samsung Exynos4412 CPUs - Socionext UniPhier NAND and eMMC support - ZTE zx296718 i2s and other audio clks - Renesas CAN and MSIOF clks for R-Car M3-W - Renesas resets for R-Car Gen2 and Gen3 and RZ/G1 - TI CDCE913, CDCE937, and CDCE949 clk generators - Marvell Armada ap806 CPU frequencies - STM32F4* I2S/SAI support - Broadcom BCM2835 DSI support - Allwinner sun5i and A80 conversion to new style clk bindings" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (130 commits) clk: renesas: mstp: ensure register writes complete clk: qcom: Do not drop device node twice clk: mvebu: adjust clock handling for the CP110 system controller clk: mvebu: Expand mv98dx3236-core-clock support clk: zte: add i2s clocks for zx296718 clk: sunxi-ng: sun9i-a80: Fix wrong pointer passed to PTR_ERR() clk: sunxi-ng: select SUNXI_CCU_MULT for sun5i clk: sunxi-ng: Check kzalloc() for errors and cleanup error path clk: tegra: Add BPMP clock driver clk: uniphier: add eMMC clock for LD11 and LD20 SoCs clk: uniphier: add NAND clock for all UniPhier SoCs ARM: dts: sun9i: Switch to new clock bindings clk: sunxi-ng: Add A80 Display Engine CCU clk: sunxi-ng: Add A80 USB CCU clk: sunxi-ng: Add A80 CCU clk: sunxi-ng: Support separately grouped PLL lock status register clk: sunxi-ng: mux: Get closest parent rate possible with CLK_SET_RATE_PARENT clk: sunxi-ng: mux: honor CLK_SET_RATE_NO_REPARENT flag clk: sunxi-ng: mux: Fix determine_rate for mux clocks with pre-dividers clk: qcom: SDHCI enablement on Nexus 5X / 6P ... |
|
Matthew Wilcox | a00cc7d9dd |
mm, x86: add support for PUD-sized transparent hugepages
The current transparent hugepage code only supports PMDs. This patch adds support for transparent use of PUDs with DAX. It does not include support for anonymous pages. x86 support code also added. Most of this patch simply parallels the work that was done for huge PMDs. The only major difference is how the new ->pud_entry method in mm_walk works. The ->pmd_entry method replaces the ->pte_entry method, whereas the ->pud_entry method works along with either ->pmd_entry or ->pte_entry. The pagewalk code takes care of locking the PUD before calling ->pud_walk, so handlers do not need to worry whether the PUD is stable. [dave.jiang@intel.com: fix SMP x86 32bit build for native_pud_clear()] Link: http://lkml.kernel.org/r/148719066814.31111.3239231168815337012.stgit@djiang5-desk3.ch.intel.com [dave.jiang@intel.com: native_pud_clear missing on i386 build] Link: http://lkml.kernel.org/r/148640375195.69754.3315433724330910314.stgit@djiang5-desk3.ch.intel.com Link: http://lkml.kernel.org/r/148545059381.17912.8602162635537598445.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Alexander Kapshuk <alexander.kapshuk@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | ca78d3173c |
arm64 updates for 4.11:
- Errata workarounds for Qualcomm's Falkor CPU - Qualcomm L2 Cache PMU driver - Qualcomm SMCCC firmware quirk - Support for DEBUG_VIRTUAL - CPU feature detection for userspace via MRS emulation - Preliminary work for the Statistical Profiling Extension - Misc cleanups and non-critical fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJYpIxqAAoJELescNyEwWM0xdwH/AsTYAXPZDMdRnrQUyV0Fd2H /9pMzww6dHXEmCMKkImf++otUD6S+gTCJTsj7kEAXT5sZzLk27std5lsW7R9oPjc bGQMalZy+ovLR1gJ6v072seM3In4xph/qAYOpD8Q0AfYCLHjfMMArQfoLa8Esgru eSsrAgzVAkrK7XHi3sYycUjr9Hac9tvOOuQ3SaZkDz4MfFIbI4b43+c1SCF7wgT9 tQUHLhhxzGmgxjViI2lLYZuBWsIWsE+algvOe1qocvA9JEIXF+W8NeOuCjdL8WwX 3aoqYClC+qD/9+/skShFv5gM5fo0/IweLTUNIHADXpB6OkCYDyg+sxNM+xnEWQU= =YrPg -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: - Errata workarounds for Qualcomm's Falkor CPU - Qualcomm L2 Cache PMU driver - Qualcomm SMCCC firmware quirk - Support for DEBUG_VIRTUAL - CPU feature detection for userspace via MRS emulation - Preliminary work for the Statistical Profiling Extension - Misc cleanups and non-critical fixes * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (74 commits) arm64/kprobes: consistently handle MRS/MSR with XZR arm64: cpufeature: correctly handle MRS to XZR arm64: traps: correctly handle MRS/MSR with XZR arm64: ptrace: add XZR-safe regs accessors arm64: include asm/assembler.h in entry-ftrace.S arm64: fix warning about swapper_pg_dir overflow arm64: Work around Falkor erratum 1003 arm64: head.S: Enable EL1 (host) access to SPE when entered at EL2 arm64: arch_timer: document Hisilicon erratum 161010101 arm64: use is_vmalloc_addr arm64: use linux/sizes.h for constants arm64: uaccess: consistently check object sizes perf: add qcom l2 cache perf events driver arm64: remove wrong CONFIG_PROC_SYSCTL ifdef ARM: smccc: Update HVC comment to describe new quirk parameter arm64: do not trace atomic operations ACPI/IORT: Fix the error return code in iort_add_smmu_platform_device() ACPI/IORT: Fix iort_node_get_id() mapping entries indexing arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA perf: xgene: Include module.h ... |
|
Linus Torvalds | 3051bf36c2 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: "Highlights: 1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini Varadhan. 2) Simplify classifier state on sk_buff in order to shrink it a bit. From Willem de Bruijn. 3) Introduce SIPHASH and it's usage for secure sequence numbers and syncookies. From Jason A. Donenfeld. 4) Reduce CPU usage for ICMP replies we are going to limit or suppress, from Jesper Dangaard Brouer. 5) Introduce Shared Memory Communications socket layer, from Ursula Braun. 6) Add RACK loss detection and allow it to actually trigger fast recovery instead of just assisting after other algorithms have triggered it. From Yuchung Cheng. 7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot. 8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert. 9) Export MPLS packet stats via netlink, from Robert Shearman. 10) Significantly improve inet port bind conflict handling, especially when an application is restarted and changes it's setting of reuseport. From Josef Bacik. 11) Implement TX batching in vhost_net, from Jason Wang. 12) Extend the dummy device so that VF (virtual function) features, such as configuration, can be more easily tested. From Phil Sutter. 13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric Dumazet. 14) Add new bpf MAP, implementing a longest prefix match trie. From Daniel Mack. 15) Packet sample offloading support in mlxsw driver, from Yotam Gigi. 16) Add new aquantia driver, from David VomLehn. 17) Add bpf tracepoints, from Daniel Borkmann. 18) Add support for port mirroring to b53 and bcm_sf2 drivers, from Florian Fainelli. 19) Remove custom busy polling in many drivers, it is done in the core networking since 4.5 times. From Eric Dumazet. 20) Support XDP adjust_head in virtio_net, from John Fastabend. 21) Fix several major holes in neighbour entry confirmation, from Julian Anastasov. 22) Add XDP support to bnxt_en driver, from Michael Chan. 23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan. 24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi. 25) Support GRO in IPSEC protocols, from Steffen Klassert" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits) Revert "ath10k: Search SMBIOS for OEM board file extension" net: socket: fix recvmmsg not returning error from sock_error bnxt_en: use eth_hw_addr_random() bpf: fix unlocking of jited image when module ronx not set arch: add ARCH_HAS_SET_MEMORY config net: napi_watchdog() can use napi_schedule_irqoff() tcp: Revert "tcp: tcp_probe: use spin_lock_bh()" net/hsr: use eth_hw_addr_random() net: mvpp2: enable building on 64-bit platforms net: mvpp2: switch to build_skb() in the RX path net: mvpp2: simplify MVPP2_PRS_RI_* definitions net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT net: mvpp2: remove unused register definitions net: mvpp2: simplify mvpp2_bm_bufs_add() net: mvpp2: drop useless fields in mvpp2_bm_pool and related code net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue' net: mvpp2: release reference to txq_cpu[] entry after unmapping net: mvpp2: handle too large value in mvpp2_rx_time_coal_set() net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set() net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set ... |
|
Linus Torvalds | 7bb033829e |
This renames the (now inaccurate) CONFIG_DEBUG_RODATA and related config
CONFIG_SET_MODULE_RONX to the more sensible CONFIG_STRICT_KERNEL_RWX and CONFIG_STRICT_MODULE_RWX. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Kees Cook <kees@outflux.net> iQIcBAABCgAGBQJYrJ2ZAAoJEIly9N/cbcAmb4UQAIDnJYF4xecUfxofypQwt7ey DcR8SH+g/Rkm3v2bUOrVdlP333ePRUEs6C47PgYSLlKsZiQA3H6bsTILHJZGHZ3j laNH4sjQ0j+Sr2rHXk8fLz3YpHHwIy49bfu2ERXFH92BMnTMCv1h9IWFgOMH+4y5 09n16TPHMUj1k0DGjHO/n03qLIKOo3Xy/Va5dhQ/6dGU4zR4KhOBnhLlG3IU7Atd KTR+ba/qym7bDQbTezMuaajTiZctr6a45yBKDWu4Knu+ot2a7K7fYvfRT3YVb5SU aTSYps7NKQbewcQSqNdek1zytoy2Ck7CH511e+3ypwNmao5KQwRgH7OX1pDEXyZv rGDaVzKMTSddH23jLEKUbpR847Lza9+V3h5YtbMG8GgiCKs91Ec666iEE3NVZBO8 1hiiYhE2iDxi10B/EZZcn2gOt2JaB2m2GxWIrJOz4txtDAWbUYlhUpWEUynBTPQ0 cYBZVnge81awipZJTWUv57LyufnTnMSK3i8Q8t0woj4C7pFbPYfjnKCrgwTQyAvr mD4uFBrgFb1lftbc3kfTdeoZmXerzvubsstWdr3rU3nsiJFzY1SwJZe8n0THyL4g DzURFrj/8UXb32Kavysz6FTxFO9u87mJm6yqHn/Y3bEK7Y7cch/NYjRC9Q6dpH+4 ld9apHF6iRrqgf+x6oOh =7KhR -----END PGP SIGNATURE----- Merge tag 'rodata-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull rodata updates from Kees Cook: "This renames the (now inaccurate) DEBUG_RODATA and related SET_MODULE_RONX configs to the more sensible STRICT_KERNEL_RWX and STRICT_MODULE_RWX" * tag 'rodata-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: arch: Rename CONFIG_DEBUG_RODATA and CONFIG_DEBUG_MODULE_RONX arch: Move CONFIG_DEBUG_RODATA and CONFIG_SET_MODULE_RONX to be common |
|
Daniel Borkmann | d2852a2240 |
arch: add ARCH_HAS_SET_MEMORY config
Currently, there's no good way to test for the presence of set_memory_ro/rw/x/nx() helpers implemented by archs such as x86, arm, arm64 and s390. There's DEBUG_SET_MODULE_RONX and DEBUG_RODATA, however both don't really reflect that: set_memory_*() are also available even when DEBUG_SET_MODULE_RONX is turned off, and DEBUG_RODATA is set by parisc, but doesn't implement above functions. Thus, add ARCH_HAS_SET_MEMORY that is selected by mentioned archs, where generic code can test against this. This also allows later on to move DEBUG_SET_MODULE_RONX out of the arch specific Kconfig to define it only once depending on ARCH_HAS_SET_MEMORY. Suggested-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Linus Torvalds | 292d386743 |
Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar: "Misc updates: - fix e820 error handling - convert page table setup code from assembly to C - fix kexec environment bug - ... plus small cleanups" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kconfig: Remove misleading note regarding hibernation and KASLR x86/boot: Fix KASLR and memmap= collision x86/e820/32: Fix e820_search_gap() error handling on x86-32 x86/boot/32: Convert the 32-bit pgtable setup code from assembly to C x86/e820: Make e820_search_gap() static and remove unused variables |
|
Laura Abbott | ad21fc4faa |
arch: Move CONFIG_DEBUG_RODATA and CONFIG_SET_MODULE_RONX to be common
There are multiple architectures that support CONFIG_DEBUG_RODATA and CONFIG_SET_MODULE_RONX. These options also now have the ability to be turned off at runtime. Move these to an architecture independent location and make these options def_bool y for almost all of those arches. Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Kees Cook <keescook@chromium.org> |
|
Niklas Cassel | 5773ebfee7 |
x86/kconfig: Remove misleading note regarding hibernation and KASLR
There used to be a restriction with KASLR and hibernation, but this is no
longer true, and since commit:
|
|
Irina Tirdea | 80a7581f38 |
arch/x86/platform/atom: Move pmc_atom to drivers/platform/x86
The pmc_atom driver does not contain any architecture specific code. It only enables the SoC Power Management Controller driver for BayTrail and CherryTrail platforms. Move the pmc_atom driver from arch/x86/platform/atom to drivers/platform/x86. Also clean-up and reorder include files by alphabetical order in pmc_atom.h Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> |
|
Borislav Petkov | d4b2ac63b0 |
x86/ras/inject: Make it depend on X86_LOCAL_APIC=y
... and get rid of the annoying: arch/x86/kernel/cpu/mcheck/mce-inject.c:97:13: warning: ‘mce_irq_ipi’ defined but not used [-Wunused-function] when doing randconfig builds. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-2-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Laura Abbott | fa5b6ec9e5 |
lib/Kconfig.debug: Add ARCH_HAS_DEBUG_VIRTUAL
DEBUG_VIRTUAL currently depends on DEBUG_KERNEL && X86. arm64 is getting the same support. Rather than add a list of architectures, switch this to ARCH_HAS_DEBUG_VIRTUAL and let architectures select it as appropriate. Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Linus Torvalds | eb254f323b |
Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cache allocation interface from Thomas Gleixner: "This provides support for Intel's Cache Allocation Technology, a cache partitioning mechanism. The interface is odd, but the hardware interface of that CAT stuff is odd as well. We tried hard to come up with an abstraction, but that only allows rather simple partitioning, but no way of sharing and dealing with the per package nature of this mechanism. In the end we decided to expose the allocation bitmaps directly so all combinations of the hardware can be utilized. There are two ways of associating a cache partition: - Task A task can be added to a resource group. It uses the cache partition associated to the group. - CPU All tasks which are not member of a resource group use the group to which the CPU they are running on is associated with. That allows for simple CPU based partitioning schemes. The main expected user sare: - Virtualization so a VM can only trash only the associated part of the cash w/o disturbing others - Real-Time systems to seperate RT and general workloads. - Latency sensitive enterprise workloads - In theory this also can be used to protect against cache side channel attacks" [ Intel RDT is "Resource Director Technology". The interface really is rather odd and very specific, which delayed this pull request while I was thinking about it. The pull request itself came in early during the merge window, I just delayed it until things had calmed down and I had more time. But people tell me they'll use this, and the good news is that it is _so_ specific that it's rather independent of anything else, and no user is going to depend on the interface since it's pretty rare. So if push comes to shove, we can just remove the interface and nothing will break ] * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) x86/intel_rdt: Implement show_options() for resctrlfs x86/intel_rdt: Call intel_rdt_sched_in() with preemption disabled x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount x86/intel_rdt: Fix setting of closid when adding CPUs to a group x86/intel_rdt: Update percpu closid immeditately on CPUs affected by changee x86/intel_rdt: Reset per cpu closids on unmount x86/intel_rdt: Select KERNFS when enabling INTEL_RDT_A x86/intel_rdt: Prevent deadlock against hotplug lock x86/intel_rdt: Protect info directory from removal x86/intel_rdt: Add info files to Documentation x86/intel_rdt: Export the minimum number of set mask bits in sysfs x86/intel_rdt: Propagate error in rdt_mount() properly x86/intel_rdt: Add a missing #include MAINTAINERS: Add maintainer for Intel RDT resource allocation x86/intel_rdt: Add scheduler hook x86/intel_rdt: Add schemata file x86/intel_rdt: Add tasks files x86/intel_rdt: Add cpus file x86/intel_rdt: Add mkdir to resctrl file system x86/intel_rdt: Add "info" files to resctrl file system ... |
|
Linus Torvalds | 8421c60446 |
platform-drivers-x86 for 4.10-2
Move and add registration for the mlx-platform driver. Introduce button and lid drivers for the surface3 (different from the surface3-pro). Add BXT PMIC TMU support. Add Y700 to existing ideapad-laptop quirk. ideapad-laptop: - Add Y700 15-ACZ to no_hw_rfkill DMI list surface3_button: - Introduce button support for the Surface 3 surface3-wmi: - Add custom surface3 platform device for controlling LID - Balance locking on error path mlx-platform: - Add mlxcpld-hotplug driver registration - Fix semicolon.cocci warnings - Move module from arch/x86 platform/x86: - Add Whiskey Cove PMIC TMU support -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJYVxXxAAoJEKbMaAwKp364mDAIAL750zoO/liY1qfFGktkY2zS xknJqh4UeIVW6iN9hEq7jQD3jkFl7hNuXgFyogLZgJQge8gfufsOD7ljkhp2wVF2 3Ir437f4GdcUeuhO6PwuPZnn++Y6oH2rLZc1d+5anZRikW470tnXitrXs5hsqG57 74KiSFa+v+Uj1kqU4Gjt0QDSfmQc185Wn5xieHPpZrcjDaut+N4t06+MDMcH7fjk 6nJ+tvo/dreZojA+AklljaNB2BioIGbEOGamnxOJ9Rjs0jV2d0A1c/6XBHDe+Xtu SkWqvjdrLOSxAjErBpB97VYnYihjCDXfl+DIABOdumGO6hJz01DvbZ+VVLwGM0E= =YH31 -----END PGP SIGNATURE----- Merge tag 'platform-drivers-x86-v4.10-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86 Pull more x86 platform driver updates from Darren Hart: "Move and add registration for the mlx-platform driver. Introduce button and lid drivers for the surface3 (different from the surface3-pro). Add BXT PMIC TMU support. Add Y700 to existing ideapad-laptop quirk. Summary: ideapad-laptop: - Add Y700 15-ACZ to no_hw_rfkill DMI list surface3_button: - Introduce button support for the Surface 3 surface3-wmi: - Add custom surface3 platform device for controlling LID - Balance locking on error path mlx-platform: - Add mlxcpld-hotplug driver registration - Fix semicolon.cocci warnings - Move module from arch/x86 platform/x86: - Add Whiskey Cove PMIC TMU support" * tag 'platform-drivers-x86-v4.10-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86: platform/x86: surface3-wmi: Balance locking on error path platform/x86: Add Whiskey Cove PMIC TMU support platform/x86: ideapad-laptop: Add Y700 15-ACZ to no_hw_rfkill DMI list platform/x86: Introduce button support for the Surface 3 platform/x86: Add custom surface3 platform device for controlling LID platform/x86: mlx-platform: Add mlxcpld-hotplug driver registration platform/x86: mlx-platform: Fix semicolon.cocci warnings platform/x86: mlx-platform: Move module from arch/x86 |
|
Vadim Pasternak | 6613d18e90 |
platform/x86: mlx-platform: Move module from arch/x86
Since mlx-platform is not an architectural driver, it is moved out of arch/x86/platform to drivers/platform/x86. Relevant Makefile and Kconfig are updated. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> |
|
Linus Torvalds | e7aa8c2eb1 |
These are the documentation changes for 4.10.
It's another busy cycle for the docs tree, as the sphinx conversion continues. Highlights include: - Further work on PDF output, which remains a bit of a pain but should be more solid now. - Five more DocBook template files converted to Sphinx. Only 27 to go... Lots of plain-text files have also been converted and integrated. - Images in binary formats have been replaced with more source-friendly versions. - Various bits of organizational work, including the renaming of various files discussed at the kernel summit. - New documentation for the device_link mechanism. ...and, of course, lots of typo fixes and small updates. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYTbl7AAoJEI3ONVYwIuV63NIP/REwzThnGWFJMRSuq8Ieq2r9 sFSQsaGTGlhyKiDoEooo+SO/Za3uTonjK+e7WZg8mhdiEdamta5aociU/71C1Yy/ T9ur0FhcGblrvZ1NidSDvCLwuECZOMMei7mgLZ9a+KCpc4ANqqTVZSUm1blKcqhF XelhVXxBa0ar35l/pVzyCxkdNXRWXv+MJZE8hp5XAdTdr11DS7UY9zrZdH31axtf BZlbYJrvB8WPydU6myTjRpirA17Hu7uU64MsL3bNIEiRQ+nVghEzQC8uxeUCvfVx r0H5AgGGQeir+e8GEv2T20SPZ+dumXs+y/HehKNb3jS3gV0mo+pKPeUhwLIxr+Zh QY64gf+jYf5ISHwAJRnU0Ima72ehObzSbx9Dko10nhq2OvbR5f83gjz9t9jKYFU7 RDowICA8lwqyRbHRoVfyoW8CpVhWFpMFu3yNeJMckeTish3m7ANqzaWslbsqIP5G zxgFMIrVVSbeae+sUeygtEJAnWI09aZ4tuaUXYtGWwu6ikC/3aV6DryP4bthG2LF A19uV4nMrLuuh8g2wiTHHjMfjYRwvSn+f9yaolwJhwyNDXQzRPy+ZJ3W/6olOkXC bAxTmVRCW5GA/fmSrfXmW1KbnxlWfP2C62hzZQ09UHxzTHdR97oFLDQdZhKo1uwf pmSJR0hVeRUmA4uw6+Su =A0EV -----END PGP SIGNATURE----- Merge tag 'docs-4.10' of git://git.lwn.net/linux Pull documentation update from Jonathan Corbet: "These are the documentation changes for 4.10. It's another busy cycle for the docs tree, as the sphinx conversion continues. Highlights include: - Further work on PDF output, which remains a bit of a pain but should be more solid now. - Five more DocBook template files converted to Sphinx. Only 27 to go... Lots of plain-text files have also been converted and integrated. - Images in binary formats have been replaced with more source-friendly versions. - Various bits of organizational work, including the renaming of various files discussed at the kernel summit. - New documentation for the device_link mechanism. ... and, of course, lots of typo fixes and small updates" * tag 'docs-4.10' of git://git.lwn.net/linux: (193 commits) dma-buf: Extract dma-buf.rst Update Documentation/00-INDEX docs: 00-INDEX: document directories/files with no docs docs: 00-INDEX: remove non-existing entries docs: 00-INDEX: add missing entries for documentation files/dirs docs: 00-INDEX: consolidate process/ and admin-guide/ description scripts: add a script to check if Documentation/00-INDEX is sane Docs: change sh -> awk in REPORTING-BUGS Documentation/core-api/device_link: Add initial documentation core-api: remove an unexpected unident ppc/idle: Add documentation for powersave=off Doc: Correct typo, "Introdution" => "Introduction" Documentation/atomic_ops.txt: convert to ReST markup Documentation/local_ops.txt: convert to ReST markup Documentation/assoc_array.txt: convert to ReST markup docs-rst: parse-headers.pl: cleanup the documentation docs-rst: fix media cleandocs target docs-rst: media/Makefile: reorganize the rules docs-rst: media: build SVG from graphviz files docs-rst: replace bayer.png by a SVG image ... |
|
Linus Torvalds | 5fc0363d43 |
Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build updates from Ingo Molnar: "The main changes in this cycle were: - Makefile improvements (Paul Bolle) - KConfig cleanups to better separate 32-bit only, 64-bit only and generic feature enablement sections (Ingo Molnar)" * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Remove three unneeded genhdr-y entries x86/build: Don't use $(LINUXINCLUDE) twice x86/kconfig: Sort the 'config X86' selects alphabetically x86/kconfig: Clean up 32-bit compat options x86/kconfig: Clean up IA32_EMULATION select x86/kconfig, x86/pkeys: Move pkeys selects to X86_INTEL_MEMORY_PROTECTION_KEYS x86/kconfig: Move 64-bit only arch Kconfig selects to 'config X86_64' x86/kconfig: Move 32-bit only arch Kconfig selects to 'config X86_32' |
|
Linus Torvalds | df5f0f0a02 |
Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 RAS updates from Ingo Molnar: "The main changes in this development cycle were: - more AMD northbridge support work, mostly in preparation for Fam17h CPUs (Yazen Ghannam, Borislav Petkov) - cleanups/refactorings and fixes (Borislav Petkov, Tony Luck, Yinghai Lu)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Include the PPIN in MCE records when available x86/mce/AMD: Add system physical address translation for AMD Fam17h x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h x86/amd_nb: Add Fam17h Data Fabric as "Northbridge" x86/amd_nb: Make all exports EXPORT_SYMBOL_GPL x86/amd_nb: Make amd_northbridges internal to amd_nb.c x86/mce/AMD: Reset Threshold Limit after logging error x86/mce/AMD: Fix HWID_MCATYPE calculation by grouping arguments x86/MCE: Correct TSC timestamping of error records x86/RAS: Hide SMCA bank names x86/RAS: Rename smca_bank_names to smca_names x86/RAS: Simplify SMCA HWID descriptor struct x86/RAS: Simplify SMCA bank descriptor struct x86/MCE: Dump MCE to dmesg if no consumers x86/RAS: Add TSC timestamp to the injected MCE x86/MCE: Do not look at panic_on_oops in the severity grading |