Commit Graph

123867 Commits

Author SHA1 Message Date
Kevin Hao c12e6f24d4 powerpc: Add option to use jump label for mmu_has_feature()
As we just did for CPU features.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:06 +10:00
Kevin Hao 4db7327194 powerpc: Add option to use jump label for cpu_has_feature()
We do binary patching of asm code using CPU features, which is a
one-time operation, done during early boot. However checks of CPU
features in C code are currently done at run time, even though the set
of CPU features can never change after boot.

We can optimise this by using jump labels to implement cpu_has_feature(),
meaning checks in C code are binary patched into a single nop or branch.

For a C sequence along the lines of:

    if (cpu_has_feature(FOO))
         return 2;

The generated code before is roughly:

    ld      r9,-27640(r2)
    ld      r9,0(r9)
    lwz     r9,32(r9)
    cmpwi   cr7,r9,0
    bge     cr7, 1f
    li      r3,2
    blr
1:  ...

After (true):
    nop
    li      r3,2
    blr

After (false):
    b	1f
    li      r3,2
    blr
1:  ...

mpe: Rename MAX_CPU_FEATURES as we already have a #define with that
name, and define it simply as a constant, rather than doing tricks with
sizeof and NULL pointers. Rename the array to cpu_feature_keys. Use the
kconfig we added to guard it. Add BUILD_BUG_ON() if the feature is not a
compile time constant. Rewrite the change log.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:05 +10:00
Michael Ellerman bfbfc8a43c powerpc: Add kconfig option to use jump labels for cpu/mmu_has_feature()
Add a kconfig option to control whether we use jump label for the
cpu/mmu_has_feature() checks. Currently this does nothing, but we will
enabled it in the subsequent patches.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:04 +10:00
Kevin Hao b92a226e52 powerpc: Move cpu_has_feature() to a separate file
We plan to use jump label for cpu_has_feature(). In order to implement
this we need to include the linux/jump_label.h in asm/cputable.h.

Unfortunately if we do that it leads to an include loop. The root of the
problem seems to be that reg.h needs cputable.h (for CPU_FTRs), and then
cputable.h via jump_label.h eventually pulls in hw_irq.h which needs
reg.h (for MSR_EE).

So move cpu_has_feature() to a separate file on its own.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Rename to cpu_has_feature.h and flesh out change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:03 +10:00
Kevin Hao 905259e33d powerpc: Remove mfvtb()
This function is only used by get_vtb(). They are almost the same except
the reading from the real register. Move the mfspr() to get_vtb() and
kill the function mfvtb(). With this, we can eliminate the use of
cpu_has_feature() in very core header file like reg.h. This is a
preparation for the use of jump label for cpu_has_feature().

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:03 +10:00
Aneesh Kumar K.V 309b315b6e powerpc: Call jump_label_init() in apply_feature_fixups()
Call jump_label_init() early so that we can use static keys for CPU and
MMU feature checks.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:02 +10:00
Aneesh Kumar K.V b8f1b4f860 powerpc/mm: Convert early cpu/mmu feature check to use the new helpers
This switches early feature checks to use the non static key variant of
the function. In later patches we will be switching cpu_has_feature()
and mmu_has_feature() to use static keys and we can use them only after
static key/jump label is initialized. Any check for feature before jump
label init should be done using this new helper.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:01 +10:00
Michael Ellerman a141cca389 powerpc/mm: Add early_[cpu|mmu]_has_feature()
In later patches, we will be switching CPU and MMU feature checks to
use static keys.

For checks in early boot before jump label is initialized we need a
variant of [cpu|mmu]_has_feature() that doesn't use jump labels.

So create those called, unimaginatively, early_[cpu|mmu]_has_feature().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:15:00 +10:00
Michael Ellerman bab4c8de62 powerpc/mm: Define radix_enabled() in one place & use static inline
Currently we have radix_enabled() three times, twice in asm/book3s/64/mmu.h
and then a fallback in asm/mmu.h.

Consolidate them in asm/mmu.h. While we're at it convert them to be
static inlines, and change the fallback case to returning a bool, like
mmu_has_feature().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:59 +10:00
Michael Ellerman 6574ba950b powerpc/kernel: Convert cpu_has_feature() to returning bool
The intention is that the result is only used as a boolean, so enforce
that by changing the return type to bool.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:58 +10:00
Michael Ellerman a81dc9d995 powerpc/kernel: Convert mmu_has_feature() to returning bool
The intention is that the result is only used as a boolean, so enforce
that by changing the return type to bool.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:58 +10:00
Aneesh Kumar K.V 5a25b6f527 powerpc/mm: Make MMU_FTR_RADIX a MMU family feature
MMU feature bits are defined such that we use the lower half to
present MMU family features. Remove the strict split of half and
also move Radix to a mmu family feature. Radix introduce a new MMU
model and strictly speaking it is a new MMU family. This also free
up bits which can be used for individual features later.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:57 +10:00
Michael Ellerman a28e46f109 powerpc/kernel: Check features don't change after patching
Early in boot we binary patch some sections of code based on the CPU and
MMU feature bits. But it is a one-time patching, there is no facility
for repatching the code later if the set of features change.

It is a major bug if the set of features changes after we've done the
code patching - so add a check for it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:56 +10:00
Michael Ellerman 9e8066f398 powerpc/64: Do feature patching before MMU init
Up until now we needed to do the MMU init before feature patching,
because part of the MMU init was scanning the device tree and setting
and/or clearing some MMU feature bits.

Now that we have split that MMU feature modification out into routines
called from early_init_devtree() (called earlier) we can now do feature
patching before calling MMU init.

The advantage of this is it means the remainder of the MMU init runs
with the final set of features which will apply for the rest of the life
of the system. This means we don't have to special case anything called
from MMU init to deal with a changing set of feature bits.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:56 +10:00
Michael Ellerman 2537b09c93 powerpc/mm: Do radix device tree scanning earlier
Like we just did for hash, split the device tree scanning parts out and
call them from mmu_early_init_devtree().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:55 +10:00
Michael Ellerman bacf9cf883 powerpc/mm: Do hash device tree scanning earlier
Currently MMU initialisation (early_init_mmu()) consists of a mixture of
scanning the device tree, setting MMU feature bits, and then also doing
actual initialisation of MMU data structures.

We'd like to decouple the setting of the MMU features from the actual
setup. So split out the device tree scanning, and associated code, and
call it from mmu_init_early_devtree().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:54 +10:00
Michael Ellerman c610ec60ed powerpc/mm: Move disable_radix handling into mmu_early_init_devtree()
Move the handling of the disable_radix command line argument into the
newly created mmu_early_init_devtree().

It's an MMU option so it's preferable to have it in an mm related file,
and it also means platforms that don't support radix don't have to carry
the code.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:53 +10:00
Michael Ellerman 1a01dc87e0 powerpc/mm: Add mmu_early_init_devtree()
Empty for now, but we'll add to it in the next patch.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01 11:14:53 +10:00
Linus Torvalds bad60e6f25 powerpc updates for 4.8 # 1
Highlights:
  - PowerNV PCI hotplug support.
  - Lots more Power9 support.
  - eBPF JIT support on ppc64le.
  - Lots of cxl updates.
  - Boot code consolidation.
 
 Bug fixes:
  - Fix spin_unlock_wait() from Boqun Feng
  - Fix stack pointer corruption in __tm_recheckpoint() from Michael Neuling
  - Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
  - mm: Ensure "special" zones are empty from Oliver O'Halloran
  - ftrace: Separate the heuristics for checking call sites from Michael Ellerman
  - modules: Never restore r2 for a mprofile-kernel style mcount() call from Michael Ellerman
  - Fix endianness when reading TCEs from Alexey Kardashevskiy
  - start rtasd before PCI probing from Greg Kurz
  - PCI: rpaphp: Fix slot registration for multiple slots under a PHB from Tyrel Datwyler
  - powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev Bhattiprolu
 
 Cleanups & fixes:
  - Drop support for MPIC in pseries from Rashmica Gupta
  - Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
  - Remove unused symbols in asm-offsets.c from Rashmica Gupta
  - Fix SRIOV not building without EEH enabled from Russell Currey
  - Remove kretprobe_trampoline_holder. from Thiago Jung Bauermann
  - Reduce log level of PCI I/O space warning from Benjamin Herrenschmidt
  - Add array bounds checking to crash_shutdown_handlers from Suraj Jitindar Singh
  - Avoid -maltivec when using clang integrated assembler from Anton Blanchard
  - Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
  - Fix error return value in cmm_mem_going_offline() from Rasmus Villemoes
  - export cpu_to_core_id() from Mauricio Faria de Oliveira
  - Remove old symbols from defconfigs from Andrew Donnellan
  - Update obsolete comments in setup_32.c about entry conditions from Benjamin Herrenschmidt
  - Add comment explaining the purpose of setup_kdump_trampoline() from Benjamin Herrenschmidt
  - Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin Hao
  - Remove RELOCATABLE_PPC32 from Kevin Hao
  - Fix .long's in tlb-radix.c to more meaningful from Balbir Singh
 
 Minor cleanups & fixes:
  - Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
    Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King, Geliang
    Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman, Michael Ellerman,
    Stephen Rothwell, Stewart Smith.
 
 Freescale updates from Scott:
  - "Highlights include more 8xx optimizations, device tree updates,
    and MVME7100 support."
 
 PowerNV PCI hotplug from Gavin Shan:
  - PCI: Add pcibios_setup_bridge()
  - Override pcibios_setup_bridge()
  - Remove PCI_RESET_DELAY_US
  - Move pnv_pci_ioda_setup_opal_tce_kill() around
  - Increase PE# capacity
  - Allocate PE# in reverse order
  - Create PEs in pcibios_setup_bridge()
  - Setup PE for root bus
  - Extend PCI bridge resources
  - Make pnv_ioda_deconfigure_pe() visible
  - Dynamically release PE
  - Update bridge windows on PCI plug
  - Delay populating pdn
  - Support PCI slot ID
  - Use PCI slot reset infrastructure
  - Introduce pnv_pci_get_slot_id()
  - Functions to get/set PCI slot state
  - PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  - Print correct PHB type names
 
 Power9 idle support from Shreyas B. Prabhu:
  - set power_save func after the idle states are initialized
  - Use PNV_THREAD_WINKLE macro while requesting for winkle
  - make hypervisor state restore a function
  - Rename idle_power7.S to idle_book3s.S
  - Rename reusable idle functions to hardware agnostic names
  - Make pnv_powersave_common more generic
  - abstraction for saving SPRs before entering deep idle states
  - Add platform support for stop instruction
  - cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
  - cpuidle/powernv: cleanup cpuidle-powernv.c
  - cpuidle/powernv: Add support for POWER ISA v3 idle states
  - Use deepest stop state when cpu is offlined
 
 Power9 PMU from Madhavan Srinivasan:
  - factor out power8 pmu macros and defines
  - factor out power8 pmu functions
  - factor out power8 __init_pmu code
  - Add power9 event list macros for generic and cache events
  - Power9 PMU support
  - Export Power9 generic and cache events to sysfs
 
 Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
  - Add XICS emulation APIs
  - Move a few exception common handlers to make room
  - Add support for HV virtualization interrupts
  - Add mechanism to force a replay of interrupts
  - Add ICP OPAL backend
  - Discover IODA3 PHBs
  - pci: Remove obsolete SW invalidate
  - opal: Add real mode call wrappers
  - Rename TCE invalidation calls
  - Remove SWINV constants and obsolete TCE code
  - Rework accessing the TCE invalidate register
  - Fallback to OPAL for TCE invalidations
  - Use the device-tree to get available range of M64's
  - Check status of a PHB before using it
  - pci: Don't try to allocate resources that will be reassigned
 
 Other Power9:
  - Send SIGBUS on unaligned copy and paste from Chris Smart
  - Large Decrementer support from Oliver O'Halloran
  - Load Monitor Register Support from Jack Miller
 
 Performance improvements from Anton Blanchard:
  - Avoid load hit store in __giveup_fpu() and __giveup_altivec()
  - Avoid load hit store in setup_sigcontext()
  - Remove assembly versions of strcpy, strcat, strlen and strcmp
  - Align hot loops of some string functions
 
 eBPF JIT from Naveen N. Rao:
  - Fix/enhance 32-bit Load Immediate implementation
  - Optimize 64-bit Immediate loads
  - Introduce rotate immediate instructions
  - A few cleanups
  - Isolate classic BPF JIT specifics into a separate header
  - Implement JIT compiler for extended BPF
 
 Operator Panel driver from Suraj Jitindar Singh:
  - devicetree/bindings: Add binding for operator panel on FSP machines
  - Add inline function to get rc from an ASYNC_COMP opal_msg
  - Add driver for operator panel on FSP machines
 
 Sparse fixes from Daniel Axtens:
  - make some things static
  - Introduce asm-prototypes.h
  - Include headers containing prototypes
  - Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
  - kvm: Clarify __user annotations
  - Pass endianness to sparse
  - Make ppc_md.{halt, restart} __noreturn
 
 MM fixes & cleanups from Aneesh Kumar K.V:
  - radix: Update LPCR HR bit as per ISA
  - use _raw variant of page table accessors
  - Compile out radix related functions if RADIX_MMU is disabled
  - Clear top 16 bits of va only on older cpus
  - Print formation regarding the the MMU mode
  - hash: Update SDR1 size encoding as documented in ISA 3.0
  - radix: Update PID switch sequence
  - radix: Update machine call back to support new HCALL.
  - radix: Add LPID based tlb flush helpers
  - radix: Add a kernel command line to disable radix
  - Cleanup LPCR defines
 
 Boot code consolidation from Benjamin Herrenschmidt:
  - Move epapr_paravirt_early_init() to early_init_devtree()
  - cell: Don't use flat device-tree after boot
  - ge_imp3a: Don't use the flat device-tree after boot
  - mpc85xx_ds: Don't use the flat device-tree after boot
  - mpc85xx_rdb: Don't use the flat device-tree after boot
  - Don't test for machine type in rtas_initialize()
  - Don't test for machine type in smp_setup_cpu_maps()
  - dt: Add of_device_compatible_match()
  - Factor do_feature_fixup calls
  - Move 64-bit feature fixup earlier
  - Move 64-bit memory reserves to setup_arch()
  - Use a cachable DART
  - Move FW feature probing out of pseries probe()
  - Put exception configuration in a common place
  - Remove early allocation of the SMU command buffer
  - Move MMU backend selection out of platform code
  - pasemi: Remove IOBMAP allocation from platform probe()
  - mm/hash: Don't use machine_is() early during boot
  - Don't test for machine type to detect HEA special case
  - pmac: Remove spurrious machine type test
  - Move hash table ops to a separate structure
  - Ensure that ppc_md is empty before probing for machine type
  - Move 64-bit probe_machine() to later in the boot process
  - Move 32-bit probe() machine to later in the boot process
  - Get rid of ppc_md.init_early()
  - Move the boot time info banner to a separate function
  - Move setting of {i,d}cache_bsize to initialize_cache_info()
  - Move the content of setup_system() to setup_arch()
  - Move cache info inits to a separate function
  - Re-order the call to smp_setup_cpu_maps()
  - Re-order setup_panic()
  - Make a few boot functions __init
  - Merge 32-bit and 64-bit setup_arch()
 
 Other new features:
  - tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
  - tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
  - powerpc: Add module autoloading based on CPU features from Alastair D'Silva
  - crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
  - Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
  - xmon: Dump ISA 2.06 SPRs from Michael Ellerman
  - xmon: Dump ISA 2.07 SPRs from Michael Ellerman
  - Add a parameter to disable 1TB segs from Oliver O'Halloran
  - powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
  - Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
  - pseries: Add pseries hotplug workqueue from John Allen
  - pseries: Add support for hotplug interrupt source from John Allen
  - pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
  - pseries: Move property cloning into its own routine from Nathan Fontenot
  - pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
  - pseries: Auto-online hotplugged memory from Nathan Fontenot
  - pseries: Remove call to memblock_add() from Nathan Fontenot
 
 cxl:
  - Add set and get private data to context struct from Michael Neuling
  - make base more explicitly non-modular from Paul Gortmaker
  - Use for_each_compatible_node() macro from Wei Yongjun
  - Frederic Barrat
    - Abstract the differences between the PSL and XSL
    - Make vPHB device node match adapter's
  - Philippe Bergheaud
    - Add mechanism for delivering AFU driver specific events
    - Ignore CAPI adapters misplaced in switched slots
    - Refine slice error debug messages
  - Andrew Donnellan
    - static-ify variables to fix sparse warnings
    - PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
    - PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
    - Add cxl_check_and_switch_mode() API to switch bi-modal cards
    - remove dead Kconfig options
    - fix potential NULL dereference in free_adapter()
  - Ian Munsie
    - Update process element after allocating interrupts
    - Add support for CAPP DMA mode
    - Fix allowing bogus AFU descriptors with 0 maximum processes
    - Fix allocating a minimum of 2 pages for the SPA
    - Fix bug where AFU disable operation had no effect
    - Workaround XSL bug that does not clear the RA bit after a reset
    - Fix NULL pointer dereference on kernel contexts with no AFU interrupts
    - powerpc/powernv: Split cxl code out into a separate file
    - Add cxl_slot_is_supported API
    - Enable bus mastering for devices using CAPP DMA mode
    - Move cxl_afu_get / cxl_afu_put to base
    - Allow a default context to be associated with an external pci_dev
    - Do not create vPHB if there are no AFU configuration records
    - powerpc/powernv: Add support for the cxl kernel api on the real phb
    - Add support for using the kernel API with a real PHB
    - Add kernel APIs to get & set the max irqs per context
    - Add preliminary workaround for CX4 interrupt limitation
    - Add support for interrupts on the Mellanox CX4
    - Workaround PE=0 hardware limitation in Mellanox CX4
    - powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n
 
 selftests:
  - Test unaligned copy and paste from Chris Smart
  - Load Monitor Register Tests from Jack Miller
  - Cyril Bur
    - exec() with suspended transaction
    - Use signed long to read perf_event_paranoid
    - Fix usage message in context_switch
    - Fix generation of vector instructions/types in context_switch
  - Michael Ellerman
    - Use "Delta" rather than "Error" in normal output
    - Import Anton's mmap & futex micro benchmarks
    - Add a test for PROT_SAO
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnWchAAoJEFHr6jzI4aWAe64P/36Vd9yJLptjkoyZp8/IQtu1
 Cv8buQwGdKuSMzdkcUAOXcC3fe2u70ZWXMKKLfY3koIV1IAiqdWk5/XWRKMP2XmE
 dG0LhSf0uu7uh+mE0WvQnRu46ImeKtQ+mPp4Hbs/s9SxMSeYjruv3vdWWmgUq0cl
 Gac2qJSRtAMmgLuHWMjf7N5mxOTOnKejU4o2i9cJ+YHmWKOdCigv2Ge1UadOQFlC
 E7tRPiUR3asfDfj+e+LVTTdToH6p8pk+mOUzIoZ8jIkQ+IXzi62UDl5+Rw9mqiuX
 1CtqEMUXxo2qwX+d4TcV/QUOp0YKPuIcUZ9NMMS+S3lOyJ4NFt+j2Izk7QJp5kNP
 gKVqB68TjDQsBuDr3P9ynlHbduxTIhZAqopbTrLe0FIg48nUe4n1yHJBVzqaVajX
 rFBJSsSUffBLAARNPSXJJhIgc2C1/qOC8dgMeDMcR2kPirDHaQZ/lY1yEpq1yiqR
 q6e3v5hvIAm4IjbYk0mF7TUxBrPGVE/ExyBINyASRoYxAJ1PyeD/iljZ9vI3asRA
 s+hhxT8H3f7lnqTrmJqMjHgAdGkmag07EdmvFNX4xK4aADSy7Y6g4dw25ffRopo9
 p9Jf9HX+dZv65Y3UjbV/6HuXcaSEBJJLSVWvii65PebqSN0LuHEFvNeIJ6Iblx0B
 AWh/hd0Iin2gdkcG39Mr
 =Z5kM
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - PowerNV PCI hotplug support.
   - Lots more Power9 support.
   - eBPF JIT support on ppc64le.
   - Lots of cxl updates.
   - Boot code consolidation.

  Bug fixes:
   - Fix spin_unlock_wait() from Boqun Feng
   - Fix stack pointer corruption in __tm_recheckpoint() from Michael
     Neuling
   - Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
   - mm: Ensure "special" zones are empty from Oliver O'Halloran
   - ftrace: Separate the heuristics for checking call sites from
     Michael Ellerman
   - modules: Never restore r2 for a mprofile-kernel style mcount() call
     from Michael Ellerman
   - Fix endianness when reading TCEs from Alexey Kardashevskiy
   - start rtasd before PCI probing from Greg Kurz
   - PCI: rpaphp: Fix slot registration for multiple slots under a PHB
     from Tyrel Datwyler
   - powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev
     Bhattiprolu

  Cleanups & fixes:
   - Drop support for MPIC in pseries from Rashmica Gupta
   - Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
   - Remove unused symbols in asm-offsets.c from Rashmica Gupta
   - Fix SRIOV not building without EEH enabled from Russell Currey
   - Remove kretprobe_trampoline_holder from Thiago Jung Bauermann
   - Reduce log level of PCI I/O space warning from Benjamin
     Herrenschmidt
   - Add array bounds checking to crash_shutdown_handlers from Suraj
     Jitindar Singh
   - Avoid -maltivec when using clang integrated assembler from Anton
     Blanchard
   - Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
   - Fix error return value in cmm_mem_going_offline() from Rasmus
     Villemoes
   - export cpu_to_core_id() from Mauricio Faria de Oliveira
   - Remove old symbols from defconfigs from Andrew Donnellan
   - Update obsolete comments in setup_32.c about entry conditions from
     Benjamin Herrenschmidt
   - Add comment explaining the purpose of setup_kdump_trampoline() from
     Benjamin Herrenschmidt
   - Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin
     Hao
   - Remove RELOCATABLE_PPC32 from Kevin Hao
   - Fix .long's in tlb-radix.c to more meaningful from Balbir Singh

  Minor cleanups & fixes:
   - Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
     Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King,
     Geliang Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman,
     Michael Ellerman, Stephen Rothwell, Stewart Smith.

  Freescale updates from Scott:
   - "Highlights include more 8xx optimizations, device tree updates,
     and MVME7100 support."

  PowerNV PCI hotplug from Gavin Shan:
   - PCI: Add pcibios_setup_bridge()
   - Override pcibios_setup_bridge()
   - Remove PCI_RESET_DELAY_US
   - Move pnv_pci_ioda_setup_opal_tce_kill() around
   - Increase PE# capacity
   - Allocate PE# in reverse order
   - Create PEs in pcibios_setup_bridge()
   - Setup PE for root bus
   - Extend PCI bridge resources
   - Make pnv_ioda_deconfigure_pe() visible
   - Dynamically release PE
   - Update bridge windows on PCI plug
   - Delay populating pdn
   - Support PCI slot ID
   - Use PCI slot reset infrastructure
   - Introduce pnv_pci_get_slot_id()
   - Functions to get/set PCI slot state
   - PCI/hotplug: PowerPC PowerNV PCI hotplug driver
   - Print correct PHB type names

  Power9 idle support from Shreyas B. Prabhu:
   - set power_save func after the idle states are initialized
   - Use PNV_THREAD_WINKLE macro while requesting for winkle
   - make hypervisor state restore a function
   - Rename idle_power7.S to idle_book3s.S
   - Rename reusable idle functions to hardware agnostic names
   - Make pnv_powersave_common more generic
   - abstraction for saving SPRs before entering deep idle states
   - Add platform support for stop instruction
   - cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
   - cpuidle/powernv: cleanup cpuidle-powernv.c
   - cpuidle/powernv: Add support for POWER ISA v3 idle states
   - Use deepest stop state when cpu is offlined

  Power9 PMU from Madhavan Srinivasan:
   - factor out power8 pmu macros and defines
   - factor out power8 pmu functions
   - factor out power8 __init_pmu code
   - Add power9 event list macros for generic and cache events
   - Power9 PMU support
   - Export Power9 generic and cache events to sysfs

  Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
   - Add XICS emulation APIs
   - Move a few exception common handlers to make room
   - Add support for HV virtualization interrupts
   - Add mechanism to force a replay of interrupts
   - Add ICP OPAL backend
   - Discover IODA3 PHBs
   - pci: Remove obsolete SW invalidate
   - opal: Add real mode call wrappers
   - Rename TCE invalidation calls
   - Remove SWINV constants and obsolete TCE code
   - Rework accessing the TCE invalidate register
   - Fallback to OPAL for TCE invalidations
   - Use the device-tree to get available range of M64's
   - Check status of a PHB before using it
   - pci: Don't try to allocate resources that will be reassigned

  Other Power9:
   - Send SIGBUS on unaligned copy and paste from Chris Smart
   - Large Decrementer support from Oliver O'Halloran
   - Load Monitor Register Support from Jack Miller

  Performance improvements from Anton Blanchard:
   - Avoid load hit store in __giveup_fpu() and __giveup_altivec()
   - Avoid load hit store in setup_sigcontext()
   - Remove assembly versions of strcpy, strcat, strlen and strcmp
   - Align hot loops of some string functions

  eBPF JIT from Naveen N. Rao:
   - Fix/enhance 32-bit Load Immediate implementation
   - Optimize 64-bit Immediate loads
   - Introduce rotate immediate instructions
   - A few cleanups
   - Isolate classic BPF JIT specifics into a separate header
   - Implement JIT compiler for extended BPF

  Operator Panel driver from Suraj Jitindar Singh:
   - devicetree/bindings: Add binding for operator panel on FSP machines
   - Add inline function to get rc from an ASYNC_COMP opal_msg
   - Add driver for operator panel on FSP machines

  Sparse fixes from Daniel Axtens:
   - make some things static
   - Introduce asm-prototypes.h
   - Include headers containing prototypes
   - Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
   - kvm: Clarify __user annotations
   - Pass endianness to sparse
   - Make ppc_md.{halt, restart} __noreturn

  MM fixes & cleanups from Aneesh Kumar K.V:
   - radix: Update LPCR HR bit as per ISA
   - use _raw variant of page table accessors
   - Compile out radix related functions if RADIX_MMU is disabled
   - Clear top 16 bits of va only on older cpus
   - Print formation regarding the the MMU mode
   - hash: Update SDR1 size encoding as documented in ISA 3.0
   - radix: Update PID switch sequence
   - radix: Update machine call back to support new HCALL.
   - radix: Add LPID based tlb flush helpers
   - radix: Add a kernel command line to disable radix
   - Cleanup LPCR defines

  Boot code consolidation from Benjamin Herrenschmidt:
   - Move epapr_paravirt_early_init() to early_init_devtree()
   - cell: Don't use flat device-tree after boot
   - ge_imp3a: Don't use the flat device-tree after boot
   - mpc85xx_ds: Don't use the flat device-tree after boot
   - mpc85xx_rdb: Don't use the flat device-tree after boot
   - Don't test for machine type in rtas_initialize()
   - Don't test for machine type in smp_setup_cpu_maps()
   - dt: Add of_device_compatible_match()
   - Factor do_feature_fixup calls
   - Move 64-bit feature fixup earlier
   - Move 64-bit memory reserves to setup_arch()
   - Use a cachable DART
   - Move FW feature probing out of pseries probe()
   - Put exception configuration in a common place
   - Remove early allocation of the SMU command buffer
   - Move MMU backend selection out of platform code
   - pasemi: Remove IOBMAP allocation from platform probe()
   - mm/hash: Don't use machine_is() early during boot
   - Don't test for machine type to detect HEA special case
   - pmac: Remove spurrious machine type test
   - Move hash table ops to a separate structure
   - Ensure that ppc_md is empty before probing for machine type
   - Move 64-bit probe_machine() to later in the boot process
   - Move 32-bit probe() machine to later in the boot process
   - Get rid of ppc_md.init_early()
   - Move the boot time info banner to a separate function
   - Move setting of {i,d}cache_bsize to initialize_cache_info()
   - Move the content of setup_system() to setup_arch()
   - Move cache info inits to a separate function
   - Re-order the call to smp_setup_cpu_maps()
   - Re-order setup_panic()
   - Make a few boot functions __init
   - Merge 32-bit and 64-bit setup_arch()

  Other new features:
   - tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
   - tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
   - powerpc: Add module autoloading based on CPU features from Alastair D'Silva
   - crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
   - Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
   - xmon: Dump ISA 2.06 SPRs from Michael Ellerman
   - xmon: Dump ISA 2.07 SPRs from Michael Ellerman
   - Add a parameter to disable 1TB segs from Oliver O'Halloran
   - powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
   - Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
   - pseries: Add pseries hotplug workqueue from John Allen
   - pseries: Add support for hotplug interrupt source from John Allen
   - pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
   - pseries: Move property cloning into its own routine from Nathan Fontenot
   - pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
   - pseries: Auto-online hotplugged memory from Nathan Fontenot
   - pseries: Remove call to memblock_add() from Nathan Fontenot

  cxl:
   - Add set and get private data to context struct from Michael Neuling
   - make base more explicitly non-modular from Paul Gortmaker
   - Use for_each_compatible_node() macro from Wei Yongjun
   - Frederic Barrat
   - Abstract the differences between the PSL and XSL
   - Make vPHB device node match adapter's
   - Philippe Bergheaud
   - Add mechanism for delivering AFU driver specific events
   - Ignore CAPI adapters misplaced in switched slots
   - Refine slice error debug messages
   - Andrew Donnellan
   - static-ify variables to fix sparse warnings
   - PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
   - PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
   - Add cxl_check_and_switch_mode() API to switch bi-modal cards
   - remove dead Kconfig options
   - fix potential NULL dereference in free_adapter()
   - Ian Munsie
   - Update process element after allocating interrupts
   - Add support for CAPP DMA mode
   - Fix allowing bogus AFU descriptors with 0 maximum processes
   - Fix allocating a minimum of 2 pages for the SPA
   - Fix bug where AFU disable operation had no effect
   - Workaround XSL bug that does not clear the RA bit after a reset
   - Fix NULL pointer dereference on kernel contexts with no AFU interrupts
   - powerpc/powernv: Split cxl code out into a separate file
   - Add cxl_slot_is_supported API
   - Enable bus mastering for devices using CAPP DMA mode
   - Move cxl_afu_get / cxl_afu_put to base
   - Allow a default context to be associated with an external pci_dev
   - Do not create vPHB if there are no AFU configuration records
   - powerpc/powernv: Add support for the cxl kernel api on the real phb
   - Add support for using the kernel API with a real PHB
   - Add kernel APIs to get & set the max irqs per context
   - Add preliminary workaround for CX4 interrupt limitation
   - Add support for interrupts on the Mellanox CX4
   - Workaround PE=0 hardware limitation in Mellanox CX4
   - powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n

  selftests:
   - Test unaligned copy and paste from Chris Smart
   - Load Monitor Register Tests from Jack Miller
   - Cyril Bur
   - exec() with suspended transaction
   - Use signed long to read perf_event_paranoid
   - Fix usage message in context_switch
   - Fix generation of vector instructions/types in context_switch
   - Michael Ellerman
   - Use "Delta" rather than "Error" in normal output
   - Import Anton's mmap & futex micro benchmarks
   - Add a test for PROT_SAO"

* tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (263 commits)
  powerpc/mm: Parenthesise IS_ENABLED() in if condition
  tty/hvc: Use opal irqchip interface if available
  tty/hvc: Use IRQF_SHARED for OPAL hvc consoles
  selftests/powerpc: exec() with suspended transaction
  powerpc: Improve comment explaining why we modify VRSAVE
  powerpc/mm: Drop unused externs for hpte_init_beat[_v3]()
  powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header
  powerpc/mm: Fix build break when PPC_NATIVE=n
  crypto: vmx - Convert to CPU feature based module autoloading
  powerpc: Add module autoloading based on CPU features
  powerpc/powernv/ioda: Fix endianness when reading TCEs
  powerpc/mm: Add memory barrier in __hugepte_alloc()
  powerpc/modules: Never restore r2 for a mprofile-kernel style mcount() call
  powerpc/ftrace: Separate the heuristics for checking call sites
  powerpc: Merge 32-bit and 64-bit setup_arch()
  powerpc/64: Make a few boot functions __init
  powerpc: Re-order setup_panic()
  powerpc: Re-order the call to smp_setup_cpu_maps()
  powerpc/32: Move cache info inits to a separate function
  powerpc/64: Move the content of setup_system() to setup_arch()
  ...
2016-07-30 21:01:36 -07:00
Linus Torvalds d761f3ed6e Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode updates from Thomas Gleixner:

 - more work to make the microcode loader robust

 - a fix for the micro code load precedence

 - fixes for initrd loading with randomized memory

 - less printk noise on SMP machines

* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm, x86/microcode: Add __PAGE_OFFSET_BASE define on 32-bit
  x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y
  x86/microcode: Remove unused symbol exports
  x86/microcode/intel: Do not issue microcode updates messages on each CPU
  Documentation/microcode: Document some aspects for more clarity
  x86/microcode/AMD: Make amd_ucode_patch[] static
  x86/microcode/intel: Unexport save_mc_for_early()
  x86/microcode/intel: Rename load_microcode_early() to find_microcode_patch()
  x86/microcode: Propagate save_microcode_in_initrd() retval
  x86/microcode: Get rid of find_cpio_data()'s dummy offset arg
  lib/cpio: Make find_cpio_data()'s offset arg optional
  x86/microcode: Fix suspend to RAM with builtin microcode
  x86/microcode: Fix loading precedence
2016-07-30 13:18:33 -07:00
Linus Torvalds b325e04ea2 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpufeature updates from Thomas Gleixner:

 - a workaround for the MONITOR instruction erratum of Goldmont CPUs

 - small fixes and cleanups here and there

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont based CPUs
  x86/cpu: Rename "WESTMERE2" family to "NEHALEM_G"
  x86/amd_nb: Clean up init path
  x86/cpufeature: Add helper macro for mask check macros
  x86/cpufeature: Make sure DISABLED/REQUIRED macros are updated
  x86/cpufeature: Update cpufeaure macros
2016-07-30 12:56:26 -07:00
Linus Torvalds f64d6e2aaa DeviceTree update for 4.8:
- Removal of most of_platform_populate() calls in arch code. Now the DT
 core code calls it in the default case and platforms only need to call
 it if they have special needs.
 
 - Use pr_fmt on all the DT core print statements.
 
 - CoreSight binding doc improvements to block name descriptions.
 
 - Add dt_to_config script which can parse dts files and list
 corresponding kernel config options.
 
 - Fix memory leak hit with a PowerMac DT.
 
 - Correct a bunch of STMicro compatible strings to use the correct
 vendor prefix.
 
 - Fix DA9052 PMIC binding doc to match what is actually used in dts
 files.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXm9KcAAoJEPr7XbWNvGHDRT4QAIIIOSB4AWHardnMLROgGge9
 aOQKZ/05O9feOcxYKe8FkQbcH+IujJjrUL+yrRD36yGQPAyBP21gtcmmfrkCcwFM
 kH915f/JbGvXpfwEf8dcarHhzYH6FFJiQGduPpWfwSSWynx+xq5EKPwCqYzMg8bN
 SExxt7vUx1MKFOExZ0K8BNCo8VMVLUWQoJ1DNeJDuL25Op4EU3i2l1HQNYV/3XDk
 BSA3x7Lw3GjrWEH20VWYn2Azq1OFLY+E2FC2lnG4nbkk5X8dZbUH9PR1Sk7uTQDj
 uxTjWe59NBpliCxKSAbMbTAU/WwSB1pJ0I+zDJBiQsdFT+nb5F4zOrs3qSKHa/A9
 Rv6AC8k5gdSMrDB1dOspfF2vWvOOInXgNV4/Kza0D92mbCpwyUuF+vhE6rfcMrZU
 OiD7rj2/fvO7Y9fUAhrp6zrfrOfH9B1Z9vS+940AlK96YwPE2+J0SA2vBxR/wg8H
 7fj4Ud5X+SFisXWQhh5Wlv0W9o6e7C7fsi8vpkQ7gufmezLFWVnJKsUfQaxGEwhG
 Hkhm9kuSHHMd+6dEnn2756DnNfJAtQv6rSR0/QR4Lf9y5L4dvR3kAQIci8X/nx4P
 sIk+IJWGZG6wziZq59hh+SO6HEqdSNuvh+5sbR0iUimdE/1HsDBdPiocXf/r8iwK
 NY9nGeZPRrXmFgdpoZfm
 =wLMr
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DeviceTree updates from Rob Herring:

 - remove most of_platform_populate() calls in arch code.  Now the DT
   core code calls it in the default case and platforms only need to
   call it if they have special needs

 - use pr_fmt on all the DT core print statements

 - CoreSight binding doc improvements to block name descriptions

 - add dt_to_config script which can parse dts files and list
   corresponding kernel config options

 - fix memory leak hit with a PowerMac DT

 - correct a bunch of STMicro compatible strings to use the correct
   vendor prefix

 - fix DA9052 PMIC binding doc to match what is actually used in dts
   files

* tag 'devicetree-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (35 commits)
  documentation: da9052: Update regulator bindings names to match DA9052/53 DTS expectations
  xtensa: Partially Revert "xtensa: Remove unnecessary of_platform_populate with default match table"
  xtensa: Fix build error due to missing include file
  MIPS: ath79: Add missing include file
  Fix spelling errors in Documentation/devicetree
  ARM: dts: fix STMicroelectronics compatible strings
  powerpc/dts: fix STMicroelectronics compatible strings
  Documentation: dt: i2c: use correct STMicroelectronics vendor prefix
  scripts/dtc: dt_to_config - kernel config options for a devicetree
  of: fdt: mark unflattened tree as detached
  of: overlay: add resolver error prints
  coresight: document binding acronyms
  Documentation/devicetree: document cavium-pip rx-delay/tx-delay properties
  of: use pr_fmt prefix for all console printing
  of/irq: Mark initialised interrupt controllers as populated
  of: fix memory leak related to safe_name()
  Revert "of/platform: export of_default_bus_match_table"
  of: unittest: use of_platform_default_populate() to populate default bus
  memory: omap-gpmc: use of_platform_default_populate() to populate default bus
  bus: uniphier-system-bus: use of_platform_default_populate() to populate default bus
  ...
2016-07-30 11:32:01 -07:00
Linus Torvalds 1056c9bd27 The bulk of the changes are updates and fixes to existing clk provider
drivers, along with a pretty standard number of new drivers. The core
 recieved a small number of updates as well.
 
 Core changes of note:
 * Removed CLK_IS_ROOT flag
 
 New clk provider drivers:
 * Renesas r8a7796 Clock Pulse Generator / Module Standby and Software
   Reset
 * Allwinner sun8i H3 Clock Controller Unit
 * AmLogic meson8b clock controller (rewritten)
 * AmLogic gxbb clock controller
 * Support for some new ICs was added by simple changes to static data
   tables for chips sharing the same family
 
 Driver updates of note:
 * the Allwinner sunxi clock driver infrastucture was rewritten to
   comform to the state of the art at drivers/clk/sunxi-ng. The old
   implementation is still supported for backwards compatibility with the
   DT ABI
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJXm9LNAAoJEKI6nJvDJaTUHYcP/1oKTHA4uSPux2+5l9vApEJc
 eSV0oSEP/zDDwYfV4xRt1byAOtzHe3R4p5zDGShnk3+CCwXbQSLGmFJFmZDoSs5E
 ulftXA8uRV4ac0SEh86BOlstdJHGOVo7Q38XtdPvgKRTv58+ZqHuFcvB726bWY/I
 GMzVTkpugbn5U7e3MLM58gkBoqa8BS9uIdsf5q/JIxdpe+VgUtjmioH9Rz6G5U5Y
 T6ObeU6HNusDaz6kUJIiAEkl4UNHk+aebnY7FTbwd9JGGySp5nEknVY7tSMeC6iU
 hG/knzdMiasa5yv0xdW3/OxnKCQtEeuPPHnJDSXWc9yb20vNs2QhRtd8uckMGHOH
 X7fXq9xoLxDH81AggW188g7+xfhx7J+ASpjHmHs74itUIoqhhy61PM3I95yQ8m6s
 QDkInqSszFd2C7OiYsy6m5XS0K8g0jyA2tzWhxlIZNVAl7yP71+HVLJ45NNKCPkb
 oYbMx4ho538Tl6+c6HHJG0vC3XWSdHbvwMOKyE9k/GyrIJ7FxLZS3BLOYPm6u9t4
 9n041MbtBb/UrWVi8kMSdmBjIIs6VSGjq5gwRFAqoPQt3VkPuy0IfnwIpjdE1aFz
 PbDJQF4ze7D3JgH2rWtivXbCTRthTANmoar299xjSwwb99TSlP2IABY3cNnJrhPr
 9/PGShM3hqRpBEUnEmt+
 =1dpY
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Michael Turquette:
 "The bulk of the changes are updates and fixes to existing clk provider
  drivers, along with a pretty standard number of new drivers.  The core
  recieved a small number of updates as well.

  Core changes of note:
   - removed CLK_IS_ROOT flag

  New clk provider drivers:
   - Renesas r8a7796 clock pulse generator / module standby and
     software reset
   - Allwinner sun8i H3 clock controller unit
   - AmLogic meson8b clock controller (rewritten)
   - AmLogic gxbb clock controller
   - support for some new ICs was added by simple changes to static
     data tables for chips sharing the same family

  Driver updates of note:
   - the Allwinner sunxi clock driver infrastucture was rewritten to
     comform to the state of the art at drivers/clk/sunxi-ng.  The old
     implementation is still supported for backwards compatibility with
     the DT ABI"

* tag 'clk-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (162 commits)
  clk: Makefile: re-sort and clean up
  Revert "clk: gxbb: expose CLKID_MMC_PCLK"
  clk: samsung: Allow modular build of the Audio Subsystem CLKCON driver
  clk: samsung: make clk-s5pv210-audss explicitly non-modular
  clk: exynos5433: remove CLK_IGNORE_UNUSED flag from SPI clocks
  clk: oxnas: Add hardware dependencies
  clk: imx7d: do not set parent of ethernet time/ref clocks
  ARM: dt: sun8i: switch the H3 to the new CCU driver
  clk: sunxi-ng: h3: Fix Kconfig symbol typo
  clk: sunxi-ng: h3: Fix audio clock divider offset
  clk: sunxi-ng: Add H3 clocks
  clk: sunxi-ng: Add N-K-M-P factor clock
  clk: sunxi-ng: Add N-K-M Factor clock
  clk: sunxi-ng: Add N-M-factor clock support
  clk: sunxi-ng: Add N-K-factor clock support
  clk: sunxi-ng: Add M-P factor clock support
  clk: sunxi-ng: Add divider
  clk: sunxi-ng: Add phase clock support
  clk: sunxi-ng: Add mux clock support
  clk: sunxi-ng: Add gate clock support
  ...
2016-07-30 11:20:02 -07:00
Michael Ellerman 719dbb2df7 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include more 8xx optimizations, device tree updates,
and MVME7100 support."
2016-07-30 13:43:19 +10:00
Linus Torvalds 797cee982e Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "Six audit patches for 4.8.

  There are a couple of style and minor whitespace tweaks for the logs,
  as well as a minor fixup to catch errors on user filter rules, however
  the major improvements are a fix to the s390 syscall argument masking
  code (reviewed by the nice s390 folks), some consolidation around the
  exclude filtering (less code, always a win), and a double-fetch fix
  for recording the execve arguments"

* 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit:
  audit: fix a double fetch in audit_log_single_execve_arg()
  audit: fix whitespace in CWD record
  audit: add fields to exclude filter by reusing user filter
  s390: ensure that syscall arguments are properly masked on s390
  audit: fix some horrible switch statement style crimes
  audit: fixup: log on errors from filter user rules
2016-07-29 17:54:17 -07:00
Linus Torvalds 7a1e8b80fb Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - TPM core and driver updates/fixes
   - IPv6 security labeling (CALIPSO)
   - Lots of Apparmor fixes
   - Seccomp: remove 2-phase API, close hole where ptrace can change
     syscall #"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
  apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
  tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
  tpm: Factor out common startup code
  tpm: use devm_add_action_or_reset
  tpm2_i2c_nuvoton: add irq validity check
  tpm: read burstcount from TPM_STS in one 32-bit transaction
  tpm: fix byte-order for the value read by tpm2_get_tpm_pt
  tpm_tis_core: convert max timeouts from msec to jiffies
  apparmor: fix arg_size computation for when setprocattr is null terminated
  apparmor: fix oops, validate buffer size in apparmor_setprocattr()
  apparmor: do not expose kernel stack
  apparmor: fix module parameters can be changed after policy is locked
  apparmor: fix oops in profile_unpack() when policy_db is not present
  apparmor: don't check for vmalloc_addr if kvzalloc() failed
  apparmor: add missing id bounds check on dfa verification
  apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
  apparmor: use list_next_entry instead of list_entry_next
  apparmor: fix refcount race when finding a child profile
  apparmor: fix ref count leak when profile sha1 hash is read
  apparmor: check that xindex is in trans_table bounds
  ...
2016-07-29 17:38:46 -07:00
Linus Torvalds 601f887d61 Power management fix for v4.8-rc1
Fix a nasty (and really hard to debug) memory corruption during
 resume from hibernation on x86-64 (that leads to a kernel panic
 most of the time) due to the use of a stale stack pointer value
 in FRAME_BEGIN (Josh Poimboeuf).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJXm76YAAoJEILEb/54YlRxBu0QAIwOnS0YJO/AuXNNEUxzK3xu
 /+nRda38MJVUeoU/Z8wBkf5l8XEzNgYZPabj87r4Dkyg32dWKXHwW8R9ApyO1RfA
 7CReTgO2rKfOrDWCq2uYdea0hKM36vavjUHGJ+fSScaqHWaGSqvqeHusroUDL+3y
 grRI8GDNyfG1eV9IxrJwcE0Oegp0QKX5saSKXXVoeaM63YnNUwv9usdEOpdE1lmk
 r6WwpYNZSh/vjaNlLEcrmEAbU3Nv/wBNqGBkoZoATgTT0YsONfqUJcU3fhVXeNMR
 u3Tnk2HJZUmJ3HQgFBcpGU5r1/c/qX8PzPK7e3D60QDgztkADCroW9WOYfwICs+A
 qmsCUs0bVyqxUOBWLIbyJ/n9/VzkeE/cD3lOjkGh/ChRZWpUeDobJWeoseMxDzG5
 21kOwEQvel9Itk8C57zoWCvuSNddg9M2f/GL7yoYtyqKrMriRwwqftIOhJocySP0
 2eONtBn2be9IUX/cdSuVVnqvWP9pGCXqr4IJZoormX3lf1Zi+/G9buPo0At35k5F
 kHxRhnAGsnYxolAhVboxQdkGItrP0LlDvXLnLzPhWjz8qOLOkZpnIfpjsj+bwF7K
 2S+Yr3oBldxIYu1A+jne/fOiTUiQ5iAicWnLwNQpmETWu/+Rz72bweWRUYiE71mW
 H0M0HaD9UzhockpqpeEM
 =O25M
 -----END PGP SIGNATURE-----

Merge tag 'pm-urgent-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Fix a nasty (and really hard to debug) memory corruption during resume
  from hibernation on x86-64 (that leads to a kernel panic most of the
  time) due to the use of a stale stack pointer value in FRAME_BEGIN
  (Josh Poimboeuf)"

* tag 'pm-urgent-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  x86/power/64: Fix hibernation return address corruption
2016-07-29 14:34:55 -07:00
Linus Torvalds a6408f6cb6 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug updates from Thomas Gleixner:
 "This is the next part of the hotplug rework.

   - Convert all notifiers with a priority assigned

   - Convert all CPU_STARTING/DYING notifiers

     The final removal of the STARTING/DYING infrastructure will happen
     when the merge window closes.

  Another 700 hundred line of unpenetrable maze gone :)"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  timers/core: Correct callback order during CPU hot plug
  leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
  powerpc/numa: Convert to hotplug state machine
  arm/perf: Fix hotplug state machine conversion
  irqchip/armada: Avoid unused function warnings
  ARC/time: Convert to hotplug state machine
  clocksource/atlas7: Convert to hotplug state machine
  clocksource/armada-370-xp: Convert to hotplug state machine
  clocksource/exynos_mct: Convert to hotplug state machine
  clocksource/arm_global_timer: Convert to hotplug state machine
  rcu: Convert rcutree to hotplug state machine
  KVM/arm/arm64/vgic-new: Convert to hotplug state machine
  smp/cfd: Convert core to hotplug state machine
  x86/x2apic: Convert to CPU hotplug state machine
  profile: Convert to hotplug state machine
  timers/core: Convert to hotplug state machine
  hrtimer: Convert to hotplug state machine
  x86/tboot: Convert to hotplug state machine
  arm64/armv8 deprecated: Convert to hotplug state machine
  hwtracing/coresight-etm4x: Convert to hotplug state machine
  ...
2016-07-29 13:55:30 -07:00
Linus Torvalds 86505fc06b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:

 1) Double spin lock bug in sunhv serial driver, from Dan Carpenter.

 2) Use correct RSS estimate when determining whether to grow the huge
    TSB or not, from Mike Kravetz.

 3) Don't use full three level page tables for hugepages, PMD level is
    sufficient.  From Nitin Gupta.

 4) Mask out extraneous bits from TSB_TAG_ACCESS register, we only want
    the address bits.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Trim page tables for 8M hugepages
  sparc64 mm: Fix base TSB sizing when hugetlb pages are used
  sparc: serial: sunhv: fix a double lock bug
  sparc32: off by ones in BUG_ON()
  sparc: Don't leak context bits into thread->fault_address
2016-07-29 13:23:18 -07:00
Linus Torvalds 9d3bc3d4a4 ARC updates for 4.8-rc1
Things have been calm here - nothing much except for a few fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXm7MUAAoJEGnX8d3iisJeTbkP/3z5vrczERECRsaHGU6KpaUP
 F9+SpDG+BmCzcKhz60GzY64p6gl2KWtbE3es66lQf45yHD7s3UhAH1Zxc6pFgZjn
 JTF8z3LFRDlKQ0H4gVxWsDp4+QXn3LOklckUpgqTocLxNpg9qXVXCVZhoUb12A3d
 AxPtOUGsFTtC1e6wnYofGknJjApRls8f11CYmJdQ8aS9lLC/pA+fC0U0fM7LUzx5
 DE+KLB3LithYxQ9TBfVrFSbCTbyxxDmYE59v9DEZQftn9pwVMtZLpGEs4BVO31fw
 bQLvROfx5xn1yElN4yH2XT6q+N47XA6MtJ3qDvjPN59yWYwP9mOCWlfoIJ0q8UY0
 sduU+9K1qZ5WQkXjh66+tPUKpm01YrZy2vghkCJ6YWXnOg9WzbYZQtkwia5TyU8h
 lQ36ri72ncK9gPfOSxQGmxY19o7ujX9our9T+bQ4JyjtMicCVKlxSGJLruyTd2ma
 LqOjqpLuZv2Ryf/2UbpoOmCsynfjqSLLc2CR+jlGJvH1vD9ycvfjMS1dcTcpcJ3Z
 AcsEDBoviMRbM2mWCybtT5gs35vzWWCpgsi+hG4lg0kYtrclGWsc2/uG13BFNK3w
 N9/aKgy6a8hYWWOEpKzqzuonR13oob91pDbkUD/m8uS9PSg/5W4WFlvgAjOIog9G
 pfL3M6oF/gEZ4CByLS7J
 =/rhI
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC updates from Vineet Gupta:
 "Things have been calm here - nothing much except for a few fixes"

* tag 'arc-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: mm: don't loose PTE_SPECIAL in pte_modify()
  ARC: dma: fix address translation in arc_dma_free
  ARC: typo fix in mm/ioremap.c
  ARC: fix linux-next build breakage
2016-07-29 13:17:34 -07:00
Rafael J. Wysocki e148d0f85c Merge branch 'pm-sleep'
* pm-sleep:
  x86/power/64: Fix hibernation return address corruption
2016-07-29 22:12:54 +02:00
Linus Torvalds befff3bfb3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32
Pull AVR32 updates from Hans-Christian Noren Egtvedt.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
  avr32: off by one in at32_init_pio()
  avr32: fixup code style in unistd.h and syscall_table.S
  avr32: wire up preadv2 and pwritev2 syscalls
2016-07-29 13:09:55 -07:00
Linus Torvalds b5f00d18cc Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "Included in this update are:

   - Patches from Gregory Clement to fix the coherent DMA cases in our
     dma-mapping code.

   - A number of CPU errata updates and fixes.

   - ARM cpuidle improvements from Jisheng Zhang.

   - Fix from Kees for the location of _etext.

   - Cleanups from Masahiro Yamada to avoid duplicated messages during
     the kernel build, and remove CONFIG_ARCH_HAS_BARRIERS.

   - Remove a udelay loop limitation, allowing for faster CPUs to
     calibrate the delay correctly.

   - Cleanup some left-overs from the SW PAN implementation.

   - Ensure that a modified address limit is not visible to exception
     handlers"

* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (21 commits)
  ARM: 8586/1: cpuidle: make arm_cpuidle_suspend() a bit more efficient
  ARM: 8585/1: cpuidle: fix !cpuidle_ops[cpu].init case during init
  ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is used
  ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherent
  ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421
  ARM: 8559/1: errata: Workaround erratum A12 821420
  ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423
  ARM: save and reset the address limit when entering an exception
  ARM: 8577/1: Fix Cortex-A15 798181 errata initialization
  ARM: 8584/1: floppy: avoid gcc-6 warning
  ARM: 8583/1: mm: fix location of _etext
  ARM: 8582/1: remove unused CONFIG_ARCH_HAS_BARRIERS
  ARM: 8306/1: loop_udelay: remove bogomips value limitation
  ARM: 8581/1: add missing <asm/prom.h> to arch/arm/kernel/devtree.c
  ARM: 8576/1: avoid duplicating "Kernel: arch/arm/boot/*Image is ready"
  ARM: 8556/1: on a generic DT system: do not touch l2x0
  ARM: uaccess: remove put_user() code duplication
  ARM: 8580/1: Remove orphaned __addr_ok() definition
  ARM: get rid of horrible *(unsigned int *)(regs + 1)
  ARM: introduce svc_pt_regs structure
  ...
2016-07-29 13:03:49 -07:00
Nitin Gupta 7bc3777ca1 sparc64: Trim page tables for 8M hugepages
For PMD aligned (8M) hugepages, we currently allocate
all four page table levels which is wasteful. We now
allocate till PMD level only which saves memory usage
from page tables.

Also, when freeing page table for 8M hugepage backed region,
make sure we don't try to access non-existent PTE level.

Orabug: 22630259

Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-29 10:49:16 -07:00
Josh Poimboeuf 4ce827b4cc x86/power/64: Fix hibernation return address corruption
In kernel bug 150021, a kernel panic was reported when restoring a
hibernate image.  Only a picture of the oops was reported, so I can't
paste the whole thing here.  But here are the most interesting parts:

  kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
  BUG: unable to handle kernel paging request at ffff8804615cfd78
  ...
  RIP: ffff8804615cfd78
  RSP: ffff8804615f0000
  RBP: ffff8804615cfdc0
  ...
  Call Trace:
   do_signal+0x23
   exit_to_usermode_loop+0x64
   ...

The RIP is on the same page as RBP, so it apparently started executing
on the stack.

The bug was bisected to commit ef0f3ed5a4 (x86/asm/power: Create
stack frames in hibernate_asm_64.S), which in retrospect seems quite
dangerous, since that code saves and restores the stack pointer from a
global variable ('saved_context').

There are a lot of moving parts in the hibernate save and restore paths,
so I don't know exactly what caused the panic.  Presumably, a FRAME_END
was executed without the corresponding FRAME_BEGIN, or vice versa.  That
would corrupt the return address on the stack and would be consistent
with the details of the above panic.

[ rjw: One major problem is that by the time the FRAME_BEGIN in
  restore_registers() is executed, the stack pointer value may not
  be valid any more.  Namely, the stack area pointed to by it
  previously may have been overwritten by some image memory contents
  and that page frame may now be used for whatever different purpose
  it had been allocated for before hibernation.  In that case, the
  FRAME_BEGIN will corrupt that memory. ]

Instead of doing the frame pointer save/restore around the bounds of the
affected functions, just do it around the call to swsusp_save().

That has the same effect of ensuring that if swsusp_save() sleeps, the
frame pointers will be correct.  It's also a much more obviously safe
way to do it than the original patch.  And objtool still doesn't report
any warnings.

Fixes: ef0f3ed5a4 (x86/asm/power: Create stack frames in hibernate_asm_64.S)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=150021
Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
Reported-by: Andre Reinke <andre.reinke@mailbox.org>
Tested-by: Andre Reinke <andre.reinke@mailbox.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-29 13:38:59 +02:00
Dan Carpenter 55f1cf83d5 avr32: off by one in at32_init_pio()
The pio_dev[] array has MAX_NR_PIO_DEVICES elements so the > should be
>=.

Fixes: 5f97f7f940 ('[PATCH] avr32 architecture')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2016-07-29 11:55:57 +02:00
Hans-Christian Noren Egtvedt 6ad4a21b67 avr32: fixup code style in unistd.h and syscall_table.S
This patch swaps the mix of tabs and space for alignment of comment
after code to use spaces only.

Also document why recvmmsg was defined twice in the syscall_table.S
table, but only once in unistd.h. In short, wired in the table by
generic arch patch, but forgotten in unistd.h (review slip).
2016-07-29 11:55:57 +02:00
Hans-Christian Noren Egtvedt 389ce5a961 avr32: wire up preadv2 and pwritev2 syscalls
This patch wires up the new preadv2 and pwritev2 syscall on AVR32.

On AVR32, all parameters beyond the 5th are passed on the stack. System
calls don't use the stack -- they borrow a callee-saved register
instead. This means that syscalls that take 6 parameters must be called
through a stub that pushes the last parameter on the stack.

Signed-off-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
2016-07-29 11:55:57 +02:00
Mike Kravetz af1b1a9b36 sparc64 mm: Fix base TSB sizing when hugetlb pages are used
do_sparc64_fault() calculates both the base and huge page RSS sizes and
uses this information in calls to tsb_grow().  The calculation for base
page TSB size is not correct if the task uses hugetlb pages.  hugetlb
pages are not accounted for in RSS, therefore the call to get_mm_rss(mm)
does not include hugetlb pages.  However, the number of pages based on
huge_pte_count (which does include hugetlb pages) is subtracted from
this value.  This will result in an artificially small and often negative
RSS calculation.  The base TSB size is then often set to max_tsb_size
as the passed RSS is unsigned, so a negative value looks really big.

THP pages are also accounted for in huge_pte_count, and THP pages are
accounted for in RSS so the calculation in do_sparc64_fault() is correct
if a task only uses THP pages.

A single huge_pte_count is not sufficient for TSB sizing if both hugetlb
and THP pages can be used.  Instead of a single counter, use two:  one
for hugetlb and one for THP.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-28 21:22:12 -07:00
Linus Torvalds f0c98ebc57 libnvdimm for 4.8
1/ Replace pcommit with ADR / directed-flushing:
    The pcommit instruction, which has not shipped on any product, is
    deprecated. Instead, the requirement is that platforms implement either
    ADR, or provide one or more flush addresses per nvdimm. ADR
    (Asynchronous DRAM Refresh) flushes data in posted write buffers to the
    memory controller on a power-fail event. Flush addresses are defined in
    ACPI 6.x as an NVDIMM Firmware Interface Table (NFIT) sub-structure:
    "Flush Hint Address Structure". A flush hint is an mmio address that
    when written and fenced assures that all previous posted writes
    targeting a given dimm have been flushed to media.
 
 2/ On-demand ARS (address range scrub):
    Linux uses the results of the ACPI ARS commands to track bad blocks
    in pmem devices.  When latent errors are detected we re-scrub the media
    to refresh the bad block list, userspace can also request a re-scrub at
    any time.
 
 3/ Support for the Microsoft DSM (device specific method) command format.
 
 4/ Support for EDK2/OVMF virtual disk device memory ranges.
 
 5/ Various fixes and cleanups across the subsystem.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXmXBsAAoJEB7SkWpmfYgCEwwP/1IOt9ocP+iHLMDH9KE7VaTZ
 NmUDR+Zy6g5cRQM7SgcuU5BXUcx+OsSrSrUTVF1cW994o9Gbz1mFotkv0ZAsPcYY
 ZVRQxo2oqHrssyOcg+PsgKWiXn68rJOCgmpEyzaJywl5qTMst7pzsT1s1f7rSh6h
 trCf4VaJJwxZR8fARGtlHUnnhPe2Orp99EZRKEWprAsIv2kPuWpPHSjRjuEgN1JG
 KW8AYwWqFTtiLRUk86I4KBB0wcDrfctsjgN9Ogd6+aHyQBRnVSr2U+vDCFkC8KLu
 qiDCpYp+yyxBjclnljz7tRRT3GtzfCUWd4v2KVWqgg2IaobUc0Lbukp/rmikUXQP
 WLikT2OCQ994eFK5OX3Q3cIU/4j459TQnof8q14yVSpjAKrNUXVSR5puN7Hxa+V7
 41wKrAsnsyY1oq+Yd/rMR8VfH7PHx3bFkrmRCGZCufLX1UQm4aYj+sWagDKiV3yA
 DiudghbOnhfurfGsnXUVw7y7GKs+gNWNBmB6ndAD6ZEHmKoGUhAEbJDLCc3DnANl
 b/2mv1MIdIcC1DlCmnbbcn6fv6bICe/r8poK3VrCK3UgOq/EOvKIWl7giP+k1JuC
 6DdVYhlNYIVFXUNSLFAwz8OkLu8byx7WDm36iEqrKHtPw+8qa/2bWVgOU6OBgpjV
 cN3edFVIdxvZeMgM5Ubq
 =xCBG
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:

 - Replace pcommit with ADR / directed-flushing.

   The pcommit instruction, which has not shipped on any product, is
   deprecated.  Instead, the requirement is that platforms implement
   either ADR, or provide one or more flush addresses per nvdimm.

   ADR (Asynchronous DRAM Refresh) flushes data in posted write buffers
   to the memory controller on a power-fail event.

   Flush addresses are defined in ACPI 6.x as an NVDIMM Firmware
   Interface Table (NFIT) sub-structure: "Flush Hint Address Structure".
   A flush hint is an mmio address that when written and fenced assures
   that all previous posted writes targeting a given dimm have been
   flushed to media.

 - On-demand ARS (address range scrub).

   Linux uses the results of the ACPI ARS commands to track bad blocks
   in pmem devices.  When latent errors are detected we re-scrub the
   media to refresh the bad block list, userspace can also request a
   re-scrub at any time.

 - Support for the Microsoft DSM (device specific method) command
   format.

 - Support for EDK2/OVMF virtual disk device memory ranges.

 - Various fixes and cleanups across the subsystem.

* tag 'libnvdimm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (41 commits)
  libnvdimm-btt: Delete an unnecessary check before the function call "__nd_device_register"
  nfit: do an ARS scrub on hitting a latent media error
  nfit: move to nfit/ sub-directory
  nfit, libnvdimm: allow an ARS scrub to be triggered on demand
  libnvdimm: register nvdimm_bus devices with an nd_bus driver
  pmem: clarify a debug print in pmem_clear_poison
  x86/insn: remove pcommit
  Revert "KVM: x86: add pcommit support"
  nfit, tools/testing/nvdimm/: unify shutdown paths
  libnvdimm: move ->module to struct nvdimm_bus_descriptor
  nfit: cleanup acpi_nfit_init calling convention
  nfit: fix _FIT evaluation memory leak + use after free
  tools/testing/nvdimm: add manufacturing_{date|location} dimm properties
  tools/testing/nvdimm: add virtual ramdisk range
  acpi, nfit: treat virtual ramdisk SPA as pmem region
  pmem: kill __pmem address space
  pmem: kill wmb_pmem()
  libnvdimm, pmem: use nvdimm_flush() for namespace I/O writes
  fs/dax: remove wmb_pmem()
  libnvdimm, pmem: flush posted-write queues on shutdown
  ...
2016-07-28 17:38:16 -07:00
Linus Torvalds 1c88e19b0f Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "The rest of MM"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (101 commits)
  mm, compaction: simplify contended compaction handling
  mm, compaction: introduce direct compaction priority
  mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations
  mm, page_alloc: make THP-specific decisions more generic
  mm, page_alloc: restructure direct compaction handling in slowpath
  mm, page_alloc: don't retry initial attempt in slowpath
  mm, page_alloc: set alloc_flags only once in slowpath
  lib/stackdepot.c: use __GFP_NOWARN for stack allocations
  mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB
  mm, kasan: account for object redzone in SLUB's nearest_obj()
  mm: fix use-after-free if memory allocation failed in vma_adjust()
  zsmalloc: Delete an unnecessary check before the function call "iput"
  mm/memblock.c: fix index adjustment error in __next_mem_range_rev()
  mem-hotplug: alloc new page from a nearest neighbor node when mem-offline
  mm: optimize copy_page_to/from_iter_iovec
  mm: add cond_resched() to generic_swapfile_activate()
  Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements"
  mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode
  mm: hwpoison: remove incorrect comments
  make __section_nr() more efficient
  ...
2016-07-28 16:36:48 -07:00
Dennis Chen cb0a650213 arm64:acpi: fix the acpi alignment exception when 'mem=' specified
When booting an ACPI enabled kernel with 'mem=x', there is the
possibility that ACPI data regions from the firmware will lie above the
memory limit.  Ordinarily these will be removed by
memblock_enforce_memory_limit(.).

Unfortunately, this means that these regions will then be mapped by
acpi_os_ioremap(.) as device memory (instead of normal) thus unaligned
accessess will then provoke alignment faults.

In this patch we adopt memblock_mem_limit_remove_map instead, and this
preserves these ACPI data regions (marked NOMAP) thus ensuring that
these regions are not mapped as device memory.

For example, below is an alignment exception observed on ARM platform
when booting the kernel with 'acpi=on mem=8G':

  ...
  Unable to handle kernel paging request at virtual address ffff0000080521e7
  pgd = ffff000008aa0000
  [ffff0000080521e7] *pgd=000000801fffe003, *pud=000000801fffd003, *pmd=000000801fffc003, *pte=00e80083ff1c1707
  Internal error: Oops: 96000021 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc3-next-20160616+ #172
  Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1001A 02/09/2016
  task: ffff800001ef0000 ti: ffff800001ef8000 task.ti: ffff800001ef8000
  PC is at acpi_ns_lookup+0x520/0x734
  LR is at acpi_ns_lookup+0x4a4/0x734
  pc : [<ffff0000083b8b10>] lr : [<ffff0000083b8a94>] pstate: 60000045
  sp : ffff800001efb8b0
  x29: ffff800001efb8c0 x28: 000000000000001b
  x27: 0000000000000001 x26: 0000000000000000
  x25: ffff800001efb9e8 x24: ffff000008a10000
  x23: 0000000000000001 x22: 0000000000000001
  x21: ffff000008724000 x20: 000000000000001b
  x19: ffff0000080521e7 x18: 000000000000000d
  x17: 00000000000038ff x16: 0000000000000002
  x15: 0000000000000007 x14: 0000000000007fff
  x13: ffffff0000000000 x12: 0000000000000018
  x11: 000000001fffd200 x10: 00000000ffffff76
  x9 : 000000000000005f x8 : ffff000008725fa8
  x7 : ffff000008a8df70 x6 : ffff000008a8df70
  x5 : ffff000008a8d000 x4 : 0000000000000010
  x3 : 0000000000000010 x2 : 000000000000000c
  x1 : 0000000000000006 x0 : 0000000000000000
  ...
    acpi_ns_lookup+0x520/0x734
    acpi_ds_load1_begin_op+0x174/0x4fc
    acpi_ps_build_named_op+0xf8/0x220
    acpi_ps_create_op+0x208/0x33c
    acpi_ps_parse_loop+0x204/0x838
    acpi_ps_parse_aml+0x1bc/0x42c
    acpi_ns_one_complete_parse+0x1e8/0x22c
    acpi_ns_parse_table+0x8c/0x128
    acpi_ns_load_table+0xc0/0x1e8
    acpi_tb_load_namespace+0xf8/0x2e8
    acpi_load_tables+0x7c/0x110
    acpi_init+0x90/0x2c0
    do_one_initcall+0x38/0x12c
    kernel_init_freeable+0x148/0x1ec
    kernel_init+0x10/0xec
    ret_from_fork+0x10/0x40
  Code: b9009fbc 2a00037b 36380057 3219037b (b9400260)
  ---[ end trace 03381e5eb0a24de4 ]---
  Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

With 'efi=debug', we can see those ACPI regions loaded by firmware on
that board as:

  efi:   0x0083ff185000-0x0083ff1b4fff [Reserved           |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
  efi:   0x0083ff1b5000-0x0083ff1c2fff [ACPI Reclaim Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
  efi:   0x0083ff223000-0x0083ff224fff [ACPI Memory NVS    |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*

Link: http://lkml.kernel.org/r/1468475036-5852-3-git-send-email-dennis.chen@arm.com
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Dennis Chen <dennis.chen@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Kaly Xin <kaly.xin@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 11fb998986 mm: move most file-based accounting to the node
There are now a number of accounting oddities such as mapped file pages
being accounted for on the node while the total number of file pages are
accounted on the zone.  This can be coped with to some extent but it's
confusing so this patch moves the relevant file-based accounted.  Due to
throttling logic in the page allocator for reliable OOM detection, it is
still necessary to track dirty and writeback pages on a per-zone basis.

[mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting]
  Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net
Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 50658e2e04 mm: move page mapped accounting to the node
Reclaim makes decisions based on the number of pages that are mapped but
it's mixing node and zone information.  Account NR_FILE_MAPPED and
NR_ANON_PAGES pages on the node.

Link: http://lkml.kernel.org/r/1467970510-21195-18-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman 599d0c954f mm, vmscan: move LRU lists to node
This moves the LRU lists from the zone to the node and related data such
as counters, tracing, congestion tracking and writeback tracking.

Unfortunately, due to reclaim and compaction retry logic, it is
necessary to account for the number of LRU pages on both zone and node
logic.  Most reclaim logic is based on the node counters but the retry
logic uses the zone counters which do not distinguish inactive and
active sizes.  It would be possible to leave the LRU counters on a
per-zone basis but it's a heavier calculation across multiple cache
lines that is much more frequent than the retry checks.

Other than the LRU counters, this is mostly a mechanical patch but note
that it introduces a number of anomalies.  For example, the scans are
per-zone but using per-node counters.  We also mark a node as congested
when a zone is congested.  This causes weird problems that are fixed
later but is easier to review.

In the event that there is excessive overhead on 32-bit systems due to
the nodes being on LRU then there are two potential solutions

1. Long-term isolation of highmem pages when reclaim is lowmem

   When pages are skipped, they are immediately added back onto the LRU
   list. If lowmem reclaim persisted for long periods of time, the same
   highmem pages get continually scanned. The idea would be that lowmem
   keeps those pages on a separate list until a reclaim for highmem pages
   arrives that splices the highmem pages back onto the LRU. It potentially
   could be implemented similar to the UNEVICTABLE list.

   That would reduce the skip rate with the potential corner case is that
   highmem pages have to be scanned and reclaimed to free lowmem slab pages.

2. Linear scan lowmem pages if the initial LRU shrink fails

   This will break LRU ordering but may be preferable and faster during
   memory pressure than skipping LRU pages.

Link: http://lkml.kernel.org/r/1467970510-21195-4-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Linus Torvalds 69c4289449 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  fat: fix error message for bogus number of directory entries
  fat: fix typo s/supeblock/superblock/
  ASoC: max9877: Remove unused function declaration
  dw2102: don't output spurious blank lines to the kernel log
  init: fix Kconfig text
  ARM: io: fix comment grammar
  ocfs: fix ocfs2_xattr_user_get() argument name
  scsi/qla2xxx: Remove erroneous unused macro qla82xx_get_temp_val1()
2016-07-28 14:22:25 -07:00
Vineet Gupta 3925a16ae9 ARC: mm: don't loose PTE_SPECIAL in pte_modify()
LTP madvise05 was generating mm splat

| [ARCLinux]# /sd/ltp/testcases/bin/madvise05
| BUG: Bad page map in process madvise05  pte:80e08211 pmd:9f7d4000
| page:9fdcfc90 count:1 mapcount:-1 mapping:  (null) index:0x0 flags: 0x404(referenced|reserved)
| page dumped because: bad pte
| addr:200b8000 vm_flags:00000070 anon_vma:  (null) mapping:  (null) index:1005c
| file:  (null) fault:  (null) mmap:  (null) readpage:  (null)
| CPU: 2 PID: 6707 Comm: madvise05

And for newer kernels, the system was rendered unusable afterwards.

The problem was mprotect->pte_modify() clearing PTE_SPECIAL (which is
set to identify the special zero page wired to the pte).
When pte was finally unmapped, special casing for zero page was not
done, and instead it was treated as a "normal" page, tripping on the
map counts etc.

This fixes ARC STAR 9001053308

Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-07-28 12:38:17 -07:00
Russell King 5f5a00eaa1 Merge branches 'cpuidle', 'fixes' and 'misc' into for-linus 2016-07-28 15:28:11 +01:00
Dan Carpenter fa1608285b sparc32: off by ones in BUG_ON()
Smatch complains that these tests are off by one, which is true but not
life threatening.

	arch/sparc/kernel/irq_32.c:169 irq_link()
	error: buffer overflow 'irq_map' 384 <= 384

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-27 22:53:17 -07:00
Stephen Rothwell fbef66f0ad powerpc/mm: Parenthesise IS_ENABLED() in if condition
Currently IS_ENABLED() produces an expression surrounded by parentheses,
which allows this code to compile, generating eg:

    else if (1 || 0)
        hpte_init_native();

However a change to the macro in the kbuild tree will break this in
future by removing the parentheses.

Fixes: 7353644fa9 ("powerpc/mm: Fix build break when PPC_NATIVE=n")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-28 12:39:15 +10:00