powerpc updates for 4.12 part 1.

Highlights include:
 
  - Larger virtual address space on 64-bit server CPUs. By default we use a 128TB
    virtual address space, but a process can request access to the full 512TB by
    passing a hint to mmap().
 
  - Support for the new Power9 "XIVE" interrupt controller.
 
  - TLB flushing optimisations for the radix MMU on Power9.
 
  - Support for CAPI cards on Power9, using the "Coherent Accelerator Interface
    Architecture 2.0".
 
  - The ability to configure the mmap randomisation limits at build and runtime.
 
  - Several small fixes and cleanups to the kprobes code, as well as support for
    KPROBES_ON_FTRACE.
 
  - Major improvements to handling of system reset interrupts, correctly treating
    them as NMIs, giving them a dedicated stack and using a new hypervisor call
    to trigger them, all of which should aid debugging and robustness.
 
 Many fixes and other minor enhancements.
 
 Thanks to:
   Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan,
   Aneesh Kumar K.V, Anshuman Khandual, Anton Blanchard, Balbir Singh, Ben
   Hutchings, Benjamin Herrenschmidt, Bhupesh Sharma, Chris Packham, Christian
   Zigotzky, Christophe Leroy, Christophe Lombard, Daniel Axtens, David Gibson,
   Gautham R. Shenoy, Gavin Shan, Geert Uytterhoeven, Guilherme G. Piccoli,
   Hamish Martin, Hari Bathini, Kees Cook, Laurent Dufour, Madhavan Srinivasan,
   Mahesh J Salgaonkar, Mahesh Salgaonkar, Masami Hiramatsu, Matt Brown, Matthew
   R. Ochs, Michael Neuling, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran,
   Pan Xinhui, Paul Mackerras, Rashmica Gupta, Russell Currey, Sukadev
   Bhattiprolu, Thadeu Lima de Souza Cascardo, Tobin C. Harding, Tyrel Datwyler,
   Uma Krishnan, Vaibhav Jain, Vipin K Parashar, Yang Shi.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZDHUMAAoJEFHr6jzI4aWAT7oQALkE2Nj3gjcn1z0SkFhq/1iO
 Py9Elmqm4E+L6NKYtBY5dS8xVAJ088ffzERyqJ1FY1LHkB8tn8bWRcMQmbjAFzTI
 V4TAzDNI890BN/F4ptrYRwNFxRBHAvZ4NDunTzagwYnwmTzW9PYHmOi4pvWTo3Tw
 KFUQ0joLSEgHzyfXxYB3fyj41u8N0FZvhfazdNSqia2Y5Vwwv/ION5jKplDM+09Y
 EtVEXFvaKAS1sjbM/d/Jo5rblHfR0D9/lYV10+jjyIokjzslIpyTbnj3izeYoM5V
 I4h99372zfsEjBGPPXyM3khL3zizGMSDYRmJHQSaKxjtecS9SPywPTZ8ufO/aSzV
 Ngq6nlND+f1zep29VQ0cxd3Jh40skWOXzxJaFjfDT25xa6FbfsWP2NCtk8PGylZ7
 EyqTuCWkMgIP02KlX3oHvEB2LRRPCDmRU2zECecRGNJrIQwYC2xjoiVi7Q8Qe8rY
 gr7Ib5Jj/a+uiTcCIy37+5nXq2s14/JBOKqxuYZIxeuZFvKYuRUipbKWO05WDOAz
 m/pSzeC3J8AAoYiqR0gcSOuJTOnJpGhs7zrQFqnEISbXIwLW+ICumzOmTAiBqOEY
 Rt8uW2gYkPwKLrE05445RfVUoERaAjaE06eRMOWS6slnngHmmnRJbf3PcoALiJkT
 ediqGEj0/N1HMB31V5tS
 =vSF3
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights include:

   - Larger virtual address space on 64-bit server CPUs. By default we
     use a 128TB virtual address space, but a process can request access
     to the full 512TB by passing a hint to mmap().

   - Support for the new Power9 "XIVE" interrupt controller.

   - TLB flushing optimisations for the radix MMU on Power9.

   - Support for CAPI cards on Power9, using the "Coherent Accelerator
     Interface Architecture 2.0".

   - The ability to configure the mmap randomisation limits at build and
     runtime.

   - Several small fixes and cleanups to the kprobes code, as well as
     support for KPROBES_ON_FTRACE.

   - Major improvements to handling of system reset interrupts,
     correctly treating them as NMIs, giving them a dedicated stack and
     using a new hypervisor call to trigger them, all of which should
     aid debugging and robustness.

   - Many fixes and other minor enhancements.

  Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple,
  Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton
  Blanchard, Balbir Singh, Ben Hutchings, Benjamin Herrenschmidt,
  Bhupesh Sharma, Chris Packham, Christian Zigotzky, Christophe Leroy,
  Christophe Lombard, Daniel Axtens, David Gibson, Gautham R. Shenoy,
  Gavin Shan, Geert Uytterhoeven, Guilherme G. Piccoli, Hamish Martin,
  Hari Bathini, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mahesh J
  Salgaonkar, Mahesh Salgaonkar, Masami Hiramatsu, Matt Brown, Matthew
  R. Ochs, Michael Neuling, Naveen N. Rao, Nicholas Piggin, Oliver
  O'Halloran, Pan Xinhui, Paul Mackerras, Rashmica Gupta, Russell
  Currey, Sukadev Bhattiprolu, Thadeu Lima de Souza Cascardo, Tobin C.
  Harding, Tyrel Datwyler, Uma Krishnan, Vaibhav Jain, Vipin K Parashar,
  Yang Shi"

* tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits)
  powerpc/64s: Power9 has no LPCR[VRMASD] field so don't set it
  powerpc/powernv: Fix TCE kill on NVLink2
  powerpc/mm/radix: Drop support for CPUs without lockless tlbie
  powerpc/book3s/mce: Move add_taint() later in virtual mode
  powerpc/sysfs: Move #ifdef CONFIG_HOTPLUG_CPU out of the function body
  powerpc/smp: Document irq enable/disable after migrating IRQs
  powerpc/mpc52xx: Don't select user-visible RTAS_PROC
  powerpc/powernv: Document cxl dependency on special case in pnv_eeh_reset()
  powerpc/eeh: Clean up and document event handling functions
  powerpc/eeh: Avoid use after free in eeh_handle_special_event()
  cxl: Mask slice error interrupts after first occurrence
  cxl: Route eeh events to all drivers in cxl_pci_error_detected()
  cxl: Force context lock during EEH flow
  powerpc/64: Allow CONFIG_RELOCATABLE if COMPILE_TEST
  powerpc/xmon: Teach xmon oops about radix vectors
  powerpc/mm/hash: Fix off-by-one in comment about kernel contexts ids
  powerpc/pseries: Enable VFIO
  powerpc/powernv: Fix iommu table size calculation hook for small tables
  powerpc/powernv: Check kzalloc() return value in pnv_pci_table_alloc
  powerpc: Add arch/powerpc/tools directory
  ...
This commit is contained in:
Linus Torvalds 2017-05-05 11:36:44 -07:00
commit 7246f60068
212 changed files with 8702 additions and 2794 deletions

View File

@ -26,7 +26,7 @@
| nios2: | TODO |
| openrisc: | TODO |
| parisc: | TODO |
| powerpc: | TODO |
| powerpc: | ok |
| s390: | TODO |
| score: | TODO |
| sh: | TODO |

View File

@ -21,7 +21,7 @@ Introduction
Hardware overview
=================
POWER8 FPGA
POWER8/9 FPGA
+----------+ +---------+
| | | |
| CPU | | AFU |
@ -34,7 +34,7 @@ Hardware overview
| | CAPP |<------>| |
+---+------+ PCIE +---------+
The POWER8 chip has a Coherently Attached Processor Proxy (CAPP)
The POWER8/9 chip has a Coherently Attached Processor Proxy (CAPP)
unit which is part of the PCIe Host Bridge (PHB). This is managed
by Linux by calls into OPAL. Linux doesn't directly program the
CAPP.
@ -59,6 +59,17 @@ Hardware overview
the fault. The context to which this fault is serviced is based on
who owns that acceleration function.
POWER8 <-----> PSL Version 8 is compliant to the CAIA Version 1.0.
POWER9 <-----> PSL Version 9 is compliant to the CAIA Version 2.0.
This PSL Version 9 provides new features such as:
* Interaction with the nest MMU on the P9 chip.
* Native DMA support.
* Supports sending ASB_Notify messages for host thread wakeup.
* Supports Atomic operations.
* ....
Cards with a PSL9 won't work on a POWER8 system and cards with a
PSL8 won't work on a POWER9 system.
AFU Modes
=========

View File

@ -105,21 +105,21 @@ memory is held.
If there is no waiting dump data, then only the memory required
to hold CPU state, HPTE region, boot memory dump and elfcore
header, is reserved at the top of memory (see Fig. 1). This area
is *not* released: this region will be kept permanently reserved,
so that it can act as a receptacle for a copy of the boot memory
content in addition to CPU state and HPTE region, in the case a
crash does occur.
header, is usually reserved at an offset greater than boot memory
size (see Fig. 1). This area is *not* released: this region will
be kept permanently reserved, so that it can act as a receptacle
for a copy of the boot memory content in addition to CPU state
and HPTE region, in the case a crash does occur.
o Memory Reservation during first kernel
Low memory Top of memory
Low memory Top of memory
0 boot memory size |
| | |<--Reserved dump area -->|
V V | Permanent Reservation V
+-----------+----------/ /----------+---+----+-----------+----+
| | |CPU|HPTE| DUMP |ELF |
+-----------+----------/ /----------+---+----+-----------+----+
| | |<--Reserved dump area -->| |
V V | Permanent Reservation | V
+-----------+----------/ /---+---+----+-----------+----+------+
| | |CPU|HPTE| DUMP |ELF | |
+-----------+----------/ /---+---+----+-----------+----+------+
| ^
| |
\ /
@ -135,12 +135,12 @@ crash does occur.
0 boot memory size |
| |<------------- Reserved dump area ----------- -->|
V V V
+-----------+----------/ /----------+---+----+-----------+----+
| | |CPU|HPTE| DUMP |ELF |
+-----------+----------/ /----------+---+----+-----------+----+
| |
V V
Used by second /proc/vmcore
+-----------+----------/ /---+---+----+-----------+----+------+
| | |CPU|HPTE| DUMP |ELF | |
+-----------+----------/ /---+---+----+-----------+----+------+
| |
V V
Used by second /proc/vmcore
kernel to boot
Fig. 2

View File

@ -5310,6 +5310,7 @@ M: Scott Wood <oss@buserror.net>
L: linuxppc-dev@lists.ozlabs.org
L: linux-arm-kernel@lists.infradead.org
S: Maintained
F: Documentation/devicetree/bindings/powerpc/fsl/
F: drivers/soc/fsl/
F: include/linux/fsl/
@ -7568,7 +7569,7 @@ Q: http://patchwork.ozlabs.org/project/linuxppc-dev/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
S: Supported
F: Documentation/ABI/stable/sysfs-firmware-opal-*
F: Documentation/devicetree/bindings/powerpc/opal/
F: Documentation/devicetree/bindings/powerpc/
F: Documentation/devicetree/bindings/rtc/rtc-opal.txt
F: Documentation/devicetree/bindings/i2c/i2c-opal.txt
F: Documentation/powerpc/

View File

@ -22,6 +22,48 @@ config MMU
bool
default y
config ARCH_MMAP_RND_BITS_MAX
# On Book3S 64, the default virtual address space for 64-bit processes
# is 2^47 (128TB). As a maximum, allow randomisation to consume up to
# 32T of address space (2^45), which should ensure a reasonable gap
# between bottom-up and top-down allocations for applications that
# consume "normal" amounts of address space. Book3S 64 only supports 64K
# and 4K page sizes.
default 29 if PPC_BOOK3S_64 && PPC_64K_PAGES # 29 = 45 (32T) - 16 (64K)
default 33 if PPC_BOOK3S_64 # 33 = 45 (32T) - 12 (4K)
#
# On all other 64-bit platforms (currently only Book3E), the virtual
# address space is 2^46 (64TB). Allow randomisation to consume up to 16T
# of address space (2^44). Only 4K page sizes are supported.
default 32 if 64BIT # 32 = 44 (16T) - 12 (4K)
#
# For 32-bit, use the compat values, as they're the same.
default ARCH_MMAP_RND_COMPAT_BITS_MAX
config ARCH_MMAP_RND_BITS_MIN
# Allow randomisation to consume up to 1GB of address space (2^30).
default 14 if 64BIT && PPC_64K_PAGES # 14 = 30 (1GB) - 16 (64K)
default 18 if 64BIT # 18 = 30 (1GB) - 12 (4K)
#
# For 32-bit, use the compat values, as they're the same.
default ARCH_MMAP_RND_COMPAT_BITS_MIN
config ARCH_MMAP_RND_COMPAT_BITS_MAX
# Total virtual address space for 32-bit processes is 2^31 (2GB).
# Allow randomisation to consume up to 512MB of address space (2^29).
default 11 if PPC_256K_PAGES # 11 = 29 (512MB) - 18 (256K)
default 13 if PPC_64K_PAGES # 13 = 29 (512MB) - 16 (64K)
default 15 if PPC_16K_PAGES # 15 = 29 (512MB) - 14 (16K)
default 17 # 17 = 29 (512MB) - 12 (4K)
config ARCH_MMAP_RND_COMPAT_BITS_MIN
# Total virtual address space for 32-bit processes is 2^31 (2GB).
# Allow randomisation to consume up to 8MB of address space (2^23).
default 5 if PPC_256K_PAGES # 5 = 23 (8MB) - 18 (256K)
default 7 if PPC_64K_PAGES # 7 = 23 (8MB) - 16 (64K)
default 9 if PPC_16K_PAGES # 9 = 23 (8MB) - 14 (16K)
default 11 # 11 = 23 (8MB) - 12 (4K)
config HAVE_SETUP_PER_CPU_AREA
def_bool PPC64
@ -38,6 +80,11 @@ config NR_IRQS
/proc/interrupts. If you configure your system to have too few,
drivers will fail to load or worse - handle with care.
config NMI_IPI
bool
depends on SMP && (DEBUGGER || KEXEC_CORE)
default y
config STACKTRACE_SUPPORT
bool
default y
@ -119,6 +166,8 @@ config PPC
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_CBPF_JIT if !PPC64
@ -141,6 +190,7 @@ config PPC
select HAVE_IRQ_EXIT_ON_IRQ_STACK
select HAVE_KERNEL_GZIP
select HAVE_KPROBES
select HAVE_KPROBES_ON_FTRACE
select HAVE_KRETPROBES
select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_MEMBLOCK
@ -489,7 +539,7 @@ config KEXEC_FILE
config RELOCATABLE
bool "Build a relocatable kernel"
depends on (PPC64 && !COMPILE_TEST) || (FLATMEM && (44x || FSL_BOOKE))
depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
select NONSTATIC_KERNEL
select MODULE_REL_CRCS if MODVERSIONS
help
@ -523,7 +573,7 @@ config RELOCATABLE_TEST
config CRASH_DUMP
bool "Build a kdump crash kernel"
depends on PPC64 || 6xx || FSL_BOOKE || (44x && !SMP)
select RELOCATABLE if (PPC64 && !COMPILE_TEST) || 44x || FSL_BOOKE
select RELOCATABLE if PPC64 || 44x || FSL_BOOKE
help
Build a kernel suitable for use as a kdump capture kernel.
The same kernel binary can be used as production kernel and dump
@ -585,7 +635,7 @@ config ARCH_SPARSEMEM_ENABLE
config ARCH_SPARSEMEM_DEFAULT
def_bool y
depends on (SMP && PPC_PSERIES) || PPC_PS3
depends on PPC_BOOK3S_64
config SYS_SUPPORTS_HUGETLBFS
bool
@ -677,6 +727,16 @@ config PPC_256K_PAGES
endchoice
config THREAD_SHIFT
int "Thread shift" if EXPERT
range 13 15
default "15" if PPC_256K_PAGES
default "14" if PPC64
default "13"
help
Used to define the stack size. The default is almost always what you
want. Only change this if you know what you are doing.
config FORCE_MAX_ZONEORDER
int "Maximum zone order"
range 8 9 if PPC64 && PPC_64K_PAGES

View File

@ -136,7 +136,7 @@ CFLAGS-$(CONFIG_GENERIC_CPU) += -mcpu=powerpc64
endif
ifdef CONFIG_MPROFILE_KERNEL
ifeq ($(shell $(srctree)/arch/powerpc/scripts/gcc-check-mprofile-kernel.sh $(CC) -I$(srctree)/include -D__KERNEL__),OK)
ifeq ($(shell $(srctree)/arch/powerpc/tools/gcc-check-mprofile-kernel.sh $(CC) -I$(srctree)/include -D__KERNEL__),OK)
CC_FLAGS_FTRACE := -pg -mprofile-kernel
KBUILD_CPPFLAGS += -DCC_USING_MPROFILE_KERNEL
else
@ -274,17 +274,6 @@ PHONY += $(BOOT_TARGETS1) $(BOOT_TARGETS2)
boot := arch/$(ARCH)/boot
ifeq ($(CONFIG_RELOCATABLE),y)
quiet_cmd_relocs_check = CALL $<
cmd_relocs_check = $(CONFIG_SHELL) $< "$(OBJDUMP)" "$(obj)/vmlinux"
PHONY += relocs_check
relocs_check: arch/powerpc/relocs_check.sh vmlinux
$(call cmd,relocs_check)
zImage: relocs_check
endif
$(BOOT_TARGETS1): vmlinux
$(Q)$(MAKE) $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
$(BOOT_TARGETS2): vmlinux

View File

@ -0,0 +1,34 @@
# ===========================================================================
# Post-link powerpc pass
# ===========================================================================
#
# 1. Check that vmlinux relocations look sane
PHONY := __archpost
__archpost:
include include/config/auto.conf
include scripts/Kbuild.include
quiet_cmd_relocs_check = CHKREL $@
cmd_relocs_check = $(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh "$(OBJDUMP)" "$@"
# `@true` prevents complaint when there is nothing to be done
vmlinux: FORCE
@true
ifdef CONFIG_RELOCATABLE
$(call if_changed,relocs_check)
endif
%.ko: FORCE
@true
clean:
@true
PHONY += FORCE clean
FORCE:
.PHONY: $(PHONY)

View File

@ -33,7 +33,7 @@ CONFIG_BLK_DEV_INITRD=y
CONFIG_BPF_SYSCALL=y
# CONFIG_COMPAT_BRK is not set
CONFIG_PROFILING=y
CONFIG_OPROFILE=y
CONFIG_OPROFILE=m
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
CONFIG_MODULES=y
@ -261,7 +261,7 @@ CONFIG_NILFS2_FS=m
CONFIG_AUTOFS4_FS=m
CONFIG_FUSE_FS=m
CONFIG_OVERLAY_FS=m
CONFIG_ISO9660_FS=m
CONFIG_ISO9660_FS=y
CONFIG_UDF_FS=m
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=m
@ -306,7 +306,7 @@ CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_CCM=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPT_CRC32C_VPMSUM=m
CONFIG_CRYPTO_CRC32C_VPMSUM=m
CONFIG_CRYPTO_MD5_PPC=m
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_SHA256=y

View File

@ -19,7 +19,7 @@ CONFIG_BLK_DEV_INITRD=y
CONFIG_BPF_SYSCALL=y
# CONFIG_COMPAT_BRK is not set
CONFIG_PROFILING=y
CONFIG_OPROFILE=y
CONFIG_OPROFILE=m
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
CONFIG_MODULES=y
@ -290,7 +290,7 @@ CONFIG_NILFS2_FS=m
CONFIG_AUTOFS4_FS=m
CONFIG_FUSE_FS=m
CONFIG_OVERLAY_FS=m
CONFIG_ISO9660_FS=m
CONFIG_ISO9660_FS=y
CONFIG_UDF_FS=m
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=m
@ -339,7 +339,7 @@ CONFIG_PPC_EARLY_DEBUG=y
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPT_CRC32C_VPMSUM=m
CONFIG_CRYPTO_CRC32C_VPMSUM=m
CONFIG_CRYPTO_MD5_PPC=m
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_SHA256=y

View File

@ -34,7 +34,7 @@ CONFIG_BLK_DEV_INITRD=y
CONFIG_BPF_SYSCALL=y
# CONFIG_COMPAT_BRK is not set
CONFIG_PROFILING=y
CONFIG_OPROFILE=y
CONFIG_OPROFILE=m
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
CONFIG_MODULES=y
@ -259,7 +259,7 @@ CONFIG_NILFS2_FS=m
CONFIG_AUTOFS4_FS=m
CONFIG_FUSE_FS=m
CONFIG_OVERLAY_FS=m
CONFIG_ISO9660_FS=m
CONFIG_ISO9660_FS=y
CONFIG_UDF_FS=m
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=m
@ -303,7 +303,7 @@ CONFIG_XMON=y
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPT_CRC32C_VPMSUM=m
CONFIG_CRYPTO_CRC32C_VPMSUM=m
CONFIG_CRYPTO_MD5_PPC=m
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_SHA256=y

View File

@ -17,6 +17,8 @@
#include <asm/checksum.h>
#include <linux/uaccess.h>
#include <asm/epapr_hcalls.h>
#include <asm/dcr.h>
#include <asm/mmu_context.h>
#include <uapi/asm/ucontext.h>
@ -120,6 +122,8 @@ extern s64 __ashrdi3(s64, int);
extern int __cmpdi2(s64, s64);
extern int __ucmpdi2(u64, u64);
/* tracing */
void _mcount(void);
unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip);
#endif /* _ASM_POWERPC_ASM_PROTOTYPES_H */

View File

@ -55,6 +55,14 @@
#define PPC_BITEXTRACT(bits, ppc_bit, dst_bit) \
((((bits) >> PPC_BITLSHIFT(ppc_bit)) & 1) << (dst_bit))
#define PPC_BITLSHIFT32(be) (32 - 1 - (be))
#define PPC_BIT32(bit) (1UL << PPC_BITLSHIFT32(bit))
#define PPC_BITMASK32(bs, be) ((PPC_BIT32(bs) - PPC_BIT32(be))|PPC_BIT32(bs))
#define PPC_BITLSHIFT8(be) (8 - 1 - (be))
#define PPC_BIT8(bit) (1UL << PPC_BITLSHIFT8(bit))
#define PPC_BITMASK8(bs, be) ((PPC_BIT8(bs) - PPC_BIT8(be))|PPC_BIT8(bs))
#include <asm/barrier.h>
/* Macro for generating the ***_bits() functions */

View File

@ -8,7 +8,7 @@
#define H_PTE_INDEX_SIZE 9
#define H_PMD_INDEX_SIZE 7
#define H_PUD_INDEX_SIZE 9
#define H_PGD_INDEX_SIZE 9
#define H_PGD_INDEX_SIZE 12
#ifndef __ASSEMBLY__
#define H_PTE_TABLE_SIZE (sizeof(pte_t) << H_PTE_INDEX_SIZE)

View File

@ -4,10 +4,14 @@
#define H_PTE_INDEX_SIZE 8
#define H_PMD_INDEX_SIZE 5
#define H_PUD_INDEX_SIZE 5
#define H_PGD_INDEX_SIZE 12
#define H_PGD_INDEX_SIZE 15
#define H_PAGE_COMBO 0x00001000 /* this is a combo 4k page */
#define H_PAGE_4K_PFN 0x00002000 /* PFN is for a single 4k page */
/*
* 64k aligned address free up few of the lower bits of RPN for us
* We steal that here. For more deatils look at pte_pfn/pfn_pte()
*/
#define H_PAGE_COMBO _RPAGE_RPN0 /* this is a combo 4k page */
#define H_PAGE_4K_PFN _RPAGE_RPN1 /* PFN is for a single 4k page */
/*
* We need to differentiate between explicit huge page and THP huge
* page, since THP huge page also need to track real subpage details

View File

@ -6,19 +6,13 @@
* Common bits between 4K and 64K pages in a linux-style PTE.
* Additional bits may be defined in pgtable-hash64-*.h
*
* Note: We only support user read/write permissions. Supervisor always
* have full read/write to pages above PAGE_OFFSET (pages below that
* always use the user access permissions).
*
* We could create separate kernel read-only if we used the 3 PP bits
* combinations that newer processors provide but we currently don't.
*/
#define H_PAGE_BUSY 0x00800 /* software: PTE & hash are busy */
#define H_PTE_NONE_MASK _PAGE_HPTEFLAGS
#define H_PAGE_F_GIX_SHIFT 57
#define H_PAGE_F_GIX (7ul << 57) /* HPTE index within HPTEG */
#define H_PAGE_F_SECOND (1ul << 60) /* HPTE is in 2ndary HPTEG */
#define H_PAGE_HASHPTE (1ul << 61) /* PTE has associated HPTE */
#define H_PAGE_F_GIX_SHIFT 56
#define H_PAGE_BUSY _RPAGE_RSV1 /* software: PTE & hash are busy */
#define H_PAGE_F_SECOND _RPAGE_RSV2 /* HPTE is in 2ndary HPTEG */
#define H_PAGE_F_GIX (_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
#define H_PAGE_HASHPTE _RPAGE_RPN43 /* PTE has associated HPTE */
#ifdef CONFIG_PPC_64K_PAGES
#include <asm/book3s/64/hash-64k.h>

View File

@ -46,7 +46,7 @@ static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
*/
VM_WARN_ON(page_shift == mmu_psize_defs[MMU_PAGE_1G].shift);
if (page_shift == mmu_psize_defs[MMU_PAGE_2M].shift)
return __pte(pte_val(entry) | _PAGE_LARGE);
return __pte(pte_val(entry) | R_PAGE_LARGE);
else
return entry;
}

View File

@ -39,6 +39,7 @@
/* Bits in the SLB VSID word */
#define SLB_VSID_SHIFT 12
#define SLB_VSID_SHIFT_256M SLB_VSID_SHIFT
#define SLB_VSID_SHIFT_1T 24
#define SLB_VSID_SSIZE_SHIFT 62
#define SLB_VSID_B ASM_CONST(0xc000000000000000)
@ -408,7 +409,7 @@ static inline unsigned long hpt_vpn(unsigned long ea,
static inline unsigned long hpt_hash(unsigned long vpn,
unsigned int shift, int ssize)
{
int mask;
unsigned long mask;
unsigned long hash, vsid;
/* VPN_SHIFT can be atmost 12 */
@ -491,13 +492,14 @@ extern void slb_set_size(u16 size);
* We first generate a 37-bit "proto-VSID". Proto-VSIDs are generated
* from mmu context id and effective segment id of the address.
*
* For user processes max context id is limited to ((1ul << 19) - 5)
* for kernel space, we use the top 4 context ids to map address as below
* For user processes max context id is limited to MAX_USER_CONTEXT.
* For kernel space, we use context ids 1-4 to map addresses as below:
* NOTE: each context only support 64TB now.
* 0x7fffc - [ 0xc000000000000000 - 0xc0003fffffffffff ]
* 0x7fffd - [ 0xd000000000000000 - 0xd0003fffffffffff ]
* 0x7fffe - [ 0xe000000000000000 - 0xe0003fffffffffff ]
* 0x7ffff - [ 0xf000000000000000 - 0xf0003fffffffffff ]
* 0x00001 - [ 0xc000000000000000 - 0xc0003fffffffffff ]
* 0x00002 - [ 0xd000000000000000 - 0xd0003fffffffffff ]
* 0x00003 - [ 0xe000000000000000 - 0xe0003fffffffffff ]
* 0x00004 - [ 0xf000000000000000 - 0xf0003fffffffffff ]
*
* The proto-VSIDs are then scrambled into real VSIDs with the
* multiplicative hash:
@ -511,20 +513,28 @@ extern void slb_set_size(u16 size);
* robust scattering in the hash table (at least based on some initial
* results).
*
* We also consider VSID 0 special. We use VSID 0 for slb entries mapping
* bad address. This enables us to consolidate bad address handling in
* hash_page.
* We use VSID 0 to indicate an invalid VSID. The means we can't use context id
* 0, because a context id of 0 and an EA of 0 gives a proto-VSID of 0, which
* will produce a VSID of 0.
*
* We also need to avoid the last segment of the last context, because that
* would give a protovsid of 0x1fffffffff. That will result in a VSID 0
* because of the modulo operation in vsid scramble. But the vmemmap
* (which is what uses region 0xf) will never be close to 64TB in size
* (it's 56 bytes per page of system memory).
* because of the modulo operation in vsid scramble.
*/
/*
* Max Va bits we support as of now is 68 bits. We want 19 bit
* context ID.
* Restrictions:
* GPU has restrictions of not able to access beyond 128TB
* (47 bit effective address). We also cannot do more than 20bit PID.
* For p4 and p5 which can only do 65 bit VA, we restrict our CONTEXT_BITS
* to 16 bits (ie, we can only have 2^16 pids at the same time).
*/
#define VA_BITS 68
#define CONTEXT_BITS 19
#define ESID_BITS 18
#define ESID_BITS_1T 6
#define ESID_BITS (VA_BITS - (SID_SHIFT + CONTEXT_BITS))
#define ESID_BITS_1T (VA_BITS - (SID_SHIFT_1T + CONTEXT_BITS))
#define ESID_BITS_MASK ((1 << ESID_BITS) - 1)
#define ESID_BITS_1T_MASK ((1 << ESID_BITS_1T) - 1)
@ -532,63 +542,70 @@ extern void slb_set_size(u16 size);
/*
* 256MB segment
* The proto-VSID space has 2^(CONTEX_BITS + ESID_BITS) - 1 segments
* available for user + kernel mapping. The top 4 contexts are used for
* kernel mapping. Each segment contains 2^28 bytes. Each
* context maps 2^46 bytes (64TB) so we can support 2^19-1 contexts
* (19 == 37 + 28 - 46).
* available for user + kernel mapping. VSID 0 is reserved as invalid, contexts
* 1-4 are used for kernel mapping. Each segment contains 2^28 bytes. Each
* context maps 2^49 bytes (512TB).
*
* We also need to avoid the last segment of the last context, because that
* would give a protovsid of 0x1fffffffff. That will result in a VSID 0
* because of the modulo operation in vsid scramble.
*/
#define MAX_USER_CONTEXT ((ASM_CONST(1) << CONTEXT_BITS) - 5)
#define MAX_USER_CONTEXT ((ASM_CONST(1) << CONTEXT_BITS) - 2)
#define MIN_USER_CONTEXT (5)
/* Would be nice to use KERNEL_REGION_ID here */
#define KERNEL_REGION_CONTEXT_OFFSET (0xc - 1)
/*
* For platforms that support on 65bit VA we limit the context bits
*/
#define MAX_USER_CONTEXT_65BIT_VA ((ASM_CONST(1) << (65 - (SID_SHIFT + ESID_BITS))) - 2)
/*
* This should be computed such that protovosid * vsid_mulitplier
* doesn't overflow 64 bits. It should also be co-prime to vsid_modulus
* doesn't overflow 64 bits. The vsid_mutliplier should also be
* co-prime to vsid_modulus. We also need to make sure that number
* of bits in multiplied result (dividend) is less than twice the number of
* protovsid bits for our modulus optmization to work.
*
* The below table shows the current values used.
* |-------+------------+----------------------+------------+-------------------|
* | | Prime Bits | proto VSID_BITS_65VA | Total Bits | 2* prot VSID_BITS |
* |-------+------------+----------------------+------------+-------------------|
* | 1T | 24 | 25 | 49 | 50 |
* |-------+------------+----------------------+------------+-------------------|
* | 256MB | 24 | 37 | 61 | 74 |
* |-------+------------+----------------------+------------+-------------------|
*
* |-------+------------+----------------------+------------+--------------------|
* | | Prime Bits | proto VSID_BITS_68VA | Total Bits | 2* proto VSID_BITS |
* |-------+------------+----------------------+------------+--------------------|
* | 1T | 24 | 28 | 52 | 56 |
* |-------+------------+----------------------+------------+--------------------|
* | 256MB | 24 | 40 | 64 | 80 |
* |-------+------------+----------------------+------------+--------------------|
*
*/
#define VSID_MULTIPLIER_256M ASM_CONST(12538073) /* 24-bit prime */
#define VSID_BITS_256M (CONTEXT_BITS + ESID_BITS)
#define VSID_MODULUS_256M ((1UL<<VSID_BITS_256M)-1)
#define VSID_BITS_256M (VA_BITS - SID_SHIFT)
#define VSID_BITS_65_256M (65 - SID_SHIFT)
/*
* Modular multiplicative inverse of VSID_MULTIPLIER under modulo VSID_MODULUS
*/
#define VSID_MULINV_256M ASM_CONST(665548017062)
#define VSID_MULTIPLIER_1T ASM_CONST(12538073) /* 24-bit prime */
#define VSID_BITS_1T (CONTEXT_BITS + ESID_BITS_1T)
#define VSID_MODULUS_1T ((1UL<<VSID_BITS_1T)-1)
#define VSID_BITS_1T (VA_BITS - SID_SHIFT_1T)
#define VSID_BITS_65_1T (65 - SID_SHIFT_1T)
#define VSID_MULINV_1T ASM_CONST(209034062)
/* 1TB VSID reserved for VRMA */
#define VRMA_VSID 0x1ffffffUL
#define USER_VSID_RANGE (1UL << (ESID_BITS + SID_SHIFT))
/*
* This macro generates asm code to compute the VSID scramble
* function. Used in slb_allocate() and do_stab_bolted. The function
* computed is: (protovsid*VSID_MULTIPLIER) % VSID_MODULUS
*
* rt = register containing the proto-VSID and into which the
* VSID will be stored
* rx = scratch register (clobbered)
*
* - rt and rx must be different registers
* - The answer will end up in the low VSID_BITS bits of rt. The higher
* bits may contain other garbage, so you may need to mask the
* result.
*/
#define ASM_VSID_SCRAMBLE(rt, rx, size) \
lis rx,VSID_MULTIPLIER_##size@h; \
ori rx,rx,VSID_MULTIPLIER_##size@l; \
mulld rt,rt,rx; /* rt = rt * MULTIPLIER */ \
\
srdi rx,rt,VSID_BITS_##size; \
clrldi rt,rt,(64-VSID_BITS_##size); \
add rt,rt,rx; /* add high and low bits */ \
/* NOTE: explanation based on VSID_BITS_##size = 36 \
* Now, r3 == VSID (mod 2^36-1), and lies between 0 and \
* 2^36-1+2^28-1. That in particular means that if r3 >= \
* 2^36-1, then r3+1 has the 2^36 bit set. So, if r3+1 has \
* the bit clear, r3 already has the answer we want, if it \
* doesn't, the answer is the low 36 bits of r3+1. So in all \
* cases the answer is the low 36 bits of (r3 + ((r3+1) >> 36))*/\
addi rx,rt,1; \
srdi rx,rx,VSID_BITS_##size; /* extract 2^VSID_BITS bit */ \
add rt,rt,rx
/* 4 bits per slice and we have one slice per 1TB */
#define SLICE_ARRAY_SIZE (H_PGTABLE_RANGE >> 41)
#define SLICE_ARRAY_SIZE (H_PGTABLE_RANGE >> 41)
#define TASK_SLICE_ARRAY_SZ(x) ((x)->context.addr_limit >> 41)
#ifndef __ASSEMBLY__
@ -634,7 +651,7 @@ static inline void subpage_prot_init_new_context(struct mm_struct *mm) { }
#define vsid_scramble(protovsid, size) \
((((protovsid) * VSID_MULTIPLIER_##size) % VSID_MODULUS_##size))
#else /* 1 */
/* simplified form avoiding mod operation */
#define vsid_scramble(protovsid, size) \
({ \
unsigned long x; \
@ -642,6 +659,21 @@ static inline void subpage_prot_init_new_context(struct mm_struct *mm) { }
x = (x >> VSID_BITS_##size) + (x & VSID_MODULUS_##size); \
(x + ((x+1) >> VSID_BITS_##size)) & VSID_MODULUS_##size; \
})
#else /* 1 */
static inline unsigned long vsid_scramble(unsigned long protovsid,
unsigned long vsid_multiplier, int vsid_bits)
{
unsigned long vsid;
unsigned long vsid_modulus = ((1UL << vsid_bits) - 1);
/*
* We have same multipler for both 256 and 1T segements now
*/
vsid = protovsid * vsid_multiplier;
vsid = (vsid >> vsid_bits) + (vsid & vsid_modulus);
return (vsid + ((vsid + 1) >> vsid_bits)) & vsid_modulus;
}
#endif /* 1 */
/* Returns the segment size indicator for a user address */
@ -656,36 +688,56 @@ static inline int user_segment_size(unsigned long addr)
static inline unsigned long get_vsid(unsigned long context, unsigned long ea,
int ssize)
{
unsigned long va_bits = VA_BITS;
unsigned long vsid_bits;
unsigned long protovsid;
/*
* Bad address. We return VSID 0 for that
*/
if ((ea & ~REGION_MASK) >= H_PGTABLE_RANGE)
return 0;
if (ssize == MMU_SEGSIZE_256M)
return vsid_scramble((context << ESID_BITS)
| ((ea >> SID_SHIFT) & ESID_BITS_MASK), 256M);
return vsid_scramble((context << ESID_BITS_1T)
| ((ea >> SID_SHIFT_1T) & ESID_BITS_1T_MASK), 1T);
if (!mmu_has_feature(MMU_FTR_68_BIT_VA))
va_bits = 65;
if (ssize == MMU_SEGSIZE_256M) {
vsid_bits = va_bits - SID_SHIFT;
protovsid = (context << ESID_BITS) |
((ea >> SID_SHIFT) & ESID_BITS_MASK);
return vsid_scramble(protovsid, VSID_MULTIPLIER_256M, vsid_bits);
}
/* 1T segment */
vsid_bits = va_bits - SID_SHIFT_1T;
protovsid = (context << ESID_BITS_1T) |
((ea >> SID_SHIFT_1T) & ESID_BITS_1T_MASK);
return vsid_scramble(protovsid, VSID_MULTIPLIER_1T, vsid_bits);
}
/*
* This is only valid for addresses >= PAGE_OFFSET
*
* For kernel space, we use the top 4 context ids to map address as below
* 0x7fffc - [ 0xc000000000000000 - 0xc0003fffffffffff ]
* 0x7fffd - [ 0xd000000000000000 - 0xd0003fffffffffff ]
* 0x7fffe - [ 0xe000000000000000 - 0xe0003fffffffffff ]
* 0x7ffff - [ 0xf000000000000000 - 0xf0003fffffffffff ]
*/
static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
{
unsigned long context;
if (!is_kernel_addr(ea))
return 0;
/*
* kernel take the top 4 context from the available range
* For kernel space, we use context ids 1-4 to map the address space as
* below:
*
* 0x00001 - [ 0xc000000000000000 - 0xc0003fffffffffff ]
* 0x00002 - [ 0xd000000000000000 - 0xd0003fffffffffff ]
* 0x00003 - [ 0xe000000000000000 - 0xe0003fffffffffff ]
* 0x00004 - [ 0xf000000000000000 - 0xf0003fffffffffff ]
*
* So we can compute the context from the region (top nibble) by
* subtracting 11, or 0xc - 1.
*/
context = (MAX_USER_CONTEXT) + ((ea >> 60) - 0xc) + 1;
context = (ea >> 60) - KERNEL_REGION_CONTEXT_OFFSET;
return get_vsid(context, ea, ssize);
}

View File

@ -65,6 +65,8 @@ extern struct patb_entry *partition_tb;
* MAX_USER_CONTEXT * 16 bytes of space.
*/
#define PRTB_SIZE_SHIFT (CONTEXT_BITS + 4)
#define PRTB_ENTRIES (1ul << CONTEXT_BITS)
/*
* Power9 currently only support 64K partition table size.
*/
@ -73,13 +75,20 @@ extern struct patb_entry *partition_tb;
typedef unsigned long mm_context_id_t;
struct spinlock;
/* Maximum possible number of NPUs in a system. */
#define NV_MAX_NPUS 8
typedef struct {
mm_context_id_t id;
u16 user_psize; /* page size index */
/* NPU NMMU context */
struct npu_context *npu_context;
#ifdef CONFIG_PPC_MM_SLICES
u64 low_slices_psize; /* SLB page size encodings */
unsigned char high_slices_psize[SLICE_ARRAY_SIZE];
unsigned long addr_limit;
#else
u16 sllp; /* SLB page size encoding */
#endif

View File

@ -13,6 +13,7 @@
#define _PAGE_BIT_SWAP_TYPE 0
#define _PAGE_RO 0
#define _PAGE_SHARED 0
#define _PAGE_EXEC 0x00001 /* execute permission */
#define _PAGE_WRITE 0x00002 /* write access allowed */
@ -37,21 +38,47 @@
#define _RPAGE_RSV3 0x0400000000000000UL
#define _RPAGE_RSV4 0x0200000000000000UL
#ifdef CONFIG_MEM_SOFT_DIRTY
#define _PAGE_SOFT_DIRTY _RPAGE_SW3 /* software: software dirty tracking */
#else
#define _PAGE_SOFT_DIRTY 0x00000
#endif
#define _PAGE_SPECIAL _RPAGE_SW2 /* software: special page */
#define _PAGE_PTE 0x4000000000000000UL /* distinguishes PTEs from pointers */
#define _PAGE_PRESENT 0x8000000000000000UL /* pte contains a translation */
/*
* For P9 DD1 only, we need to track whether the pte's huge.
* Top and bottom bits of RPN which can be used by hash
* translation mode, because we expect them to be zero
* otherwise.
*/
#define _PAGE_LARGE _RPAGE_RSV1
#define _RPAGE_RPN0 0x01000
#define _RPAGE_RPN1 0x02000
#define _RPAGE_RPN44 0x0100000000000000UL
#define _RPAGE_RPN43 0x0080000000000000UL
#define _RPAGE_RPN42 0x0040000000000000UL
#define _RPAGE_RPN41 0x0020000000000000UL
/* Max physical address bit as per radix table */
#define _RPAGE_PA_MAX 57
#define _PAGE_PTE (1ul << 62) /* distinguishes PTEs from pointers */
#define _PAGE_PRESENT (1ul << 63) /* pte contains a translation */
/*
* Max physical address bit we will use for now.
*
* This is mostly a hardware limitation and for now Power9 has
* a 51 bit limit.
*
* This is different from the number of physical bit required to address
* the last byte of memory. That is defined by MAX_PHYSMEM_BITS.
* MAX_PHYSMEM_BITS is a linux limitation imposed by the maximum
* number of sections we can support (SECTIONS_SHIFT).
*
* This is different from Radix page table limitation above and
* should always be less than that. The limit is done such that
* we can overload the bits between _RPAGE_PA_MAX and _PAGE_PA_MAX
* for hash linux page table specific bits.
*
* In order to be compatible with future hardware generations we keep
* some offsets and limit this for now to 53
*/
#define _PAGE_PA_MAX 53
#define _PAGE_SOFT_DIRTY _RPAGE_SW3 /* software: software dirty tracking */
#define _PAGE_SPECIAL _RPAGE_SW2 /* software: special page */
/*
* Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
* Instead of fixing all of them, add an alternate define which
@ -59,10 +86,11 @@
*/
#define _PAGE_NO_CACHE _PAGE_TOLERANT
/*
* We support 57 bit real address in pte. Clear everything above 57, and
* every thing below PAGE_SHIFT;
* We support _RPAGE_PA_MAX bit real address in pte. On the linux side
* we are limited by _PAGE_PA_MAX. Clear everything above _PAGE_PA_MAX
* and every thing below PAGE_SHIFT;
*/
#define PTE_RPN_MASK (((1UL << 57) - 1) & (PAGE_MASK))
#define PTE_RPN_MASK (((1UL << _PAGE_PA_MAX) - 1) & (PAGE_MASK))
/*
* set of bits not changed in pmd_modify. Even though we have hash specific bits
* in here, on radix we expect them to be zero.
@ -205,10 +233,6 @@ extern unsigned long __pte_frag_nr;
extern unsigned long __pte_frag_size_shift;
#define PTE_FRAG_SIZE_SHIFT __pte_frag_size_shift
#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT)
/*
* Pgtable size used by swapper, init in asm code
*/
#define MAX_PGD_TABLE_SIZE (sizeof(pgd_t) << RADIX_PGD_INDEX_SIZE)
#define PTRS_PER_PTE (1 << PTE_INDEX_SIZE)
#define PTRS_PER_PMD (1 << PMD_INDEX_SIZE)

View File

@ -11,6 +11,12 @@
#include <asm/book3s/64/radix-4k.h>
#endif
/*
* For P9 DD1 only, we need to track whether the pte's huge.
*/
#define R_PAGE_LARGE _RPAGE_RSV1
#ifndef __ASSEMBLY__
#include <asm/book3s/64/tlbflush-radix.h>
#include <asm/cpu_has_feature.h>
@ -252,7 +258,7 @@ static inline int radix__pmd_trans_huge(pmd_t pmd)
static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
{
if (cpu_has_feature(CPU_FTR_POWER9_DD1))
return __pmd(pmd_val(pmd) | _PAGE_PTE | _PAGE_LARGE);
return __pmd(pmd_val(pmd) | _PAGE_PTE | R_PAGE_LARGE);
return __pmd(pmd_val(pmd) | _PAGE_PTE);
}
static inline void radix__pmdp_huge_split_prepare(struct vm_area_struct *vma,

View File

@ -12,6 +12,8 @@
#include <asm/types.h>
#include <asm/ppc-opcode.h>
#include <linux/string.h>
#include <linux/kallsyms.h>
/* Flags for create_branch:
* "b" == create_branch(addr, target, 0);
@ -99,6 +101,45 @@ static inline unsigned long ppc_global_function_entry(void *func)
#endif
}
/*
* Wrapper around kallsyms_lookup() to return function entry address:
* - For ABIv1, we lookup the dot variant.
* - For ABIv2, we return the local entry point.
*/
static inline unsigned long ppc_kallsyms_lookup_name(const char *name)
{
unsigned long addr;
#ifdef PPC64_ELF_ABI_v1
/* check for dot variant */
char dot_name[1 + KSYM_NAME_LEN];
bool dot_appended = false;
if (strnlen(name, KSYM_NAME_LEN) >= KSYM_NAME_LEN)
return 0;
if (name[0] != '.') {
dot_name[0] = '.';
dot_name[1] = '\0';
strlcat(dot_name, name, sizeof(dot_name));
dot_appended = true;
} else {
dot_name[0] = '\0';
strlcat(dot_name, name, sizeof(dot_name));
}
addr = kallsyms_lookup_name(dot_name);
if (!addr && dot_appended)
/* Let's try the original non-dot symbol lookup */
addr = kallsyms_lookup_name(name);
#elif defined(PPC64_ELF_ABI_v2)
addr = kallsyms_lookup_name(name);
if (addr)
addr = ppc_function_entry((void *)addr);
#else
addr = kallsyms_lookup_name(name);
#endif
return addr;
}
#ifdef CONFIG_PPC64
/*
* Some instruction encodings commonly used in dynamic ftracing

View File

@ -2,13 +2,39 @@
#define _ASM_POWERPC_CPUIDLE_H
#ifdef CONFIG_PPC_POWERNV
/* Used in powernv idle state management */
/* Thread state used in powernv idle state management */
#define PNV_THREAD_RUNNING 0
#define PNV_THREAD_NAP 1
#define PNV_THREAD_SLEEP 2
#define PNV_THREAD_WINKLE 3
#define PNV_CORE_IDLE_LOCK_BIT 0x100
#define PNV_CORE_IDLE_THREAD_BITS 0x0FF
/*
* Core state used in powernv idle for POWER8.
*
* The lock bit synchronizes updates to the state, as well as parts of the
* sleep/wake code (see kernel/idle_book3s.S).
*
* Bottom 8 bits track the idle state of each thread. Bit is cleared before
* the thread executes an idle instruction (nap/sleep/winkle).
*
* Then there is winkle tracking. A core does not lose complete state
* until every thread is in winkle. So the winkle count field counts the
* number of threads in winkle (small window of false positives is okay
* around the sleep/wake, so long as there are no false negatives).
*
* When the winkle count reaches 8 (the COUNT_ALL_BIT becomes set), then
* the THREAD_WINKLE_BITS are set, which indicate which threads have not
* yet woken from the winkle state.
*/
#define PNV_CORE_IDLE_LOCK_BIT 0x10000000
#define PNV_CORE_IDLE_WINKLE_COUNT 0x00010000
#define PNV_CORE_IDLE_WINKLE_COUNT_ALL_BIT 0x00080000
#define PNV_CORE_IDLE_WINKLE_COUNT_BITS 0x000F0000
#define PNV_CORE_IDLE_THREAD_WINKLE_BITS_SHIFT 8
#define PNV_CORE_IDLE_THREAD_WINKLE_BITS 0x0000FF00
#define PNV_CORE_IDLE_THREAD_BITS 0x000000FF
/*
* ============================ NOTE =================================
@ -46,6 +72,7 @@ extern u32 pnv_fastsleep_workaround_at_exit[];
extern u64 pnv_first_deep_stop_state;
unsigned long pnv_cpu_offline(unsigned int cpu);
int validate_psscr_val_mask(u64 *psscr_val, u64 *psscr_mask, u32 flags);
static inline void report_invalid_psscr_val(u64 psscr_val, int err)
{

View File

@ -471,10 +471,11 @@ enum {
CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
CPU_FTR_DSCR | CPU_FTR_SAO | \
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300)
#define CPU_FTRS_POWER9_DD1 (CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1)
#define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \
(~CPU_FTR_SAO))
#define CPU_FTRS_CELL (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \

View File

@ -35,33 +35,53 @@ enum ppc_dbell {
#ifdef CONFIG_PPC_BOOK3S
#define PPC_DBELL_MSGTYPE PPC_DBELL_SERVER
#define SPRN_DOORBELL_CPUTAG SPRN_TIR
#define PPC_DBELL_TAG_MASK 0x7f
static inline void _ppc_msgsnd(u32 msg)
{
if (cpu_has_feature(CPU_FTR_HVMODE))
__asm__ __volatile__ (PPC_MSGSND(%0) : : "r" (msg));
else
__asm__ __volatile__ (PPC_MSGSNDP(%0) : : "r" (msg));
__asm__ __volatile__ (ASM_FTR_IFSET(PPC_MSGSND(%1), PPC_MSGSNDP(%1), %0)
: : "i" (CPU_FTR_HVMODE), "r" (msg));
}
/* sync before sending message */
static inline void ppc_msgsnd_sync(void)
{
__asm__ __volatile__ ("sync" : : : "memory");
}
/* sync after taking message interrupt */
static inline void ppc_msgsync(void)
{
/* sync is not required when taking messages from the same core */
__asm__ __volatile__ (ASM_FTR_IFSET(PPC_MSGSYNC " ; lwsync", "", %0)
: : "i" (CPU_FTR_HVMODE|CPU_FTR_ARCH_300));
}
#else /* CONFIG_PPC_BOOK3S */
#define PPC_DBELL_MSGTYPE PPC_DBELL
#define SPRN_DOORBELL_CPUTAG SPRN_PIR
#define PPC_DBELL_TAG_MASK 0x3fff
static inline void _ppc_msgsnd(u32 msg)
{
__asm__ __volatile__ (PPC_MSGSND(%0) : : "r" (msg));
}
/* sync before sending message */
static inline void ppc_msgsnd_sync(void)
{
__asm__ __volatile__ ("sync" : : : "memory");
}
/* sync after taking message interrupt */
static inline void ppc_msgsync(void)
{
}
#endif /* CONFIG_PPC_BOOK3S */
extern void doorbell_cause_ipi(int cpu, unsigned long data);
extern void doorbell_global_ipi(int cpu);
extern void doorbell_core_ipi(int cpu);
extern int doorbell_try_core_ipi(int cpu);
extern void doorbell_exception(struct pt_regs *regs);
extern void doorbell_setup_this_cpu(void);
static inline void ppc_msgsnd(enum ppc_dbell type, u32 flags, u32 tag)
{

View File

@ -8,8 +8,6 @@
struct pt_regs;
extern struct dentry *powerpc_debugfs_root;
#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE)
extern int (*__debugger)(struct pt_regs *regs);

View File

@ -0,0 +1,17 @@
#ifndef _ASM_POWERPC_DEBUGFS_H
#define _ASM_POWERPC_DEBUGFS_H
/*
* Copyright 2017, Michael Ellerman, IBM Corporation.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#include <linux/debugfs.h>
extern struct dentry *powerpc_debugfs_root;
#endif /* _ASM_POWERPC_DEBUGFS_H */

View File

@ -167,17 +167,14 @@ BEGIN_FTR_SECTION_NESTED(943) \
std ra,offset(r13); \
END_FTR_SECTION_NESTED(ftr,ftr,943)
#define EXCEPTION_PROLOG_0_PACA(area) \
#define EXCEPTION_PROLOG_0(area) \
GET_PACA(r13); \
std r9,area+EX_R9(r13); /* save r9 */ \
OPT_GET_SPR(r9, SPRN_PPR, CPU_FTR_HAS_PPR); \
HMT_MEDIUM; \
std r10,area+EX_R10(r13); /* save r10 - r12 */ \
OPT_GET_SPR(r10, SPRN_CFAR, CPU_FTR_CFAR)
#define EXCEPTION_PROLOG_0(area) \
GET_PACA(r13); \
EXCEPTION_PROLOG_0_PACA(area)
#define __EXCEPTION_PROLOG_1(area, extra, vec) \
OPT_SAVE_REG_TO_PACA(area+EX_PPR, r9, CPU_FTR_HAS_PPR); \
OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR); \
@ -203,17 +200,26 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
#define EXCEPTION_PROLOG_PSERIES_1(label, h) \
__EXCEPTION_PROLOG_PSERIES_1(label, h)
/* _NORI variant keeps MSR_RI clear */
#define __EXCEPTION_PROLOG_PSERIES_1_NORI(label, h) \
ld r10,PACAKMSR(r13); /* get MSR value for kernel */ \
xori r10,r10,MSR_RI; /* Clear MSR_RI */ \
mfspr r11,SPRN_##h##SRR0; /* save SRR0 */ \
LOAD_HANDLER(r12,label) \
mtspr SPRN_##h##SRR0,r12; \
mfspr r12,SPRN_##h##SRR1; /* and SRR1 */ \
mtspr SPRN_##h##SRR1,r10; \
h##rfid; \
b . /* prevent speculative execution */
#define EXCEPTION_PROLOG_PSERIES_1_NORI(label, h) \
__EXCEPTION_PROLOG_PSERIES_1_NORI(label, h)
#define EXCEPTION_PROLOG_PSERIES(area, label, h, extra, vec) \
EXCEPTION_PROLOG_0(area); \
EXCEPTION_PROLOG_1(area, extra, vec); \
EXCEPTION_PROLOG_PSERIES_1(label, h);
/* Have the PACA in r13 already */
#define EXCEPTION_PROLOG_PSERIES_PACA(area, label, h, extra, vec) \
EXCEPTION_PROLOG_0_PACA(area); \
EXCEPTION_PROLOG_1(area, extra, vec); \
EXCEPTION_PROLOG_PSERIES_1(label, h);
#define __KVMTEST(h, n) \
lbz r10,HSTATE_IN_GUEST(r13); \
cmpwi r10,0; \
@ -256,11 +262,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
ld r9,area+EX_R9(r13); \
bctr
#define BRANCH_TO_KVM(reg, label) \
__LOAD_FAR_HANDLER(reg, label); \
mtctr reg; \
bctr
#else
#define BRANCH_TO_COMMON(reg, label) \
b label
@ -268,15 +269,18 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
#define BRANCH_LINK_TO_FAR(label) \
bl label
#define BRANCH_TO_KVM(reg, label) \
b label
#define __BRANCH_TO_KVM_EXIT(area, label) \
ld r9,area+EX_R9(r13); \
b label
#endif
/* Do not enable RI */
#define EXCEPTION_PROLOG_PSERIES_NORI(area, label, h, extra, vec) \
EXCEPTION_PROLOG_0(area); \
EXCEPTION_PROLOG_1(area, extra, vec); \
EXCEPTION_PROLOG_PSERIES_1_NORI(label, h);
#define __KVM_HANDLER(area, h, n) \
BEGIN_FTR_SECTION_NESTED(947) \
@ -325,6 +329,15 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
#define NOTEST(n)
#define EXCEPTION_PROLOG_COMMON_1() \
std r9,_CCR(r1); /* save CR in stackframe */ \
std r11,_NIP(r1); /* save SRR0 in stackframe */ \
std r12,_MSR(r1); /* save SRR1 in stackframe */ \
std r10,0(r1); /* make stack chain pointer */ \
std r0,GPR0(r1); /* save r0 in stackframe */ \
std r10,GPR1(r1); /* save r1 in stackframe */ \
/*
* The common exception prolog is used for all except a few exceptions
* such as a segment miss on a kernel address. We have to be prepared
@ -349,12 +362,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
addi r3,r13,area; /* r3 -> where regs are saved*/ \
RESTORE_CTR(r1, area); \
b bad_stack; \
3: std r9,_CCR(r1); /* save CR in stackframe */ \
std r11,_NIP(r1); /* save SRR0 in stackframe */ \
std r12,_MSR(r1); /* save SRR1 in stackframe */ \
std r10,0(r1); /* make stack chain pointer */ \
std r0,GPR0(r1); /* save r0 in stackframe */ \
std r10,GPR1(r1); /* save r1 in stackframe */ \
3: EXCEPTION_PROLOG_COMMON_1(); \
beq 4f; /* if from kernel mode */ \
ACCOUNT_CPU_USER_ENTRY(r13, r9, r10); \
SAVE_PPR(area, r9, r10); \
@ -522,7 +530,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
#define MASKABLE_RELON_EXCEPTION_HV_OOL(vec, label) \
EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_HV, vec); \
EXCEPTION_PROLOG_PSERIES_1(label, EXC_HV)
EXCEPTION_RELON_PROLOG_PSERIES_1(label, EXC_HV)
/*
* Our exception common code can be passed various "additions"
@ -547,26 +555,39 @@ BEGIN_FTR_SECTION \
beql ppc64_runlatch_on_trampoline; \
END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
#define EXCEPTION_COMMON(trap, label, hdlr, ret, additions) \
EXCEPTION_PROLOG_COMMON(trap, PACA_EXGEN); \
#define EXCEPTION_COMMON(area, trap, label, hdlr, ret, additions) \
EXCEPTION_PROLOG_COMMON(trap, area); \
/* Volatile regs are potentially clobbered here */ \
additions; \
addi r3,r1,STACK_FRAME_OVERHEAD; \
bl hdlr; \
b ret
/*
* Exception where stack is already set in r1, r1 is saved in r10, and it
* continues rather than returns.
*/
#define EXCEPTION_COMMON_NORET_STACK(area, trap, label, hdlr, additions) \
EXCEPTION_PROLOG_COMMON_1(); \
EXCEPTION_PROLOG_COMMON_2(area); \
EXCEPTION_PROLOG_COMMON_3(trap); \
/* Volatile regs are potentially clobbered here */ \
additions; \
addi r3,r1,STACK_FRAME_OVERHEAD; \
bl hdlr
#define STD_EXCEPTION_COMMON(trap, label, hdlr) \
EXCEPTION_COMMON(trap, label, hdlr, ret_from_except, \
ADD_NVGPRS;ADD_RECONCILE)
EXCEPTION_COMMON(PACA_EXGEN, trap, label, hdlr, \
ret_from_except, ADD_NVGPRS;ADD_RECONCILE)
/*
* Like STD_EXCEPTION_COMMON, but for exceptions that can occur
* in the idle task and therefore need the special idle handling
* (finish nap and runlatch)
*/
#define STD_EXCEPTION_COMMON_ASYNC(trap, label, hdlr) \
EXCEPTION_COMMON(trap, label, hdlr, ret_from_except_lite, \
FINISH_NAP;ADD_RECONCILE;RUNLATCH_ON)
#define STD_EXCEPTION_COMMON_ASYNC(trap, label, hdlr) \
EXCEPTION_COMMON(PACA_EXGEN, trap, label, hdlr, \
ret_from_except_lite, FINISH_NAP;ADD_RECONCILE;RUNLATCH_ON)
/*
* When the idle code in power4_idle puts the CPU into NAP mode,

View File

@ -66,6 +66,9 @@ label##5: \
#define END_FTR_SECTION(msk, val) \
END_FTR_SECTION_NESTED(msk, val, 97)
#define END_FTR_SECTION_NESTED_IFSET(msk, label) \
END_FTR_SECTION_NESTED((msk), (msk), label)
#define END_FTR_SECTION_IFSET(msk) END_FTR_SECTION((msk), (msk))
#define END_FTR_SECTION_IFCLR(msk) END_FTR_SECTION((msk), 0)

View File

@ -213,6 +213,7 @@ end_##sname:
USE_TEXT_SECTION(); \
.balign IFETCH_ALIGN_BYTES; \
.global name; \
_ASM_NOKPROBE_SYMBOL(name); \
DEFINE_FIXED_SYMBOL(name); \
name:

View File

@ -377,16 +377,6 @@ long plpar_hcall_raw(unsigned long opcode, unsigned long *retbuf, ...);
long plpar_hcall9(unsigned long opcode, unsigned long *retbuf, ...);
long plpar_hcall9_raw(unsigned long opcode, unsigned long *retbuf, ...);
/* For hcall instrumentation. One structure per-hcall, per-CPU */
struct hcall_stats {
unsigned long num_calls; /* number of calls (on this CPU) */
unsigned long tb_total; /* total wall time (mftb) of calls. */
unsigned long purr_total; /* total cpu time (PURR) of calls. */
unsigned long tb_start;
unsigned long purr_start;
};
#define HCALL_STAT_ARRAY_SIZE ((MAX_HCALL_OPCODE >> 2) + 1)
struct hvcall_mpp_data {
unsigned long entitled_mem;
unsigned long mapped_mem;

View File

@ -25,8 +25,6 @@ extern struct pci_dev *isa_bridge_pcidev;
#endif
#include <linux/device.h>
#include <linux/io.h>
#include <linux/compiler.h>
#include <asm/page.h>
#include <asm/byteorder.h>
@ -192,24 +190,8 @@ DEF_MMIO_OUT_D(out_le32, 32, stw);
#endif /* __BIG_ENDIAN */
/*
* Cache inhibitied accessors for use in real mode, you don't want to use these
* unless you know what you're doing.
*
* NB. These use the cpu byte ordering.
*/
DEF_MMIO_OUT_X(out_rm8, 8, stbcix);
DEF_MMIO_OUT_X(out_rm16, 16, sthcix);
DEF_MMIO_OUT_X(out_rm32, 32, stwcix);
DEF_MMIO_IN_X(in_rm8, 8, lbzcix);
DEF_MMIO_IN_X(in_rm16, 16, lhzcix);
DEF_MMIO_IN_X(in_rm32, 32, lwzcix);
#ifdef __powerpc64__
DEF_MMIO_OUT_X(out_rm64, 64, stdcix);
DEF_MMIO_IN_X(in_rm64, 64, ldcix);
#ifdef __BIG_ENDIAN__
DEF_MMIO_OUT_D(out_be64, 64, std);
DEF_MMIO_IN_D(in_be64, 64, ld);
@ -242,35 +224,6 @@ static inline void out_be64(volatile u64 __iomem *addr, u64 val)
#endif
#endif /* __powerpc64__ */
/*
* Simple Cache inhibited accessors
* Unlike the DEF_MMIO_* macros, these don't include any h/w memory
* barriers, callers need to manage memory barriers on their own.
* These can only be used in hypervisor real mode.
*/
static inline u32 _lwzcix(unsigned long addr)
{
u32 ret;
__asm__ __volatile__("lwzcix %0,0, %1"
: "=r" (ret) : "r" (addr) : "memory");
return ret;
}
static inline void _stbcix(u64 addr, u8 val)
{
__asm__ __volatile__("stbcix %0,0,%1"
: : "r" (val), "r" (addr) : "memory");
}
static inline void _stwcix(u64 addr, u32 val)
{
__asm__ __volatile__("stwcix %0,0,%1"
: : "r" (val), "r" (addr) : "memory");
}
/*
* Low level IO stream instructions are defined out of line for now
*/
@ -417,15 +370,64 @@ static inline void __raw_writeq(unsigned long v, volatile void __iomem *addr)
}
/*
* Real mode version of the above. stdcix is only supposed to be used
* in hypervisor real mode as per the architecture spec.
* Real mode versions of the above. Those instructions are only supposed
* to be used in hypervisor real mode as per the architecture spec.
*/
static inline void __raw_rm_writeb(u8 val, volatile void __iomem *paddr)
{
__asm__ __volatile__("stbcix %0,0,%1"
: : "r" (val), "r" (paddr) : "memory");
}
static inline void __raw_rm_writew(u16 val, volatile void __iomem *paddr)
{
__asm__ __volatile__("sthcix %0,0,%1"
: : "r" (val), "r" (paddr) : "memory");
}
static inline void __raw_rm_writel(u32 val, volatile void __iomem *paddr)
{
__asm__ __volatile__("stwcix %0,0,%1"
: : "r" (val), "r" (paddr) : "memory");
}
static inline void __raw_rm_writeq(u64 val, volatile void __iomem *paddr)
{
__asm__ __volatile__("stdcix %0,0,%1"
: : "r" (val), "r" (paddr) : "memory");
}
static inline u8 __raw_rm_readb(volatile void __iomem *paddr)
{
u8 ret;
__asm__ __volatile__("lbzcix %0,0, %1"
: "=r" (ret) : "r" (paddr) : "memory");
return ret;
}
static inline u16 __raw_rm_readw(volatile void __iomem *paddr)
{
u16 ret;
__asm__ __volatile__("lhzcix %0,0, %1"
: "=r" (ret) : "r" (paddr) : "memory");
return ret;
}
static inline u32 __raw_rm_readl(volatile void __iomem *paddr)
{
u32 ret;
__asm__ __volatile__("lwzcix %0,0, %1"
: "=r" (ret) : "r" (paddr) : "memory");
return ret;
}
static inline u64 __raw_rm_readq(volatile void __iomem *paddr)
{
u64 ret;
__asm__ __volatile__("ldcix %0,0, %1"
: "=r" (ret) : "r" (paddr) : "memory");
return ret;
}
#endif /* __powerpc64__ */
/*
@ -757,6 +759,8 @@ extern void __iomem *ioremap_prot(phys_addr_t address, unsigned long size,
extern void __iomem *ioremap_wc(phys_addr_t address, unsigned long size);
#define ioremap_nocache(addr, size) ioremap((addr), (size))
#define ioremap_uc(addr, size) ioremap((addr), (size))
#define ioremap_cache(addr, size) \
ioremap_prot((addr), (size), pgprot_val(PAGE_KERNEL))
extern void iounmap(volatile void __iomem *addr);

View File

@ -64,6 +64,11 @@ struct iommu_table_ops {
long index,
unsigned long *hpa,
enum dma_data_direction *direction);
/* Real mode */
int (*exchange_rm)(struct iommu_table *tbl,
long index,
unsigned long *hpa,
enum dma_data_direction *direction);
#endif
void (*clear)(struct iommu_table *tbl,
long index, long npages);
@ -114,6 +119,7 @@ struct iommu_table {
struct list_head it_group_list;/* List of iommu_table_group_link */
unsigned long *it_userspace; /* userspace view of the table */
struct iommu_table_ops *it_ops;
struct kref it_kref;
};
#define IOMMU_TABLE_USERSPACE_ENTRY(tbl, entry) \
@ -146,8 +152,8 @@ static inline void *get_iommu_table_base(struct device *dev)
extern int dma_iommu_dma_supported(struct device *dev, u64 mask);
/* Frees table for an individual device node */
extern void iommu_free_table(struct iommu_table *tbl, const char *node_name);
extern struct iommu_table *iommu_tce_table_get(struct iommu_table *tbl);
extern int iommu_tce_table_put(struct iommu_table *tbl);
/* Initializes an iommu_table based in values set in the passed-in
* structure
@ -208,6 +214,8 @@ extern void iommu_del_device(struct device *dev);
extern int __init tce_iommu_bus_notifier_init(void);
extern long iommu_tce_xchg(struct iommu_table *tbl, unsigned long entry,
unsigned long *hpa, enum dma_data_direction *direction);
extern long iommu_tce_xchg_rm(struct iommu_table *tbl, unsigned long entry,
unsigned long *hpa, enum dma_data_direction *direction);
#else
static inline void iommu_register_group(struct iommu_table_group *table_group,
int pci_domain_number,

View File

@ -61,59 +61,6 @@ extern kprobe_opcode_t optprobe_template_end[];
#define MAX_OPTINSN_SIZE (optprobe_template_end - optprobe_template_entry)
#define RELATIVEJUMP_SIZE sizeof(kprobe_opcode_t) /* 4 bytes */
#ifdef PPC64_ELF_ABI_v2
/* PPC64 ABIv2 needs local entry point */
#define kprobe_lookup_name(name, addr) \
{ \
addr = (kprobe_opcode_t *)kallsyms_lookup_name(name); \
if (addr) \
addr = (kprobe_opcode_t *)ppc_function_entry(addr); \
}
#elif defined(PPC64_ELF_ABI_v1)
/*
* 64bit powerpc ABIv1 uses function descriptors:
* - Check for the dot variant of the symbol first.
* - If that fails, try looking up the symbol provided.
*
* This ensures we always get to the actual symbol and not the descriptor.
* Also handle <module:symbol> format.
*/
#define kprobe_lookup_name(name, addr) \
{ \
char dot_name[MODULE_NAME_LEN + 1 + KSYM_NAME_LEN]; \
const char *modsym; \
bool dot_appended = false; \
if ((modsym = strchr(name, ':')) != NULL) { \
modsym++; \
if (*modsym != '\0' && *modsym != '.') { \
/* Convert to <module:.symbol> */ \
strncpy(dot_name, name, modsym - name); \
dot_name[modsym - name] = '.'; \
dot_name[modsym - name + 1] = '\0'; \
strncat(dot_name, modsym, \
sizeof(dot_name) - (modsym - name) - 2);\
dot_appended = true; \
} else { \
dot_name[0] = '\0'; \
strncat(dot_name, name, sizeof(dot_name) - 1); \
} \
} else if (name[0] != '.') { \
dot_name[0] = '.'; \
dot_name[1] = '\0'; \
strncat(dot_name, name, KSYM_NAME_LEN - 2); \
dot_appended = true; \
} else { \
dot_name[0] = '\0'; \
strncat(dot_name, name, KSYM_NAME_LEN - 1); \
} \
addr = (kprobe_opcode_t *)kallsyms_lookup_name(dot_name); \
if (!addr && dot_appended) { \
/* Let's try the original non-dot symbol lookup */ \
addr = (kprobe_opcode_t *)kallsyms_lookup_name(name); \
} \
}
#endif
#define flush_insn_slot(p) do { } while (0)
#define kretprobe_blacklist_size 0
@ -156,6 +103,16 @@ extern int kprobe_exceptions_notify(struct notifier_block *self,
extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
extern int kprobe_handler(struct pt_regs *regs);
extern int kprobe_post_handler(struct pt_regs *regs);
#ifdef CONFIG_KPROBES_ON_FTRACE
extern int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb);
#else
static inline int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
{
return 0;
}
#endif
#else
static inline int kprobe_handler(struct pt_regs *regs) { return 0; }
static inline int kprobe_post_handler(struct pt_regs *regs) { return 0; }

View File

@ -49,8 +49,6 @@ static inline bool kvm_is_radix(struct kvm *kvm)
#define KVM_DEFAULT_HPT_ORDER 24 /* 16MB HPT by default */
#endif
#define VRMA_VSID 0x1ffffffUL /* 1TB VSID reserved for VRMA */
/*
* We use a lock bit in HPTE dword 0 to synchronize updates and
* accesses to each HPTE, and another bit to indicate non-present

View File

@ -110,7 +110,7 @@ struct kvmppc_host_state {
u8 ptid;
struct kvm_vcpu *kvm_vcpu;
struct kvmppc_vcore *kvm_vcore;
unsigned long xics_phys;
void __iomem *xics_phys;
u32 saved_xirr;
u64 dabr;
u64 host_mmcr[7]; /* MMCR 0,1,A, SIAR, SDAR, MMCR2, SIER */

View File

@ -409,7 +409,7 @@ struct openpic;
extern void kvm_cma_reserve(void) __init;
static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
{
paca[cpu].kvm_hstate.xics_phys = addr;
paca[cpu].kvm_hstate.xics_phys = (void __iomem *)addr;
}
static inline u32 kvmppc_get_xics_latch(void)
@ -478,8 +478,6 @@ extern void kvmppc_free_host_rm_ops(void);
extern void kvmppc_free_pimap(struct kvm *kvm);
extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall);
extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu);
extern int kvmppc_xics_create_icp(struct kvm_vcpu *vcpu, unsigned long server);
extern int kvm_vm_ioctl_xics_irq(struct kvm *kvm, struct kvm_irq_level *args);
extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd);
extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu);
extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval);
@ -507,12 +505,6 @@ static inline int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall)
static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu)
{ return 0; }
static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { }
static inline int kvmppc_xics_create_icp(struct kvm_vcpu *vcpu,
unsigned long server)
{ return -EINVAL; }
static inline int kvm_vm_ioctl_xics_irq(struct kvm *kvm,
struct kvm_irq_level *args)
{ return -ENOTTY; }
static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd)
{ return 0; }
#endif

View File

@ -24,97 +24,6 @@
#include <linux/bitops.h>
/*
* Machine Check bits on power7 and power8
*/
#define P7_SRR1_MC_LOADSTORE(srr1) ((srr1) & PPC_BIT(42)) /* P8 too */
/* SRR1 bits for machine check (On Power7 and Power8) */
#define P7_SRR1_MC_IFETCH(srr1) ((srr1) & PPC_BITMASK(43, 45)) /* P8 too */
#define P7_SRR1_MC_IFETCH_UE (0x1 << PPC_BITLSHIFT(45)) /* P8 too */
#define P7_SRR1_MC_IFETCH_SLB_PARITY (0x2 << PPC_BITLSHIFT(45)) /* P8 too */
#define P7_SRR1_MC_IFETCH_SLB_MULTIHIT (0x3 << PPC_BITLSHIFT(45)) /* P8 too */
#define P7_SRR1_MC_IFETCH_SLB_BOTH (0x4 << PPC_BITLSHIFT(45))
#define P7_SRR1_MC_IFETCH_TLB_MULTIHIT (0x5 << PPC_BITLSHIFT(45)) /* P8 too */
#define P7_SRR1_MC_IFETCH_UE_TLB_RELOAD (0x6 << PPC_BITLSHIFT(45)) /* P8 too */
#define P7_SRR1_MC_IFETCH_UE_IFU_INTERNAL (0x7 << PPC_BITLSHIFT(45))
/* SRR1 bits for machine check (On Power8) */
#define P8_SRR1_MC_IFETCH_ERAT_MULTIHIT (0x4 << PPC_BITLSHIFT(45))
/* DSISR bits for machine check (On Power7 and Power8) */
#define P7_DSISR_MC_UE (PPC_BIT(48)) /* P8 too */
#define P7_DSISR_MC_UE_TABLEWALK (PPC_BIT(49)) /* P8 too */
#define P7_DSISR_MC_ERAT_MULTIHIT (PPC_BIT(52)) /* P8 too */
#define P7_DSISR_MC_TLB_MULTIHIT_MFTLB (PPC_BIT(53)) /* P8 too */
#define P7_DSISR_MC_SLB_PARITY_MFSLB (PPC_BIT(55)) /* P8 too */
#define P7_DSISR_MC_SLB_MULTIHIT (PPC_BIT(56)) /* P8 too */
#define P7_DSISR_MC_SLB_MULTIHIT_PARITY (PPC_BIT(57)) /* P8 too */
/*
* DSISR bits for machine check (Power8) in addition to above.
* Secondary DERAT Multihit
*/
#define P8_DSISR_MC_ERAT_MULTIHIT_SEC (PPC_BIT(54))
/* SLB error bits */
#define P7_DSISR_MC_SLB_ERRORS (P7_DSISR_MC_ERAT_MULTIHIT | \
P7_DSISR_MC_SLB_PARITY_MFSLB | \
P7_DSISR_MC_SLB_MULTIHIT | \
P7_DSISR_MC_SLB_MULTIHIT_PARITY)
#define P8_DSISR_MC_SLB_ERRORS (P7_DSISR_MC_SLB_ERRORS | \
P8_DSISR_MC_ERAT_MULTIHIT_SEC)
/*
* Machine Check bits on power9
*/
#define P9_SRR1_MC_LOADSTORE(srr1) (((srr1) >> PPC_BITLSHIFT(42)) & 1)
#define P9_SRR1_MC_IFETCH(srr1) ( \
PPC_BITEXTRACT(srr1, 45, 0) | \
PPC_BITEXTRACT(srr1, 44, 1) | \
PPC_BITEXTRACT(srr1, 43, 2) | \
PPC_BITEXTRACT(srr1, 36, 3) )
/* 0 is reserved */
#define P9_SRR1_MC_IFETCH_UE 1
#define P9_SRR1_MC_IFETCH_SLB_PARITY 2
#define P9_SRR1_MC_IFETCH_SLB_MULTIHIT 3
#define P9_SRR1_MC_IFETCH_ERAT_MULTIHIT 4
#define P9_SRR1_MC_IFETCH_TLB_MULTIHIT 5
#define P9_SRR1_MC_IFETCH_UE_TLB_RELOAD 6
/* 7 is reserved */
#define P9_SRR1_MC_IFETCH_LINK_TIMEOUT 8
#define P9_SRR1_MC_IFETCH_LINK_TABLEWALK_TIMEOUT 9
/* 10 ? */
#define P9_SRR1_MC_IFETCH_RA 11
#define P9_SRR1_MC_IFETCH_RA_TABLEWALK 12
#define P9_SRR1_MC_IFETCH_RA_ASYNC_STORE 13
#define P9_SRR1_MC_IFETCH_LINK_ASYNC_STORE_TIMEOUT 14
#define P9_SRR1_MC_IFETCH_RA_TABLEWALK_FOREIGN 15
/* DSISR bits for machine check (On Power9) */
#define P9_DSISR_MC_UE (PPC_BIT(48))
#define P9_DSISR_MC_UE_TABLEWALK (PPC_BIT(49))
#define P9_DSISR_MC_LINK_LOAD_TIMEOUT (PPC_BIT(50))
#define P9_DSISR_MC_LINK_TABLEWALK_TIMEOUT (PPC_BIT(51))
#define P9_DSISR_MC_ERAT_MULTIHIT (PPC_BIT(52))
#define P9_DSISR_MC_TLB_MULTIHIT_MFTLB (PPC_BIT(53))
#define P9_DSISR_MC_USER_TLBIE (PPC_BIT(54))
#define P9_DSISR_MC_SLB_PARITY_MFSLB (PPC_BIT(55))
#define P9_DSISR_MC_SLB_MULTIHIT_MFSLB (PPC_BIT(56))
#define P9_DSISR_MC_RA_LOAD (PPC_BIT(57))
#define P9_DSISR_MC_RA_TABLEWALK (PPC_BIT(58))
#define P9_DSISR_MC_RA_TABLEWALK_FOREIGN (PPC_BIT(59))
#define P9_DSISR_MC_RA_FOREIGN (PPC_BIT(60))
/* SLB error bits */
#define P9_DSISR_MC_SLB_ERRORS (P9_DSISR_MC_ERAT_MULTIHIT | \
P9_DSISR_MC_SLB_PARITY_MFSLB | \
P9_DSISR_MC_SLB_MULTIHIT_MFSLB)
enum MCE_Version {
MCE_V1 = 1,
};
@ -298,7 +207,8 @@ extern void save_mce_event(struct pt_regs *regs, long handled,
extern int get_mce_event(struct machine_check_event *mce, bool release);
extern void release_mce_event(void);
extern void machine_check_queue_event(void);
extern void machine_check_print_event_info(struct machine_check_event *evt);
extern void machine_check_print_event_info(struct machine_check_event *evt,
bool user_mode);
extern uint64_t get_mce_fault_addr(struct machine_check_event *evt);
#endif /* __ASM_PPC64_MCE_H__ */

View File

@ -229,11 +229,6 @@ typedef struct {
unsigned int id;
unsigned int active;
unsigned long vdso_base;
#ifdef CONFIG_PPC_MM_SLICES
u64 low_slices_psize; /* SLB page size encodings */
u64 high_slices_psize; /* 4 bits per slice for now */
u16 user_psize; /* page size index */
#endif
#ifdef CONFIG_PPC_64K_PAGES
/* for 4K PTE fragment support */
void *pte_frag;

View File

@ -28,6 +28,10 @@
* Individual features below.
*/
/*
* Support for 68 bit VA space. We added that from ISA 2.05
*/
#define MMU_FTR_68_BIT_VA ASM_CONST(0x00002000)
/*
* Kernel read only support.
* We added the ppp value 0b110 in ISA 2.04.
@ -109,10 +113,10 @@
#define MMU_FTRS_POWER4 MMU_FTRS_DEFAULT_HPTE_ARCH_V2
#define MMU_FTRS_PPC970 MMU_FTRS_POWER4 | MMU_FTR_TLBIE_CROP_VA
#define MMU_FTRS_POWER5 MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE
#define MMU_FTRS_POWER6 MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_KERNEL_RO
#define MMU_FTRS_POWER7 MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_KERNEL_RO
#define MMU_FTRS_POWER8 MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_KERNEL_RO
#define MMU_FTRS_POWER9 MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_KERNEL_RO
#define MMU_FTRS_POWER6 MMU_FTRS_POWER5 | MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA
#define MMU_FTRS_POWER7 MMU_FTRS_POWER6
#define MMU_FTRS_POWER8 MMU_FTRS_POWER6
#define MMU_FTRS_POWER9 MMU_FTRS_POWER6
#define MMU_FTRS_CELL MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \
MMU_FTR_CI_LARGE_PAGE
#define MMU_FTRS_PA6T MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \
@ -136,7 +140,7 @@ enum {
MMU_FTR_NO_SLBIE_B | MMU_FTR_16M_PAGE | MMU_FTR_TLBIEL |
MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE |
MMU_FTR_1T_SEGMENT | MMU_FTR_TLBIE_CROP_VA |
MMU_FTR_KERNEL_RO |
MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA |
#ifdef CONFIG_PPC_RADIX_MMU
MMU_FTR_TYPE_RADIX |
#endif
@ -290,7 +294,10 @@ static inline bool early_radix_enabled(void)
#define MMU_PAGE_16G 14
#define MMU_PAGE_64G 15
/* N.B. we need to change the type of hpte_page_sizes if this gets to be > 16 */
/*
* N.B. we need to change the type of hpte_page_sizes if this gets to be > 16
* Also we need to change he type of mm_context.low/high_slices_psize.
*/
#define MMU_PAGE_COUNT 16
#ifdef CONFIG_PPC_BOOK3S_64

View File

@ -29,10 +29,14 @@ extern void mm_iommu_init(struct mm_struct *mm);
extern void mm_iommu_cleanup(struct mm_struct *mm);
extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm,
unsigned long ua, unsigned long size);
extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup_rm(
struct mm_struct *mm, unsigned long ua, unsigned long size);
extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
unsigned long ua, unsigned long entries);
extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
unsigned long ua, unsigned long *hpa);
extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
unsigned long ua, unsigned long *hpa);
extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
#endif
@ -51,7 +55,8 @@ static inline void switch_mmu_context(struct mm_struct *prev,
return switch_slb(tsk, next);
}
extern int __init_new_context(void);
extern int hash__alloc_context_id(void);
extern void hash__reserve_context_id(int id);
extern void __destroy_context(int context_id);
static inline void mmu_context_init(void) { }
#else
@ -70,8 +75,9 @@ extern void drop_cop(unsigned long acop, struct mm_struct *mm);
* switch_mm is the entry point called from the architecture independent
* code in kernel/sched/core.c
*/
static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
struct task_struct *tsk)
static inline void switch_mm_irqs_off(struct mm_struct *prev,
struct mm_struct *next,
struct task_struct *tsk)
{
/* Mark this context has been used on the new CPU */
if (!cpumask_test_cpu(smp_processor_id(), mm_cpumask(next)))
@ -110,6 +116,18 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
switch_mmu_context(prev, next, tsk);
}
static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
struct task_struct *tsk)
{
unsigned long flags;
local_irq_save(flags);
switch_mm_irqs_off(prev, next, tsk);
local_irq_restore(flags);
}
#define switch_mm_irqs_off switch_mm_irqs_off
#define deactivate_mm(tsk,mm) do { } while (0)
/*

View File

@ -88,11 +88,6 @@
#include <asm/nohash/pte-book3e.h>
#include <asm/pte-common.h>
#ifdef CONFIG_PPC_MM_SLICES
#define HAVE_ARCH_UNMAPPED_AREA
#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
#endif /* CONFIG_PPC_MM_SLICES */
#ifndef __ASSEMBLY__
/* pte_clear moved to later in this file */

View File

@ -40,6 +40,8 @@
#define OPAL_I2C_ARBT_LOST -22
#define OPAL_I2C_NACK_RCVD -23
#define OPAL_I2C_STOP_ERR -24
#define OPAL_XIVE_PROVISIONING -31
#define OPAL_XIVE_FREE_ACTIVE -32
/* API Tokens (in r0) */
#define OPAL_INVALID_CALL -1
@ -168,7 +170,27 @@
#define OPAL_INT_SET_MFRR 125
#define OPAL_PCI_TCE_KILL 126
#define OPAL_NMMU_SET_PTCR 127
#define OPAL_LAST 127
#define OPAL_XIVE_RESET 128
#define OPAL_XIVE_GET_IRQ_INFO 129
#define OPAL_XIVE_GET_IRQ_CONFIG 130
#define OPAL_XIVE_SET_IRQ_CONFIG 131
#define OPAL_XIVE_GET_QUEUE_INFO 132
#define OPAL_XIVE_SET_QUEUE_INFO 133
#define OPAL_XIVE_DONATE_PAGE 134
#define OPAL_XIVE_ALLOCATE_VP_BLOCK 135
#define OPAL_XIVE_FREE_VP_BLOCK 136
#define OPAL_XIVE_GET_VP_INFO 137
#define OPAL_XIVE_SET_VP_INFO 138
#define OPAL_XIVE_ALLOCATE_IRQ 139
#define OPAL_XIVE_FREE_IRQ 140
#define OPAL_XIVE_SYNC 141
#define OPAL_XIVE_DUMP 142
#define OPAL_XIVE_RESERVED3 143
#define OPAL_XIVE_RESERVED4 144
#define OPAL_NPU_INIT_CONTEXT 146
#define OPAL_NPU_DESTROY_CONTEXT 147
#define OPAL_NPU_MAP_LPAR 148
#define OPAL_LAST 148
/* Device tree flags */
@ -928,6 +950,59 @@ enum {
OPAL_PCI_TCE_KILL_ALL,
};
/* The xive operation mode indicates the active "API" and
* corresponds to the "mode" parameter of the opal_xive_reset()
* call
*/
enum {
OPAL_XIVE_MODE_EMU = 0,
OPAL_XIVE_MODE_EXPL = 1,
};
/* Flags for OPAL_XIVE_GET_IRQ_INFO */
enum {
OPAL_XIVE_IRQ_TRIGGER_PAGE = 0x00000001,
OPAL_XIVE_IRQ_STORE_EOI = 0x00000002,
OPAL_XIVE_IRQ_LSI = 0x00000004,
OPAL_XIVE_IRQ_SHIFT_BUG = 0x00000008,
OPAL_XIVE_IRQ_MASK_VIA_FW = 0x00000010,
OPAL_XIVE_IRQ_EOI_VIA_FW = 0x00000020,
};
/* Flags for OPAL_XIVE_GET/SET_QUEUE_INFO */
enum {
OPAL_XIVE_EQ_ENABLED = 0x00000001,
OPAL_XIVE_EQ_ALWAYS_NOTIFY = 0x00000002,
OPAL_XIVE_EQ_ESCALATE = 0x00000004,
};
/* Flags for OPAL_XIVE_GET/SET_VP_INFO */
enum {
OPAL_XIVE_VP_ENABLED = 0x00000001,
};
/* "Any chip" replacement for chip ID for allocation functions */
enum {
OPAL_XIVE_ANY_CHIP = 0xffffffff,
};
/* Xive sync options */
enum {
/* This bits are cumulative, arg is a girq */
XIVE_SYNC_EAS = 0x00000001, /* Sync irq source */
XIVE_SYNC_QUEUE = 0x00000002, /* Sync irq target */
};
/* Dump options */
enum {
XIVE_DUMP_TM_HYP = 0,
XIVE_DUMP_TM_POOL = 1,
XIVE_DUMP_TM_OS = 2,
XIVE_DUMP_TM_USER = 3,
XIVE_DUMP_VP = 4,
XIVE_DUMP_EMU_STATE = 5,
};
#endif /* __ASSEMBLY__ */
#endif /* __OPAL_API_H */

View File

@ -29,6 +29,11 @@ extern struct device_node *opal_node;
/* API functions */
int64_t opal_invalid_call(void);
int64_t opal_npu_destroy_context(uint64_t phb_id, uint64_t pid, uint64_t bdf);
int64_t opal_npu_init_context(uint64_t phb_id, int pasid, uint64_t msr,
uint64_t bdf);
int64_t opal_npu_map_lpar(uint64_t phb_id, uint64_t bdf, uint64_t lparid,
uint64_t lpcr);
int64_t opal_console_write(int64_t term_number, __be64 *length,
const uint8_t *buffer);
int64_t opal_console_read(int64_t term_number, __be64 *length,
@ -226,6 +231,42 @@ int64_t opal_pci_tce_kill(uint64_t phb_id, uint32_t kill_type,
uint32_t pe_num, uint32_t tce_size,
uint64_t dma_addr, uint32_t npages);
int64_t opal_nmmu_set_ptcr(uint64_t chip_id, uint64_t ptcr);
int64_t opal_xive_reset(uint64_t version);
int64_t opal_xive_get_irq_info(uint32_t girq,
__be64 *out_flags,
__be64 *out_eoi_page,
__be64 *out_trig_page,
__be32 *out_esb_shift,
__be32 *out_src_chip);
int64_t opal_xive_get_irq_config(uint32_t girq, __be64 *out_vp,
uint8_t *out_prio, __be32 *out_lirq);
int64_t opal_xive_set_irq_config(uint32_t girq, uint64_t vp, uint8_t prio,
uint32_t lirq);
int64_t opal_xive_get_queue_info(uint64_t vp, uint32_t prio,
__be64 *out_qpage,
__be64 *out_qsize,
__be64 *out_qeoi_page,
__be32 *out_escalate_irq,
__be64 *out_qflags);
int64_t opal_xive_set_queue_info(uint64_t vp, uint32_t prio,
uint64_t qpage,
uint64_t qsize,
uint64_t qflags);
int64_t opal_xive_donate_page(uint32_t chip_id, uint64_t addr);
int64_t opal_xive_alloc_vp_block(uint32_t alloc_order);
int64_t opal_xive_free_vp_block(uint64_t vp);
int64_t opal_xive_get_vp_info(uint64_t vp,
__be64 *out_flags,
__be64 *out_cam_value,
__be64 *out_report_cl_pair,
__be32 *out_chip_id);
int64_t opal_xive_set_vp_info(uint64_t vp,
uint64_t flags,
uint64_t report_cl_pair);
int64_t opal_xive_allocate_irq(uint32_t chip_id);
int64_t opal_xive_free_irq(uint32_t girq);
int64_t opal_xive_sync(uint32_t type, uint32_t id);
int64_t opal_xive_dump(uint32_t type, uint32_t id);
/* Internal functions */
extern int early_init_dt_scan_opal(unsigned long node, const char *uname,

View File

@ -99,7 +99,6 @@ struct paca_struct {
*/
/* used for most interrupts/exceptions */
u64 exgen[13] __attribute__((aligned(0x80)));
u64 exmc[13]; /* used for machine checks */
u64 exslb[13]; /* used for SLB/segment table misses
* on the linear mapping */
/* SLB related definitions */
@ -139,6 +138,7 @@ struct paca_struct {
#ifdef CONFIG_PPC_MM_SLICES
u64 mm_ctx_low_slices_psize;
unsigned char mm_ctx_high_slices_psize[SLICE_ARRAY_SIZE];
unsigned long addr_limit;
#else
u16 mm_ctx_user_psize;
u16 mm_ctx_sllp;
@ -172,17 +172,31 @@ struct paca_struct {
u8 thread_mask;
/* Mask to denote subcore sibling threads */
u8 subcore_sibling_mask;
/*
* Pointer to an array which contains pointer
* to the sibling threads' paca.
*/
struct paca_struct **thread_sibling_pacas;
#endif
#ifdef CONFIG_PPC_STD_MMU_64
/* Non-maskable exceptions that are not performance critical */
u64 exnmi[13]; /* used for system reset (nmi) */
u64 exmc[13]; /* used for machine checks */
#endif
#ifdef CONFIG_PPC_BOOK3S_64
/* Exclusive emergency stack pointer for machine check exception. */
/* Exclusive stacks for system reset and machine check exception. */
void *nmi_emergency_sp;
void *mc_emergency_sp;
u16 in_nmi; /* In nmi handler */
/*
* Flag to check whether we are in machine check early handler
* and already using emergency stack.
*/
u16 in_mce;
u8 hmi_event_available; /* HMI event is available */
u8 hmi_event_available; /* HMI event is available */
#endif
/* Stuff for accurate time accounting */
@ -206,23 +220,7 @@ struct paca_struct {
#endif
};
#ifdef CONFIG_PPC_BOOK3S
static inline void copy_mm_to_paca(mm_context_t *context)
{
get_paca()->mm_ctx_id = context->id;
#ifdef CONFIG_PPC_MM_SLICES
get_paca()->mm_ctx_low_slices_psize = context->low_slices_psize;
memcpy(&get_paca()->mm_ctx_high_slices_psize,
&context->high_slices_psize, SLICE_ARRAY_SIZE);
#else
get_paca()->mm_ctx_user_psize = context->user_psize;
get_paca()->mm_ctx_sllp = context->sllp;
#endif
}
#else
static inline void copy_mm_to_paca(mm_context_t *context){}
#endif
extern void copy_mm_to_paca(struct mm_struct *mm);
extern struct paca_struct *paca;
extern void initialise_paca(struct paca_struct *new_paca, int cpu);
extern void setup_paca(struct paca_struct *new_paca);

View File

@ -98,21 +98,7 @@ extern u64 ppc64_pft_size;
#define GET_LOW_SLICE_INDEX(addr) ((addr) >> SLICE_LOW_SHIFT)
#define GET_HIGH_SLICE_INDEX(addr) ((addr) >> SLICE_HIGH_SHIFT)
/*
* 1 bit per slice and we have one slice per 1TB
* Right now we support only 64TB.
* IF we change this we will have to change the type
* of high_slices
*/
#define SLICE_MASK_SIZE 8
#ifndef __ASSEMBLY__
struct slice_mask {
u16 low_slices;
u64 high_slices;
};
struct mm_struct;
extern unsigned long slice_get_unmapped_area(unsigned long addr,

View File

@ -38,6 +38,9 @@ struct power_pmu {
unsigned long *valp);
int (*get_alternatives)(u64 event_id, unsigned int flags,
u64 alt[]);
void (*get_mem_data_src)(union perf_mem_data_src *dsrc,
u32 flags, struct pt_regs *regs);
void (*get_mem_weight)(u64 *weight);
u64 (*bhrb_filter_map)(u64 branch_sample_type);
void (*config_bhrb)(u64 pmu_bhrb_filter);
void (*disable_pmc)(unsigned int pmc, unsigned long mmcr[]);

View File

@ -11,9 +11,31 @@
#define _ASM_POWERNV_H
#ifdef CONFIG_PPC_POWERNV
#define NPU2_WRITE 1
extern void powernv_set_nmmu_ptcr(unsigned long ptcr);
extern struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
unsigned long flags,
struct npu_context *(*cb)(struct npu_context *, void *),
void *priv);
extern void pnv_npu2_destroy_context(struct npu_context *context,
struct pci_dev *gpdev);
extern int pnv_npu2_handle_fault(struct npu_context *context, uintptr_t *ea,
unsigned long *flags, unsigned long *status,
int count);
#else
static inline void powernv_set_nmmu_ptcr(unsigned long ptcr) { }
static inline struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
unsigned long flags,
struct npu_context *(*cb)(struct npu_context *, void *),
void *priv) { return ERR_PTR(-ENODEV); }
static inline void pnv_npu2_destroy_context(struct npu_context *context,
struct pci_dev *gpdev) { }
static inline int pnv_npu2_handle_fault(struct npu_context *context,
uintptr_t *ea, unsigned long *flags,
unsigned long *status, int count) {
return -ENODEV;
}
#endif
#endif /* _ASM_POWERNV_H */

View File

@ -161,6 +161,7 @@
#define PPC_INST_MFTMR 0x7c0002dc
#define PPC_INST_MSGSND 0x7c00019c
#define PPC_INST_MSGCLR 0x7c0001dc
#define PPC_INST_MSGSYNC 0x7c0006ec
#define PPC_INST_MSGSNDP 0x7c00011c
#define PPC_INST_MTTMR 0x7c0003dc
#define PPC_INST_NOP 0x60000000
@ -345,6 +346,7 @@
___PPC_RB(b) | __PPC_EH(eh))
#define PPC_MSGSND(b) stringify_in_c(.long PPC_INST_MSGSND | \
___PPC_RB(b))
#define PPC_MSGSYNC stringify_in_c(.long PPC_INST_MSGSYNC)
#define PPC_MSGCLR(b) stringify_in_c(.long PPC_INST_MSGCLR | \
___PPC_RB(b))
#define PPC_MSGSNDP(b) stringify_in_c(.long PPC_INST_MSGSNDP | \

View File

@ -102,11 +102,25 @@ void release_thread(struct task_struct *);
#endif
#ifdef CONFIG_PPC64
/* 64-bit user address space is 46-bits (64TB user VM) */
#define TASK_SIZE_USER64 (0x0000400000000000UL)
/*
* 64-bit user address space can have multiple limits
* For now supported values are:
*/
#define TASK_SIZE_64TB (0x0000400000000000UL)
#define TASK_SIZE_128TB (0x0000800000000000UL)
#define TASK_SIZE_512TB (0x0002000000000000UL)
/*
* 32-bit user address space is 4GB - 1 page
#ifdef CONFIG_PPC_BOOK3S_64
/*
* Max value currently used:
*/
#define TASK_SIZE_USER64 TASK_SIZE_512TB
#else
#define TASK_SIZE_USER64 TASK_SIZE_64TB
#endif
/*
* 32-bit user address space is 4GB - 1 page
* (this 1 page is needed so referencing of 0xFFFFFFFF generates EFAULT
*/
#define TASK_SIZE_USER32 (0x0000000100000000UL - (1*PAGE_SIZE))
@ -114,26 +128,37 @@ void release_thread(struct task_struct *);
#define TASK_SIZE_OF(tsk) (test_tsk_thread_flag(tsk, TIF_32BIT) ? \
TASK_SIZE_USER32 : TASK_SIZE_USER64)
#define TASK_SIZE TASK_SIZE_OF(current)
/* This decides where the kernel will search for a free chunk of vm
* space during mmap's.
*/
#define TASK_UNMAPPED_BASE_USER32 (PAGE_ALIGN(TASK_SIZE_USER32 / 4))
#define TASK_UNMAPPED_BASE_USER64 (PAGE_ALIGN(TASK_SIZE_USER64 / 4))
#define TASK_UNMAPPED_BASE_USER64 (PAGE_ALIGN(TASK_SIZE_128TB / 4))
#define TASK_UNMAPPED_BASE ((is_32bit_task()) ? \
TASK_UNMAPPED_BASE_USER32 : TASK_UNMAPPED_BASE_USER64 )
#endif
/*
* Initial task size value for user applications. For book3s 64 we start
* with 128TB and conditionally enable upto 512TB
*/
#ifdef CONFIG_PPC_BOOK3S_64
#define DEFAULT_MAP_WINDOW ((is_32bit_task()) ? \
TASK_SIZE_USER32 : TASK_SIZE_128TB)
#else
#define DEFAULT_MAP_WINDOW TASK_SIZE
#endif
#ifdef __powerpc64__
#define STACK_TOP_USER64 TASK_SIZE_USER64
/* Limit stack to 128TB */
#define STACK_TOP_USER64 TASK_SIZE_128TB
#define STACK_TOP_USER32 TASK_SIZE_USER32
#define STACK_TOP (is_32bit_task() ? \
STACK_TOP_USER32 : STACK_TOP_USER64)
#define STACK_TOP_MAX STACK_TOP_USER64
#define STACK_TOP_MAX TASK_SIZE_USER64
#else /* __powerpc64__ */

View File

@ -310,6 +310,7 @@
#define SPRN_PMCR 0x374 /* Power Management Control Register */
/* HFSCR and FSCR bit numbers are the same */
#define FSCR_SCV_LG 12 /* Enable System Call Vectored */
#define FSCR_MSGP_LG 10 /* Enable MSGP */
#define FSCR_TAR_LG 8 /* Enable Target Address Register */
#define FSCR_EBB_LG 7 /* Enable Event Based Branching */
@ -320,6 +321,7 @@
#define FSCR_VECVSX_LG 1 /* Enable VMX/VSX */
#define FSCR_FP_LG 0 /* Enable Floating Point */
#define SPRN_FSCR 0x099 /* Facility Status & Control Register */
#define FSCR_SCV __MASK(FSCR_SCV_LG)
#define FSCR_TAR __MASK(FSCR_TAR_LG)
#define FSCR_EBB __MASK(FSCR_EBB_LG)
#define FSCR_DSCR __MASK(FSCR_DSCR_LG)
@ -365,6 +367,7 @@
#define LPCR_MER_SH 11
#define LPCR_GTSE ASM_CONST(0x0000000000000400) /* Guest Translation Shootdown Enable */
#define LPCR_TC ASM_CONST(0x0000000000000200) /* Translation control */
#define LPCR_HEIC ASM_CONST(0x0000000000000010) /* Hypervisor External Interrupt Control */
#define LPCR_LPES 0x0000000c
#define LPCR_LPES0 ASM_CONST(0x0000000000000008) /* LPAR Env selector 0 */
#define LPCR_LPES1 ASM_CONST(0x0000000000000004) /* LPAR Env selector 1 */
@ -656,6 +659,7 @@
#define SRR1_ISI_PROT 0x08000000 /* ISI: Other protection fault */
#define SRR1_WAKEMASK 0x00380000 /* reason for wakeup */
#define SRR1_WAKEMASK_P8 0x003c0000 /* reason for wakeup on POWER8 and 9 */
#define SRR1_WAKEMCE_RESVD 0x003c0000 /* Unused/reserved value used by MCE wakeup to indicate cause to idle wakeup handler */
#define SRR1_WAKESYSERR 0x00300000 /* System error */
#define SRR1_WAKEEE 0x00200000 /* External interrupt */
#define SRR1_WAKEHVI 0x00240000 /* Hypervisor Virtualization Interrupt (P9) */

View File

@ -6,6 +6,8 @@
#include <linux/uaccess.h>
#include <asm-generic/sections.h>
extern char __head_end[];
#ifdef __powerpc64__
extern char __start_interrupts[];

View File

@ -40,10 +40,12 @@ extern int cpu_to_chip_id(int cpu);
struct smp_ops_t {
void (*message_pass)(int cpu, int msg);
#ifdef CONFIG_PPC_SMP_MUXED_IPI
void (*cause_ipi)(int cpu, unsigned long data);
void (*cause_ipi)(int cpu);
#endif
int (*cause_nmi_ipi)(int cpu);
void (*probe)(void);
int (*kick_cpu)(int nr);
int (*prepare_cpu)(int nr);
void (*setup_cpu)(int nr);
void (*bringup_done)(void);
void (*take_timebase)(void);
@ -61,7 +63,6 @@ extern void smp_generic_take_timebase(void);
DECLARE_PER_CPU(unsigned int, cpu_pvr);
#ifdef CONFIG_HOTPLUG_CPU
extern void migrate_irqs(void);
int generic_cpu_disable(void);
void generic_cpu_die(unsigned int cpu);
void generic_set_cpu_dead(unsigned int cpu);
@ -112,23 +113,31 @@ extern int cpu_to_core_id(int cpu);
*
* Make sure this matches openpic_request_IPIs in open_pic.c, or what shows up
* in /proc/interrupts will be wrong!!! --Troy */
#define PPC_MSG_CALL_FUNCTION 0
#define PPC_MSG_RESCHEDULE 1
#define PPC_MSG_CALL_FUNCTION 0
#define PPC_MSG_RESCHEDULE 1
#define PPC_MSG_TICK_BROADCAST 2
#define PPC_MSG_DEBUGGER_BREAK 3
#define PPC_MSG_NMI_IPI 3
/* This is only used by the powernv kernel */
#define PPC_MSG_RM_HOST_ACTION 4
#define NMI_IPI_ALL_OTHERS -2
#ifdef CONFIG_NMI_IPI
extern int smp_handle_nmi_ipi(struct pt_regs *regs);
#else
static inline int smp_handle_nmi_ipi(struct pt_regs *regs) { return 0; }
#endif
/* for irq controllers that have dedicated ipis per message (4) */
extern int smp_request_message_ipi(int virq, int message);
extern const char *smp_ipi_name[];
/* for irq controllers with only a single ipi */
extern void smp_muxed_ipi_set_data(int cpu, unsigned long data);
extern void smp_muxed_ipi_message_pass(int cpu, int msg);
extern void smp_muxed_ipi_set_message(int cpu, int msg);
extern irqreturn_t smp_ipi_demux(void);
extern irqreturn_t smp_ipi_demux_relaxed(void);
void smp_init_pSeries(void);
void smp_init_cell(void);

View File

@ -8,10 +8,10 @@
struct rtas_args;
asmlinkage unsigned long sys_mmap(unsigned long addr, size_t len,
asmlinkage long sys_mmap(unsigned long addr, size_t len,
unsigned long prot, unsigned long flags,
unsigned long fd, off_t offset);
asmlinkage unsigned long sys_mmap2(unsigned long addr, size_t len,
asmlinkage long sys_mmap2(unsigned long addr, size_t len,
unsigned long prot, unsigned long flags,
unsigned long fd, unsigned long pgoff);
asmlinkage long ppc64_personality(unsigned long personality);

View File

@ -10,15 +10,7 @@
#ifdef __KERNEL__
/* We have 8k stacks on ppc32 and 16k on ppc64 */
#if defined(CONFIG_PPC64)
#define THREAD_SHIFT 14
#elif defined(CONFIG_PPC_256K_PAGES)
#define THREAD_SHIFT 15
#else
#define THREAD_SHIFT 13
#endif
#define THREAD_SHIFT CONFIG_THREAD_SHIFT
#define THREAD_SIZE (1 << THREAD_SHIFT)

View File

@ -57,7 +57,7 @@ struct icp_ops {
void (*teardown_cpu)(void);
void (*flush_ipi)(void);
#ifdef CONFIG_SMP
void (*cause_ipi)(int cpu, unsigned long data);
void (*cause_ipi)(int cpu);
irq_handler_t ipi_action;
#endif
};

View File

@ -0,0 +1,97 @@
/*
* Copyright 2016,2017 IBM Corporation.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#ifndef _ASM_POWERPC_XIVE_REGS_H
#define _ASM_POWERPC_XIVE_REGS_H
/*
* Thread Management (aka "TM") registers
*/
/* TM register offsets */
#define TM_QW0_USER 0x000 /* All rings */
#define TM_QW1_OS 0x010 /* Ring 0..2 */
#define TM_QW2_HV_POOL 0x020 /* Ring 0..1 */
#define TM_QW3_HV_PHYS 0x030 /* Ring 0..1 */
/* Byte offsets inside a QW QW0 QW1 QW2 QW3 */
#define TM_NSR 0x0 /* + + - + */
#define TM_CPPR 0x1 /* - + - + */
#define TM_IPB 0x2 /* - + + + */
#define TM_LSMFB 0x3 /* - + + + */
#define TM_ACK_CNT 0x4 /* - + - - */
#define TM_INC 0x5 /* - + - + */
#define TM_AGE 0x6 /* - + - + */
#define TM_PIPR 0x7 /* - + - + */
#define TM_WORD0 0x0
#define TM_WORD1 0x4
/*
* QW word 2 contains the valid bit at the top and other fields
* depending on the QW.
*/
#define TM_WORD2 0x8
#define TM_QW0W2_VU PPC_BIT32(0)
#define TM_QW0W2_LOGIC_SERV PPC_BITMASK32(1,31) // XX 2,31 ?
#define TM_QW1W2_VO PPC_BIT32(0)
#define TM_QW1W2_OS_CAM PPC_BITMASK32(8,31)
#define TM_QW2W2_VP PPC_BIT32(0)
#define TM_QW2W2_POOL_CAM PPC_BITMASK32(8,31)
#define TM_QW3W2_VT PPC_BIT32(0)
#define TM_QW3W2_LP PPC_BIT32(6)
#define TM_QW3W2_LE PPC_BIT32(7)
#define TM_QW3W2_T PPC_BIT32(31)
/*
* In addition to normal loads to "peek" and writes (only when invalid)
* using 4 and 8 bytes accesses, the above registers support these
* "special" byte operations:
*
* - Byte load from QW0[NSR] - User level NSR (EBB)
* - Byte store to QW0[NSR] - User level NSR (EBB)
* - Byte load/store to QW1[CPPR] and QW3[CPPR] - CPPR access
* - Byte load from QW3[TM_WORD2] - Read VT||00000||LP||LE on thrd 0
* otherwise VT||0000000
* - Byte store to QW3[TM_WORD2] - Set VT bit (and LP/LE if present)
*
* Then we have all these "special" CI ops at these offset that trigger
* all sorts of side effects:
*/
#define TM_SPC_ACK_EBB 0x800 /* Load8 ack EBB to reg*/
#define TM_SPC_ACK_OS_REG 0x810 /* Load16 ack OS irq to reg */
#define TM_SPC_PUSH_USR_CTX 0x808 /* Store32 Push/Validate user context */
#define TM_SPC_PULL_USR_CTX 0x808 /* Load32 Pull/Invalidate user context */
#define TM_SPC_SET_OS_PENDING 0x812 /* Store8 Set OS irq pending bit */
#define TM_SPC_PULL_OS_CTX 0x818 /* Load32/Load64 Pull/Invalidate OS context to reg */
#define TM_SPC_PULL_POOL_CTX 0x828 /* Load32/Load64 Pull/Invalidate Pool context to reg*/
#define TM_SPC_ACK_HV_REG 0x830 /* Load16 ack HV irq to reg */
#define TM_SPC_PULL_USR_CTX_OL 0xc08 /* Store8 Pull/Inval usr ctx to odd line */
#define TM_SPC_ACK_OS_EL 0xc10 /* Store8 ack OS irq to even line */
#define TM_SPC_ACK_HV_POOL_EL 0xc20 /* Store8 ack HV evt pool to even line */
#define TM_SPC_ACK_HV_EL 0xc30 /* Store8 ack HV irq to even line */
/* XXX more... */
/* NSR fields for the various QW ack types */
#define TM_QW0_NSR_EB PPC_BIT8(0)
#define TM_QW1_NSR_EO PPC_BIT8(0)
#define TM_QW3_NSR_HE PPC_BITMASK8(0,1)
#define TM_QW3_NSR_HE_NONE 0
#define TM_QW3_NSR_HE_POOL 1
#define TM_QW3_NSR_HE_PHYS 2
#define TM_QW3_NSR_HE_LSI 3
#define TM_QW3_NSR_I PPC_BIT8(2)
#define TM_QW3_NSR_GRP_LVL PPC_BIT8(3,7)
/* Utilities to manipulate these (originaly from OPAL) */
#define MASK_TO_LSH(m) (__builtin_ffsl(m) - 1)
#define GETFIELD(m, v) (((v) & (m)) >> MASK_TO_LSH(m))
#define SETFIELD(m, v, val) \
(((v) & ~(m)) | ((((typeof(v))(val)) << MASK_TO_LSH(m)) & (m)))
#endif /* _ASM_POWERPC_XIVE_REGS_H */

View File

@ -0,0 +1,163 @@
/*
* Copyright 2016,2017 IBM Corporation.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#ifndef _ASM_POWERPC_XIVE_H
#define _ASM_POWERPC_XIVE_H
#define XIVE_INVALID_VP 0xffffffff
#ifdef CONFIG_PPC_XIVE
/*
* Thread Interrupt Management Area (TIMA)
*
* This is a global MMIO region divided in 4 pages of varying access
* permissions, providing access to per-cpu interrupt management
* functions. It always identifies the CPU doing the access based
* on the PowerBus initiator ID, thus we always access via the
* same offset regardless of where the code is executing
*/
extern void __iomem *xive_tima;
/*
* Offset in the TM area of our current execution level (provided by
* the backend)
*/
extern u32 xive_tima_offset;
/*
* Per-irq data (irq_get_handler_data for normal IRQs), IPIs
* have it stored in the xive_cpu structure. We also cache
* for normal interrupts the current target CPU.
*
* This structure is setup by the backend for each interrupt.
*/
struct xive_irq_data {
u64 flags;
u64 eoi_page;
void __iomem *eoi_mmio;
u64 trig_page;
void __iomem *trig_mmio;
u32 esb_shift;
int src_chip;
/* Setup/used by frontend */
int target;
bool saved_p;
};
#define XIVE_IRQ_FLAG_STORE_EOI 0x01
#define XIVE_IRQ_FLAG_LSI 0x02
#define XIVE_IRQ_FLAG_SHIFT_BUG 0x04
#define XIVE_IRQ_FLAG_MASK_FW 0x08
#define XIVE_IRQ_FLAG_EOI_FW 0x10
#define XIVE_INVALID_CHIP_ID -1
/* A queue tracking structure in a CPU */
struct xive_q {
__be32 *qpage;
u32 msk;
u32 idx;
u32 toggle;
u64 eoi_phys;
u32 esc_irq;
atomic_t count;
atomic_t pending_count;
};
/*
* "magic" Event State Buffer (ESB) MMIO offsets.
*
* Each interrupt source has a 2-bit state machine called ESB
* which can be controlled by MMIO. It's made of 2 bits, P and
* Q. P indicates that an interrupt is pending (has been sent
* to a queue and is waiting for an EOI). Q indicates that the
* interrupt has been triggered while pending.
*
* This acts as a coalescing mechanism in order to guarantee
* that a given interrupt only occurs at most once in a queue.
*
* When doing an EOI, the Q bit will indicate if the interrupt
* needs to be re-triggered.
*
* The following offsets into the ESB MMIO allow to read or
* manipulate the PQ bits. They must be used with an 8-bytes
* load instruction. They all return the previous state of the
* interrupt (atomically).
*
* Additionally, some ESB pages support doing an EOI via a
* store at 0 and some ESBs support doing a trigger via a
* separate trigger page.
*/
#define XIVE_ESB_GET 0x800
#define XIVE_ESB_SET_PQ_00 0xc00
#define XIVE_ESB_SET_PQ_01 0xd00
#define XIVE_ESB_SET_PQ_10 0xe00
#define XIVE_ESB_SET_PQ_11 0xf00
#define XIVE_ESB_MASK XIVE_ESB_SET_PQ_01
#define XIVE_ESB_VAL_P 0x2
#define XIVE_ESB_VAL_Q 0x1
/* Global enable flags for the XIVE support */
extern bool __xive_enabled;
static inline bool xive_enabled(void) { return __xive_enabled; }
extern bool xive_native_init(void);
extern void xive_smp_probe(void);
extern int xive_smp_prepare_cpu(unsigned int cpu);
extern void xive_smp_setup_cpu(void);
extern void xive_smp_disable_cpu(void);
extern void xive_kexec_teardown_cpu(int secondary);
extern void xive_shutdown(void);
extern void xive_flush_interrupt(void);
/* xmon hook */
extern void xmon_xive_do_dump(int cpu);
/* APIs used by KVM */
extern u32 xive_native_default_eq_shift(void);
extern u32 xive_native_alloc_vp_block(u32 max_vcpus);
extern void xive_native_free_vp_block(u32 vp_base);
extern int xive_native_populate_irq_data(u32 hw_irq,
struct xive_irq_data *data);
extern void xive_cleanup_irq_data(struct xive_irq_data *xd);
extern u32 xive_native_alloc_irq(void);
extern void xive_native_free_irq(u32 irq);
extern int xive_native_configure_irq(u32 hw_irq, u32 target, u8 prio, u32 sw_irq);
extern int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
__be32 *qpage, u32 order, bool can_escalate);
extern void xive_native_disable_queue(u32 vp_id, struct xive_q *q, u8 prio);
extern bool __xive_irq_trigger(struct xive_irq_data *xd);
extern bool __xive_irq_retrigger(struct xive_irq_data *xd);
extern void xive_do_source_eoi(u32 hw_irq, struct xive_irq_data *xd);
extern bool is_xive_irq(struct irq_chip *chip);
#else
static inline bool xive_enabled(void) { return false; }
static inline bool xive_native_init(void) { return false; }
static inline void xive_smp_probe(void) { }
extern inline int xive_smp_prepare_cpu(unsigned int cpu) { return -EINVAL; }
static inline void xive_smp_setup_cpu(void) { }
static inline void xive_smp_disable_cpu(void) { }
static inline void xive_kexec_teardown_cpu(int secondary) { }
static inline void xive_shutdown(void) { }
static inline void xive_flush_interrupt(void) { }
static inline u32 xive_native_alloc_vp_block(u32 max_vcpus) { return XIVE_INVALID_VP; }
static inline void xive_native_free_vp_block(u32 vp_base) { }
#endif
#endif /* _ASM_POWERPC_XIVE_H */

View File

@ -29,5 +29,7 @@ static inline void xmon_register_spus(struct list_head *list) { };
extern int cpus_are_in_xmon(void);
#endif
extern void xmon_printf(const char *format, ...);
#endif /* __KERNEL __ */
#endif /* __ASM_POWERPC_XMON_H */

View File

@ -29,4 +29,20 @@
#define MAP_STACK 0x20000 /* give out an address that is best suited for process/thread stacks */
#define MAP_HUGETLB 0x40000 /* create a huge page mapping */
/*
* When MAP_HUGETLB is set, bits [26:31] of the flags argument to mmap(2),
* encode the log2 of the huge page size. A value of zero indicates that the
* default huge page size should be used. To use a non-default huge page size,
* one of these defines can be used, or the size can be encoded by hand. Note
* that on most systems only a subset, or possibly none, of these sizes will be
* available.
*/
#define MAP_HUGE_512KB (19 << MAP_HUGE_SHIFT) /* 512KB HugeTLB Page */
#define MAP_HUGE_1MB (20 << MAP_HUGE_SHIFT) /* 1MB HugeTLB Page */
#define MAP_HUGE_2MB (21 << MAP_HUGE_SHIFT) /* 2MB HugeTLB Page */
#define MAP_HUGE_8MB (23 << MAP_HUGE_SHIFT) /* 8MB HugeTLB Page */
#define MAP_HUGE_16MB (24 << MAP_HUGE_SHIFT) /* 16MB HugeTLB Page */
#define MAP_HUGE_1GB (30 << MAP_HUGE_SHIFT) /* 1GB HugeTLB Page */
#define MAP_HUGE_16GB (34 << MAP_HUGE_SHIFT) /* 16GB HugeTLB Page */
#endif /* _UAPI_ASM_POWERPC_MMAN_H */

View File

@ -25,8 +25,6 @@ CFLAGS_REMOVE_cputable.o = -mno-sched-epilog $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_prom_init.o = -mno-sched-epilog $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_btext.o = -mno-sched-epilog $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_prom.o = -mno-sched-epilog $(CC_FLAGS_FTRACE)
# do not trace tracer code
CFLAGS_REMOVE_ftrace.o = -mno-sched-epilog $(CC_FLAGS_FTRACE)
# timers used by tracing
CFLAGS_REMOVE_time.o = -mno-sched-epilog $(CC_FLAGS_FTRACE)
endif
@ -97,6 +95,7 @@ obj-$(CONFIG_BOOTX_TEXT) += btext.o
obj-$(CONFIG_SMP) += smp.o
obj-$(CONFIG_KPROBES) += kprobes.o
obj-$(CONFIG_OPTPROBES) += optprobes.o optprobes_head.o
obj-$(CONFIG_KPROBES_ON_FTRACE) += kprobes-ftrace.o
obj-$(CONFIG_UPROBES) += uprobes.o
obj-$(CONFIG_PPC_UDBG_16550) += legacy_serial.o udbg_16550.o
obj-$(CONFIG_STACKTRACE) += stacktrace.o
@ -118,10 +117,7 @@ obj64-$(CONFIG_AUDIT) += compat_audit.o
obj-$(CONFIG_PPC_IO_WORKAROUNDS) += io-workarounds.o
obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
obj-$(CONFIG_TRACING) += trace_clock.o
obj-y += trace/
ifneq ($(CONFIG_PPC_INDIRECT_PIO),y)
obj-y += iomap.o
@ -142,14 +138,14 @@ obj-$(CONFIG_KVM_GUEST) += kvm.o kvm_emul.o
# Disable GCOV & sanitizers in odd or sensitive code
GCOV_PROFILE_prom_init.o := n
UBSAN_SANITIZE_prom_init.o := n
GCOV_PROFILE_ftrace.o := n
UBSAN_SANITIZE_ftrace.o := n
GCOV_PROFILE_machine_kexec_64.o := n
UBSAN_SANITIZE_machine_kexec_64.o := n
GCOV_PROFILE_machine_kexec_32.o := n
UBSAN_SANITIZE_machine_kexec_32.o := n
GCOV_PROFILE_kprobes.o := n
UBSAN_SANITIZE_kprobes.o := n
GCOV_PROFILE_kprobes-ftrace.o := n
UBSAN_SANITIZE_kprobes-ftrace.o := n
UBSAN_SANITIZE_vdso.o := n
extra-$(CONFIG_PPC_FPU) += fpu.o

View File

@ -185,6 +185,7 @@ int main(void)
#ifdef CONFIG_PPC_MM_SLICES
OFFSET(PACALOWSLICESPSIZE, paca_struct, mm_ctx_low_slices_psize);
OFFSET(PACAHIGHSLICEPSIZE, paca_struct, mm_ctx_high_slices_psize);
DEFINE(PACA_ADDR_LIMIT, offsetof(struct paca_struct, addr_limit));
DEFINE(MMUPSIZEDEFSIZE, sizeof(struct mmu_psize_def));
#endif /* CONFIG_PPC_MM_SLICES */
#endif
@ -219,6 +220,7 @@ int main(void)
OFFSET(PACA_EXGEN, paca_struct, exgen);
OFFSET(PACA_EXMC, paca_struct, exmc);
OFFSET(PACA_EXSLB, paca_struct, exslb);
OFFSET(PACA_EXNMI, paca_struct, exnmi);
OFFSET(PACALPPACAPTR, paca_struct, lppaca_ptr);
OFFSET(PACA_SLBSHADOWPTR, paca_struct, slb_shadow_ptr);
OFFSET(SLBSHADOW_STACKVSID, slb_shadow, save_area[SLB_NUM_BOLTED - 1].vsid);
@ -232,7 +234,9 @@ int main(void)
OFFSET(PACAEMERGSP, paca_struct, emergency_sp);
#ifdef CONFIG_PPC_BOOK3S_64
OFFSET(PACAMCEMERGSP, paca_struct, mc_emergency_sp);
OFFSET(PACA_NMI_EMERG_SP, paca_struct, nmi_emergency_sp);
OFFSET(PACA_IN_MCE, paca_struct, in_mce);
OFFSET(PACA_IN_NMI, paca_struct, in_nmi);
#endif
OFFSET(PACAHWCPUID, paca_struct, hw_cpu_id);
OFFSET(PACAKEXECSTATE, paca_struct, kexec_state);
@ -399,8 +403,8 @@ int main(void)
DEFINE(BUG_ENTRY_SIZE, sizeof(struct bug_entry));
#endif
#ifdef MAX_PGD_TABLE_SIZE
DEFINE(PGD_TABLE_SIZE, MAX_PGD_TABLE_SIZE);
#ifdef CONFIG_PPC_BOOK3S_64
DEFINE(PGD_TABLE_SIZE, (sizeof(pgd_t) << max(RADIX_PGD_INDEX_SIZE, H_PGD_INDEX_SIZE)));
#else
DEFINE(PGD_TABLE_SIZE, PGD_TABLE_SIZE);
#endif
@ -727,6 +731,7 @@ int main(void)
OFFSET(PACA_THREAD_IDLE_STATE, paca_struct, thread_idle_state);
OFFSET(PACA_THREAD_MASK, paca_struct, thread_mask);
OFFSET(PACA_SUBCORE_SIBLING_MASK, paca_struct, subcore_sibling_mask);
OFFSET(PACA_SIBLING_PACA_PTRS, paca_struct, thread_sibling_pacas);
#endif
DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);

View File

@ -29,7 +29,8 @@ _GLOBAL(__setup_cpu_power7)
li r0,0
mtspr SPRN_LPID,r0
mfspr r3,SPRN_LPCR
bl __init_LPCR
li r4,(LPCR_LPES1 >> LPCR_LPES_SH)
bl __init_LPCR_ISA206
bl __init_tlb_power7
mtlr r11
blr
@ -42,7 +43,8 @@ _GLOBAL(__restore_cpu_power7)
li r0,0
mtspr SPRN_LPID,r0
mfspr r3,SPRN_LPCR
bl __init_LPCR
li r4,(LPCR_LPES1 >> LPCR_LPES_SH)
bl __init_LPCR_ISA206
bl __init_tlb_power7
mtlr r11
blr
@ -59,7 +61,8 @@ _GLOBAL(__setup_cpu_power8)
mtspr SPRN_LPID,r0
mfspr r3,SPRN_LPCR
ori r3, r3, LPCR_PECEDH
bl __init_LPCR
li r4,0 /* LPES = 0 */
bl __init_LPCR_ISA206
bl __init_HFSCR
bl __init_tlb_power8
bl __init_PMU_HV
@ -80,7 +83,8 @@ _GLOBAL(__restore_cpu_power8)
mtspr SPRN_LPID,r0
mfspr r3,SPRN_LPCR
ori r3, r3, LPCR_PECEDH
bl __init_LPCR
li r4,0 /* LPES = 0 */
bl __init_LPCR_ISA206
bl __init_HFSCR
bl __init_tlb_power8
bl __init_PMU_HV
@ -99,11 +103,12 @@ _GLOBAL(__setup_cpu_power9)
mtspr SPRN_PSSCR,r0
mtspr SPRN_LPID,r0
mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE)
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
or r3, r3, r4
LOAD_REG_IMMEDIATE(r4, LPCR_UPRT | LPCR_HR)
andc r3, r3, r4
bl __init_LPCR
li r4,0 /* LPES = 0 */
bl __init_LPCR_ISA300
bl __init_HFSCR
bl __init_tlb_power9
bl __init_PMU_HV
@ -122,11 +127,12 @@ _GLOBAL(__restore_cpu_power9)
mtspr SPRN_PSSCR,r0
mtspr SPRN_LPID,r0
mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE)
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
or r3, r3, r4
LOAD_REG_IMMEDIATE(r4, LPCR_UPRT | LPCR_HR)
andc r3, r3, r4
bl __init_LPCR
li r4,0 /* LPES = 0 */
bl __init_LPCR_ISA300
bl __init_HFSCR
bl __init_tlb_power9
bl __init_PMU_HV
@ -144,9 +150,9 @@ __init_hvmode_206:
std r5,CPU_SPEC_FEATURES(r4)
blr
__init_LPCR:
__init_LPCR_ISA206:
/* Setup a sane LPCR:
* Called with initial LPCR in R3
* Called with initial LPCR in R3 and desired LPES 2-bit value in R4
*
* LPES = 0b01 (HSRR0/1 used for 0x500)
* PECE = 0b111
@ -157,16 +163,18 @@ __init_LPCR:
*
* Other bits untouched for now
*/
li r5,1
rldimi r3,r5, LPCR_LPES_SH, 64-LPCR_LPES_SH-2
li r5,0x10
rldimi r3,r5, LPCR_VRMASD_SH, 64-LPCR_VRMASD_SH-5
/* POWER9 has no VRMASD */
__init_LPCR_ISA300:
rldimi r3,r4, LPCR_LPES_SH, 64-LPCR_LPES_SH-2
ori r3,r3,(LPCR_PECE0|LPCR_PECE1|LPCR_PECE2)
li r5,4
rldimi r3,r5, LPCR_DPFD_SH, 64-LPCR_DPFD_SH-3
clrrdi r3,r3,1 /* clear HDICE */
li r5,4
rldimi r3,r5, LPCR_VC_SH, 0
li r5,0x10
rldimi r3,r5, LPCR_VRMASD_SH, 64-LPCR_VRMASD_SH-5
mtspr SPRN_LPCR,r3
isync
blr

View File

@ -20,18 +20,60 @@
#include <asm/kvm_ppc.h>
#ifdef CONFIG_SMP
void doorbell_setup_this_cpu(void)
{
unsigned long tag = mfspr(SPRN_DOORBELL_CPUTAG) & PPC_DBELL_TAG_MASK;
smp_muxed_ipi_set_data(smp_processor_id(), tag);
/*
* Doorbells must only be used if CPU_FTR_DBELL is available.
* msgsnd is used in HV, and msgsndp is used in !HV.
*
* These should be used by platform code that is aware of restrictions.
* Other arch code should use ->cause_ipi.
*
* doorbell_global_ipi() sends a dbell to any target CPU.
* Must be used only by architectures that address msgsnd target
* by PIR/get_hard_smp_processor_id.
*/
void doorbell_global_ipi(int cpu)
{
u32 tag = get_hard_smp_processor_id(cpu);
kvmppc_set_host_ipi(cpu, 1);
/* Order previous accesses vs. msgsnd, which is treated as a store */
ppc_msgsnd_sync();
ppc_msgsnd(PPC_DBELL_MSGTYPE, 0, tag);
}
void doorbell_cause_ipi(int cpu, unsigned long data)
/*
* doorbell_core_ipi() sends a dbell to a target CPU in the same core.
* Must be used only by architectures that address msgsnd target
* by TIR/cpu_thread_in_core.
*/
void doorbell_core_ipi(int cpu)
{
u32 tag = cpu_thread_in_core(cpu);
kvmppc_set_host_ipi(cpu, 1);
/* Order previous accesses vs. msgsnd, which is treated as a store */
mb();
ppc_msgsnd(PPC_DBELL_MSGTYPE, 0, data);
ppc_msgsnd_sync();
ppc_msgsnd(PPC_DBELL_MSGTYPE, 0, tag);
}
/*
* Attempt to cause a core doorbell if destination is on the same core.
* Returns 1 on success, 0 on failure.
*/
int doorbell_try_core_ipi(int cpu)
{
int this_cpu = get_cpu();
int ret = 0;
if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
doorbell_core_ipi(cpu);
ret = 1;
}
put_cpu();
return ret;
}
void doorbell_exception(struct pt_regs *regs)
@ -40,12 +82,14 @@ void doorbell_exception(struct pt_regs *regs)
irq_enter();
ppc_msgsync();
may_hard_irq_enable();
kvmppc_set_host_ipi(smp_processor_id(), 0);
__this_cpu_inc(irq_stat.doorbell_irqs);
smp_ipi_demux();
smp_ipi_demux_relaxed(); /* already performed the barrier */
irq_exit();
set_irq_regs(old_regs);

View File

@ -22,7 +22,6 @@
*/
#include <linux/delay.h>
#include <linux/debugfs.h>
#include <linux/sched.h>
#include <linux/init.h>
#include <linux/list.h>
@ -37,7 +36,7 @@
#include <linux/of.h>
#include <linux/atomic.h>
#include <asm/debug.h>
#include <asm/debugfs.h>
#include <asm/eeh.h>
#include <asm/eeh_event.h>
#include <asm/io.h>

View File

@ -724,7 +724,16 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
*/
#define MAX_WAIT_FOR_RECOVERY 300
static void eeh_handle_normal_event(struct eeh_pe *pe)
/**
* eeh_handle_normal_event - Handle EEH events on a specific PE
* @pe: EEH PE
*
* Attempts to recover the given PE. If recovery fails or the PE has failed
* too many times, remove the PE.
*
* Returns true if @pe should no longer be used, else false.
*/
static bool eeh_handle_normal_event(struct eeh_pe *pe)
{
struct pci_bus *frozen_bus;
struct eeh_dev *edev, *tmp;
@ -736,13 +745,18 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
if (!frozen_bus) {
pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n",
__func__, pe->phb->global_number, pe->addr);
return;
return false;
}
eeh_pe_update_time_stamp(pe);
pe->freeze_count++;
if (pe->freeze_count > eeh_max_freezes)
goto excess_failures;
if (pe->freeze_count > eeh_max_freezes) {
pr_err("EEH: PHB#%x-PE#%x has failed %d times in the\n"
"last hour and has been permanently disabled.\n",
pe->phb->global_number, pe->addr,
pe->freeze_count);
goto hard_fail;
}
pr_warn("EEH: This PCI device has failed %d times in the last hour\n",
pe->freeze_count);
@ -870,27 +884,18 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
pr_info("EEH: Notify device driver to resume\n");
eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);
return;
return false;
excess_failures:
hard_fail:
/*
* About 90% of all real-life EEH failures in the field
* are due to poorly seated PCI cards. Only 10% or so are
* due to actual, failed cards.
*/
pr_err("EEH: PHB#%x-PE#%x has failed %d times in the\n"
"last hour and has been permanently disabled.\n"
"Please try reseating or replacing it.\n",
pe->phb->global_number, pe->addr,
pe->freeze_count);
goto perm_error;
hard_fail:
pr_err("EEH: Unable to recover from failure from PHB#%x-PE#%x.\n"
"Please try reseating or replacing it\n",
pe->phb->global_number, pe->addr);
perm_error:
eeh_slot_error_detail(pe, EEH_LOG_PERM);
/* Notify all devices that they're about to go down. */
@ -915,10 +920,21 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
pci_lock_rescan_remove();
pci_hp_remove_devices(frozen_bus);
pci_unlock_rescan_remove();
/* The passed PE should no longer be used */
return true;
}
}
return false;
}
/**
* eeh_handle_special_event - Handle EEH events without a specific failing PE
*
* Called when an EEH event is detected but can't be narrowed down to a
* specific PE. Iterates through possible failures and handles them as
* necessary.
*/
static void eeh_handle_special_event(void)
{
struct eeh_pe *pe, *phb_pe;
@ -982,7 +998,14 @@ static void eeh_handle_special_event(void)
*/
if (rc == EEH_NEXT_ERR_FROZEN_PE ||
rc == EEH_NEXT_ERR_FENCED_PHB) {
eeh_handle_normal_event(pe);
/*
* eeh_handle_normal_event() can make the PE stale if it
* determines that the PE cannot possibly be recovered.
* Don't modify the PE state if that's the case.
*/
if (eeh_handle_normal_event(pe))
continue;
eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
} else {
pci_lock_rescan_remove();

View File

@ -31,7 +31,6 @@
#include <asm/ppc_asm.h>
#include <asm/asm-offsets.h>
#include <asm/unistd.h>
#include <asm/ftrace.h>
#include <asm/ptrace.h>
#include <asm/export.h>
@ -1315,109 +1314,3 @@ machine_check_in_rtas:
/* XXX load up BATs and panic */
#endif /* CONFIG_PPC_RTAS */
#ifdef CONFIG_FUNCTION_TRACER
#ifdef CONFIG_DYNAMIC_FTRACE
_GLOBAL(mcount)
_GLOBAL(_mcount)
/*
* It is required that _mcount on PPC32 must preserve the
* link register. But we have r0 to play with. We use r0
* to push the return address back to the caller of mcount
* into the ctr register, restore the link register and
* then jump back using the ctr register.
*/
mflr r0
mtctr r0
lwz r0, 4(r1)
mtlr r0
bctr
_GLOBAL(ftrace_caller)
MCOUNT_SAVE_FRAME
/* r3 ends up with link register */
subi r3, r3, MCOUNT_INSN_SIZE
.globl ftrace_call
ftrace_call:
bl ftrace_stub
nop
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
.globl ftrace_graph_call
ftrace_graph_call:
b ftrace_graph_stub
_GLOBAL(ftrace_graph_stub)
#endif
MCOUNT_RESTORE_FRAME
/* old link register ends up in ctr reg */
bctr
#else
_GLOBAL(mcount)
_GLOBAL(_mcount)
MCOUNT_SAVE_FRAME
subi r3, r3, MCOUNT_INSN_SIZE
LOAD_REG_ADDR(r5, ftrace_trace_function)
lwz r5,0(r5)
mtctr r5
bctrl
nop
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
b ftrace_graph_caller
#endif
MCOUNT_RESTORE_FRAME
bctr
#endif
EXPORT_SYMBOL(_mcount)
_GLOBAL(ftrace_stub)
blr
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
_GLOBAL(ftrace_graph_caller)
/* load r4 with local address */
lwz r4, 44(r1)
subi r4, r4, MCOUNT_INSN_SIZE
/* Grab the LR out of the caller stack frame */
lwz r3,52(r1)
bl prepare_ftrace_return
nop
/*
* prepare_ftrace_return gives us the address we divert to.
* Change the LR in the callers stack frame to this.
*/
stw r3,52(r1)
MCOUNT_RESTORE_FRAME
/* old link register ends up in ctr reg */
bctr
_GLOBAL(return_to_handler)
/* need to save return values */
stwu r1, -32(r1)
stw r3, 20(r1)
stw r4, 16(r1)
stw r31, 12(r1)
mr r31, r1
bl ftrace_return_to_handler
nop
/* return value has real return address */
mtlr r3
lwz r3, 20(r1)
lwz r4, 16(r1)
lwz r31,12(r1)
lwz r1, 0(r1)
/* Jump back to real return address */
blr
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
#endif /* CONFIG_FUNCTION_TRACER */

View File

@ -20,7 +20,6 @@
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/magic.h>
#include <asm/unistd.h>
#include <asm/processor.h>
#include <asm/page.h>
@ -33,7 +32,6 @@
#include <asm/bug.h>
#include <asm/ptrace.h>
#include <asm/irqflags.h>
#include <asm/ftrace.h>
#include <asm/hw_irq.h>
#include <asm/context_tracking.h>
#include <asm/tm.h>
@ -1173,381 +1171,3 @@ _GLOBAL(enter_prom)
ld r0,16(r1)
mtlr r0
blr
#ifdef CONFIG_FUNCTION_TRACER
#ifdef CONFIG_DYNAMIC_FTRACE
_GLOBAL(mcount)
_GLOBAL(_mcount)
EXPORT_SYMBOL(_mcount)
mflr r12
mtctr r12
mtlr r0
bctr
#ifndef CC_USING_MPROFILE_KERNEL
_GLOBAL_TOC(ftrace_caller)
/* Taken from output of objdump from lib64/glibc */
mflr r3
ld r11, 0(r1)
stdu r1, -112(r1)
std r3, 128(r1)
ld r4, 16(r11)
subi r3, r3, MCOUNT_INSN_SIZE
.globl ftrace_call
ftrace_call:
bl ftrace_stub
nop
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
.globl ftrace_graph_call
ftrace_graph_call:
b ftrace_graph_stub
_GLOBAL(ftrace_graph_stub)
#endif
ld r0, 128(r1)
mtlr r0
addi r1, r1, 112
#else /* CC_USING_MPROFILE_KERNEL */
/*
*
* ftrace_caller() is the function that replaces _mcount() when ftrace is
* active.
*
* We arrive here after a function A calls function B, and we are the trace
* function for B. When we enter r1 points to A's stack frame, B has not yet
* had a chance to allocate one yet.
*
* Additionally r2 may point either to the TOC for A, or B, depending on
* whether B did a TOC setup sequence before calling us.
*
* On entry the LR points back to the _mcount() call site, and r0 holds the
* saved LR as it was on entry to B, ie. the original return address at the
* call site in A.
*
* Our job is to save the register state into a struct pt_regs (on the stack)
* and then arrange for the ftrace function to be called.
*/
_GLOBAL(ftrace_caller)
/* Save the original return address in A's stack frame */
std r0,LRSAVE(r1)
/* Create our stack frame + pt_regs */
stdu r1,-SWITCH_FRAME_SIZE(r1)
/* Save all gprs to pt_regs */
SAVE_8GPRS(0,r1)
SAVE_8GPRS(8,r1)
SAVE_8GPRS(16,r1)
SAVE_8GPRS(24,r1)
/* Load special regs for save below */
mfmsr r8
mfctr r9
mfxer r10
mfcr r11
/* Get the _mcount() call site out of LR */
mflr r7
/* Save it as pt_regs->nip & pt_regs->link */
std r7, _NIP(r1)
std r7, _LINK(r1)
/* Save callee's TOC in the ABI compliant location */
std r2, 24(r1)
ld r2,PACATOC(r13) /* get kernel TOC in r2 */
addis r3,r2,function_trace_op@toc@ha
addi r3,r3,function_trace_op@toc@l
ld r5,0(r3)
#ifdef CONFIG_LIVEPATCH
mr r14,r7 /* remember old NIP */
#endif
/* Calculate ip from nip-4 into r3 for call below */
subi r3, r7, MCOUNT_INSN_SIZE
/* Put the original return address in r4 as parent_ip */
mr r4, r0
/* Save special regs */
std r8, _MSR(r1)
std r9, _CTR(r1)
std r10, _XER(r1)
std r11, _CCR(r1)
/* Load &pt_regs in r6 for call below */
addi r6, r1 ,STACK_FRAME_OVERHEAD
/* ftrace_call(r3, r4, r5, r6) */
.globl ftrace_call
ftrace_call:
bl ftrace_stub
nop
/* Load ctr with the possibly modified NIP */
ld r3, _NIP(r1)
mtctr r3
#ifdef CONFIG_LIVEPATCH
cmpd r14,r3 /* has NIP been altered? */
#endif
/* Restore gprs */
REST_8GPRS(0,r1)
REST_8GPRS(8,r1)
REST_8GPRS(16,r1)
REST_8GPRS(24,r1)
/* Restore callee's TOC */
ld r2, 24(r1)
/* Pop our stack frame */
addi r1, r1, SWITCH_FRAME_SIZE
/* Restore original LR for return to B */
ld r0, LRSAVE(r1)
mtlr r0
#ifdef CONFIG_LIVEPATCH
/* Based on the cmpd above, if the NIP was altered handle livepatch */
bne- livepatch_handler
#endif
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
stdu r1, -112(r1)
.globl ftrace_graph_call
ftrace_graph_call:
b ftrace_graph_stub
_GLOBAL(ftrace_graph_stub)
addi r1, r1, 112
#endif
ld r0,LRSAVE(r1) /* restore callee's lr at _mcount site */
mtlr r0
bctr /* jump after _mcount site */
#endif /* CC_USING_MPROFILE_KERNEL */
_GLOBAL(ftrace_stub)
blr
#ifdef CONFIG_LIVEPATCH
/*
* This function runs in the mcount context, between two functions. As
* such it can only clobber registers which are volatile and used in
* function linkage.
*
* We get here when a function A, calls another function B, but B has
* been live patched with a new function C.
*
* On entry:
* - we have no stack frame and can not allocate one
* - LR points back to the original caller (in A)
* - CTR holds the new NIP in C
* - r0 & r12 are free
*
* r0 can't be used as the base register for a DS-form load or store, so
* we temporarily shuffle r1 (stack pointer) into r0 and then put it back.
*/
livepatch_handler:
CURRENT_THREAD_INFO(r12, r1)
/* Save stack pointer into r0 */
mr r0, r1
/* Allocate 3 x 8 bytes */
ld r1, TI_livepatch_sp(r12)
addi r1, r1, 24
std r1, TI_livepatch_sp(r12)
/* Save toc & real LR on livepatch stack */
std r2, -24(r1)
mflr r12
std r12, -16(r1)
/* Store stack end marker */
lis r12, STACK_END_MAGIC@h
ori r12, r12, STACK_END_MAGIC@l
std r12, -8(r1)
/* Restore real stack pointer */
mr r1, r0
/* Put ctr in r12 for global entry and branch there */
mfctr r12
bctrl
/*
* Now we are returning from the patched function to the original
* caller A. We are free to use r0 and r12, and we can use r2 until we
* restore it.
*/
CURRENT_THREAD_INFO(r12, r1)
/* Save stack pointer into r0 */
mr r0, r1
ld r1, TI_livepatch_sp(r12)
/* Check stack marker hasn't been trashed */
lis r2, STACK_END_MAGIC@h
ori r2, r2, STACK_END_MAGIC@l
ld r12, -8(r1)
1: tdne r12, r2
EMIT_BUG_ENTRY 1b, __FILE__, __LINE__ - 1, 0
/* Restore LR & toc from livepatch stack */
ld r12, -16(r1)
mtlr r12
ld r2, -24(r1)
/* Pop livepatch stack frame */
CURRENT_THREAD_INFO(r12, r0)
subi r1, r1, 24
std r1, TI_livepatch_sp(r12)
/* Restore real stack pointer */
mr r1, r0
/* Return to original caller of live patched function */
blr
#endif
#else
_GLOBAL_TOC(_mcount)
EXPORT_SYMBOL(_mcount)
/* Taken from output of objdump from lib64/glibc */
mflr r3
ld r11, 0(r1)
stdu r1, -112(r1)
std r3, 128(r1)
ld r4, 16(r11)
subi r3, r3, MCOUNT_INSN_SIZE
LOAD_REG_ADDR(r5,ftrace_trace_function)
ld r5,0(r5)
ld r5,0(r5)
mtctr r5
bctrl
nop
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
b ftrace_graph_caller
#endif
ld r0, 128(r1)
mtlr r0
addi r1, r1, 112
_GLOBAL(ftrace_stub)
blr
#endif /* CONFIG_DYNAMIC_FTRACE */
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
#ifndef CC_USING_MPROFILE_KERNEL
_GLOBAL(ftrace_graph_caller)
/* load r4 with local address */
ld r4, 128(r1)
subi r4, r4, MCOUNT_INSN_SIZE
/* Grab the LR out of the caller stack frame */
ld r11, 112(r1)
ld r3, 16(r11)
bl prepare_ftrace_return
nop
/*
* prepare_ftrace_return gives us the address we divert to.
* Change the LR in the callers stack frame to this.
*/
ld r11, 112(r1)
std r3, 16(r11)
ld r0, 128(r1)
mtlr r0
addi r1, r1, 112
blr
#else /* CC_USING_MPROFILE_KERNEL */
_GLOBAL(ftrace_graph_caller)
/* with -mprofile-kernel, parameter regs are still alive at _mcount */
std r10, 104(r1)
std r9, 96(r1)
std r8, 88(r1)
std r7, 80(r1)
std r6, 72(r1)
std r5, 64(r1)
std r4, 56(r1)
std r3, 48(r1)
/* Save callee's TOC in the ABI compliant location */
std r2, 24(r1)
ld r2, PACATOC(r13) /* get kernel TOC in r2 */
mfctr r4 /* ftrace_caller has moved local addr here */
std r4, 40(r1)
mflr r3 /* ftrace_caller has restored LR from stack */
subi r4, r4, MCOUNT_INSN_SIZE
bl prepare_ftrace_return
nop
/*
* prepare_ftrace_return gives us the address we divert to.
* Change the LR to this.
*/
mtlr r3
ld r0, 40(r1)
mtctr r0
ld r10, 104(r1)
ld r9, 96(r1)
ld r8, 88(r1)
ld r7, 80(r1)
ld r6, 72(r1)
ld r5, 64(r1)
ld r4, 56(r1)
ld r3, 48(r1)
/* Restore callee's TOC */
ld r2, 24(r1)
addi r1, r1, 112
mflr r0
std r0, LRSAVE(r1)
bctr
#endif /* CC_USING_MPROFILE_KERNEL */
_GLOBAL(return_to_handler)
/* need to save return values */
std r4, -32(r1)
std r3, -24(r1)
/* save TOC */
std r2, -16(r1)
std r31, -8(r1)
mr r31, r1
stdu r1, -112(r1)
/*
* We might be called from a module.
* Switch to our TOC to run inside the core kernel.
*/
ld r2, PACATOC(r13)
bl ftrace_return_to_handler
nop
/* return value has real return address */
mtlr r3
ld r1, 0(r1)
ld r4, -32(r1)
ld r3, -24(r1)
ld r2, -16(r1)
ld r31, -8(r1)
/* Jump back to real return address */
blr
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
#endif /* CONFIG_FUNCTION_TRACER */

View File

@ -116,9 +116,11 @@ EXC_VIRT_NONE(0x4000, 0x100)
EXC_REAL_BEGIN(system_reset, 0x100, 0x100)
SET_SCRATCH0(r13)
GET_PACA(r13)
clrrdi r13,r13,1 /* Last bit of HSPRG0 is set if waking from winkle */
EXCEPTION_PROLOG_PSERIES_PACA(PACA_EXGEN, system_reset_common, EXC_STD,
/*
* MSR_RI is not enabled, because PACA_EXNMI and nmi stack is
* being used, so a nested NMI exception would corrupt it.
*/
EXCEPTION_PROLOG_PSERIES_NORI(PACA_EXNMI, system_reset_common, EXC_STD,
IDLETEST, 0x100)
EXC_REAL_END(system_reset, 0x100, 0x100)
@ -126,34 +128,37 @@ EXC_VIRT_NONE(0x4100, 0x100)
#ifdef CONFIG_PPC_P7_NAP
EXC_COMMON_BEGIN(system_reset_idle_common)
BEGIN_FTR_SECTION
GET_PACA(r13) /* Restore HSPRG0 to get the winkle bit in r13 */
END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
bl pnv_restore_hyp_resource
li r0,PNV_THREAD_RUNNING
stb r0,PACA_THREAD_IDLE_STATE(r13) /* Clear thread state */
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
li r0,KVM_HWTHREAD_IN_KERNEL
stb r0,HSTATE_HWTHREAD_STATE(r13)
/* Order setting hwthread_state vs. testing hwthread_req */
sync
lbz r0,HSTATE_HWTHREAD_REQ(r13)
cmpwi r0,0
beq 1f
BRANCH_TO_KVM(r10, kvm_start_guest)
1:
b pnv_powersave_wakeup
#endif
/* Return SRR1 from power7_nap() */
mfspr r3,SPRN_SRR1
blt cr3,2f
b pnv_wakeup_loss
2: b pnv_wakeup_noloss
#endif
EXC_COMMON_BEGIN(system_reset_common)
/*
* Increment paca->in_nmi then enable MSR_RI. SLB or MCE will be able
* to recover, but nested NMI will notice in_nmi and not recover
* because of the use of the NMI stack. in_nmi reentrancy is tested in
* system_reset_exception.
*/
lhz r10,PACA_IN_NMI(r13)
addi r10,r10,1
sth r10,PACA_IN_NMI(r13)
li r10,MSR_RI
mtmsrd r10,1
EXC_COMMON(system_reset_common, 0x100, system_reset_exception)
mr r10,r1
ld r1,PACA_NMI_EMERG_SP(r13)
subi r1,r1,INT_FRAME_SIZE
EXCEPTION_COMMON_NORET_STACK(PACA_EXNMI, 0x100,
system_reset, system_reset_exception,
ADD_NVGPRS;ADD_RECONCILE)
/*
* The stack is no longer in use, decrement in_nmi.
*/
lhz r10,PACA_IN_NMI(r13)
subi r10,r10,1
sth r10,PACA_IN_NMI(r13)
b ret_from_except
#ifdef CONFIG_PPC_PSERIES
/*
@ -161,8 +166,9 @@ EXC_COMMON(system_reset_common, 0x100, system_reset_exception)
*/
TRAMP_REAL_BEGIN(system_reset_fwnmi)
SET_SCRATCH0(r13) /* save r13 */
EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common, EXC_STD,
NOTEST, 0x100)
/* See comment at system_reset exception */
EXCEPTION_PROLOG_PSERIES_NORI(PACA_EXNMI, system_reset_common,
EXC_STD, NOTEST, 0x100)
#endif /* CONFIG_PPC_PSERIES */
@ -172,14 +178,6 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
* vector
*/
SET_SCRATCH0(r13) /* save r13 */
/*
* Running native on arch 2.06 or later, we may wakeup from winkle
* inside machine check. If yes, then last bit of HSPRG0 would be set
* to 1. Hence clear it unconditionally.
*/
GET_PACA(r13)
clrrdi r13,r13,1
SET_PACA(r13)
EXCEPTION_PROLOG_0(PACA_EXMC)
BEGIN_FTR_SECTION
b machine_check_powernv_early
@ -212,6 +210,12 @@ BEGIN_FTR_SECTION
* NOTE: We are here with MSR_ME=0 (off), which means we risk a
* checkstop if we get another machine check exception before we do
* rfid with MSR_ME=1.
*
* This interrupt can wake directly from idle. If that is the case,
* the machine check is handled then the idle wakeup code is called
* to restore state. In that case, the POWER9 DD1 idle PACA workaround
* is not applied in the early machine check code, which will cause
* bugs.
*/
mr r11,r1 /* Save r1 */
lhz r10,PACA_IN_MCE(r13)
@ -268,20 +272,11 @@ machine_check_fwnmi:
machine_check_pSeries_0:
EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
/*
* The following is essentially EXCEPTION_PROLOG_PSERIES_1 with the
* difference that MSR_RI is not enabled, because PACA_EXMC is being
* used, so nested machine check corrupts it. machine_check_common
* enables MSR_RI.
* MSR_RI is not enabled, because PACA_EXMC is being used, so a
* nested machine check corrupts it. machine_check_common enables
* MSR_RI.
*/
ld r10,PACAKMSR(r13)
xori r10,r10,MSR_RI
mfspr r11,SPRN_SRR0
LOAD_HANDLER(r12, machine_check_common)
mtspr SPRN_SRR0,r12
mfspr r12,SPRN_SRR1
mtspr SPRN_SRR1,r10
rfid
b . /* prevent speculative execution */
EXCEPTION_PROLOG_PSERIES_1_NORI(machine_check_common, EXC_STD)
TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
@ -340,6 +335,37 @@ EXC_COMMON_BEGIN(machine_check_common)
/* restore original r1. */ \
ld r1,GPR1(r1)
#ifdef CONFIG_PPC_P7_NAP
/*
* This is an idle wakeup. Low level machine check has already been
* done. Queue the event then call the idle code to do the wake up.
*/
EXC_COMMON_BEGIN(machine_check_idle_common)
bl machine_check_queue_event
/*
* We have not used any non-volatile GPRs here, and as a rule
* most exception code including machine check does not.
* Therefore PACA_NAPSTATELOST does not need to be set. Idle
* wakeup will restore volatile registers.
*
* Load the original SRR1 into r3 for pnv_powersave_wakeup_mce.
*
* Then decrement MCE nesting after finishing with the stack.
*/
ld r3,_MSR(r1)
lhz r11,PACA_IN_MCE(r13)
subi r11,r11,1
sth r11,PACA_IN_MCE(r13)
/* Turn off the RI bit because SRR1 is used by idle wakeup code. */
/* Recoverability could be improved by reducing the use of SRR1. */
li r11,0
mtmsrd r11,1
b pnv_powersave_wakeup_mce
#endif
/*
* Handle machine check early in real mode. We come here with
* ME=1, MMU (IR=0 and DR=0) off and using MC emergency stack.
@ -352,6 +378,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
bl machine_check_early
std r3,RESULT(r1) /* Save result */
ld r12,_MSR(r1)
#ifdef CONFIG_PPC_P7_NAP
/*
* Check if thread was in power saving mode. We come here when any
@ -362,48 +389,14 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
*
* Go back to nap/sleep/winkle mode again if (b) is true.
*/
rlwinm. r11,r12,47-31,30,31 /* Was it in power saving mode? */
beq 4f /* No, it wasn;t */
/* Thread was in power saving mode. Go back to nap again. */
cmpwi r11,2
blt 3f
/* Supervisor/Hypervisor state loss */
li r0,1
stb r0,PACA_NAPSTATELOST(r13)
3: bl machine_check_queue_event
MACHINE_CHECK_HANDLER_WINDUP
GET_PACA(r13)
ld r1,PACAR1(r13)
/*
* Check what idle state this CPU was in and go back to same mode
* again.
*/
lbz r3,PACA_THREAD_IDLE_STATE(r13)
cmpwi r3,PNV_THREAD_NAP
bgt 10f
IDLE_STATE_ENTER_SEQ_NORET(PPC_NAP)
/* No return */
10:
cmpwi r3,PNV_THREAD_SLEEP
bgt 2f
IDLE_STATE_ENTER_SEQ_NORET(PPC_SLEEP)
/* No return */
2:
/*
* Go back to winkle. Please note that this thread was woken up in
* machine check from winkle and have not restored the per-subcore
* state. Hence before going back to winkle, set last bit of HSPRG0
* to 1. This will make sure that if this thread gets woken up
* again at reset vector 0x100 then it will get chance to restore
* the subcore state.
*/
ori r13,r13,1
SET_PACA(r13)
IDLE_STATE_ENTER_SEQ_NORET(PPC_WINKLE)
/* No return */
BEGIN_FTR_SECTION
rlwinm. r11,r12,47-31,30,31
beq- 4f
BRANCH_TO_COMMON(r10, machine_check_idle_common)
4:
END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
#endif
/*
* Check if we are coming from hypervisor userspace. If yes then we
* continue in host kernel in V mode to deliver the MC event.
@ -968,17 +961,12 @@ EXC_VIRT_NONE(0x4e60, 0x20)
TRAMP_KVM_HV(PACA_EXGEN, 0xe60)
TRAMP_REAL_BEGIN(hmi_exception_early)
EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_HV, 0xe60)
mr r10,r1 /* Save r1 */
ld r1,PACAEMERGSP(r13) /* Use emergency stack */
mr r10,r1 /* Save r1 */
ld r1,PACAEMERGSP(r13) /* Use emergency stack for realmode */
subi r1,r1,INT_FRAME_SIZE /* alloc stack frame */
std r9,_CCR(r1) /* save CR in stackframe */
mfspr r11,SPRN_HSRR0 /* Save HSRR0 */
std r11,_NIP(r1) /* save HSRR0 in stackframe */
mfspr r12,SPRN_HSRR1 /* Save SRR1 */
std r12,_MSR(r1) /* save SRR1 in stackframe */
std r10,0(r1) /* make stack chain pointer */
std r0,GPR0(r1) /* save r0 in stackframe */
std r10,GPR1(r1) /* save r1 in stackframe */
mfspr r12,SPRN_HSRR1 /* Save HSRR1 */
EXCEPTION_PROLOG_COMMON_1()
EXCEPTION_PROLOG_COMMON_2(PACA_EXGEN)
EXCEPTION_PROLOG_COMMON_3(0xe60)
addi r3,r1,STACK_FRAME_OVERHEAD

View File

@ -30,17 +30,16 @@
#include <linux/string.h>
#include <linux/memblock.h>
#include <linux/delay.h>
#include <linux/debugfs.h>
#include <linux/seq_file.h>
#include <linux/crash_dump.h>
#include <linux/kobject.h>
#include <linux/sysfs.h>
#include <asm/debugfs.h>
#include <asm/page.h>
#include <asm/prom.h>
#include <asm/rtas.h>
#include <asm/fadump.h>
#include <asm/debug.h>
#include <asm/setup.h>
static struct fw_dump fw_dump;
@ -319,15 +318,34 @@ int __init fadump_reserve_mem(void)
pr_debug("fadumphdr_addr = %p\n",
(void *) fw_dump.fadumphdr_addr);
} else {
/* Reserve the memory at the top of memory. */
size = get_fadump_area_size();
base = memory_boundary - size;
memblock_reserve(base, size);
printk(KERN_INFO "Reserved %ldMB of memory at %ldMB "
"for firmware-assisted dump\n",
(unsigned long)(size >> 20),
(unsigned long)(base >> 20));
/*
* Reserve memory at an offset closer to bottom of the RAM to
* minimize the impact of memory hot-remove operation. We can't
* use memblock_find_in_range() here since it doesn't allocate
* from bottom to top.
*/
for (base = fw_dump.boot_memory_size;
base <= (memory_boundary - size);
base += size) {
if (memblock_is_region_memory(base, size) &&
!memblock_is_region_reserved(base, size))
break;
}
if ((base > (memory_boundary - size)) ||
memblock_reserve(base, size)) {
pr_err("Failed to reserve memory\n");
return 0;
}
pr_info("Reserved %ldMB of memory at %ldMB for firmware-"
"assisted dump (System RAM: %ldMB)\n",
(unsigned long)(size >> 20),
(unsigned long)(base >> 20),
(unsigned long)(memblock_phys_mem_size() >> 20));
}
fw_dump.reserve_dump_area_start = base;
fw_dump.reserve_dump_area_size = size;
return 1;

View File

@ -735,11 +735,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_NEED_DTLB_SW_LRU)
EXCEPTION(0x2c00, Trap_2c, unknown_exception, EXC_XFER_EE)
EXCEPTION(0x2d00, Trap_2d, unknown_exception, EXC_XFER_EE)
EXCEPTION(0x2e00, Trap_2e, unknown_exception, EXC_XFER_EE)
EXCEPTION(0x2f00, MOLTrampoline, unknown_exception, EXC_XFER_EE_LITE)
.globl mol_trampoline
.set mol_trampoline, i0x2f00
EXPORT_SYMBOL(mol_trampoline)
EXCEPTION(0x2f00, Trap_2f, unknown_exception, EXC_XFER_EE)
. = 0x3000
@ -1278,16 +1274,6 @@ EXPORT_SYMBOL(empty_zero_page)
swapper_pg_dir:
.space PGD_TABLE_SIZE
.globl intercept_table
intercept_table:
.long 0, 0, i0x200, i0x300, i0x400, 0, i0x600, i0x700
.long i0x800, 0, 0, 0, 0, i0xd00, 0, 0
.long 0, 0, 0, i0x1300, 0, 0, 0, 0
.long 0, 0, 0, 0, 0, 0, 0, 0
.long 0, 0, 0, 0, 0, 0, 0, 0
.long 0, 0, 0, 0, 0, 0, 0, 0
EXPORT_SYMBOL(intercept_table)
/* Room for two PTE pointers, usually the kernel and current user pointers
* to their respective root page table.
*/

View File

@ -949,7 +949,8 @@ start_here_multiplatform:
LOAD_REG_ADDR(r3,init_thread_union)
/* set up a stack pointer */
addi r1,r3,THREAD_SIZE
LOAD_REG_IMMEDIATE(r1,THREAD_SIZE)
add r1,r3,r1
li r0,0
stdu r0,-STACK_FRAME_OVERHEAD(r1)

View File

@ -20,6 +20,7 @@
#include <asm/kvm_book3s_asm.h>
#include <asm/opal.h>
#include <asm/cpuidle.h>
#include <asm/exception-64s.h>
#include <asm/book3s/64/mmu-hash.h>
#include <asm/mmu.h>
@ -94,12 +95,12 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
core_idle_lock_held:
HMT_LOW
3: lwz r15,0(r14)
andi. r15,r15,PNV_CORE_IDLE_LOCK_BIT
andis. r15,r15,PNV_CORE_IDLE_LOCK_BIT@h
bne 3b
HMT_MEDIUM
lwarx r15,0,r14
andi. r9,r15,PNV_CORE_IDLE_LOCK_BIT
bne core_idle_lock_held
andis. r9,r15,PNV_CORE_IDLE_LOCK_BIT@h
bne- core_idle_lock_held
blr
/*
@ -113,7 +114,7 @@ core_idle_lock_held:
*
* Address to 'rfid' to in r5
*/
_GLOBAL(pnv_powersave_common)
pnv_powersave_common:
/* Use r3 to pass state nap/sleep/winkle */
/* NAP is a state loss, we create a regs frame on the
* stack, fill it up with the state we care about and
@ -188,8 +189,8 @@ pnv_enter_arch207_idle_mode:
/* The following store to HSTATE_HWTHREAD_STATE(r13) */
/* MUST occur in real mode, i.e. with the MMU off, */
/* and the MMU must stay off until we clear this flag */
/* and test HSTATE_HWTHREAD_REQ(r13) in the system */
/* reset interrupt vector in exceptions-64s.S. */
/* and test HSTATE_HWTHREAD_REQ(r13) in */
/* pnv_powersave_wakeup in this file. */
/* The reason is that another thread can switch the */
/* MMU to a guest context whenever this flag is set */
/* to KVM_HWTHREAD_IN_IDLE, and if the MMU was on, */
@ -209,15 +210,20 @@ pnv_enter_arch207_idle_mode:
/* Sleep or winkle */
lbz r7,PACA_THREAD_MASK(r13)
ld r14,PACA_CORE_IDLE_STATE_PTR(r13)
li r5,0
beq cr3,3f
lis r5,PNV_CORE_IDLE_WINKLE_COUNT@h
3:
lwarx_loop1:
lwarx r15,0,r14
andi. r9,r15,PNV_CORE_IDLE_LOCK_BIT
bnel core_idle_lock_held
andis. r9,r15,PNV_CORE_IDLE_LOCK_BIT@h
bnel- core_idle_lock_held
add r15,r15,r5 /* Add if winkle */
andc r15,r15,r7 /* Clear thread bit */
andi. r15,r15,PNV_CORE_IDLE_THREAD_BITS
andi. r9,r15,PNV_CORE_IDLE_THREAD_BITS
/*
* If cr0 = 0, then current thread is the last thread of the core entering
@ -240,7 +246,7 @@ common_enter: /* common code for all the threads entering sleep or winkle */
IDLE_STATE_ENTER_SEQ_NORET(PPC_SLEEP)
fastsleep_workaround_at_entry:
ori r15,r15,PNV_CORE_IDLE_LOCK_BIT
oris r15,r15,PNV_CORE_IDLE_LOCK_BIT@h
stwcx. r15,0,r14
bne- lwarx_loop1
isync
@ -250,10 +256,10 @@ fastsleep_workaround_at_entry:
li r4,1
bl opal_config_cpu_idle_state
/* Clear Lock bit */
li r0,0
/* Unlock */
xoris r15,r15,PNV_CORE_IDLE_LOCK_BIT@h
lwsync
stw r0,0(r14)
stw r15,0(r14)
b common_enter
enter_winkle:
@ -301,8 +307,8 @@ power_enter_stop:
lwarx_loop_stop:
lwarx r15,0,r14
andi. r9,r15,PNV_CORE_IDLE_LOCK_BIT
bnel core_idle_lock_held
andis. r9,r15,PNV_CORE_IDLE_LOCK_BIT@h
bnel- core_idle_lock_held
andc r15,r15,r7 /* Clear thread bit */
stwcx. r15,0,r14
@ -375,17 +381,113 @@ _GLOBAL(power9_idle_stop)
li r4,1
b pnv_powersave_common
/* No return */
/*
* Called from reset vector. Check whether we have woken up with
* hypervisor state loss. If yes, restore hypervisor state and return
* back to reset vector.
* On waking up from stop 0,1,2 with ESL=1 on POWER9 DD1,
* HSPRG0 will be set to the HSPRG0 value of one of the
* threads in this core. Thus the value we have in r13
* may not be this thread's paca pointer.
*
* r13 - Contents of HSPRG0
* Fortunately, the TIR remains invariant. Since this thread's
* paca pointer is recorded in all its sibling's paca, we can
* correctly recover this thread's paca pointer if we
* know the index of this thread in the core.
*
* This index can be obtained from the TIR.
*
* i.e, thread's position in the core = TIR.
* If this value is i, then this thread's paca is
* paca->thread_sibling_pacas[i].
*/
power9_dd1_recover_paca:
mfspr r4, SPRN_TIR
/*
* Since each entry in thread_sibling_pacas is 8 bytes
* we need to left-shift by 3 bits. Thus r4 = i * 8
*/
sldi r4, r4, 3
/* Get &paca->thread_sibling_pacas[0] in r5 */
ld r5, PACA_SIBLING_PACA_PTRS(r13)
/* Load paca->thread_sibling_pacas[i] into r13 */
ldx r13, r4, r5
SET_PACA(r13)
/*
* Indicate that we have lost NVGPR state
* which needs to be restored from the stack.
*/
li r3, 1
stb r0,PACA_NAPSTATELOST(r13)
blr
/*
* Called from machine check handler for powersave wakeups.
* Low level machine check processing has already been done. Now just
* go through the wake up path to get everything in order.
*
* r3 - The original SRR1 value.
* Original SRR[01] have been clobbered.
* MSR_RI is clear.
*/
.global pnv_powersave_wakeup_mce
pnv_powersave_wakeup_mce:
/* Set cr3 for pnv_powersave_wakeup */
rlwinm r11,r3,47-31,30,31
cmpwi cr3,r11,2
/*
* Now put the original SRR1 with SRR1_WAKEMCE_RESVD as the wake
* reason into SRR1, which allows reuse of the system reset wakeup
* code without being mistaken for another type of wakeup.
*/
oris r3,r3,SRR1_WAKEMCE_RESVD@h
mtspr SPRN_SRR1,r3
b pnv_powersave_wakeup
/*
* Called from reset vector for powersave wakeups.
* cr3 - set to gt if waking up with partial/complete hypervisor state loss
*/
_GLOBAL(pnv_restore_hyp_resource)
.global pnv_powersave_wakeup
pnv_powersave_wakeup:
ld r2, PACATOC(r13)
BEGIN_FTR_SECTION
ld r2,PACATOC(r13);
BEGIN_FTR_SECTION_NESTED(70)
bl power9_dd1_recover_paca
END_FTR_SECTION_NESTED_IFSET(CPU_FTR_POWER9_DD1, 70)
bl pnv_restore_hyp_resource_arch300
FTR_SECTION_ELSE
bl pnv_restore_hyp_resource_arch207
ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
li r0,PNV_THREAD_RUNNING
stb r0,PACA_THREAD_IDLE_STATE(r13) /* Clear thread state */
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
li r0,KVM_HWTHREAD_IN_KERNEL
stb r0,HSTATE_HWTHREAD_STATE(r13)
/* Order setting hwthread_state vs. testing hwthread_req */
sync
lbz r0,HSTATE_HWTHREAD_REQ(r13)
cmpwi r0,0
beq 1f
b kvm_start_guest
1:
#endif
/* Return SRR1 from power7_nap() */
mfspr r3,SPRN_SRR1
blt cr3,pnv_wakeup_noloss
b pnv_wakeup_loss
/*
* Check whether we have woken up with hypervisor state loss.
* If yes, restore hypervisor state and return back to link.
*
* cr3 - set to gt if waking up with partial/complete hypervisor state loss
*/
pnv_restore_hyp_resource_arch300:
/*
* POWER ISA 3. Use PSSCR to determine if we
* are waking up from deep idle state
@ -400,31 +502,19 @@ BEGIN_FTR_SECTION
*/
rldicl r5,r5,4,60
cmpd cr4,r5,r4
bge cr4,pnv_wakeup_tb_loss
/*
* Waking up without hypervisor state loss. Return to
* reset vector
*/
blr
bge cr4,pnv_wakeup_tb_loss /* returns to caller */
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
blr /* Waking up without hypervisor state loss. */
/* Same calling convention as arch300 */
pnv_restore_hyp_resource_arch207:
/*
* POWER ISA 2.07 or less.
* Check if last bit of HSPGR0 is set. This indicates whether we are
* waking up from winkle.
* Check if we slept with sleep or winkle.
*/
clrldi r5,r13,63
clrrdi r13,r13,1
/* Now that we are sure r13 is corrected, load TOC */
ld r2,PACATOC(r13);
cmpwi cr4,r5,1
mtspr SPRN_HSPRG0,r13
lbz r0,PACA_THREAD_IDLE_STATE(r13)
cmpwi cr2,r0,PNV_THREAD_NAP
bgt cr2,pnv_wakeup_tb_loss /* Either sleep or Winkle */
lbz r4,PACA_THREAD_IDLE_STATE(r13)
cmpwi cr2,r4,PNV_THREAD_NAP
bgt cr2,pnv_wakeup_tb_loss /* Either sleep or Winkle */
/*
* We fall through here if PACA_THREAD_IDLE_STATE shows we are waking
@ -433,8 +523,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
*/
bgt cr3,.
blr /* Return back to System Reset vector from where
pnv_restore_hyp_resource was invoked */
blr /* Waking up without hypervisor state loss */
/*
* Called if waking up from idle state which can cause either partial or
@ -444,9 +533,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
*
* r13 - PACA
* cr3 - gt if waking up with partial/complete hypervisor state loss
*
* If ISA300:
* cr4 - gt or eq if waking up from complete hypervisor state loss.
*
* If ISA207:
* r4 - PACA_THREAD_IDLE_STATE
*/
_GLOBAL(pnv_wakeup_tb_loss)
pnv_wakeup_tb_loss:
ld r1,PACAR1(r13)
/*
* Before entering any idle state, the NVGPRs are saved in the stack.
@ -473,18 +567,19 @@ _GLOBAL(pnv_wakeup_tb_loss)
* is required to return back to reset vector after hypervisor state
* restore is complete.
*/
mr r18,r4
mflr r17
mfspr r16,SPRN_SRR1
BEGIN_FTR_SECTION
CHECK_HMI_INTERRUPT
END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
lbz r7,PACA_THREAD_MASK(r13)
ld r14,PACA_CORE_IDLE_STATE_PTR(r13)
lwarx_loop2:
lwarx r15,0,r14
andi. r9,r15,PNV_CORE_IDLE_LOCK_BIT
lbz r7,PACA_THREAD_MASK(r13)
/*
* Take the core lock to synchronize against other threads.
*
* Lock bit is set in one of the 2 cases-
* a. In the sleep/winkle enter path, the last thread is executing
* fastsleep workaround code.
@ -492,23 +587,93 @@ lwarx_loop2:
* workaround undo code or resyncing timebase or restoring context
* In either case loop until the lock bit is cleared.
*/
bnel core_idle_lock_held
1:
lwarx r15,0,r14
andis. r9,r15,PNV_CORE_IDLE_LOCK_BIT@h
bnel- core_idle_lock_held
oris r15,r15,PNV_CORE_IDLE_LOCK_BIT@h
stwcx. r15,0,r14
bne- 1b
isync
cmpwi cr2,r15,0
andi. r9,r15,PNV_CORE_IDLE_THREAD_BITS
cmpwi cr2,r9,0
/*
* At this stage
* cr2 - eq if first thread to wakeup in core
* cr3- gt if waking up with partial/complete hypervisor state loss
* ISA300:
* cr4 - gt or eq if waking up from complete hypervisor state loss.
*/
ori r15,r15,PNV_CORE_IDLE_LOCK_BIT
stwcx. r15,0,r14
bne- lwarx_loop2
isync
BEGIN_FTR_SECTION
/*
* Were we in winkle?
* If yes, check if all threads were in winkle, decrement our
* winkle count, set all thread winkle bits if all were in winkle.
* Check if our thread has a winkle bit set, and set cr4 accordingly
* (to match ISA300, above). Pseudo-code for core idle state
* transitions for ISA207 is as follows (everything happens atomically
* due to store conditional and/or lock bit):
*
* nap_idle() { }
* nap_wake() { }
*
* sleep_idle()
* {
* core_idle_state &= ~thread_in_core
* }
*
* sleep_wake()
* {
* bool first_in_core, first_in_subcore;
*
* first_in_core = (core_idle_state & IDLE_THREAD_BITS) == 0;
* first_in_subcore = (core_idle_state & SUBCORE_SIBLING_MASK) == 0;
*
* core_idle_state |= thread_in_core;
* }
*
* winkle_idle()
* {
* core_idle_state &= ~thread_in_core;
* core_idle_state += 1 << WINKLE_COUNT_SHIFT;
* }
*
* winkle_wake()
* {
* bool first_in_core, first_in_subcore, winkle_state_lost;
*
* first_in_core = (core_idle_state & IDLE_THREAD_BITS) == 0;
* first_in_subcore = (core_idle_state & SUBCORE_SIBLING_MASK) == 0;
*
* core_idle_state |= thread_in_core;
*
* if ((core_idle_state & WINKLE_MASK) == (8 << WINKLE_COUNT_SIHFT))
* core_idle_state |= THREAD_WINKLE_BITS;
* core_idle_state -= 1 << WINKLE_COUNT_SHIFT;
*
* winkle_state_lost = core_idle_state &
* (thread_in_core << WINKLE_THREAD_SHIFT);
* core_idle_state &= ~(thread_in_core << WINKLE_THREAD_SHIFT);
* }
*
*/
cmpwi r18,PNV_THREAD_WINKLE
bne 2f
andis. r9,r15,PNV_CORE_IDLE_WINKLE_COUNT_ALL_BIT@h
subis r15,r15,PNV_CORE_IDLE_WINKLE_COUNT@h
beq 2f
ori r15,r15,PNV_CORE_IDLE_THREAD_WINKLE_BITS /* all were winkle */
2:
/* Shift thread bit to winkle mask, then test if this thread is set,
* and remove it from the winkle bits */
slwi r8,r7,8
and r8,r8,r15
andc r15,r15,r8
cmpwi cr4,r8,1 /* cr4 will be gt if our bit is set, lt if not */
lbz r4,PACA_SUBCORE_SIBLING_MASK(r13)
and r4,r4,r15
cmpwi r4,0 /* Check if first in subcore */
@ -593,7 +758,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
mtspr SPRN_WORC,r4
clear_lock:
andi. r15,r15,PNV_CORE_IDLE_THREAD_BITS
xoris r15,r15,PNV_CORE_IDLE_LOCK_BIT@h
lwsync
stw r15,0(r14)
@ -651,8 +816,7 @@ hypervisor_state_restored:
mtspr SPRN_SRR1,r16
mtlr r17
blr /* Return back to System Reset vector from where
pnv_restore_hyp_resource was invoked */
blr /* return to pnv_powersave_wakeup */
fastsleep_workaround_at_exit:
li r3,1
@ -664,7 +828,8 @@ fastsleep_workaround_at_exit:
* R3 here contains the value that will be returned to the caller
* of power7_nap.
*/
_GLOBAL(pnv_wakeup_loss)
.global pnv_wakeup_loss
pnv_wakeup_loss:
ld r1,PACAR1(r13)
BEGIN_FTR_SECTION
CHECK_HMI_INTERRUPT
@ -684,7 +849,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
* R3 here contains the value that will be returned to the caller
* of power7_nap.
*/
_GLOBAL(pnv_wakeup_noloss)
pnv_wakeup_noloss:
lbz r0,PACA_NAPSTATELOST(r13)
cmpwi r0,0
bne pnv_wakeup_loss

View File

@ -711,13 +711,16 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid)
return tbl;
}
void iommu_free_table(struct iommu_table *tbl, const char *node_name)
static void iommu_table_free(struct kref *kref)
{
unsigned long bitmap_sz;
unsigned int order;
struct iommu_table *tbl;
if (!tbl)
return;
tbl = container_of(kref, struct iommu_table, it_kref);
if (tbl->it_ops->free)
tbl->it_ops->free(tbl);
if (!tbl->it_map) {
kfree(tbl);
@ -733,7 +736,7 @@ void iommu_free_table(struct iommu_table *tbl, const char *node_name)
/* verify that table contains no entries */
if (!bitmap_empty(tbl->it_map, tbl->it_size))
pr_warn("%s: Unexpected TCEs for %s\n", __func__, node_name);
pr_warn("%s: Unexpected TCEs\n", __func__);
/* calculate bitmap size in bytes */
bitmap_sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long);
@ -746,6 +749,24 @@ void iommu_free_table(struct iommu_table *tbl, const char *node_name)
kfree(tbl);
}
struct iommu_table *iommu_tce_table_get(struct iommu_table *tbl)
{
if (kref_get_unless_zero(&tbl->it_kref))
return tbl;
return NULL;
}
EXPORT_SYMBOL_GPL(iommu_tce_table_get);
int iommu_tce_table_put(struct iommu_table *tbl)
{
if (WARN_ON(!tbl))
return 0;
return kref_put(&tbl->it_kref, iommu_table_free);
}
EXPORT_SYMBOL_GPL(iommu_tce_table_put);
/* Creates TCEs for a user provided buffer. The user buffer must be
* contiguous real kernel storage (not vmalloc). The address passed here
* comprises a page address and offset into that page. The dma_addr_t
@ -1004,6 +1025,31 @@ long iommu_tce_xchg(struct iommu_table *tbl, unsigned long entry,
}
EXPORT_SYMBOL_GPL(iommu_tce_xchg);
#ifdef CONFIG_PPC_BOOK3S_64
long iommu_tce_xchg_rm(struct iommu_table *tbl, unsigned long entry,
unsigned long *hpa, enum dma_data_direction *direction)
{
long ret;
ret = tbl->it_ops->exchange_rm(tbl, entry, hpa, direction);
if (!ret && ((*direction == DMA_FROM_DEVICE) ||
(*direction == DMA_BIDIRECTIONAL))) {
struct page *pg = realmode_pfn_to_page(*hpa >> PAGE_SHIFT);
if (likely(pg)) {
SetPageDirty(pg);
} else {
tbl->it_ops->exchange_rm(tbl, entry, hpa, direction);
ret = -EFAULT;
}
}
return ret;
}
EXPORT_SYMBOL_GPL(iommu_tce_xchg_rm);
#endif
int iommu_take_ownership(struct iommu_table *tbl)
{
unsigned long flags, i, sz = (tbl->it_size + 7) >> 3;

View File

@ -65,7 +65,6 @@
#include <asm/machdep.h>
#include <asm/udbg.h>
#include <asm/smp.h>
#include <asm/debug.h>
#include <asm/livepatch.h>
#include <asm/asm-prototypes.h>
@ -442,46 +441,6 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
return sum;
}
#ifdef CONFIG_HOTPLUG_CPU
void migrate_irqs(void)
{
struct irq_desc *desc;
unsigned int irq;
static int warned;
cpumask_var_t mask;
const struct cpumask *map = cpu_online_mask;
alloc_cpumask_var(&mask, GFP_KERNEL);
for_each_irq_desc(irq, desc) {
struct irq_data *data;
struct irq_chip *chip;
data = irq_desc_get_irq_data(desc);
if (irqd_is_per_cpu(data))
continue;
chip = irq_data_get_irq_chip(data);
cpumask_and(mask, irq_data_get_affinity_mask(data), map);
if (cpumask_any(mask) >= nr_cpu_ids) {
pr_warn("Breaking affinity for irq %i\n", irq);
cpumask_copy(mask, map);
}
if (chip->irq_set_affinity)
chip->irq_set_affinity(data, mask, true);
else if (desc->action && !(warned++))
pr_err("Cannot set affinity for irq %i\n", irq);
}
free_cpumask_var(mask);
local_irq_enable();
mdelay(1);
local_irq_disable();
}
#endif
static inline void check_stack_overflow(void)
{
#ifdef CONFIG_DEBUG_STACKOVERFLOW

View File

@ -0,0 +1,104 @@
/*
* Dynamic Ftrace based Kprobes Optimization
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
*
* Copyright (C) Hitachi Ltd., 2012
* Copyright 2016 Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
* IBM Corporation
*/
#include <linux/kprobes.h>
#include <linux/ptrace.h>
#include <linux/hardirq.h>
#include <linux/preempt.h>
#include <linux/ftrace.h>
static nokprobe_inline
int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb, unsigned long orig_nip)
{
/*
* Emulate singlestep (and also recover regs->nip)
* as if there is a nop
*/
regs->nip = (unsigned long)p->addr + MCOUNT_INSN_SIZE;
if (unlikely(p->post_handler)) {
kcb->kprobe_status = KPROBE_HIT_SSDONE;
p->post_handler(p, regs, 0);
}
__this_cpu_write(current_kprobe, NULL);
if (orig_nip)
regs->nip = orig_nip;
return 1;
}
int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
{
if (kprobe_ftrace(p))
return __skip_singlestep(p, regs, kcb, 0);
else
return 0;
}
NOKPROBE_SYMBOL(skip_singlestep);
/* Ftrace callback handler for kprobes */
void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
struct ftrace_ops *ops, struct pt_regs *regs)
{
struct kprobe *p;
struct kprobe_ctlblk *kcb;
unsigned long flags;
/* Disable irq for emulating a breakpoint and avoiding preempt */
local_irq_save(flags);
hard_irq_disable();
p = get_kprobe((kprobe_opcode_t *)nip);
if (unlikely(!p) || kprobe_disabled(p))
goto end;
kcb = get_kprobe_ctlblk();
if (kprobe_running()) {
kprobes_inc_nmissed_count(p);
} else {
unsigned long orig_nip = regs->nip;
/*
* On powerpc, NIP is *before* this instruction for the
* pre handler
*/
regs->nip -= MCOUNT_INSN_SIZE;
__this_cpu_write(current_kprobe, p);
kcb->kprobe_status = KPROBE_HIT_ACTIVE;
if (!p->pre_handler || !p->pre_handler(p, regs))
__skip_singlestep(p, regs, kcb, orig_nip);
/*
* If pre_handler returns !0, it sets regs->nip and
* resets current kprobe.
*/
}
end:
local_irq_restore(flags);
}
NOKPROBE_SYMBOL(kprobe_ftrace_handler);
int arch_prepare_kprobe_ftrace(struct kprobe *p)
{
p->ainsn.insn = NULL;
p->ainsn.boostable = -1;
return 0;
}

View File

@ -35,6 +35,7 @@
#include <asm/code-patching.h>
#include <asm/cacheflush.h>
#include <asm/sstep.h>
#include <asm/sections.h>
#include <linux/uaccess.h>
DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
@ -42,7 +43,86 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
struct kretprobe_blackpoint kretprobe_blacklist[] = {{NULL, NULL}};
int __kprobes arch_prepare_kprobe(struct kprobe *p)
bool arch_within_kprobe_blacklist(unsigned long addr)
{
return (addr >= (unsigned long)__kprobes_text_start &&
addr < (unsigned long)__kprobes_text_end) ||
(addr >= (unsigned long)_stext &&
addr < (unsigned long)__head_end);
}
kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
{
kprobe_opcode_t *addr;
#ifdef PPC64_ELF_ABI_v2
/* PPC64 ABIv2 needs local entry point */
addr = (kprobe_opcode_t *)kallsyms_lookup_name(name);
if (addr && !offset) {
#ifdef CONFIG_KPROBES_ON_FTRACE
unsigned long faddr;
/*
* Per livepatch.h, ftrace location is always within the first
* 16 bytes of a function on powerpc with -mprofile-kernel.
*/
faddr = ftrace_location_range((unsigned long)addr,
(unsigned long)addr + 16);
if (faddr)
addr = (kprobe_opcode_t *)faddr;
else
#endif
addr = (kprobe_opcode_t *)ppc_function_entry(addr);
}
#elif defined(PPC64_ELF_ABI_v1)
/*
* 64bit powerpc ABIv1 uses function descriptors:
* - Check for the dot variant of the symbol first.
* - If that fails, try looking up the symbol provided.
*
* This ensures we always get to the actual symbol and not
* the descriptor.
*
* Also handle <module:symbol> format.
*/
char dot_name[MODULE_NAME_LEN + 1 + KSYM_NAME_LEN];
const char *modsym;
bool dot_appended = false;
if ((modsym = strchr(name, ':')) != NULL) {
modsym++;
if (*modsym != '\0' && *modsym != '.') {
/* Convert to <module:.symbol> */
strncpy(dot_name, name, modsym - name);
dot_name[modsym - name] = '.';
dot_name[modsym - name + 1] = '\0';
strncat(dot_name, modsym,
sizeof(dot_name) - (modsym - name) - 2);
dot_appended = true;
} else {
dot_name[0] = '\0';
strncat(dot_name, name, sizeof(dot_name) - 1);
}
} else if (name[0] != '.') {
dot_name[0] = '.';
dot_name[1] = '\0';
strncat(dot_name, name, KSYM_NAME_LEN - 2);
dot_appended = true;
} else {
dot_name[0] = '\0';
strncat(dot_name, name, KSYM_NAME_LEN - 1);
}
addr = (kprobe_opcode_t *)kallsyms_lookup_name(dot_name);
if (!addr && dot_appended) {
/* Let's try the original non-dot symbol lookup */
addr = (kprobe_opcode_t *)kallsyms_lookup_name(name);
}
#else
addr = (kprobe_opcode_t *)kallsyms_lookup_name(name);
#endif
return addr;
}
int arch_prepare_kprobe(struct kprobe *p)
{
int ret = 0;
kprobe_opcode_t insn = *p->addr;
@ -74,30 +154,34 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
p->ainsn.boostable = 0;
return ret;
}
NOKPROBE_SYMBOL(arch_prepare_kprobe);
void __kprobes arch_arm_kprobe(struct kprobe *p)
void arch_arm_kprobe(struct kprobe *p)
{
*p->addr = BREAKPOINT_INSTRUCTION;
flush_icache_range((unsigned long) p->addr,
(unsigned long) p->addr + sizeof(kprobe_opcode_t));
}
NOKPROBE_SYMBOL(arch_arm_kprobe);
void __kprobes arch_disarm_kprobe(struct kprobe *p)
void arch_disarm_kprobe(struct kprobe *p)
{
*p->addr = p->opcode;
flush_icache_range((unsigned long) p->addr,
(unsigned long) p->addr + sizeof(kprobe_opcode_t));
}
NOKPROBE_SYMBOL(arch_disarm_kprobe);
void __kprobes arch_remove_kprobe(struct kprobe *p)
void arch_remove_kprobe(struct kprobe *p)
{
if (p->ainsn.insn) {
free_insn_slot(p->ainsn.insn, 0);
p->ainsn.insn = NULL;
}
}
NOKPROBE_SYMBOL(arch_remove_kprobe);
static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs)
static nokprobe_inline void prepare_singlestep(struct kprobe *p, struct pt_regs *regs)
{
enable_single_step(regs);
@ -110,37 +194,80 @@ static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs)
regs->nip = (unsigned long)p->ainsn.insn;
}
static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
static nokprobe_inline void save_previous_kprobe(struct kprobe_ctlblk *kcb)
{
kcb->prev_kprobe.kp = kprobe_running();
kcb->prev_kprobe.status = kcb->kprobe_status;
kcb->prev_kprobe.saved_msr = kcb->kprobe_saved_msr;
}
static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
static nokprobe_inline void restore_previous_kprobe(struct kprobe_ctlblk *kcb)
{
__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
kcb->kprobe_status = kcb->prev_kprobe.status;
kcb->kprobe_saved_msr = kcb->prev_kprobe.saved_msr;
}
static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
static nokprobe_inline void set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
{
__this_cpu_write(current_kprobe, p);
kcb->kprobe_saved_msr = regs->msr;
}
void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs)
bool arch_function_offset_within_entry(unsigned long offset)
{
#ifdef PPC64_ELF_ABI_v2
#ifdef CONFIG_KPROBES_ON_FTRACE
return offset <= 16;
#else
return offset <= 8;
#endif
#else
return !offset;
#endif
}
void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *)regs->link;
/* Replace the return addr with trampoline addr */
regs->link = (unsigned long)kretprobe_trampoline;
}
NOKPROBE_SYMBOL(arch_prepare_kretprobe);
int __kprobes kprobe_handler(struct pt_regs *regs)
int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
{
int ret;
unsigned int insn = *p->ainsn.insn;
/* regs->nip is also adjusted if emulate_step returns 1 */
ret = emulate_step(regs, insn);
if (ret > 0) {
/*
* Once this instruction has been boosted
* successfully, set the boostable flag
*/
if (unlikely(p->ainsn.boostable == 0))
p->ainsn.boostable = 1;
} else if (ret < 0) {
/*
* We don't allow kprobes on mtmsr(d)/rfi(d), etc.
* So, we should never get here... but, its still
* good to catch them, just in case...
*/
printk("Can't step on instruction %x\n", insn);
BUG();
} else if (ret == 0)
/* This instruction can't be boosted */
p->ainsn.boostable = -1;
return ret;
}
NOKPROBE_SYMBOL(try_to_emulate);
int kprobe_handler(struct pt_regs *regs)
{
struct kprobe *p;
int ret = 0;
@ -177,10 +304,17 @@ int __kprobes kprobe_handler(struct pt_regs *regs)
*/
save_previous_kprobe(kcb);
set_current_kprobe(p, regs, kcb);
kcb->kprobe_saved_msr = regs->msr;
kprobes_inc_nmissed_count(p);
prepare_singlestep(p, regs);
kcb->kprobe_status = KPROBE_REENTER;
if (p->ainsn.boostable >= 0) {
ret = try_to_emulate(p, regs);
if (ret > 0) {
restore_previous_kprobe(kcb);
return 1;
}
}
return 1;
} else {
if (*addr != BREAKPOINT_INSTRUCTION) {
@ -197,7 +331,9 @@ int __kprobes kprobe_handler(struct pt_regs *regs)
}
p = __this_cpu_read(current_kprobe);
if (p->break_handler && p->break_handler(p, regs)) {
goto ss_probe;
if (!skip_singlestep(p, regs, kcb))
goto ss_probe;
ret = 1;
}
}
goto no_kprobe;
@ -235,18 +371,9 @@ int __kprobes kprobe_handler(struct pt_regs *regs)
ss_probe:
if (p->ainsn.boostable >= 0) {
unsigned int insn = *p->ainsn.insn;
ret = try_to_emulate(p, regs);
/* regs->nip is also adjusted if emulate_step returns 1 */
ret = emulate_step(regs, insn);
if (ret > 0) {
/*
* Once this instruction has been boosted
* successfully, set the boostable flag
*/
if (unlikely(p->ainsn.boostable == 0))
p->ainsn.boostable = 1;
if (p->post_handler)
p->post_handler(p, regs, 0);
@ -254,17 +381,7 @@ int __kprobes kprobe_handler(struct pt_regs *regs)
reset_current_kprobe();
preempt_enable_no_resched();
return 1;
} else if (ret < 0) {
/*
* We don't allow kprobes on mtmsr(d)/rfi(d), etc.
* So, we should never get here... but, its still
* good to catch them, just in case...
*/
printk("Can't step on instruction %x\n", insn);
BUG();
} else if (ret == 0)
/* This instruction can't be boosted */
p->ainsn.boostable = -1;
}
}
prepare_singlestep(p, regs);
kcb->kprobe_status = KPROBE_HIT_SS;
@ -274,6 +391,7 @@ int __kprobes kprobe_handler(struct pt_regs *regs)
preempt_enable_no_resched();
return ret;
}
NOKPROBE_SYMBOL(kprobe_handler);
/*
* Function return probe trampoline:
@ -291,8 +409,7 @@ asm(".global kretprobe_trampoline\n"
/*
* Called when the probe at kretprobe trampoline is hit
*/
static int __kprobes trampoline_probe_handler(struct kprobe *p,
struct pt_regs *regs)
static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kretprobe_instance *ri = NULL;
struct hlist_head *head, empty_rp;
@ -361,6 +478,7 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
*/
return 1;
}
NOKPROBE_SYMBOL(trampoline_probe_handler);
/*
* Called after single-stepping. p->addr is the address of the
@ -370,7 +488,7 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
* single-stepped a copy of the instruction. The address of this
* copy is p->ainsn.insn.
*/
int __kprobes kprobe_post_handler(struct pt_regs *regs)
int kprobe_post_handler(struct pt_regs *regs)
{
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@ -410,8 +528,9 @@ int __kprobes kprobe_post_handler(struct pt_regs *regs)
return 1;
}
NOKPROBE_SYMBOL(kprobe_post_handler);
int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
{
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@ -474,13 +593,15 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
}
return 0;
}
NOKPROBE_SYMBOL(kprobe_fault_handler);
unsigned long arch_deref_entry_point(void *entry)
{
return ppc_global_function_entry(entry);
}
NOKPROBE_SYMBOL(arch_deref_entry_point);
int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
int setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
{
struct jprobe *jp = container_of(p, struct jprobe, kp);
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@ -497,17 +618,20 @@ int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
return 1;
}
NOKPROBE_SYMBOL(setjmp_pre_handler);
void __used __kprobes jprobe_return(void)
void __used jprobe_return(void)
{
asm volatile("trap" ::: "memory");
}
NOKPROBE_SYMBOL(jprobe_return);
static void __used __kprobes jprobe_return_end(void)
static void __used jprobe_return_end(void)
{
};
}
NOKPROBE_SYMBOL(jprobe_return_end);
int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@ -520,6 +644,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
preempt_enable_no_resched();
return 1;
}
NOKPROBE_SYMBOL(longjmp_break_handler);
static struct kprobe trampoline_p = {
.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
@ -531,10 +656,11 @@ int __init arch_init_kprobes(void)
return register_kprobe(&trampoline_p);
}
int __kprobes arch_trampoline_kprobe(struct kprobe *p)
int arch_trampoline_kprobe(struct kprobe *p)
{
if (p->addr == (kprobe_opcode_t *)&kretprobe_trampoline)
return 1;
return 0;
}
NOKPROBE_SYMBOL(arch_trampoline_kprobe);

View File

@ -221,6 +221,8 @@ static void machine_check_process_queued_event(struct irq_work *work)
{
int index;
add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
/*
* For now just print it to console.
* TODO: log this error event to FSP or nvram.
@ -228,12 +230,13 @@ static void machine_check_process_queued_event(struct irq_work *work)
while (__this_cpu_read(mce_queue_count) > 0) {
index = __this_cpu_read(mce_queue_count) - 1;
machine_check_print_event_info(
this_cpu_ptr(&mce_event_queue[index]));
this_cpu_ptr(&mce_event_queue[index]), false);
__this_cpu_dec(mce_queue_count);
}
}
void machine_check_print_event_info(struct machine_check_event *evt)
void machine_check_print_event_info(struct machine_check_event *evt,
bool user_mode)
{
const char *level, *sevstr, *subtype;
static const char *mc_ue_types[] = {
@ -310,7 +313,16 @@ void machine_check_print_event_info(struct machine_check_event *evt)
printk("%s%s Machine check interrupt [%s]\n", level, sevstr,
evt->disposition == MCE_DISPOSITION_RECOVERED ?
"Recovered" : "[Not recovered");
"Recovered" : "Not recovered");
if (user_mode) {
printk("%s NIP: [%016llx] PID: %d Comm: %s\n", level,
evt->srr0, current->pid, current->comm);
} else {
printk("%s NIP [%016llx]: %pS\n", level, evt->srr0,
(void *)evt->srr0);
}
printk("%s Initiator: %s\n", level,
evt->initiator == MCE_INITIATOR_CPU ? "CPU" : "Unknown");
switch (evt->error_type) {

View File

@ -72,10 +72,14 @@ void __flush_tlb_power8(unsigned int action)
void __flush_tlb_power9(unsigned int action)
{
if (radix_enabled())
flush_tlb_206(POWER9_TLB_SETS_RADIX, action);
unsigned int num_sets;
flush_tlb_206(POWER9_TLB_SETS_HASH, action);
if (radix_enabled())
num_sets = POWER9_TLB_SETS_RADIX;
else
num_sets = POWER9_TLB_SETS_HASH;
flush_tlb_206(num_sets, action);
}
@ -147,159 +151,365 @@ static int mce_flush(int what)
return 0;
}
static int mce_handle_flush_derrors(uint64_t dsisr, uint64_t slb, uint64_t tlb, uint64_t erat)
#define SRR1_MC_LOADSTORE(srr1) ((srr1) & PPC_BIT(42))
struct mce_ierror_table {
unsigned long srr1_mask;
unsigned long srr1_value;
bool nip_valid; /* nip is a valid indicator of faulting address */
unsigned int error_type;
unsigned int error_subtype;
unsigned int initiator;
unsigned int severity;
};
static const struct mce_ierror_table mce_p7_ierror_table[] = {
{ 0x00000000001c0000, 0x0000000000040000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000001c0000, 0x0000000000080000, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_PARITY,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000001c0000, 0x00000000000c0000, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000001c0000, 0x0000000000100000, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_INDETERMINATE, /* BOTH */
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000001c0000, 0x0000000000140000, true,
MCE_ERROR_TYPE_TLB, MCE_TLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000001c0000, 0x0000000000180000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000001c0000, 0x00000000001c0000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0, 0, 0, 0, 0, 0 } };
static const struct mce_ierror_table mce_p8_ierror_table[] = {
{ 0x00000000081c0000, 0x0000000000040000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000000080000, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_PARITY,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x00000000000c0000, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000000100000, true,
MCE_ERROR_TYPE_ERAT,MCE_ERAT_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000000140000, true,
MCE_ERROR_TYPE_TLB, MCE_TLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000000180000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x00000000001c0000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000008000000, true,
MCE_ERROR_TYPE_LINK,MCE_LINK_ERROR_IFETCH_TIMEOUT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000008040000, true,
MCE_ERROR_TYPE_LINK,MCE_LINK_ERROR_PAGE_TABLE_WALK_IFETCH_TIMEOUT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0, 0, 0, 0, 0, 0 } };
static const struct mce_ierror_table mce_p9_ierror_table[] = {
{ 0x00000000081c0000, 0x0000000000040000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000000080000, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_PARITY,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x00000000000c0000, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000000100000, true,
MCE_ERROR_TYPE_ERAT,MCE_ERAT_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000000140000, true,
MCE_ERROR_TYPE_TLB, MCE_TLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000000180000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000008000000, true,
MCE_ERROR_TYPE_LINK,MCE_LINK_ERROR_IFETCH_TIMEOUT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000008040000, true,
MCE_ERROR_TYPE_LINK,MCE_LINK_ERROR_PAGE_TABLE_WALK_IFETCH_TIMEOUT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x00000000080c0000, true,
MCE_ERROR_TYPE_RA, MCE_RA_ERROR_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000008100000, true,
MCE_ERROR_TYPE_RA, MCE_RA_ERROR_PAGE_TABLE_WALK_IFETCH,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000000081c0000, 0x0000000008140000, false,
MCE_ERROR_TYPE_RA, MCE_RA_ERROR_STORE,
MCE_INITIATOR_CPU, MCE_SEV_FATAL, }, /* ASYNC is fatal */
{ 0x00000000081c0000, 0x0000000008180000, false,
MCE_ERROR_TYPE_LINK,MCE_LINK_ERROR_STORE_TIMEOUT,
MCE_INITIATOR_CPU, MCE_SEV_FATAL, }, /* ASYNC is fatal */
{ 0x00000000081c0000, 0x00000000081c0000, true,
MCE_ERROR_TYPE_RA, MCE_RA_ERROR_PAGE_TABLE_WALK_IFETCH_FOREIGN,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0, 0, 0, 0, 0, 0 } };
struct mce_derror_table {
unsigned long dsisr_value;
bool dar_valid; /* dar is a valid indicator of faulting address */
unsigned int error_type;
unsigned int error_subtype;
unsigned int initiator;
unsigned int severity;
};
static const struct mce_derror_table mce_p7_derror_table[] = {
{ 0x00008000, false,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_LOAD_STORE,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00004000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_PAGE_TABLE_WALK_LOAD_STORE,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000800, true,
MCE_ERROR_TYPE_ERAT, MCE_ERAT_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000400, true,
MCE_ERROR_TYPE_TLB, MCE_TLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000100, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_PARITY,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000080, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000040, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_INDETERMINATE, /* BOTH */
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0, false, 0, 0, 0, 0 } };
static const struct mce_derror_table mce_p8_derror_table[] = {
{ 0x00008000, false,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_LOAD_STORE,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00004000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_PAGE_TABLE_WALK_LOAD_STORE,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00002000, true,
MCE_ERROR_TYPE_LINK, MCE_LINK_ERROR_LOAD_TIMEOUT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00001000, true,
MCE_ERROR_TYPE_LINK, MCE_LINK_ERROR_PAGE_TABLE_WALK_LOAD_STORE_TIMEOUT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000800, true,
MCE_ERROR_TYPE_ERAT, MCE_ERAT_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000400, true,
MCE_ERROR_TYPE_TLB, MCE_TLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000200, true,
MCE_ERROR_TYPE_ERAT, MCE_ERAT_ERROR_MULTIHIT, /* SECONDARY ERAT */
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000100, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_PARITY,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000080, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0, false, 0, 0, 0, 0 } };
static const struct mce_derror_table mce_p9_derror_table[] = {
{ 0x00008000, false,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_LOAD_STORE,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00004000, true,
MCE_ERROR_TYPE_UE, MCE_UE_ERROR_PAGE_TABLE_WALK_LOAD_STORE,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00002000, true,
MCE_ERROR_TYPE_LINK, MCE_LINK_ERROR_LOAD_TIMEOUT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00001000, true,
MCE_ERROR_TYPE_LINK, MCE_LINK_ERROR_PAGE_TABLE_WALK_LOAD_STORE_TIMEOUT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000800, true,
MCE_ERROR_TYPE_ERAT, MCE_ERAT_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000400, true,
MCE_ERROR_TYPE_TLB, MCE_TLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000200, false,
MCE_ERROR_TYPE_USER, MCE_USER_ERROR_TLBIE,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000100, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_PARITY,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000080, true,
MCE_ERROR_TYPE_SLB, MCE_SLB_ERROR_MULTIHIT,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000040, true,
MCE_ERROR_TYPE_RA, MCE_RA_ERROR_LOAD,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000020, false,
MCE_ERROR_TYPE_RA, MCE_RA_ERROR_PAGE_TABLE_WALK_LOAD_STORE,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000010, false,
MCE_ERROR_TYPE_RA, MCE_RA_ERROR_PAGE_TABLE_WALK_LOAD_STORE_FOREIGN,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0x00000008, false,
MCE_ERROR_TYPE_RA, MCE_RA_ERROR_LOAD_STORE_FOREIGN,
MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, },
{ 0, false, 0, 0, 0, 0 } };
static int mce_handle_ierror(struct pt_regs *regs,
const struct mce_ierror_table table[],
struct mce_error_info *mce_err, uint64_t *addr)
{
if ((dsisr & slb) && mce_flush(MCE_FLUSH_SLB))
dsisr &= ~slb;
if ((dsisr & erat) && mce_flush(MCE_FLUSH_ERAT))
dsisr &= ~erat;
if ((dsisr & tlb) && mce_flush(MCE_FLUSH_TLB))
dsisr &= ~tlb;
/* Any other errors we don't understand? */
if (dsisr)
return 0;
return 1;
}
uint64_t srr1 = regs->msr;
int handled = 0;
int i;
static long mce_handle_derror(uint64_t dsisr, uint64_t slb_error_bits)
{
long handled = 1;
*addr = 0;
/*
* flush and reload SLBs for SLB errors and flush TLBs for TLB errors.
* reset the error bits whenever we handle them so that at the end
* we can check whether we handled all of them or not.
* */
#ifdef CONFIG_PPC_STD_MMU_64
if (dsisr & slb_error_bits) {
flush_and_reload_slb();
/* reset error bits */
dsisr &= ~(slb_error_bits);
}
if (dsisr & P7_DSISR_MC_TLB_MULTIHIT_MFTLB) {
if (cur_cpu_spec && cur_cpu_spec->flush_tlb)
cur_cpu_spec->flush_tlb(TLB_INVAL_SCOPE_GLOBAL);
/* reset error bits */
dsisr &= ~P7_DSISR_MC_TLB_MULTIHIT_MFTLB;
}
#endif
/* Any other errors we don't understand? */
if (dsisr & 0xffffffffUL)
handled = 0;
for (i = 0; table[i].srr1_mask; i++) {
if ((srr1 & table[i].srr1_mask) != table[i].srr1_value)
continue;
return handled;
}
static long mce_handle_derror_p7(uint64_t dsisr)
{
return mce_handle_derror(dsisr, P7_DSISR_MC_SLB_ERRORS);
}
static long mce_handle_common_ierror(uint64_t srr1)
{
long handled = 0;
switch (P7_SRR1_MC_IFETCH(srr1)) {
case 0:
break;
#ifdef CONFIG_PPC_STD_MMU_64
case P7_SRR1_MC_IFETCH_SLB_PARITY:
case P7_SRR1_MC_IFETCH_SLB_MULTIHIT:
/* flush and reload SLBs for SLB errors. */
flush_and_reload_slb();
handled = 1;
break;
case P7_SRR1_MC_IFETCH_TLB_MULTIHIT:
if (cur_cpu_spec && cur_cpu_spec->flush_tlb) {
cur_cpu_spec->flush_tlb(TLB_INVAL_SCOPE_GLOBAL);
handled = 1;
/* attempt to correct the error */
switch (table[i].error_type) {
case MCE_ERROR_TYPE_SLB:
handled = mce_flush(MCE_FLUSH_SLB);
break;
case MCE_ERROR_TYPE_ERAT:
handled = mce_flush(MCE_FLUSH_ERAT);
break;
case MCE_ERROR_TYPE_TLB:
handled = mce_flush(MCE_FLUSH_TLB);
break;
}
break;
#endif
default:
break;
/* now fill in mce_error_info */
mce_err->error_type = table[i].error_type;
switch (table[i].error_type) {
case MCE_ERROR_TYPE_UE:
mce_err->u.ue_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_SLB:
mce_err->u.slb_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_ERAT:
mce_err->u.erat_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_TLB:
mce_err->u.tlb_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_USER:
mce_err->u.user_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_RA:
mce_err->u.ra_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_LINK:
mce_err->u.link_error_type = table[i].error_subtype;
break;
}
mce_err->severity = table[i].severity;
mce_err->initiator = table[i].initiator;
if (table[i].nip_valid)
*addr = regs->nip;
return handled;
}
return handled;
mce_err->error_type = MCE_ERROR_TYPE_UNKNOWN;
mce_err->severity = MCE_SEV_ERROR_SYNC;
mce_err->initiator = MCE_INITIATOR_CPU;
return 0;
}
static long mce_handle_ierror_p7(uint64_t srr1)
static int mce_handle_derror(struct pt_regs *regs,
const struct mce_derror_table table[],
struct mce_error_info *mce_err, uint64_t *addr)
{
long handled = 0;
uint64_t dsisr = regs->dsisr;
int handled = 0;
int found = 0;
int i;
handled = mce_handle_common_ierror(srr1);
*addr = 0;
#ifdef CONFIG_PPC_STD_MMU_64
if (P7_SRR1_MC_IFETCH(srr1) == P7_SRR1_MC_IFETCH_SLB_BOTH) {
flush_and_reload_slb();
handled = 1;
for (i = 0; table[i].dsisr_value; i++) {
if (!(dsisr & table[i].dsisr_value))
continue;
/* attempt to correct the error */
switch (table[i].error_type) {
case MCE_ERROR_TYPE_SLB:
if (mce_flush(MCE_FLUSH_SLB))
handled = 1;
break;
case MCE_ERROR_TYPE_ERAT:
if (mce_flush(MCE_FLUSH_ERAT))
handled = 1;
break;
case MCE_ERROR_TYPE_TLB:
if (mce_flush(MCE_FLUSH_TLB))
handled = 1;
break;
}
/*
* Attempt to handle multiple conditions, but only return
* one. Ensure uncorrectable errors are first in the table
* to match.
*/
if (found)
continue;
/* now fill in mce_error_info */
mce_err->error_type = table[i].error_type;
switch (table[i].error_type) {
case MCE_ERROR_TYPE_UE:
mce_err->u.ue_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_SLB:
mce_err->u.slb_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_ERAT:
mce_err->u.erat_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_TLB:
mce_err->u.tlb_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_USER:
mce_err->u.user_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_RA:
mce_err->u.ra_error_type = table[i].error_subtype;
break;
case MCE_ERROR_TYPE_LINK:
mce_err->u.link_error_type = table[i].error_subtype;
break;
}
mce_err->severity = table[i].severity;
mce_err->initiator = table[i].initiator;
if (table[i].dar_valid)
*addr = regs->dar;
found = 1;
}
#endif
return handled;
}
static void mce_get_common_ierror(struct mce_error_info *mce_err, uint64_t srr1)
{
switch (P7_SRR1_MC_IFETCH(srr1)) {
case P7_SRR1_MC_IFETCH_SLB_PARITY:
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_PARITY;
break;
case P7_SRR1_MC_IFETCH_SLB_MULTIHIT:
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_MULTIHIT;
break;
case P7_SRR1_MC_IFETCH_TLB_MULTIHIT:
mce_err->error_type = MCE_ERROR_TYPE_TLB;
mce_err->u.tlb_error_type = MCE_TLB_ERROR_MULTIHIT;
break;
case P7_SRR1_MC_IFETCH_UE:
case P7_SRR1_MC_IFETCH_UE_IFU_INTERNAL:
mce_err->error_type = MCE_ERROR_TYPE_UE;
mce_err->u.ue_error_type = MCE_UE_ERROR_IFETCH;
break;
case P7_SRR1_MC_IFETCH_UE_TLB_RELOAD:
mce_err->error_type = MCE_ERROR_TYPE_UE;
mce_err->u.ue_error_type =
MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH;
break;
}
}
if (found)
return handled;
static void mce_get_ierror_p7(struct mce_error_info *mce_err, uint64_t srr1)
{
mce_get_common_ierror(mce_err, srr1);
if (P7_SRR1_MC_IFETCH(srr1) == P7_SRR1_MC_IFETCH_SLB_BOTH) {
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_INDETERMINATE;
}
}
mce_err->error_type = MCE_ERROR_TYPE_UNKNOWN;
mce_err->severity = MCE_SEV_ERROR_SYNC;
mce_err->initiator = MCE_INITIATOR_CPU;
static void mce_get_derror_p7(struct mce_error_info *mce_err, uint64_t dsisr)
{
if (dsisr & P7_DSISR_MC_UE) {
mce_err->error_type = MCE_ERROR_TYPE_UE;
mce_err->u.ue_error_type = MCE_UE_ERROR_LOAD_STORE;
} else if (dsisr & P7_DSISR_MC_UE_TABLEWALK) {
mce_err->error_type = MCE_ERROR_TYPE_UE;
mce_err->u.ue_error_type =
MCE_UE_ERROR_PAGE_TABLE_WALK_LOAD_STORE;
} else if (dsisr & P7_DSISR_MC_ERAT_MULTIHIT) {
mce_err->error_type = MCE_ERROR_TYPE_ERAT;
mce_err->u.erat_error_type = MCE_ERAT_ERROR_MULTIHIT;
} else if (dsisr & P7_DSISR_MC_SLB_MULTIHIT) {
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_MULTIHIT;
} else if (dsisr & P7_DSISR_MC_SLB_PARITY_MFSLB) {
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_PARITY;
} else if (dsisr & P7_DSISR_MC_TLB_MULTIHIT_MFTLB) {
mce_err->error_type = MCE_ERROR_TYPE_TLB;
mce_err->u.tlb_error_type = MCE_TLB_ERROR_MULTIHIT;
} else if (dsisr & P7_DSISR_MC_SLB_MULTIHIT_PARITY) {
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_INDETERMINATE;
}
return 0;
}
static long mce_handle_ue_error(struct pt_regs *regs)
@ -320,292 +530,42 @@ static long mce_handle_ue_error(struct pt_regs *regs)
return handled;
}
long __machine_check_early_realmode_p7(struct pt_regs *regs)
static long mce_handle_error(struct pt_regs *regs,
const struct mce_derror_table dtable[],
const struct mce_ierror_table itable[])
{
uint64_t srr1, nip, addr;
long handled = 1;
struct mce_error_info mce_error_info = { 0 };
struct mce_error_info mce_err = { 0 };
uint64_t addr;
uint64_t srr1 = regs->msr;
long handled;
mce_error_info.severity = MCE_SEV_ERROR_SYNC;
mce_error_info.initiator = MCE_INITIATOR_CPU;
if (SRR1_MC_LOADSTORE(srr1))
handled = mce_handle_derror(regs, dtable, &mce_err, &addr);
else
handled = mce_handle_ierror(regs, itable, &mce_err, &addr);
srr1 = regs->msr;
nip = regs->nip;
/*
* Handle memory errors depending whether this was a load/store or
* ifetch exception. Also, populate the mce error_type and
* type-specific error_type from either SRR1 or DSISR, depending
* whether this was a load/store or ifetch exception
*/
if (P7_SRR1_MC_LOADSTORE(srr1)) {
handled = mce_handle_derror_p7(regs->dsisr);
mce_get_derror_p7(&mce_error_info, regs->dsisr);
addr = regs->dar;
} else {
handled = mce_handle_ierror_p7(srr1);
mce_get_ierror_p7(&mce_error_info, srr1);
addr = regs->nip;
}
/* Handle UE error. */
if (mce_error_info.error_type == MCE_ERROR_TYPE_UE)
if (!handled && mce_err.error_type == MCE_ERROR_TYPE_UE)
handled = mce_handle_ue_error(regs);
save_mce_event(regs, handled, &mce_error_info, nip, addr);
save_mce_event(regs, handled, &mce_err, regs->nip, addr);
return handled;
}
static void mce_get_ierror_p8(struct mce_error_info *mce_err, uint64_t srr1)
long __machine_check_early_realmode_p7(struct pt_regs *regs)
{
mce_get_common_ierror(mce_err, srr1);
if (P7_SRR1_MC_IFETCH(srr1) == P8_SRR1_MC_IFETCH_ERAT_MULTIHIT) {
mce_err->error_type = MCE_ERROR_TYPE_ERAT;
mce_err->u.erat_error_type = MCE_ERAT_ERROR_MULTIHIT;
}
}
/* P7 DD1 leaves top bits of DSISR undefined */
regs->dsisr &= 0x0000ffff;
static void mce_get_derror_p8(struct mce_error_info *mce_err, uint64_t dsisr)
{
mce_get_derror_p7(mce_err, dsisr);
if (dsisr & P8_DSISR_MC_ERAT_MULTIHIT_SEC) {
mce_err->error_type = MCE_ERROR_TYPE_ERAT;
mce_err->u.erat_error_type = MCE_ERAT_ERROR_MULTIHIT;
}
}
static long mce_handle_ierror_p8(uint64_t srr1)
{
long handled = 0;
handled = mce_handle_common_ierror(srr1);
#ifdef CONFIG_PPC_STD_MMU_64
if (P7_SRR1_MC_IFETCH(srr1) == P8_SRR1_MC_IFETCH_ERAT_MULTIHIT) {
flush_and_reload_slb();
handled = 1;
}
#endif
return handled;
}
static long mce_handle_derror_p8(uint64_t dsisr)
{
return mce_handle_derror(dsisr, P8_DSISR_MC_SLB_ERRORS);
return mce_handle_error(regs, mce_p7_derror_table, mce_p7_ierror_table);
}
long __machine_check_early_realmode_p8(struct pt_regs *regs)
{
uint64_t srr1, nip, addr;
long handled = 1;
struct mce_error_info mce_error_info = { 0 };
mce_error_info.severity = MCE_SEV_ERROR_SYNC;
mce_error_info.initiator = MCE_INITIATOR_CPU;
srr1 = regs->msr;
nip = regs->nip;
if (P7_SRR1_MC_LOADSTORE(srr1)) {
handled = mce_handle_derror_p8(regs->dsisr);
mce_get_derror_p8(&mce_error_info, regs->dsisr);
addr = regs->dar;
} else {
handled = mce_handle_ierror_p8(srr1);
mce_get_ierror_p8(&mce_error_info, srr1);
addr = regs->nip;
}
/* Handle UE error. */
if (mce_error_info.error_type == MCE_ERROR_TYPE_UE)
handled = mce_handle_ue_error(regs);
save_mce_event(regs, handled, &mce_error_info, nip, addr);
return handled;
}
static int mce_handle_derror_p9(struct pt_regs *regs)
{
uint64_t dsisr = regs->dsisr;
return mce_handle_flush_derrors(dsisr,
P9_DSISR_MC_SLB_PARITY_MFSLB |
P9_DSISR_MC_SLB_MULTIHIT_MFSLB,
P9_DSISR_MC_TLB_MULTIHIT_MFTLB,
P9_DSISR_MC_ERAT_MULTIHIT);
}
static int mce_handle_ierror_p9(struct pt_regs *regs)
{
uint64_t srr1 = regs->msr;
switch (P9_SRR1_MC_IFETCH(srr1)) {
case P9_SRR1_MC_IFETCH_SLB_PARITY:
case P9_SRR1_MC_IFETCH_SLB_MULTIHIT:
return mce_flush(MCE_FLUSH_SLB);
case P9_SRR1_MC_IFETCH_TLB_MULTIHIT:
return mce_flush(MCE_FLUSH_TLB);
case P9_SRR1_MC_IFETCH_ERAT_MULTIHIT:
return mce_flush(MCE_FLUSH_ERAT);
default:
return 0;
}
}
static void mce_get_derror_p9(struct pt_regs *regs,
struct mce_error_info *mce_err, uint64_t *addr)
{
uint64_t dsisr = regs->dsisr;
mce_err->severity = MCE_SEV_ERROR_SYNC;
mce_err->initiator = MCE_INITIATOR_CPU;
if (dsisr & P9_DSISR_MC_USER_TLBIE)
*addr = regs->nip;
else
*addr = regs->dar;
if (dsisr & P9_DSISR_MC_UE) {
mce_err->error_type = MCE_ERROR_TYPE_UE;
mce_err->u.ue_error_type = MCE_UE_ERROR_LOAD_STORE;
} else if (dsisr & P9_DSISR_MC_UE_TABLEWALK) {
mce_err->error_type = MCE_ERROR_TYPE_UE;
mce_err->u.ue_error_type = MCE_UE_ERROR_PAGE_TABLE_WALK_LOAD_STORE;
} else if (dsisr & P9_DSISR_MC_LINK_LOAD_TIMEOUT) {
mce_err->error_type = MCE_ERROR_TYPE_LINK;
mce_err->u.link_error_type = MCE_LINK_ERROR_LOAD_TIMEOUT;
} else if (dsisr & P9_DSISR_MC_LINK_TABLEWALK_TIMEOUT) {
mce_err->error_type = MCE_ERROR_TYPE_LINK;
mce_err->u.link_error_type = MCE_LINK_ERROR_PAGE_TABLE_WALK_LOAD_STORE_TIMEOUT;
} else if (dsisr & P9_DSISR_MC_ERAT_MULTIHIT) {
mce_err->error_type = MCE_ERROR_TYPE_ERAT;
mce_err->u.erat_error_type = MCE_ERAT_ERROR_MULTIHIT;
} else if (dsisr & P9_DSISR_MC_TLB_MULTIHIT_MFTLB) {
mce_err->error_type = MCE_ERROR_TYPE_TLB;
mce_err->u.tlb_error_type = MCE_TLB_ERROR_MULTIHIT;
} else if (dsisr & P9_DSISR_MC_USER_TLBIE) {
mce_err->error_type = MCE_ERROR_TYPE_USER;
mce_err->u.user_error_type = MCE_USER_ERROR_TLBIE;
} else if (dsisr & P9_DSISR_MC_SLB_PARITY_MFSLB) {
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_PARITY;
} else if (dsisr & P9_DSISR_MC_SLB_MULTIHIT_MFSLB) {
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_MULTIHIT;
} else if (dsisr & P9_DSISR_MC_RA_LOAD) {
mce_err->error_type = MCE_ERROR_TYPE_RA;
mce_err->u.ra_error_type = MCE_RA_ERROR_LOAD;
} else if (dsisr & P9_DSISR_MC_RA_TABLEWALK) {
mce_err->error_type = MCE_ERROR_TYPE_RA;
mce_err->u.ra_error_type = MCE_RA_ERROR_PAGE_TABLE_WALK_LOAD_STORE;
} else if (dsisr & P9_DSISR_MC_RA_TABLEWALK_FOREIGN) {
mce_err->error_type = MCE_ERROR_TYPE_RA;
mce_err->u.ra_error_type = MCE_RA_ERROR_PAGE_TABLE_WALK_LOAD_STORE_FOREIGN;
} else if (dsisr & P9_DSISR_MC_RA_FOREIGN) {
mce_err->error_type = MCE_ERROR_TYPE_RA;
mce_err->u.ra_error_type = MCE_RA_ERROR_LOAD_STORE_FOREIGN;
}
}
static void mce_get_ierror_p9(struct pt_regs *regs,
struct mce_error_info *mce_err, uint64_t *addr)
{
uint64_t srr1 = regs->msr;
switch (P9_SRR1_MC_IFETCH(srr1)) {
case P9_SRR1_MC_IFETCH_RA_ASYNC_STORE:
case P9_SRR1_MC_IFETCH_LINK_ASYNC_STORE_TIMEOUT:
mce_err->severity = MCE_SEV_FATAL;
break;
default:
mce_err->severity = MCE_SEV_ERROR_SYNC;
break;
}
mce_err->initiator = MCE_INITIATOR_CPU;
*addr = regs->nip;
switch (P9_SRR1_MC_IFETCH(srr1)) {
case P9_SRR1_MC_IFETCH_UE:
mce_err->error_type = MCE_ERROR_TYPE_UE;
mce_err->u.ue_error_type = MCE_UE_ERROR_IFETCH;
break;
case P9_SRR1_MC_IFETCH_SLB_PARITY:
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_PARITY;
break;
case P9_SRR1_MC_IFETCH_SLB_MULTIHIT:
mce_err->error_type = MCE_ERROR_TYPE_SLB;
mce_err->u.slb_error_type = MCE_SLB_ERROR_MULTIHIT;
break;
case P9_SRR1_MC_IFETCH_ERAT_MULTIHIT:
mce_err->error_type = MCE_ERROR_TYPE_ERAT;
mce_err->u.erat_error_type = MCE_ERAT_ERROR_MULTIHIT;
break;
case P9_SRR1_MC_IFETCH_TLB_MULTIHIT:
mce_err->error_type = MCE_ERROR_TYPE_TLB;
mce_err->u.tlb_error_type = MCE_TLB_ERROR_MULTIHIT;
break;
case P9_SRR1_MC_IFETCH_UE_TLB_RELOAD:
mce_err->error_type = MCE_ERROR_TYPE_UE;
mce_err->u.ue_error_type = MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH;
break;
case P9_SRR1_MC_IFETCH_LINK_TIMEOUT:
mce_err->error_type = MCE_ERROR_TYPE_LINK;
mce_err->u.link_error_type = MCE_LINK_ERROR_IFETCH_TIMEOUT;
break;
case P9_SRR1_MC_IFETCH_LINK_TABLEWALK_TIMEOUT:
mce_err->error_type = MCE_ERROR_TYPE_LINK;
mce_err->u.link_error_type = MCE_LINK_ERROR_PAGE_TABLE_WALK_IFETCH_TIMEOUT;
break;
case P9_SRR1_MC_IFETCH_RA:
mce_err->error_type = MCE_ERROR_TYPE_RA;
mce_err->u.ra_error_type = MCE_RA_ERROR_IFETCH;
break;
case P9_SRR1_MC_IFETCH_RA_TABLEWALK:
mce_err->error_type = MCE_ERROR_TYPE_RA;
mce_err->u.ra_error_type = MCE_RA_ERROR_PAGE_TABLE_WALK_IFETCH;
break;
case P9_SRR1_MC_IFETCH_RA_ASYNC_STORE:
mce_err->error_type = MCE_ERROR_TYPE_RA;
mce_err->u.ra_error_type = MCE_RA_ERROR_STORE;
break;
case P9_SRR1_MC_IFETCH_LINK_ASYNC_STORE_TIMEOUT:
mce_err->error_type = MCE_ERROR_TYPE_LINK;
mce_err->u.link_error_type = MCE_LINK_ERROR_STORE_TIMEOUT;
break;
case P9_SRR1_MC_IFETCH_RA_TABLEWALK_FOREIGN:
mce_err->error_type = MCE_ERROR_TYPE_RA;
mce_err->u.ra_error_type = MCE_RA_ERROR_PAGE_TABLE_WALK_IFETCH_FOREIGN;
break;
default:
break;
}
return mce_handle_error(regs, mce_p8_derror_table, mce_p8_ierror_table);
}
long __machine_check_early_realmode_p9(struct pt_regs *regs)
{
uint64_t nip, addr;
long handled;
struct mce_error_info mce_error_info = { 0 };
nip = regs->nip;
if (P9_SRR1_MC_LOADSTORE(regs->msr)) {
handled = mce_handle_derror_p9(regs);
mce_get_derror_p9(regs, &mce_error_info, &addr);
} else {
handled = mce_handle_ierror_p9(regs);
mce_get_ierror_p9(regs, &mce_error_info, &addr);
}
/* Handle UE error. */
if (mce_error_info.error_type == MCE_ERROR_TYPE_UE)
handled = mce_handle_ue_error(regs);
save_mce_event(regs, handled, &mce_error_info, nip, addr);
return handled;
return mce_handle_error(regs, mce_p9_derror_table, mce_p9_ierror_table);
}

View File

@ -243,10 +243,10 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
/*
* 2. branch to optimized_callback() and emulate_step()
*/
kprobe_lookup_name("optimized_callback", op_callback_addr);
kprobe_lookup_name("emulate_step", emulate_step_addr);
op_callback_addr = (kprobe_opcode_t *)ppc_kallsyms_lookup_name("optimized_callback");
emulate_step_addr = (kprobe_opcode_t *)ppc_kallsyms_lookup_name("emulate_step");
if (!op_callback_addr || !emulate_step_addr) {
WARN(1, "kprobe_lookup_name() failed\n");
WARN(1, "Unable to lookup optimized_callback()/emulate_step()\n");
goto error;
}

View File

@ -245,3 +245,24 @@ void __init free_unused_pacas(void)
free_lppacas();
}
void copy_mm_to_paca(struct mm_struct *mm)
{
#ifdef CONFIG_PPC_BOOK3S
mm_context_t *context = &mm->context;
get_paca()->mm_ctx_id = context->id;
#ifdef CONFIG_PPC_MM_SLICES
VM_BUG_ON(!mm->context.addr_limit);
get_paca()->addr_limit = mm->context.addr_limit;
get_paca()->mm_ctx_low_slices_psize = context->low_slices_psize;
memcpy(&get_paca()->mm_ctx_high_slices_psize,
&context->high_slices_psize, TASK_SLICE_ARRAY_SZ(mm));
#else /* CONFIG_PPC_MM_SLICES */
get_paca()->mm_ctx_user_psize = context->user_psize;
get_paca()->mm_ctx_sllp = context->sllp;
#endif
#else /* CONFIG_PPC_BOOK3S */
return;
#endif
}

View File

@ -55,7 +55,6 @@
#include <asm/kexec.h>
#include <asm/opal.h>
#include <asm/fadump.h>
#include <asm/debug.h>
#include <asm/epapr_hcalls.h>
#include <asm/firmware.h>

View File

@ -815,7 +815,7 @@ struct ibm_arch_vec __cacheline_aligned ibm_architecture_vec = {
.virt_base = cpu_to_be32(0xffffffff),
.virt_size = cpu_to_be32(0xffffffff),
.load_base = cpu_to_be32(0xffffffff),
.min_rma = cpu_to_be32(256), /* 256MB min RMA */
.min_rma = cpu_to_be32(512), /* 512MB min RMA */
.min_load = cpu_to_be32(0xffffffff), /* full client load */
.min_rma_percent = 0, /* min RMA percentage of total RAM */
.max_pft_size = 48, /* max log_2(hash table size) */

View File

@ -31,11 +31,11 @@
#include <linux/unistd.h>
#include <linux/serial.h>
#include <linux/serial_8250.h>
#include <linux/debugfs.h>
#include <linux/percpu.h>
#include <linux/memblock.h>
#include <linux/of_platform.h>
#include <linux/hugetlb.h>
#include <asm/debugfs.h>
#include <asm/io.h>
#include <asm/paca.h>
#include <asm/prom.h>
@ -920,6 +920,15 @@ void __init setup_arch(char **cmdline_p)
init_mm.end_code = (unsigned long) _etext;
init_mm.end_data = (unsigned long) _edata;
init_mm.brk = klimit;
#ifdef CONFIG_PPC_MM_SLICES
#ifdef CONFIG_PPC64
init_mm.context.addr_limit = TASK_SIZE_128TB;
#else
#error "context.addr_limit not initialized."
#endif
#endif
#ifdef CONFIG_PPC_64K_PAGES
init_mm.context.pte_frag = NULL;
#endif

View File

@ -230,8 +230,8 @@ static void cpu_ready_for_interrupts(void)
* If we are not in hypervisor mode the job is done once for
* the whole partition in configure_exceptions().
*/
if (early_cpu_has_feature(CPU_FTR_HVMODE) &&
early_cpu_has_feature(CPU_FTR_ARCH_207S)) {
if (cpu_has_feature(CPU_FTR_HVMODE) &&
cpu_has_feature(CPU_FTR_ARCH_207S)) {
unsigned long lpcr = mfspr(SPRN_LPCR);
mtspr(SPRN_LPCR, lpcr | LPCR_AIL_3);
}
@ -637,6 +637,11 @@ void __init emergency_stack_init(void)
paca[i].emergency_sp = (void *)ti + THREAD_SIZE;
#ifdef CONFIG_PPC_BOOK3S_64
/* emergency stack for NMI exception handling. */
ti = __va(memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit));
klp_init_thread_info(ti);
paca[i].nmi_emergency_sp = (void *)ti + THREAD_SIZE;
/* emergency stack for machine check exception handling. */
ti = __va(memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit));
klp_init_thread_info(ti);

View File

@ -39,6 +39,7 @@
#include <asm/irq.h>
#include <asm/hw_irq.h>
#include <asm/kvm_ppc.h>
#include <asm/dbell.h>
#include <asm/page.h>
#include <asm/pgtable.h>
#include <asm/prom.h>
@ -86,8 +87,6 @@ volatile unsigned int cpu_callin_map[NR_CPUS];
int smt_enabled_at_boot = 1;
static void (*crash_ipi_function_ptr)(struct pt_regs *) = NULL;
/*
* Returns 1 if the specified cpu should be brought up during boot.
* Used to inhibit booting threads if they've been disabled or
@ -158,32 +157,33 @@ static irqreturn_t tick_broadcast_ipi_action(int irq, void *data)
return IRQ_HANDLED;
}
static irqreturn_t debug_ipi_action(int irq, void *data)
#ifdef CONFIG_NMI_IPI
static irqreturn_t nmi_ipi_action(int irq, void *data)
{
if (crash_ipi_function_ptr) {
crash_ipi_function_ptr(get_irq_regs());
return IRQ_HANDLED;
}
#ifdef CONFIG_DEBUGGER
debugger_ipi(get_irq_regs());
#endif /* CONFIG_DEBUGGER */
smp_handle_nmi_ipi(get_irq_regs());
return IRQ_HANDLED;
}
#endif
static irq_handler_t smp_ipi_action[] = {
[PPC_MSG_CALL_FUNCTION] = call_function_action,
[PPC_MSG_RESCHEDULE] = reschedule_action,
[PPC_MSG_TICK_BROADCAST] = tick_broadcast_ipi_action,
[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
#ifdef CONFIG_NMI_IPI
[PPC_MSG_NMI_IPI] = nmi_ipi_action,
#endif
};
/*
* The NMI IPI is a fallback and not truly non-maskable. It is simpler
* than going through the call function infrastructure, and strongly
* serialized, so it is more appropriate for debugging.
*/
const char *smp_ipi_name[] = {
[PPC_MSG_CALL_FUNCTION] = "ipi call function",
[PPC_MSG_RESCHEDULE] = "ipi reschedule",
[PPC_MSG_TICK_BROADCAST] = "ipi tick-broadcast",
[PPC_MSG_DEBUGGER_BREAK] = "ipi debugger",
[PPC_MSG_NMI_IPI] = "nmi ipi",
};
/* optional function to request ipi, for controllers with >= 4 ipis */
@ -191,14 +191,13 @@ int smp_request_message_ipi(int virq, int msg)
{
int err;
if (msg < 0 || msg > PPC_MSG_DEBUGGER_BREAK) {
if (msg < 0 || msg > PPC_MSG_NMI_IPI)
return -EINVAL;
}
#if !defined(CONFIG_DEBUGGER) && !defined(CONFIG_KEXEC_CORE)
if (msg == PPC_MSG_DEBUGGER_BREAK) {
#ifndef CONFIG_NMI_IPI
if (msg == PPC_MSG_NMI_IPI)
return 1;
}
#endif
err = request_irq(virq, smp_ipi_action[msg],
IRQF_PERCPU | IRQF_NO_THREAD | IRQF_NO_SUSPEND,
smp_ipi_name[msg], NULL);
@ -211,17 +210,9 @@ int smp_request_message_ipi(int virq, int msg)
#ifdef CONFIG_PPC_SMP_MUXED_IPI
struct cpu_messages {
long messages; /* current messages */
unsigned long data; /* data for cause ipi */
};
static DEFINE_PER_CPU_SHARED_ALIGNED(struct cpu_messages, ipi_message);
void smp_muxed_ipi_set_data(int cpu, unsigned long data)
{
struct cpu_messages *info = &per_cpu(ipi_message, cpu);
info->data = data;
}
void smp_muxed_ipi_set_message(int cpu, int msg)
{
struct cpu_messages *info = &per_cpu(ipi_message, cpu);
@ -236,14 +227,13 @@ void smp_muxed_ipi_set_message(int cpu, int msg)
void smp_muxed_ipi_message_pass(int cpu, int msg)
{
struct cpu_messages *info = &per_cpu(ipi_message, cpu);
smp_muxed_ipi_set_message(cpu, msg);
/*
* cause_ipi functions are required to include a full barrier
* before doing whatever causes the IPI.
*/
smp_ops->cause_ipi(cpu, info->data);
smp_ops->cause_ipi(cpu);
}
#ifdef __BIG_ENDIAN__
@ -254,11 +244,18 @@ void smp_muxed_ipi_message_pass(int cpu, int msg)
irqreturn_t smp_ipi_demux(void)
{
struct cpu_messages *info = this_cpu_ptr(&ipi_message);
unsigned long all;
mb(); /* order any irq clear */
return smp_ipi_demux_relaxed();
}
/* sync-free variant. Callers should ensure synchronization */
irqreturn_t smp_ipi_demux_relaxed(void)
{
struct cpu_messages *info;
unsigned long all;
info = this_cpu_ptr(&ipi_message);
do {
all = xchg(&info->messages, 0);
#if defined(CONFIG_KVM_XICS) && defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE)
@ -278,8 +275,10 @@ irqreturn_t smp_ipi_demux(void)
scheduler_ipi();
if (all & IPI_MESSAGE(PPC_MSG_TICK_BROADCAST))
tick_broadcast_ipi_handler();
if (all & IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK))
debug_ipi_action(0, NULL);
#ifdef CONFIG_NMI_IPI
if (all & IPI_MESSAGE(PPC_MSG_NMI_IPI))
nmi_ipi_action(0, NULL);
#endif
} while (info->messages);
return IRQ_HANDLED;
@ -316,6 +315,187 @@ void arch_send_call_function_ipi_mask(const struct cpumask *mask)
do_message_pass(cpu, PPC_MSG_CALL_FUNCTION);
}
#ifdef CONFIG_NMI_IPI
/*
* "NMI IPI" system.
*
* NMI IPIs may not be recoverable, so should not be used as ongoing part of
* a running system. They can be used for crash, debug, halt/reboot, etc.
*
* NMI IPIs are globally single threaded. No more than one in progress at
* any time.
*
* The IPI call waits with interrupts disabled until all targets enter the
* NMI handler, then the call returns.
*
* No new NMI can be initiated until targets exit the handler.
*
* The IPI call may time out without all targets entering the NMI handler.
* In that case, there is some logic to recover (and ignore subsequent
* NMI interrupts that may eventually be raised), but the platform interrupt
* handler may not be able to distinguish this from other exception causes,
* which may cause a crash.
*/
static atomic_t __nmi_ipi_lock = ATOMIC_INIT(0);
static struct cpumask nmi_ipi_pending_mask;
static int nmi_ipi_busy_count = 0;
static void (*nmi_ipi_function)(struct pt_regs *) = NULL;
static void nmi_ipi_lock_start(unsigned long *flags)
{
raw_local_irq_save(*flags);
hard_irq_disable();
while (atomic_cmpxchg(&__nmi_ipi_lock, 0, 1) == 1) {
raw_local_irq_restore(*flags);
cpu_relax();
raw_local_irq_save(*flags);
hard_irq_disable();
}
}
static void nmi_ipi_lock(void)
{
while (atomic_cmpxchg(&__nmi_ipi_lock, 0, 1) == 1)
cpu_relax();
}
static void nmi_ipi_unlock(void)
{
smp_mb();
WARN_ON(atomic_read(&__nmi_ipi_lock) != 1);
atomic_set(&__nmi_ipi_lock, 0);
}
static void nmi_ipi_unlock_end(unsigned long *flags)
{
nmi_ipi_unlock();
raw_local_irq_restore(*flags);
}
/*
* Platform NMI handler calls this to ack
*/
int smp_handle_nmi_ipi(struct pt_regs *regs)
{
void (*fn)(struct pt_regs *);
unsigned long flags;
int me = raw_smp_processor_id();
int ret = 0;
/*
* Unexpected NMIs are possible here because the interrupt may not
* be able to distinguish NMI IPIs from other types of NMIs, or
* because the caller may have timed out.
*/
nmi_ipi_lock_start(&flags);
if (!nmi_ipi_busy_count)
goto out;
if (!cpumask_test_cpu(me, &nmi_ipi_pending_mask))
goto out;
fn = nmi_ipi_function;
if (!fn)
goto out;
cpumask_clear_cpu(me, &nmi_ipi_pending_mask);
nmi_ipi_busy_count++;
nmi_ipi_unlock();
ret = 1;
fn(regs);
nmi_ipi_lock();
nmi_ipi_busy_count--;
out:
nmi_ipi_unlock_end(&flags);
return ret;
}
static void do_smp_send_nmi_ipi(int cpu)
{
if (smp_ops->cause_nmi_ipi && smp_ops->cause_nmi_ipi(cpu))
return;
if (cpu >= 0) {
do_message_pass(cpu, PPC_MSG_NMI_IPI);
} else {
int c;
for_each_online_cpu(c) {
if (c == raw_smp_processor_id())
continue;
do_message_pass(c, PPC_MSG_NMI_IPI);
}
}
}
/*
* - cpu is the target CPU (must not be this CPU), or NMI_IPI_ALL_OTHERS.
* - fn is the target callback function.
* - delay_us > 0 is the delay before giving up waiting for targets to
* enter the handler, == 0 specifies indefinite delay.
*/
static int smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 delay_us)
{
unsigned long flags;
int me = raw_smp_processor_id();
int ret = 1;
BUG_ON(cpu == me);
BUG_ON(cpu < 0 && cpu != NMI_IPI_ALL_OTHERS);
if (unlikely(!smp_ops))
return 0;
/* Take the nmi_ipi_busy count/lock with interrupts hard disabled */
nmi_ipi_lock_start(&flags);
while (nmi_ipi_busy_count) {
nmi_ipi_unlock_end(&flags);
cpu_relax();
nmi_ipi_lock_start(&flags);
}
nmi_ipi_function = fn;
if (cpu < 0) {
/* ALL_OTHERS */
cpumask_copy(&nmi_ipi_pending_mask, cpu_online_mask);
cpumask_clear_cpu(me, &nmi_ipi_pending_mask);
} else {
/* cpumask starts clear */
cpumask_set_cpu(cpu, &nmi_ipi_pending_mask);
}
nmi_ipi_busy_count++;
nmi_ipi_unlock();
do_smp_send_nmi_ipi(cpu);
while (!cpumask_empty(&nmi_ipi_pending_mask)) {
udelay(1);
if (delay_us) {
delay_us--;
if (!delay_us)
break;
}
}
nmi_ipi_lock();
if (!cpumask_empty(&nmi_ipi_pending_mask)) {
/* Could not gather all CPUs */
ret = 0;
cpumask_clear(&nmi_ipi_pending_mask);
}
nmi_ipi_busy_count--;
nmi_ipi_unlock_end(&flags);
return ret;
}
#endif /* CONFIG_NMI_IPI */
#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
void tick_broadcast(const struct cpumask *mask)
{
@ -326,29 +506,22 @@ void tick_broadcast(const struct cpumask *mask)
}
#endif
#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE)
#ifdef CONFIG_DEBUGGER
void debugger_ipi_callback(struct pt_regs *regs)
{
debugger_ipi(regs);
}
void smp_send_debugger_break(void)
{
int cpu;
int me = raw_smp_processor_id();
if (unlikely(!smp_ops))
return;
for_each_online_cpu(cpu)
if (cpu != me)
do_message_pass(cpu, PPC_MSG_DEBUGGER_BREAK);
smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, debugger_ipi_callback, 1000000);
}
#endif
#ifdef CONFIG_KEXEC_CORE
void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *))
{
crash_ipi_function_ptr = crash_ipi_callback;
if (crash_ipi_callback) {
mb();
smp_send_debugger_break();
}
smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, crash_ipi_callback, 1000000);
}
#endif
@ -439,7 +612,21 @@ int generic_cpu_disable(void)
#ifdef CONFIG_PPC64
vdso_data->processorCount--;
#endif
migrate_irqs();
/* Update affinity of all IRQs previously aimed at this CPU */
irq_migrate_all_off_this_cpu();
/*
* Depending on the details of the interrupt controller, it's possible
* that one of the interrupts we just migrated away from this CPU is
* actually already pending on this CPU. If we leave it in that state
* the interrupt will never be EOI'ed, and will never fire again. So
* temporarily enable interrupts here, to allow any pending interrupt to
* be received (and EOI'ed), before we take this CPU offline.
*/
local_irq_enable();
mdelay(1);
local_irq_disable();
return 0;
}
@ -521,6 +708,16 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
cpu_idle_thread_init(cpu, tidle);
/*
* The platform might need to allocate resources prior to bringing
* up the CPU
*/
if (smp_ops->prepare_cpu) {
rc = smp_ops->prepare_cpu(cpu);
if (rc)
return rc;
}
/* Make sure callin-map entry is 0 (can be leftover a CPU
* hotplug
*/

View File

@ -59,7 +59,14 @@ EXPORT_SYMBOL_GPL(save_stack_trace);
void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
{
save_context_stack(trace, tsk->thread.ksp, tsk, 0);
unsigned long sp;
if (tsk == current)
sp = current_stack_pointer();
else
sp = tsk->thread.ksp;
save_context_stack(trace, sp, tsk, 0);
}
EXPORT_SYMBOL_GPL(save_stack_trace_tsk);

View File

@ -10,6 +10,7 @@
*/
#include <linux/sched.h>
#include <linux/suspend.h>
#include <asm/current.h>
#include <asm/mmu_context.h>
#include <asm/switch_to.h>

View File

@ -42,11 +42,11 @@
#include <asm/unistd.h>
#include <asm/asm-prototypes.h>
static inline unsigned long do_mmap2(unsigned long addr, size_t len,
static inline long do_mmap2(unsigned long addr, size_t len,
unsigned long prot, unsigned long flags,
unsigned long fd, unsigned long off, int shift)
{
unsigned long ret = -EINVAL;
long ret = -EINVAL;
if (!arch_validate_prot(prot))
goto out;
@ -62,16 +62,16 @@ static inline unsigned long do_mmap2(unsigned long addr, size_t len,
return ret;
}
unsigned long sys_mmap2(unsigned long addr, size_t len,
unsigned long prot, unsigned long flags,
unsigned long fd, unsigned long pgoff)
SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len,
unsigned long, prot, unsigned long, flags,
unsigned long, fd, unsigned long, pgoff)
{
return do_mmap2(addr, len, prot, flags, fd, pgoff, PAGE_SHIFT-12);
}
unsigned long sys_mmap(unsigned long addr, size_t len,
unsigned long prot, unsigned long flags,
unsigned long fd, off_t offset)
SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len,
unsigned long, prot, unsigned long, flags,
unsigned long, fd, off_t, offset)
{
return do_mmap2(addr, len, prot, flags, fd, offset, PAGE_SHIFT);
}

View File

@ -710,6 +710,10 @@ static int register_cpu_online(unsigned int cpu)
struct device_attribute *attrs, *pmc_attrs;
int i, nattrs;
/* For cpus present at boot a reference was already grabbed in register_cpu() */
if (!s->of_node)
s->of_node = of_get_cpu_node(cpu, NULL);
#ifdef CONFIG_PPC64
if (cpu_has_feature(CPU_FTR_SMT))
device_create_file(s, &dev_attr_smt_snooze_delay);
@ -785,9 +789,9 @@ static int register_cpu_online(unsigned int cpu)
return 0;
}
#ifdef CONFIG_HOTPLUG_CPU
static int unregister_cpu_online(unsigned int cpu)
{
#ifdef CONFIG_HOTPLUG_CPU
struct cpu *c = &per_cpu(cpu_devices, cpu);
struct device *s = &c->dev;
struct device_attribute *attrs, *pmc_attrs;
@ -864,9 +868,13 @@ static int unregister_cpu_online(unsigned int cpu)
}
#endif
cacheinfo_cpu_offline(cpu);
#endif /* CONFIG_HOTPLUG_CPU */
of_node_put(s->of_node);
s->of_node = NULL;
return 0;
}
#else /* !CONFIG_HOTPLUG_CPU */
#define unregister_cpu_online NULL
#endif
#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
ssize_t arch_cpu_probe(const char *buf, size_t count)

View File

@ -0,0 +1,29 @@
#
# Makefile for the powerpc trace subsystem
#
subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
ifdef CONFIG_FUNCTION_TRACER
# do not trace tracer code
CFLAGS_REMOVE_ftrace.o = -mno-sched-epilog $(CC_FLAGS_FTRACE)
endif
obj32-$(CONFIG_FUNCTION_TRACER) += ftrace_32.o
obj64-$(CONFIG_FUNCTION_TRACER) += ftrace_64.o
ifdef CONFIG_MPROFILE_KERNEL
obj64-$(CONFIG_FUNCTION_TRACER) += ftrace_64_mprofile.o
else
obj64-$(CONFIG_FUNCTION_TRACER) += ftrace_64_pg.o
endif
obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
obj-$(CONFIG_TRACING) += trace_clock.o
obj-$(CONFIG_PPC64) += $(obj64-y)
obj-$(CONFIG_PPC32) += $(obj32-y)
# Disable GCOV & sanitizers in odd or sensitive code
GCOV_PROFILE_ftrace.o := n
UBSAN_SANITIZE_ftrace.o := n

View File

@ -21,6 +21,7 @@
#include <linux/init.h>
#include <linux/list.h>
#include <asm/asm-prototypes.h>
#include <asm/cacheflush.h>
#include <asm/code-patching.h>
#include <asm/ftrace.h>

View File

@ -0,0 +1,118 @@
/*
* Split from entry_32.S
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#include <linux/magic.h>
#include <asm/reg.h>
#include <asm/ppc_asm.h>
#include <asm/asm-offsets.h>
#include <asm/ftrace.h>
#include <asm/export.h>
#ifdef CONFIG_DYNAMIC_FTRACE
_GLOBAL(mcount)
_GLOBAL(_mcount)
/*
* It is required that _mcount on PPC32 must preserve the
* link register. But we have r0 to play with. We use r0
* to push the return address back to the caller of mcount
* into the ctr register, restore the link register and
* then jump back using the ctr register.
*/
mflr r0
mtctr r0
lwz r0, 4(r1)
mtlr r0
bctr
_GLOBAL(ftrace_caller)
MCOUNT_SAVE_FRAME
/* r3 ends up with link register */
subi r3, r3, MCOUNT_INSN_SIZE
.globl ftrace_call
ftrace_call:
bl ftrace_stub
nop
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
.globl ftrace_graph_call
ftrace_graph_call:
b ftrace_graph_stub
_GLOBAL(ftrace_graph_stub)
#endif
MCOUNT_RESTORE_FRAME
/* old link register ends up in ctr reg */
bctr
#else
_GLOBAL(mcount)
_GLOBAL(_mcount)
MCOUNT_SAVE_FRAME
subi r3, r3, MCOUNT_INSN_SIZE
LOAD_REG_ADDR(r5, ftrace_trace_function)
lwz r5,0(r5)
mtctr r5
bctrl
nop
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
b ftrace_graph_caller
#endif
MCOUNT_RESTORE_FRAME
bctr
#endif
EXPORT_SYMBOL(_mcount)
_GLOBAL(ftrace_stub)
blr
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
_GLOBAL(ftrace_graph_caller)
/* load r4 with local address */
lwz r4, 44(r1)
subi r4, r4, MCOUNT_INSN_SIZE
/* Grab the LR out of the caller stack frame */
lwz r3,52(r1)
bl prepare_ftrace_return
nop
/*
* prepare_ftrace_return gives us the address we divert to.
* Change the LR in the callers stack frame to this.
*/
stw r3,52(r1)
MCOUNT_RESTORE_FRAME
/* old link register ends up in ctr reg */
bctr
_GLOBAL(return_to_handler)
/* need to save return values */
stwu r1, -32(r1)
stw r3, 20(r1)
stw r4, 16(r1)
stw r31, 12(r1)
mr r31, r1
bl ftrace_return_to_handler
nop
/* return value has real return address */
mtlr r3
lwz r3, 20(r1)
lwz r4, 16(r1)
lwz r31,12(r1)
lwz r1, 0(r1)
/* Jump back to real return address */
blr
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */

View File

@ -0,0 +1,85 @@
/*
* Split from entry_64.S
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#include <linux/magic.h>
#include <asm/ppc_asm.h>
#include <asm/asm-offsets.h>
#include <asm/ftrace.h>
#include <asm/ppc-opcode.h>
#include <asm/export.h>
#ifdef CONFIG_DYNAMIC_FTRACE
_GLOBAL(mcount)
_GLOBAL(_mcount)
EXPORT_SYMBOL(_mcount)
mflr r12
mtctr r12
mtlr r0
bctr
#else /* CONFIG_DYNAMIC_FTRACE */
_GLOBAL_TOC(_mcount)
EXPORT_SYMBOL(_mcount)
/* Taken from output of objdump from lib64/glibc */
mflr r3
ld r11, 0(r1)
stdu r1, -112(r1)
std r3, 128(r1)
ld r4, 16(r11)
subi r3, r3, MCOUNT_INSN_SIZE
LOAD_REG_ADDR(r5,ftrace_trace_function)
ld r5,0(r5)
ld r5,0(r5)
mtctr r5
bctrl
nop
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
b ftrace_graph_caller
#endif
ld r0, 128(r1)
mtlr r0
addi r1, r1, 112
_GLOBAL(ftrace_stub)
blr
#endif /* CONFIG_DYNAMIC_FTRACE */
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
_GLOBAL(return_to_handler)
/* need to save return values */
std r4, -32(r1)
std r3, -24(r1)
/* save TOC */
std r2, -16(r1)
std r31, -8(r1)
mr r31, r1
stdu r1, -112(r1)
/*
* We might be called from a module.
* Switch to our TOC to run inside the core kernel.
*/
ld r2, PACATOC(r13)
bl ftrace_return_to_handler
nop
/* return value has real return address */
mtlr r3
ld r1, 0(r1)
ld r4, -32(r1)
ld r3, -24(r1)
ld r2, -16(r1)
ld r31, -8(r1)
/* Jump back to real return address */
blr
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */

View File

@ -0,0 +1,272 @@
/*
* Split from ftrace_64.S
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#include <linux/magic.h>
#include <asm/ppc_asm.h>
#include <asm/asm-offsets.h>
#include <asm/ftrace.h>
#include <asm/ppc-opcode.h>
#include <asm/export.h>
#include <asm/thread_info.h>
#include <asm/bug.h>
#include <asm/ptrace.h>
#ifdef CONFIG_DYNAMIC_FTRACE
/*
*
* ftrace_caller() is the function that replaces _mcount() when ftrace is
* active.
*
* We arrive here after a function A calls function B, and we are the trace
* function for B. When we enter r1 points to A's stack frame, B has not yet
* had a chance to allocate one yet.
*
* Additionally r2 may point either to the TOC for A, or B, depending on
* whether B did a TOC setup sequence before calling us.
*
* On entry the LR points back to the _mcount() call site, and r0 holds the
* saved LR as it was on entry to B, ie. the original return address at the
* call site in A.
*
* Our job is to save the register state into a struct pt_regs (on the stack)
* and then arrange for the ftrace function to be called.
*/
_GLOBAL(ftrace_caller)
/* Save the original return address in A's stack frame */
std r0,LRSAVE(r1)
/* Create our stack frame + pt_regs */
stdu r1,-SWITCH_FRAME_SIZE(r1)
/* Save all gprs to pt_regs */
SAVE_8GPRS(0,r1)
SAVE_8GPRS(8,r1)
SAVE_8GPRS(16,r1)
SAVE_8GPRS(24,r1)
/* Load special regs for save below */
mfmsr r8
mfctr r9
mfxer r10
mfcr r11
/* Get the _mcount() call site out of LR */
mflr r7
/* Save it as pt_regs->nip */
std r7, _NIP(r1)
/* Save the read LR in pt_regs->link */
std r0, _LINK(r1)
/* Save callee's TOC in the ABI compliant location */
std r2, 24(r1)
ld r2,PACATOC(r13) /* get kernel TOC in r2 */
addis r3,r2,function_trace_op@toc@ha
addi r3,r3,function_trace_op@toc@l
ld r5,0(r3)
#ifdef CONFIG_LIVEPATCH
mr r14,r7 /* remember old NIP */
#endif
/* Calculate ip from nip-4 into r3 for call below */
subi r3, r7, MCOUNT_INSN_SIZE
/* Put the original return address in r4 as parent_ip */
mr r4, r0
/* Save special regs */
std r8, _MSR(r1)
std r9, _CTR(r1)
std r10, _XER(r1)
std r11, _CCR(r1)
/* Load &pt_regs in r6 for call below */
addi r6, r1 ,STACK_FRAME_OVERHEAD
/* ftrace_call(r3, r4, r5, r6) */
.globl ftrace_call
ftrace_call:
bl ftrace_stub
nop
/* Load ctr with the possibly modified NIP */
ld r3, _NIP(r1)
mtctr r3
#ifdef CONFIG_LIVEPATCH
cmpd r14,r3 /* has NIP been altered? */
#endif
/* Restore gprs */
REST_8GPRS(0,r1)
REST_8GPRS(8,r1)
REST_8GPRS(16,r1)
REST_8GPRS(24,r1)
/* Restore possibly modified LR */
ld r0, _LINK(r1)
mtlr r0
/* Restore callee's TOC */
ld r2, 24(r1)
/* Pop our stack frame */
addi r1, r1, SWITCH_FRAME_SIZE
#ifdef CONFIG_LIVEPATCH
/* Based on the cmpd above, if the NIP was altered handle livepatch */
bne- livepatch_handler
#endif
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
.globl ftrace_graph_call
ftrace_graph_call:
b ftrace_graph_stub
_GLOBAL(ftrace_graph_stub)
#endif
bctr /* jump after _mcount site */
_GLOBAL(ftrace_stub)
blr
#ifdef CONFIG_LIVEPATCH
/*
* This function runs in the mcount context, between two functions. As
* such it can only clobber registers which are volatile and used in
* function linkage.
*
* We get here when a function A, calls another function B, but B has
* been live patched with a new function C.
*
* On entry:
* - we have no stack frame and can not allocate one
* - LR points back to the original caller (in A)
* - CTR holds the new NIP in C
* - r0 & r12 are free
*
* r0 can't be used as the base register for a DS-form load or store, so
* we temporarily shuffle r1 (stack pointer) into r0 and then put it back.
*/
livepatch_handler:
CURRENT_THREAD_INFO(r12, r1)
/* Save stack pointer into r0 */
mr r0, r1
/* Allocate 3 x 8 bytes */
ld r1, TI_livepatch_sp(r12)
addi r1, r1, 24
std r1, TI_livepatch_sp(r12)
/* Save toc & real LR on livepatch stack */
std r2, -24(r1)
mflr r12
std r12, -16(r1)
/* Store stack end marker */
lis r12, STACK_END_MAGIC@h
ori r12, r12, STACK_END_MAGIC@l
std r12, -8(r1)
/* Restore real stack pointer */
mr r1, r0
/* Put ctr in r12 for global entry and branch there */
mfctr r12
bctrl
/*
* Now we are returning from the patched function to the original
* caller A. We are free to use r0 and r12, and we can use r2 until we
* restore it.
*/
CURRENT_THREAD_INFO(r12, r1)
/* Save stack pointer into r0 */
mr r0, r1
ld r1, TI_livepatch_sp(r12)
/* Check stack marker hasn't been trashed */
lis r2, STACK_END_MAGIC@h
ori r2, r2, STACK_END_MAGIC@l
ld r12, -8(r1)
1: tdne r12, r2
EMIT_BUG_ENTRY 1b, __FILE__, __LINE__ - 1, 0
/* Restore LR & toc from livepatch stack */
ld r12, -16(r1)
mtlr r12
ld r2, -24(r1)
/* Pop livepatch stack frame */
CURRENT_THREAD_INFO(r12, r0)
subi r1, r1, 24
std r1, TI_livepatch_sp(r12)
/* Restore real stack pointer */
mr r1, r0
/* Return to original caller of live patched function */
blr
#endif /* CONFIG_LIVEPATCH */
#endif /* CONFIG_DYNAMIC_FTRACE */
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
_GLOBAL(ftrace_graph_caller)
stdu r1, -112(r1)
/* with -mprofile-kernel, parameter regs are still alive at _mcount */
std r10, 104(r1)
std r9, 96(r1)
std r8, 88(r1)
std r7, 80(r1)
std r6, 72(r1)
std r5, 64(r1)
std r4, 56(r1)
std r3, 48(r1)
/* Save callee's TOC in the ABI compliant location */
std r2, 24(r1)
ld r2, PACATOC(r13) /* get kernel TOC in r2 */
mfctr r4 /* ftrace_caller has moved local addr here */
std r4, 40(r1)
mflr r3 /* ftrace_caller has restored LR from stack */
subi r4, r4, MCOUNT_INSN_SIZE
bl prepare_ftrace_return
nop
/*
* prepare_ftrace_return gives us the address we divert to.
* Change the LR to this.
*/
mtlr r3
ld r0, 40(r1)
mtctr r0
ld r10, 104(r1)
ld r9, 96(r1)
ld r8, 88(r1)
ld r7, 80(r1)
ld r6, 72(r1)
ld r5, 64(r1)
ld r4, 56(r1)
ld r3, 48(r1)
/* Restore callee's TOC */
ld r2, 24(r1)
addi r1, r1, 112
mflr r0
std r0, LRSAVE(r1)
bctr
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */

View File

@ -0,0 +1,68 @@
/*
* Split from ftrace_64.S
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#include <linux/magic.h>
#include <asm/ppc_asm.h>
#include <asm/asm-offsets.h>
#include <asm/ftrace.h>
#include <asm/ppc-opcode.h>
#include <asm/export.h>
#ifdef CONFIG_DYNAMIC_FTRACE
_GLOBAL_TOC(ftrace_caller)
/* Taken from output of objdump from lib64/glibc */
mflr r3
ld r11, 0(r1)
stdu r1, -112(r1)
std r3, 128(r1)
ld r4, 16(r11)
subi r3, r3, MCOUNT_INSN_SIZE
.globl ftrace_call
ftrace_call:
bl ftrace_stub
nop
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
.globl ftrace_graph_call
ftrace_graph_call:
b ftrace_graph_stub
_GLOBAL(ftrace_graph_stub)
#endif
ld r0, 128(r1)
mtlr r0
addi r1, r1, 112
_GLOBAL(ftrace_stub)
blr
#endif /* CONFIG_DYNAMIC_FTRACE */
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
_GLOBAL(ftrace_graph_caller)
/* load r4 with local address */
ld r4, 128(r1)
subi r4, r4, MCOUNT_INSN_SIZE
/* Grab the LR out of the caller stack frame */
ld r11, 112(r1)
ld r3, 16(r11)
bl prepare_ftrace_return
nop
/*
* prepare_ftrace_return gives us the address we divert to.
* Change the LR in the callers stack frame to this.
*/
ld r11, 112(r1)
std r3, 16(r11)
ld r0, 128(r1)
mtlr r0
addi r1, r1, 112
blr
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */

View File

@ -35,13 +35,13 @@
#include <linux/backlight.h>
#include <linux/bug.h>
#include <linux/kdebug.h>
#include <linux/debugfs.h>
#include <linux/ratelimit.h>
#include <linux/context_tracking.h>
#include <asm/emulated_ops.h>
#include <asm/pgtable.h>
#include <linux/uaccess.h>
#include <asm/debugfs.h>
#include <asm/io.h>
#include <asm/machdep.h>
#include <asm/rtas.h>
@ -279,18 +279,35 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
void system_reset_exception(struct pt_regs *regs)
{
/*
* Avoid crashes in case of nested NMI exceptions. Recoverability
* is determined by RI and in_nmi
*/
bool nested = in_nmi();
if (!nested)
nmi_enter();
/* See if any machine dependent calls */
if (ppc_md.system_reset_exception) {
if (ppc_md.system_reset_exception(regs))
return;
goto out;
}
die("System Reset", regs, SIGABRT);
out:
#ifdef CONFIG_PPC_BOOK3S_64
BUG_ON(get_paca()->in_nmi == 0);
if (get_paca()->in_nmi > 1)
panic("Unrecoverable nested System Reset");
#endif
/* Must die if the interrupt is not recoverable */
if (!(regs->msr & MSR_RI))
panic("Unrecoverable System Reset");
if (!nested)
nmi_exit();
/* What should we do here? We could issue a shutdown or hard reset. */
}
@ -306,8 +323,6 @@ long machine_check_early(struct pt_regs *regs)
__this_cpu_inc(irq_stat.mce_exceptions);
add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
handled = cur_cpu_spec->machine_check_early(regs);
return handled;
@ -741,6 +756,8 @@ void machine_check_exception(struct pt_regs *regs)
__this_cpu_inc(irq_stat.mce_exceptions);
add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
/* See if any machine dependent calls. In theory, we would want
* to call the CPU first, and call the ppc_md. one if the CPU
* one returns a positive number. However there is existing code
@ -1440,6 +1457,8 @@ void facility_unavailable_exception(struct pt_regs *regs)
[FSCR_TM_LG] = "TM",
[FSCR_EBB_LG] = "EBB",
[FSCR_TAR_LG] = "TAR",
[FSCR_MSGP_LG] = "MSGP",
[FSCR_SCV_LG] = "SCV",
};
char *facility = "unknown";
u64 value;

View File

@ -77,6 +77,8 @@ SECTIONS
#endif
} :kernel
__head_end = .;
/*
* If the build dies here, it's likely code in head_64.S is referencing
* labels it can't reach, and the linker inserting stubs without the

View File

@ -20,6 +20,10 @@
#include <linux/slab.h>
#include <linux/module.h>
#include <linux/miscdevice.h>
#include <linux/gfp.h>
#include <linux/sched.h>
#include <linux/vmalloc.h>
#include <linux/highmem.h>
#include <asm/reg.h>
#include <asm/cputable.h>
@ -31,10 +35,6 @@
#include <asm/kvm_book3s.h>
#include <asm/mmu_context.h>
#include <asm/page.h>
#include <linux/gfp.h>
#include <linux/sched.h>
#include <linux/vmalloc.h>
#include <linux/highmem.h>
#include "book3s.h"
#include "trace.h"

View File

@ -229,6 +229,7 @@ void kvmppc_mmu_unmap_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte)
static struct kvmppc_sid_map *create_sid_map(struct kvm_vcpu *vcpu, u64 gvsid)
{
unsigned long vsid_bits = VSID_BITS_65_256M;
struct kvmppc_sid_map *map;
struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
u16 sid_map_mask;
@ -257,7 +258,12 @@ static struct kvmppc_sid_map *create_sid_map(struct kvm_vcpu *vcpu, u64 gvsid)
kvmppc_mmu_pte_flush(vcpu, 0, 0);
kvmppc_mmu_flush_segments(vcpu);
}
map->host_vsid = vsid_scramble(vcpu_book3s->proto_vsid_next++, 256M);
if (mmu_has_feature(MMU_FTR_68_BIT_VA))
vsid_bits = VSID_BITS_256M;
map->host_vsid = vsid_scramble(vcpu_book3s->proto_vsid_next++,
VSID_MULTIPLIER_256M, vsid_bits);
map->guest_vsid = gvsid;
map->valid = true;
@ -390,7 +396,7 @@ int kvmppc_mmu_init(struct kvm_vcpu *vcpu)
struct kvmppc_vcpu_book3s *vcpu3s = to_book3s(vcpu);
int err;
err = __init_new_context();
err = hash__alloc_context_id();
if (err < 0)
return -1;
vcpu3s->context_id[0] = err;

Some files were not shown because too many files have changed in this diff Show More