Commit Graph

793 Commits

Author SHA1 Message Date
Nicholas Piggin 544686cae8 powerpc/64s: Stop using bit in HSPRG0 to test winkle
The POWER8 idle code has a neat trick of programming the power on engine
to restore a low bit into HSPRG0, so idle wakeup code can test and see
if it has been programmed this way and therefore lost all state. Restore
time can be reduced if winkle has not been reached.

However this messes with our r13 PACA pointer, and requires HSPRG0 to be
written to. It also optimizes the slowest and most uncommon case at the
expense of another SPR write in the common nap state wakeup.

Remove this complexity and assume winkle sleeps always require a state
restore. This speedup could be made entirely contained within the winkle
idle code by counting per-core winkles and setting a thread bitmap when
all have gone to winkle.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:31:39 +10:00
Yongji Xie 3827463769 powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
Override pcibios_default_alignment() to set default alignment to PAGE_SIZE
for all PCI devices on PowerNV platform.  Thus sub-page BARs would not
share a page and could be mapped into guest when VFIO passthrough them.

Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-19 12:51:26 -05:00
Michael Ellerman 40e275653e powerpc/powernv: Always enable SMP when building powernv
The powernv platform supports Power7 and later CPUs, all of which are
multithreaded and multicore.

As such we never build a SMP=n kernel for those machines, other than
possibly for debugging or running in a simulator.

In the debugging case we can get a similar effect by booting with
nr_cpus=1, or there's always the option of building a custom kernel with
SMP hacked out.

For running in simulators the code size reduction from building without
SMP is not particularly important, what matters is the number of
instructions executed. A quick test shows that a SMP=y kernel takes ~6%
more instructions to boot to a shell. Booting with nr_cpus=1 recovers
about half that deficit.

On the flip side, keeping the SMP=n kernel building can be a pain at
times. And although we've mostly kept it building in recent years, no
one is regularly testing that the SMP=n kernel actually boots and works
well on these machines.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-13 23:37:17 +10:00
Nicholas Piggin 6b3edefefa powerpc/powernv: POWER9 support for msgsnd/doorbell IPI
POWER9 requires msgsync for receiver-side synchronization, and a DD1
workaround restricts IPIs to core-local.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Drop no longer needed asm feature macro changes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-13 23:34:34 +10:00
Nicholas Piggin b866cc2199 powerpc: Change the doorbell IPI calling convention
Change the doorbell callers to know about their msgsnd addressing,
rather than have them set a per-cpu target data tag at boot that gets
sent to the cause_ipi functions. The data is only used for doorbell IPI
functions, no other IPI types, so it makes sense to keep that detail
local to doorbell.

Have the platform code understand doorbell IPIs, rather than the
interrupt controller code understand them. Platform code can look at
capabilities it has available and decide which to use.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-13 23:34:33 +10:00
Michael Ellerman 3c19d5ada1 Merge branch 'topic/xive' (early part) into next
This merges the arch part of the XIVE support, leaving the final commit
with the KVM specific pieces dangling on the branch for Paul to merge
via the kvm-ppc tree.
2017-04-12 22:31:37 +10:00
Gautham R. Shenoy 17ed4c8f81 powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1
POWER9 DD1.0 hardware has a bug where the SPRs of a thread waking up
from stop 0,1,2 with ESL=1 can endup being misplaced in the core. Thus
the HSPRG0 of a thread waking up from can contain the paca pointer of
its sibling.

This patch implements a context recovery framework within threads of a
core, by provisioning space in paca_struct for saving every sibling
threads's paca pointers. Basically, we should be able to arrive at the
right paca pointer from any of the thread's existing paca pointer.

At bootup, during powernv idle-init, we save the paca address of every
CPU in each one its siblings paca_struct in the slot corresponding to
this CPU's index in the core.

On wakeup from a stop, the thread will determine its index in the core
from the TIR register and recover its PACA pointer by indexing into
the correct slot in the provisioned space in the current PACA.

Furthermore, ensure that the NVGPRs are restored from the stack on the
way out by setting the NAPSTATELOST in paca.

[Changelog written with inputs from svaidy@linux.vnet.ibm.com]
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Call it a bug]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-11 08:45:09 +10:00
Gautham R. Shenoy f3b3f28493 powerpc/powernv/idle: Don't override default/deepest directly in kernel
Currently during idle-init on power9, if we don't find suitable stop
states in the device tree that can be used as the
default_stop/deepest_stop, we set stop0 (ESL=1,EC=1) as the default
stop state psscr to be used by power9_idle and deepest stop state
which is used by CPU-Hotplug.

However, if the platform firmware has not configured or enabled a stop
state, the kernel should not make any assumptions and fallback to a
default choice.

If the kernel uses a stop state that is not configured by the platform
firmware, it may lead to further failures which should be avoided.

In this patch, we modify the init code to ensure that the kernel uses
only the stop states exposed by the firmware through the device
tree. When a suitable default stop state isn't found, we disable
ppc_md.power_save for power9. Similarly, when a suitable
deepest_stop_state is not found in the device tree exported by the
firmware, fall back to the default busy-wait loop in the CPU-Hotplug
code.

[Changelog written with inputs from svaidy@linux.vnet.ibm.com]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-11 08:45:09 +10:00
Gautham R. Shenoy 9006123157 powerpc/powernv/smp: Add busy-wait loop as fall back for CPU-Hotplug
Currently, the powernv cpu-offline function assumes that platform idle
states such as stop on POWER9, winkle/sleep/nap on POWER8 are always
available. On POWER8, it picks nap as the default state if other deep
idle states like sleep/winkle are not available and enabled in the
platform.

On POWER9, nap is not available and all idle states are managed by
STOP instruction.  The parameters to the idle state are passed through
processor stop status control register (PSSCR).  Hence as such
executing STOP would take parameters from current PSSCR. We do not
want to make any assumptions in kernel on what STOP states and PSSCR
features are configured by the platform.

Ideally platform will configure a good set of stop states that can be
used in the kernel.  We would like to start with a clean slate, if the
platform choose to not configure any state or there is an error in
platform firmware that lead to no stop states being configured or
allowed to be requested.

This patch adds a fallback method for CPU-Hotplug that is similar to
snooze loop at idle where the threads are left to spin at low priority
and hence reduce the cycles consumed.

This is a safe fallback mechanism in the case when no stop state would
be requested if the platform firmware did not configure them most
likely due to an error condition.

Requesting a stop state when the platform has not configured them or
enabled them would lead to further error conditions which could be
difficult to debug.

[Changelog written with inputs from svaidy@linux.vnet.ibm.com]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-11 08:45:09 +10:00
Gautham R. Shenoy a7cd88da97 powerpc/powernv: Move CPU-Offline idle state invocation from smp.c to idle.c
Move the piece of code in powernv/smp.c::pnv_smp_cpu_kill_self() which
transitions the CPU to the deepest available platform idle state to a
new function named pnv_cpu_offline() in powernv/idle.c. The rationale
behind this code movement is that the data required to determine the
deepest available platform state resides in powernv/idle.c.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-11 08:45:09 +10:00
Michael Ellerman 7644d5819c powerpc: Create asm/debugfs.h and move powerpc_debugfs_root there
powerpc_debugfs_root is the dentry representing the root of the
"powerpc" directory tree in debugfs.

Currently it sits in asm/debug.h, a long with some other things that
have "debug" in the name, but are otherwise unrelated.

Pull it out into a separate header, which also includes linux/debugfs.h,
and convert all the users to include debugfs.h instead of debug.h.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-11 07:46:03 +10:00
Alistair Popple abfe8026b5 powerpc/powernv: Require MMU_NOTIFIER to fix NPU build
In the recent commit 1ab66d1fba ("powerpc/powernv: Introduce address
translation services for Nvlink2") the NPU code gained a dependency on MMU
notifiers.

All our defconfigs have KVM enabled, which selects MMU_NOTIFIER, but if KVM is
not enabled then the build breaks.

Fix it by always selecting MMU_NOTIFIER when we're building powernv.

Fixes: 1ab66d1fba ("powerpc/powernv: Introduce address translation services for Nvlink2")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[mpe: Reword change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-11 07:46:03 +10:00
Benjamin Herrenschmidt d381d7caf8 powerpc: Consolidate variants of real-mode MMIOs
We have all sort of variants of MMIO accessors for the real mode
instructions. This creates a clean set of accessors based on
Linux normal naming conventions, replacing all occurrences of
the old ones in the tree.

I have purposefully removed the "out/in" variants in favor of
only including __raw variants. Any code using these is already
pretty much hand tuned to operate in a very specific environment.
I've fixed up the 2 users (only one of them actually needed
a barrier in the first place).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-10 21:43:16 +10:00
Benjamin Herrenschmidt 243e25112d powerpc/xive: Native exploitation of the XIVE interrupt controller
The XIVE interrupt controller is the new interrupt controller
found in POWER9. It supports advanced virtualization capabilities
among other things.

Currently we use a set of firmware calls that simulate the old
"XICS" interrupt controller but this is fairly inefficient.

This adds the framework for using XIVE along with a native
backend which OPAL for configuration. Later, a backend allowing
the use in a KVM or PowerVM guest will also be provided.

This disables some fast path for interrupts in KVM when XIVE is
enabled as these rely on the firmware emulation code which is no
longer available when the XIVE is used natively by Linux.

A latter patch will make KVM also directly exploit the XIVE, thus
recovering the lost performance (and more).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Fixup pr_xxx("XIVE:"...), don't split pr_xxx() strings,
 tweak Kconfig so XIVE_NATIVE selects XIVE and depends on POWERNV,
 fix build errors when SMP=n, fold in fixes from Ben:
   Don't call cpu_online() on an invalid CPU number
   Fix irq target selection returning out of bounds cpu#
   Extra sanity checks on cpu numbers
 ]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-10 21:41:34 +10:00
Benjamin Herrenschmidt eeea1a434d powerpc/powernv: Add XIVE related definitions to opal-api.h
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-06 19:58:46 +10:00
Matt Brown 11fe909d23 powerpc/powernv: Add OPAL exports attributes to sysfs
New versions of OPAL have a device node /ibm,opal/firmware/exports, each
property of which describes a range of memory in OPAL that Linux might
want to export to userspace for debugging.

This patch adds a sysfs file under 'opal/exports' for each property
found there, and makes it read-only by root.

Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com>
[mpe: Drop counting of props, rename to attr, free on sysfs error, c'log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-04 14:11:22 +10:00
Alistair Popple 1ab66d1fba powerpc/powernv: Introduce address translation services for Nvlink2
Nvlink2 supports address translation services (ATS) allowing devices
to request address translations from an mmu known as the nest MMU
which is setup to walk the CPU page tables.

To access this functionality certain firmware calls are required to
setup and manage hardware context tables in the nvlink processing unit
(NPU). The NPU also manages forwarding of TLB invalidates (known as
address translation shootdowns/ATSDs) to attached devices.

This patch exports several methods to allow device drivers to register
a process id (PASID/PID) in the hardware tables and to receive
notification of when a device should stop issuing address translation
requests (ATRs). It also adds a fault handler to allow device drivers
to demand fault pages in.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
[mpe: Fix up comment formatting, use flush_tlb_mm()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-04 13:27:26 +10:00
Alistair Popple 4c3b89effc powerpc/powernv: Add sanity checks to pnv_pci_get_{gpu|npu}_dev
The pnv_pci_get_{gpu|npu}_dev functions are used to find associations
between nvlink PCIe devices and standard PCIe devices. However they
lacked basic sanity checking which results in NULL pointer
dereferencing if they are incorrect called can be harder to spot than
an explicit WARN_ON.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-03 23:36:00 +10:00
Michael Ellerman 63f44d6514 powerpc/book3s: Print task info if we take a machine check in user mode
For an MCE (Machine Check Exception) that hits while in user mode
MSR(PR=1), print the task info to the console MCE error log. This may
help to identify an application that triggered the MCE.

After this patch the MCE console looks like:

  Severe Machine check interrupt [Recovered]
    NIP: [0000000010039778] PID: 762 Comm: ebizzy
    Initiator: CPU
    Error type: SLB [Multihit]
      Effective address: 0000000010039778

  Severe Machine check interrupt [Not recovered]
    NIP: [0000000010039778] PID: 763 Comm: ebizzy
    Initiator: CPU
    Error type: UE [Page table walk ifetch]
      Effective address: 0000000010039778
  ebizzy[763]: unhandled signal 7 at 0000000010039778 nip 0000000010039778 lr 0000000010001b44 code 30004

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-03 16:12:00 +10:00
Aneesh Kumar K.V 3a4c26011d powerpc/mm: Add translation mode information in /proc/cpuinfo
With this we have on powernv and pseries /proc/cpuinfo reporting

timebase        : 512000000
platform        : PowerNV
model           : 8247-22L
machine         : PowerNV 8247-22L
firmware        : OPAL
MMU		: Hash

Reviewed-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31 23:09:50 +11:00
Vipin K Parashar 17bb69515c powerpc/powernv: Handle OPAL_WRONG_STATE in opal_get_sensor_data()
OPAL returns OPAL_WRONG_STATE upon failing to provide sensor data due to
core sleeping/offline. Add a check in opal_get_sensor_data() for sensor
read failure with OPAL_WRONG_STATE return code and return -EIO.

Signed-off-by: Vipin K Parashar <vipin@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-31 15:22:56 +11:00
Alexey Kardashevskiy e5afdf9dd5 powerpc/vfio_spapr_tce: Add reference counting to iommu_table
So far iommu_table obejcts were only used in virtual mode and had
a single owner. We are going to change this by implementing in-kernel
acceleration of DMA mapping requests. The proposed acceleration
will handle requests in real mode and KVM will keep references to tables.

This adds a kref to iommu_table and defines new helpers to update it.
This replaces iommu_free_table() with iommu_tce_table_put() and makes
iommu_free_table() static. iommu_tce_table_get() is not used in this patch
but it will be in the following patch.

Since this touches prototypes, this also removes @node_name parameter as
it has never been really useful on powernv and carrying it for
the pseries platform code to iommu_free_table() seems to be quite
useless as well.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-30 21:42:11 +11:00
Alexey Kardashevskiy 11edf116e3 powerpc/iommu/vfio_spapr_tce: Cleanup iommu_table disposal
At the moment iommu_table can be disposed by either calling
iommu_table_free() directly or it_ops::free(); the only implementation
of free() is in IODA2 - pnv_ioda2_table_free() - and it calls
iommu_table_free() anyway.

As we are going to have reference counting on tables, we need an unified
way of disposing tables.

This moves it_ops::free() call into iommu_free_table() and makes use
of the latter. The free() callback now handles only platform-specific
data.

As from now on the iommu_free_table() calls it_ops->free(), we need
to have it_ops initialized before calling iommu_free_table() so this
moves this initialization in pnv_pci_ioda2_create_table().

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-30 21:42:11 +11:00
Alexey Kardashevskiy a540aa56ba powerpc/powernv/iommu: Add real mode version of iommu_table_ops::exchange()
In real mode, TCE tables are invalidated using special
cache-inhibited store instructions which are not available in
virtual mode

This defines and implements exchange_rm() callback. This does not
define set_rm/clear_rm/flush_rm callbacks as there is no user for those -
exchange/exchange_rm are only to be used by KVM for VFIO.

The exchange_rm callback is defined for IODA1/IODA2 powernv platforms.

This replaces list_for_each_entry_rcu with its lockless version as
from now on pnv_pci_ioda2_tce_invalidate() can be called in
the real mode too.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-30 21:42:01 +11:00
Michael Neuling 517c27570c powerpc/powernv: Fix XSCOM address mangling for form 1 indirect
POWER9 adds form 1 scoms. The form of the indirection is specified in
the top nibble of the scom address.

Currently we do some (ugly) bit mangling so that we can fit a 64 bit
scom address into the debugfs interface. The current code only shifts
the top bit (indirect bit).

This patch changes it to shift the whole top nibble so that the form
of the indirection is also shifted.

This patch is backwards compatible with older scoms.

(This change isn't required in the arch/powerpc/platforms/powernv/opal-prd.c
scom interface as it passes the whole 64bit scom address without any bit
mangling)

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-28 10:52:03 +11:00
Oliver O'Halloran c3a08e93d6 powerpc/powernv: de-deuplicate OPAL call wrappers
Currently the code to perform an OPAL call is duplicated between the
normal path and path taken when tracepoints are enabled. There's no
real need for this and combining them makes opal_tracepoint_entry
considerably easier to understand.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-28 10:52:02 +11:00
Alexey Kardashevskiy 20f13b95ee powerpc/powernv/npu: Remove dead iommu code
PNV_IODA_PE_DEV is only used for NPU devices (emulated PCI bridges
representing NVLink). These are added to IOMMU groups with corresponding
NVIDIA devices after all non-NPU PEs are setup; a special helper -
pnv_pci_ioda_setup_iommu_api() - handles this in pnv_pci_ioda_fixup().

The pnv_pci_ioda2_setup_dma_pe() helper sets up DMA for a PE. It is called
for VFs (so it does not handle NPU case) and PCI bridges but only
IODA1 and IODA2 types. An NPU bridge has its own type id (PNV_PHB_NPU)
so pnv_pci_ioda2_setup_dma_pe() cannot be called on NPU and therefore
(pe->flags & PNV_IODA_PE_DEV) is always "false".

This removes not used iommu_add_device(). This should not cause any
behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-20 19:02:49 +11:00
Alexey Kardashevskiy 81d5fe1a3b powerpc/powernv: Fix it_ops::get() callback to return in cpu endian
The iommu_table_ops callbacks are declared CPU endian as they take and
return "unsigned long"; underlying hardware tables are big-endian.

However get() was missing be64_to_cpu(), this adds the missing conversion.

The only caller of this is crash dump at arch/powerpc/kernel/iommu.c,
iommu_table_clear() which only compares TCE to zero so this change
should not cause behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-20 19:02:49 +11:00
Nicholas Piggin 1363875bdb powerpc/64s: fix handling of non-synchronous machine checks
A synchronous machine check is an exception raised by the attempt to
execute the current instruction. If the error can't be corrected, it
can make sense to SIGBUS the currently running process.

In other cases, the error condition is not related to the current
instruction, so killing the current process is not the right thing to
do.

Today, all machine checks are MCE_SEV_ERROR_SYNC, so this has no
practical change. It will be used to handle POWER9 asynchronous
machine checks.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-10 16:32:06 +11:00
Alexey Kardashevskiy db08e1d530 powerpc/powernv/ioda2: Update iommu table base on ownership change
On POWERNV platform, in order to do DMA via IOMMU (i.e. 32bit DMA in
our case), a device needs an iommu_table pointer set via
set_iommu_table_base().

The codeflow is:
- pnv_pci_ioda2_setup_dma_pe()
	- pnv_pci_ioda2_setup_default_config()
	- pnv_ioda_setup_bus_dma() [1]

pnv_pci_ioda2_setup_dma_pe() creates IOMMU groups,
pnv_pci_ioda2_setup_default_config() does default DMA setup,
pnv_ioda_setup_bus_dma() takes a bus PE (on IODA2, all physical function
PEs as bus PEs except NPU), walks through all underlying buses and
devices, adds all devices to an IOMMU group and sets iommu_table.

On IODA2, when VFIO is used, it takes ownership over a PE which means it
removes all tables and creates new ones (with a possibility of sharing
them among PEs). So when the ownership is returned from VFIO to
the kernel, the iommu_table pointer written to a device at [1] is
stale and needs an update.

This adds an "add_to_group" parameter to pnv_ioda_setup_bus_dma()
(in fact re-adds as it used to be there a while ago for different
reasons) to tell the helper if a device needs to be added to
an IOMMU group with an iommu_table update or just the latter.

This calls pnv_ioda_setup_bus_dma(..., false) from
pnv_ioda2_release_ownership() so when the ownership is restored,
32bit DMA can work again for a device. This does the same thing
on obtaining ownership as the iommu_table point is stale at this point
anyway and it is safer to have NULL there.

We did not hit this earlier as all tested devices in recent years were
only using 64bit DMA; the rare exception for this is MPT3 SAS adapter
which uses both 32bit and 64bit DMA access and it has not been tested
with VFIO much.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-09 20:21:18 +11:00
Alexey Kardashevskiy 7aafac11e3 powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested
The IODA2 specification says that a 64 DMA address cannot use top 4 bits
(3 are reserved and one is a "TVE select"); bottom page_shift bits
cannot be used for multilevel table addressing either.

The existing IODA2 table allocation code aligns the minimum TCE table
size to PAGE_SIZE so in the case of 64K system pages and 4K IOMMU pages,
we have 64-4-12=48 bits. Since 64K page stores 8192 TCEs, i.e. needs
13 bits, the maximum number of levels is 48/13 = 3 so we physically
cannot address more and EEH happens on DMA accesses.

This adds a check that too many levels were requested.

It is still possible to have 5 levels in the case of 4K system page size.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-09 19:07:12 +11:00
Linus Torvalds f7d6a7283a powerpc fixes for 4.11 #3
Five fairly small fixes for things that went in this cycle.
 
 A fairly large patch to rework the CAS logic on Power9, necessitated by a late
 change to the firmware API, and we can't boot without it.
 
 Three fixes going to stable, allowing more instructions to be emulated on LE,
 fixing a boot crash on 32-bit Freescale BookE machines, and the OPAL XICS
 workaround.
 
 And a patch from me to sort the selects under CONFIG PPC. Annoying churn, but
 worth it in the long run, and best for it to go in now to avoid conflicts.
 
 Thanks to:
   Alexey Kardashevskiy, Anton Blanchard, Balbir Singh, Gautham R. Shenoy,
   Laurentiu Tudor, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Sachin Sant,
   Shile Zhang, Suraj Jitindar Singh.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYvqSxAAoJEFHr6jzI4aWAjMQP/06OFGz3VQvO5Q8jPsqRF22y
 Wr+04OKFmKnYVObdQk15HGOagp1fSkWWHfP/eu50kx1WNCzq7tQdLjNSi7H4F3s1
 4NwlaOfSQoxctsVtfnITJkfVScjcxK7XVagswtb3wvBpBx4lwD8fGwxkSxj6NhRw
 PNxLi44wobb8mDyR6L/6tJKBI2Jt12qXZY+kBQIleun5+lF8fNXIu4qPiglMOia6
 oPhXlp4RASt8wz74H8JuMTwGv17MxG+zvbkDPwQC7PI/fohJLybgWEfByN4H5UMy
 7Xi/lWHlShAyc7ulAIN+A1mHKY9LSv45U6qrrHFUJgRftZihoZHe6ekcI+h5oFVX
 chP9oUrQNeeZ5QqUC4rYdWwsMfiXBI0y5+BCupItixXc1LANBH9Ym9IECbgPRP93
 LQVqiS4958KijHlYBOA2zPicl/FnVO16orqakyRS0B3lQ54XBvhcgG8gIXjQr8PM
 Mt2W4r6RtGJ4ddhUPpF/W4lEuR4+dmXfEqs7DkgBKRbvi8XYkiLx2byBNh/OMRUG
 T4ILXsYf50AKRAq/jFTs9A0zkjtmtBeDdn96Mcan8i3WZuTQ7b8mQlC46zEg23A8
 XmTG2xt7N1dMjjwS78CfnvQ8sIVtA9AUfK37aTc0ICMsBCqEcWLAhHKZyCw0h25C
 wq9BMn4e5Gdg2xLTHKlL
 =SxON
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Five fairly small fixes for things that went in this cycle.

  A fairly large patch to rework the CAS logic on Power9, necessitated
  by a late change to the firmware API, and we can't boot without it.

  Three fixes going to stable, allowing more instructions to be emulated
  on LE, fixing a boot crash on 32-bit Freescale BookE machines, and the
  OPAL XICS workaround.

  And a patch from me to sort the selects under CONFIG PPC. Annoying
  churn, but worth it in the long run, and best for it to go in now to
  avoid conflicts.

  Thanks to:
    Alexey Kardashevskiy, Anton Blanchard, Balbir Singh, Gautham R.
    Shenoy, Laurentiu Tudor, Nicholas Piggin, Paul Mackerras, Ravi
    Bangoria, Sachin Sant, Shile Zhang, Suraj Jitindar Singh"

* tag 'powerpc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Sort the selects under CONFIG_PPC
  powerpc/64: Fix L1D cache shape vector reporting L1I values
  powerpc/64: Avoid panic during boot due to divide by zero in init_cache_info()
  powerpc: Update to new option-vector-5 format for CAS
  powerpc: Parse the command line before calling CAS
  powerpc/xics: Work around limitations of OPAL XICS priority handling
  powerpc/64: Fix checksum folding in csum_add()
  powerpc/powernv: Fix opal tracepoints with JUMP_LABEL=n
  powerpc/booke: Fix boot crash due to null hugepd
  powerpc: Fix compiling a BE kernel with a powerpc64le toolchain
  selftest/powerpc: Fix false failures for skipped tests
  powerpc/powernv: Fix bug due to labeling ambiguity in power_enter_stop
  powerpc/64: Invalidate process table caching after setting process table
  powerpc: emulate_step() tests for load/store instructions
  powerpc: Emulation support for load/store instructions on LE
2017-03-07 10:46:10 -08:00
Alexey Kardashevskiy 2a9c4f40ab powerpc/powernv: Fix opal tracepoints with JUMP_LABEL=n
The recent commit to allow calling OPAL calls in real mode, commit
ab9bad0ead ("powerpc/powernv: Remove separate entry for OPAL real mode
calls"), introduced a bug when CONFIG_JUMP_LABEL=n.

The commit moved the "mfmsr r12" prior to the call to OPAL_BRANCH, but
we missed that OPAL_BRANCH clobbers r12 when jump labels are disabled.
This leads to us using the tracepoint refcount as the MSR value,
typically zero, and saving that into PACASAVEDMSR. When we return from
OPAL we use that value as the MSR value for rfid, meaning we switch to
32-bit BE real mode - hilarity ensues.

Fix it by using r11 in OPAL_BRANCH, which is not live at the time the
macro is used in OPAL_CALL.

Fixes: ab9bad0ead ("powerpc/powernv: Remove separate entry for OPAL real mode calls")
Suggested-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-04 21:48:49 +11:00
Ingo Molnar ef8bd77f33 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/hotplug.h>
We are going to split <linux/sched/hotplug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/hotplug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:35 +01:00
Linus Torvalds b286cedd47 powerpc updates for 4.11 part 2
Highlights include:
 
  - An update of the disassembly code used by xmon to the latest versions in
    binutils. We've received permission from all the authors of the relevant
    binutils changes to relicense their changes to the relevant files from GPLv3
    to GPLv2, for inclusion in Linux. Thanks to Peter Bergner for doing the leg
    work to get permission from everyone.
 
  - Addition of the "architected" Power9 CPU table entry, allowing us to boot
    in Power9 architected mode under a hypervisor.
 
  - Updates to the Power9 PMU code.
 
  - Implementation of clear_bit_unlock_is_negative_byte() to optimise
    unlock_page().
 
  - Freescale updates from Scott: "Highlights include 8xx breakpoints and perf,
    t1042rdb display support, and board updates."
 
 Thanks to:
   Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas Miller,
   Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan, Michael Roth, Nathan
   Fontenot, Naveen N. Rao, Nicholas Piggin, Peter Bergner, Paul E. McKenney,
   Rashmica Gupta, Russell Currey, Sahil Mehta, Stewart Smith.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYthsKAAoJEFHr6jzI4aWAaWMQAJ7mAwX98ncoYschPgRmmIun
 f6DtE4IonrxiZ22gp1ct4+c9OFtA+B5FXMcEhOKpfh93lg38PTDjHs9e5kfauD7+
 oTQ2Bg1eXaL48FKdmC5Vs4Kt+/J8e9guGafUC1OVIpTyyRPoZeUDH0lx+kSPV5bd
 PkL+wY/k3W0Njo8WgD1P9u3W15+BxISo/k8c7ajzKTHGBZlAvj5h2gO6XUBNMLyy
 YClB/qIymjZriSB+AeWYD79k8gPbBZPsmZG0ZF1hY060894LgqLB9mPOJdffx/DY
 H7/uP6jcsRDOXTOmyueW1SEmPoQbtysiMd1lNrCXKtC/Okr5uhn2cUhi88AsgWvd
 1QFly2lobcDAKPah/yB7YQGMAcmYvGGNuqrWaosaV2T7r0KprzUYYgCOqzvC3WSJ
 QtVatBzMIqRTMYq+3U4G1aHeCXlRazVQHDuvPby8RdR5b2gIexiqMab2eS7tSMIH
 mCOIunRIvT14g/7wxUV7tahN+ifncNxzAk4DvPO+Wc4FQ4sy7wArv2YipSaWRWtE
 u7tNdBkEwlDkKhJgRU5T0Op2PyMbHwCP8pWuz7PQIhKIcgwmP9wb07BIWG/GGIqn
 07TxJYX2ItabyEMZMsYhzILZqjLyiAaCARANB7ScbQbdP8wdcGZcwismhwnfROIU
 NuxsZg63BUDMoxk7Sauu
 =rspd
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc updates from Michael Ellerman:
 "Highlights include:

   - an update of the disassembly code used by xmon to the latest
     versions in binutils. We've received permission from all the
     authors of the relevant binutils changes to relicense their changes
     to the relevant files from GPLv3 to GPLv2, for inclusion in Linux.
     Thanks to Peter Bergner for doing the leg work to get permission
     from everyone.

   - addition of the "architected" Power9 CPU table entry, allowing us
     to boot in Power9 architected mode under a hypervisor.

   - updates to the Power9 PMU code.

   - implementation of clear_bit_unlock_is_negative_byte() to optimise
     unlock_page().

   - Freescale updates from Scott: "Highlights include 8xx breakpoints
     and perf, t1042rdb display support, and board updates."

  Thanks to:
    Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas
    Miller, Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan,
    Michael Roth, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Peter
    Bergner, Paul E. McKenney, Rashmica Gupta, Russell Currey, Sahil
    Mehta, Stewart Smith"

* tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
  powerpc: Remove leftover cputime_to_nsecs call causing build error
  powerpc/mm/hash: Always clear UPRT and Host Radix bits when setting up CPU
  powerpc/optprobes: Fix TOC handling in optprobes trampoline
  powerpc/pseries: Advertise Hot Plug Event support to firmware
  cxl: fix nested locking hang during EEH hotplug
  powerpc/xmon: Dump memory in CPU endian format
  powerpc/pseries: Revert 'Auto-online hotplugged memory'
  powerpc/powernv: Make PCI non-optional
  powerpc/64: Implement clear_bit_unlock_is_negative_byte()
  powerpc/powernv: Remove unused variable in pnv_pci_sriov_disable()
  powerpc/kernel: Remove error message in pcibios_setup_phb_resources()
  powerpc/mm: Fix typo in set_pte_at()
  pci/hotplug/pnv-php: Disable MSI and PCI device properly
  pci/hotplug/pnv-php: Disable surprise hotplug capability on conflicts
  pci/hotplug/pnv-php: Remove WARN_ON() in pnv_php_put_slot()
  powerpc: Add POWER9 architected mode to cputable
  powerpc/perf: use is_kernel_addr macro in perf_get_misc_flags()
  powerpc/perf: Avoid FAB_*_MATCH checks for power9
  powerpc/perf: Add restrictions to PMC5 in power9 DD1
  powerpc/perf: Use Instruction Counter value
  ...
2017-03-01 10:10:16 -08:00
Masahiro Yamada 03671057c3 scripts/spelling.txt: add "overrided" pattern and fix typo instances
Fix typos and add the following to the scripts/spelling.txt:

  overrided||overridden

Link: http://lkml.kernel.org/r/1481573103-11329-22-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:47 -08:00
Linus Torvalds ac1820fb28 This is a tree wide change and has been kept separate for that reason.
Bart Van Assche noted that the ib DMA mapping code was significantly
 similar enough to the core DMA mapping code that with a few changes
 it was possible to remove the IB DMA mapping code entirely and
 switch the RDMA stack to use the core DMA mapping code.  This resulted
 in a nice set of cleanups, but touched the entire tree.  This branch
 will be submitted separately to Linus at the end of the merge window
 as per normal practice for tree wide changes like this.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYo06oAAoJELgmozMOVy/d9Z8QALedWHdu98St1L0u2c8sxnR9
 2zo/4sF5Vb9u7FpmdIX32L4SQ9s9KhPE8Qp8NtZLf9v10zlDebIRJDpXknXtKooV
 CAXxX4sxBXV27/UrhbZEfXiPrmm6ccJFyIfRnMU6NlMqh2AtAsRa5AC2/RMp8oUD
 Med97PFiF0o6TD22/UH1VFbRpX1zjaKyqm7a3as5sJfzNA+UGIZAQ7Euz8000DKZ
 xCgVLTEwS0FmOujtBkCst7xa9TjuqR1HLOB4DdGvAhP6BHdz2yamM7Qmh9NN+NEX
 0BtjsuXomtn6j6AszGC+bpipCZh3NUigcwoFAARXCYFHibBvo4DPdFeGsraFgXdy
 1+KyR8CCeQG3Aly5Vwr264RFPGkGpwMj8PsBlXgQVtrlg4rriaCzOJNmIIbfdADw
 ftqhxBOzReZw77aH2s+9p2ILRfcAmPqhynLvFGFo9LBvsik8LVso7YgZN0xGxwcI
 IjI/XGC8UskPVsIZBIYA6sl2bYzgOjtBIHiXjRrPlW3uhduIXLrvKFfLPP/5XLAG
 ehLXK+J0bfsyY9ClmlNS8oH/WdLhXAyy/KNmnj5bRRm9qg6BRJR3bsOBhZJODuoC
 XgEXFfF6/7roNESWxowff7pK0rTkRg/m/Pa4VQpeO+6NWHE7kgZhL6kyIp5nKcwS
 3e7mgpcwC+3XfA/6vU3F
 =e0Si
 -----END PGP SIGNATURE-----

Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma DMA mapping updates from Doug Ledford:
 "Drop IB DMA mapping code and use core DMA code instead.

  Bart Van Assche noted that the ib DMA mapping code was significantly
  similar enough to the core DMA mapping code that with a few changes it
  was possible to remove the IB DMA mapping code entirely and switch the
  RDMA stack to use the core DMA mapping code.

  This resulted in a nice set of cleanups, but touched the entire tree
  and has been kept separate for that reason."

* tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
  IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it
  IB/core: Remove ib_device.dma_device
  nvme-rdma: Switch from dma_device to dev.parent
  RDS: net: Switch from dma_device to dev.parent
  IB/srpt: Modify a debug statement
  IB/srp: Switch from dma_device to dev.parent
  IB/iser: Switch from dma_device to dev.parent
  IB/IPoIB: Switch from dma_device to dev.parent
  IB/rxe: Switch from dma_device to dev.parent
  IB/vmw_pvrdma: Switch from dma_device to dev.parent
  IB/usnic: Switch from dma_device to dev.parent
  IB/qib: Switch from dma_device to dev.parent
  IB/qedr: Switch from dma_device to dev.parent
  IB/ocrdma: Switch from dma_device to dev.parent
  IB/nes: Remove a superfluous assignment statement
  IB/mthca: Switch from dma_device to dev.parent
  IB/mlx5: Switch from dma_device to dev.parent
  IB/mlx4: Switch from dma_device to dev.parent
  IB/i40iw: Remove a superfluous assignment statement
  IB/hns: Switch from dma_device to dev.parent
  ...
2017-02-25 13:45:43 -08:00
Linus Torvalds 38705613b7 powerpc updates for 4.11 part 1.
Highlights include:
 
  - Support for direct mapped LPC on POWER9, giving Linux direct access to
    devices that may be on there such as a UART.
 
  - Memory hotplug support for the Power9 Radix MMU.
 
  - Add new AUX vectors describing the processor's cache geometry, to be used by
    glibc.
 
  - The ability for a guest to ask the hypervisor to resize the guest's hash
    table, and in addition support for doing so automatically when memory is
    hotplugged into/out-of the guest. This allows the hash table to be sized
    based on the current memory usage of the guest, rather than the maximum
    possible memory usage.
 
  - Implementation of optprobes (kprobe optimisation) for powerpc.
 
 In addition there's the topic branch shared with the KVM tree, which includes
 support for guests to use the Radix MMU on Power9.
 
 Thanks to:
   Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton Blanchard,
   Benjamin Herrenschmidt, Chris Packham, Daniel Axtens, Daniel Borkmann, David
   Gibson, Finn Thain, Gautham R. Shenoy, Gavin Shan, Greg Kurz, Joel Stanley,
   John Allen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
   Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi
   Bangoria, Reza Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYrWj0AAoJEFHr6jzI4aWAsn4P/08Kz3TtOvDuuPGVNoO7fWOn
 ag5/zVt8R4FuCALqWpAZbVqMuUU4wLxG0RuWlmBNYYhrjMC6JxHpOSjQXxM2D7YT
 CdGTJxG414r6HMOeToL9i/z33o0m+KT07tscer+QMKlXVKCR2z0fEJch+zoPBHYA
 5eVBFpLLTtbNiX9UnrcM/vYz61d56kT4YJey9/8qbAkTAc1rMPa8ucU5UiKYJ7yX
 mF8cd7WE+7aqif9V8yN59G2rcbz+h3pbMw/gzImiYsYrUj4fLjU+VTKL5PPT+UFy
 WWBXD3MAMm1dksMMZi3hgoo2BZhDn3RkymeYi6Jo4kDknNMPZzkMxGyvaJ8Eq5H/
 bIYXdS1AbTtvaaEEuWqDFjpnChOEvj/8IeqitU0jjlql8BVjNKg/ESaaKucZr+pO
 pk2Mfvw0Tb/lxJT5qj27yq4aRsxJwdFOoPYCN7MquPp/wV2Tg5M6h4nVQ4T6Wl0S
 tMFQeCqXflhDWh0Xgr2UXpF66/YTj3Du5LasOTkgGeU30Z8TcNGFEmDWShKP3cEm
 e0oQE+OHhIGN4WSBXBAto/Gw8/0v3pXlMs+VEfeHqenwPss2sWtSwXWe8khmiy9e
 DJ48sTzj75/Zx1fiqRldw9YEnrL+NK0eOOpvzxeyKpfvdUytc+chFfEqUmO6kl3Z
 DW2UmlZxmW+b0SfexCHL
 =Icle
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights include:

   - Support for direct mapped LPC on POWER9, giving Linux direct access
     to devices that may be on there such as a UART.

   - Memory hotplug support for the Power9 Radix MMU.

   - Add new AUX vectors describing the processor's cache geometry, to
     be used by glibc.

   - The ability for a guest to ask the hypervisor to resize the guest's
     hash table, and in addition support for doing so automatically when
     memory is hotplugged into/out-of the guest. This allows the hash
     table to be sized based on the current memory usage of the guest,
     rather than the maximum possible memory usage.

   - Implementation of optprobes (kprobe optimisation) for powerpc.

  In addition there's the topic branch shared with the KVM tree, which
  includes support for guests to use the Radix MMU on Power9.

  Thanks to:
    Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton
    Blanchard, Benjamin Herrenschmidt, Chris Packham, Daniel Axtens,
    Daniel Borkmann, David Gibson, Finn Thain, Gautham R. Shenoy, Gavin
    Shan, Greg Kurz, Joel Stanley, John Allen, Madhavan Srinivasan,
    Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Nathan Fontenot,
    Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Reza
    Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun"

* tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (129 commits)
  powerpc/mm/radix: Skip ptesync in pte update helpers
  powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm
  powerpc/mm/radix: Update pte update sequence for pte clear case
  powerpc/mm: Update PROTFAULT handling in the page fault path
  powerpc/xmon: Fix data-breakpoint
  powerpc/mm: Fix build break with BOOK3S_64=n and MEMORY_HOTPLUG=y
  powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y
  powerpc/mm: Fix build break with RADIX=y & HUGETLBFS=n
  powerpc/pseries: Fix typo in parameter description
  powerpc/kprobes: Remove kprobe_exceptions_notify()
  kprobes: Introduce weak variant of kprobe_exceptions_notify()
  powerpc/ftrace: Fix confusing help text for DISABLE_MPROFILE_KERNEL
  powerpc/powernv: Fix opal_exit tracepoint opcode
  powerpc: Add a prototype for mcount() so it can be versioned
  powerpc: Drop GPL from of_node_to_nid() export to match other arches
  powerpc/kprobes: Optimize kprobe in kretprobe_trampoline()
  powerpc/kprobes: Implement Optprobes
  powerpc/kprobes: Fixes for kprobe_lookup_name() on BE
  powerpc: Add helper to check if offset is within relative branch range
  powerpc/bpf: Introduce __PPC_SH64()
  ...
2017-02-22 10:30:38 -08:00
Michael Ellerman a311e738b6 powerpc/powernv: Make PCI non-optional
Bare metal systems without PCI don't exist, so there's no real point in
making PCI optional, it just breaks the build from time to time. In fact
the build is broken now if you turn off PCI_MSI but enable KVM.

Using select for PCI is OK because we (powerpc) define config PCI, and it
has no dependencies. Selecting PCI_MSI is slightly fishy, because it's
in drivers/pci and it is user-visible, but its only dependency is PCI,
so selecting it can't actually lead to breakage.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-20 14:56:26 +11:00
Gavin Shan 02983449c8 powerpc/powernv: Remove unused variable in pnv_pci_sriov_disable()
The local variable @iov isn't used, to remove it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-17 22:18:43 +11:00
Michael Ellerman da0e7e6276 Merge branch 'topic/ppc-kvm' into next
Merge the topic branch we're sharing with the kvm-ppc tree.
2017-02-14 17:18:29 +11:00
Michael Ellerman a7e0fb6c20 powerpc/powernv: Fix opal_exit tracepoint opcode
Currently the opal_exit tracepoint usually shows the opcode as 0:

  <idle>-0     [047] d.h.   635.654292: opal_entry: opcode=63
  <idle>-0     [047] d.h.   635.654296: opal_exit: opcode=0 retval=0
  kopald-1209  [019] d...   636.420943: opal_entry: opcode=10
  kopald-1209  [019] d...   636.420959: opal_exit: opcode=0 retval=0

This is because we incorrectly load the opcode into r0 before calling
__trace_opal_exit(), whereas it expects the opcode in r3 (first function
parameter). In fact we are leaving the retval in r3, so opcode and
retval will always show the same value.

Instead load the opcode into r3, resulting in:

  <idle>-0     [040] d.h.   636.618625: opal_entry: opcode=63
  <idle>-0     [040] d.h.   636.618627: opal_exit: opcode=63 retval=0

Fixes: c49f63530b ("powernv: Add OPAL tracepoints")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-10 13:40:22 +11:00
Benjamin Herrenschmidt 9b25671497 powerpc/powernv: Fix CPU hotplug to handle waking on HVI
The IPIs come in as HVI not EE, so we need to test the appropriate
SRR1 bits. The encoding is such that it won't have false positives
on P7 and P8 so we can just test it like that. We also need to handle
the icp-opal variant of the flush.

Fixes: d74361881f ("powerpc/xics: Add ICP OPAL backend")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-09 15:19:57 +11:00
Benjamin Herrenschmidt 86f7ce4b47 powerpc/opal-lpc: Remove unneeded include
We don't need asm/xics.h

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-09 10:31:37 +11:00
Benjamin Herrenschmidt 2717a33d60 powerpc/opal-irqchip: Use interrupt names if present
Recent versions of OPAL can provide names for the various OPAL interrupts,
so let's use them. This also modernises the code that fetches the
interrupt array to use the helpers provided by the generic code instead
of hand-parsing the property.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Free irqs on error, check allocation of names, consolidate error
      handling, whitespace.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-09 10:31:37 +11:00
Mahesh Salgaonkar 470a36a8c0 powerpc/powernv: Display the correct error info for CAPP errors.
On some CAPP errors we see console messages that prints unknown HMIs for
which CAPI recovery is in progress. This patch fixes this by printing
correct error info for HMI generated due to CAPP recovery.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-09 10:31:36 +11:00
Benjamin Herrenschmidt ab9bad0ead powerpc/powernv: Remove separate entry for OPAL real mode calls
All entry points already read the MSR so they can easily do
the right thing.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-07 16:40:18 +11:00
Michael Ellerman 0b1c764339 powerpc/powernv: Fix section mismatch from opal_lpc_init()
opal_lpc_init() is called from an __init routine, and calls other __init
routines, so should also be __init, init?

Fixes: 023b13a501 ("powerpc/powernv: Add support for direct mapped LPC on POWER9")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-02 16:35:10 +11:00
Benjamin Herrenschmidt 023b13a501 powerpc/powernv: Add support for direct mapped LPC on POWER9
Use the new non-PCI ISA bridge support to expose the POWER9
LPC bus as direct mapped via the ISA IO port range. This
enables direct access via drivers such as 8250

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 13:54:18 +11:00
Benjamin Herrenschmidt 38e9d36bc1 powerpc: Move isa bridge definitions to separate include
We'll be adding non-PCI isa bridge support so let's not
have all the definition in pci-bridge.h

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 13:54:17 +11:00
Gautham R. Shenoy 09206b600c powernv: Pass PSSCR value and mask to power9_idle_stop
The power9_idle_stop method currently takes only the requested stop
level as a parameter and picks up the rest of the PSSCR bits from a
hand-coded macro. This is not a very flexible design, especially when
the firmware has the capability to communicate the psscr value and the
mask associated with a particular stop state via device tree.

This patch modifies the power9_idle_stop API to take as parameters the
PSSCR value and the PSSCR mask corresponding to the stop state that
needs to be set. These PSSCR value and mask are respectively obtained
by parsing the "ibm,cpu-idle-state-psscr" and
"ibm,cpu-idle-state-psscr-mask" fields from the device tree.

In addition to this, the patch adds support for handling stop states
for which ESL and EC bits in the PSSCR are zero. As per the
architecture, a wakeup from these stop states resumes execution from
the subsequent instruction as opposed to waking up at the System
Vector.

The older firmware sets only the Requested Level (RL) field in the
psscr and psscr-mask exposed in the device tree. For older firmware
where psscr-mask=0xf, this patch will set the default sane values that
the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and
TR). For the new firmware, the patch will validate that the invariants
required by the ISA for the psscr values are maintained by the
firmware.

This skiboot patch that exports fully populated PSSCR values and the
mask for all the stop states can be found here:
https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html

[Optimize the number of instructions before entering STOP with
ESL=EC=0, validate the PSSCR values provided by the firimware
maintains the invariants required as per the ISA suggested by Balbir
Singh]

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 08:32:13 +11:00
Gautham R. Shenoy dd34c74c97 powernv:stop: Rename pnv_arch300_idle_init to pnv_power9_idle_init
Balbir pointed out that the name of the function pnv_arch300_idle_init
was inconsistent with the names of the variables and functions
pertaining to POWER9 features in book3s_idle.S.

This patch renames pnv_arch300_idle_init to pnv_power9_idle_init.

This patch does not change any behaviour.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 08:32:12 +11:00
Alistair Popple 616badd2fb powerpc/powernv: Use OPAL call for TCE kill on NVLink2
Add detection of NPU2 PHBs. NPU2/NVLink2 has a different register
layout for the TCE kill register therefore TCE invalidation should be
done via the OPAL call rather than using the register directly as it
is for PHB3 and NVLink1. This changes TCE invalidation to use the OPAL
call in the case of a NPU2 PHB model.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-30 20:34:53 +11:00
Alistair Popple 1d0761d255 powerpc/powernv: Initialise nest mmu
POWER9 contains an off core mmu called the nest mmu (NMMU). This is
used by other hardware units on the chip to translate virtual
addresses into real addresses. The unit attempting an address
translation provides the majority of the context required for the
translation request except for the base address of the partition table
(ie. the PTCR) which needs to be programmed into the NMMU.

This patch adds a call to OPAL to set the PTCR for the nest mmu in
opal_init().

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-30 20:24:33 +11:00
Markus Elfring fb37e12896 powerpc/powernv/pci: Use kmalloc_array() in two functions
Use kmalloc_array(), which checks for overflow of the multiplication,
rather than doing it by hand.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-25 13:34:19 +11:00
Joel Stanley 14a41d6b75 powerpc/powernv: Report size of OPAL memcons log
The OPAL memory console is reported to be size zero, as we do not
initialise the struct attr with any size information due to the size
being variable. This leads users to think that the console is empty.

Instead report the maximum size.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-25 13:33:55 +11:00
Bart Van Assche 5299709d0a treewide: Constify most dma_map_ops structures
Most dma_map_ops structures are never modified. Constify these
structures such that these can be write-protected. This patch
has been generated as follows:

git grep -l 'struct dma_map_ops' |
  xargs -d\\n sed -i \
    -e 's/struct dma_map_ops/const struct dma_map_ops/g' \
    -e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \
    -e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \
    -e 's/const const struct dma_map_ops /const struct dma_map_ops /g';
sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \
  $(git grep -l 'struct dma_map_ops intel_dma_ops');
sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \
  $(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc);
sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \
       -e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \
       -e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \
    drivers/pci/host/*.c
sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c
sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c
sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 12:23:35 -05:00
Linus Torvalds 7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Linus Torvalds de399813b5 powerpc updates for 4.10
Highlights include:
 
  - Support for the kexec_file_load() syscall, which is a prereq for secure and
    trusted boot.
 
  - Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN).
 
  - Sort the exception tables at build time, to save time at boot, and store
    them as relative offsets to save space in the kernel image & memory.
 
  - Allow building the kernel with thin archives, which should allow us to build
    an allyesconfig once some other fixes land.
 
  - Build fixes to allow us to correctly rebuild when changing the kernel endian
    from big to little or vice versa.
 
  - Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix.
 
  - Initial stack protector support (-fstack-protector).
 
  - Support for dumping the radix (aka. Linux) and hash page tables via debugfs.
 
  - Fix an oops in cxl coredump generation when cxl_get_fd() is used.
 
  - Freescale updates from Scott: "Highlights include 8xx hugepage support,
    qbman fixes/cleanup, device tree updates, and some misc cleanup."
 
  - Many and varied fixes and minor enhancements as always.
 
 Thanks to:
   Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual,
   Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet,
   Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat,
   Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold,
   Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan
   Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin,
   Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
   Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYU4YSAAoJEFHr6jzI4aWAC4gQALtIAqqPon0Cd5b/FVVcMbW7
 mMqB2b/0FGEl5GoRTzGUDaQqElilm6AEVfHO86C7DFji/a6olneFfw87iz+mtWuZ
 JvrNq68ZiSnoeszdUy4MgtXFLb5sTzNMev4skaHfjI9E5CepWBoR0zH4G+kNVnd5
 WSgudv8Cq4Px+MEuTOigt3QYjHzZ3cw/XNOOm9c+oGj+PDW4O9UItVI+S1WLoey4
 rAB2nRcLMDPuwfRQC9XsF3zEbkv4h1dEXo/EBRuRpcF+0lLTzFw1lv1WE8OxlUmS
 kAXbty3dIytBfSbtJT0c0Ps6sfQ4HFhu6ZV2fjnxNTz2KDkBIN7LBYHmBYiqY9oZ
 9zvbUWtfiTu5ocfRtTq7rC/Hcj4Kbr9S9F/FvXR0WyDsKgu4xxAovqC3gcn6YjYK
 Rr1tcCI4nUzyhVJVmd+OEhUvc5JbFy9aGage+YeOyejfvvSbXIunaxWlPjoDkvim
 Vjl+UKU8gw51XFssqY5ZBi/HNlMFKYedLpMFp/fItnLglhj50V0eFWkpDgdSCYom
 vo9ifPLZx8n8m8De3H7TV4E0F4gCHcTeqZdu7tW9AAUVM6iLJcDLm3asGmtNh21t
 snOHNOJ5QSIno6ezUUg29T6VBjbPh46fdJJSlIZrEe8OzLZ1haGyttf0tD00PQvY
 Z2W/m3gxafnOeGgBqvyv
 =xOzf
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights include:

   - Support for the kexec_file_load() syscall, which is a prereq for
     secure and trusted boot.

   - Prevent kernel execution of userspace on P9 Radix (similar to
     SMEP/PXN).

   - Sort the exception tables at build time, to save time at boot, and
     store them as relative offsets to save space in the kernel image &
     memory.

   - Allow building the kernel with thin archives, which should allow us
     to build an allyesconfig once some other fixes land.

   - Build fixes to allow us to correctly rebuild when changing the
     kernel endian from big to little or vice versa.

   - Plumbing so that we can avoid doing a full mm TLB flush on P9
     Radix.

   - Initial stack protector support (-fstack-protector).

   - Support for dumping the radix (aka. Linux) and hash page tables via
     debugfs.

   - Fix an oops in cxl coredump generation when cxl_get_fd() is used.

   - Freescale updates from Scott: "Highlights include 8xx hugepage
     support, qbman fixes/cleanup, device tree updates, and some misc
     cleanup."

   - Many and varied fixes and minor enhancements as always.

  Thanks to:
    Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman
    Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz,
    Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar
    Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff
    Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin,
    Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N.
    Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica
    Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
    Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain"

[ And thanks to Michael, who took time off from a new baby to get this
  pull request done.   - Linus ]

* tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits)
  powerpc/fsl/dts: add FMan node for t1042d4rdb
  powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb
  powerpc/fsl/dts: add QMan and BMan nodes on t1024
  powerpc/fsl/dts: add QMan and BMan nodes on t1023
  soc/fsl/qman: test: use DEFINE_SPINLOCK()
  powerpc/fsl-lbc: use DEFINE_SPINLOCK()
  powerpc/8xx: Implement support of hugepages
  powerpc: get hugetlbpage handling more generic
  powerpc: port 64 bits pgtable_cache to 32 bits
  powerpc/boot: Request no dynamic linker for boot wrapper
  soc/fsl/bman: Use resource_size instead of computation
  soc/fsl/qe: use builtin_platform_driver
  powerpc/fsl_pmc: use builtin_platform_driver
  powerpc/83xx/suspend: use builtin_platform_driver
  powerpc/ftrace: Fix the comments for ftrace_modify_code
  powerpc/perf: macros for power9 format encoding
  powerpc/perf: power9 raw event format encoding
  powerpc/perf: update attribute_group data structure
  powerpc/perf: factor out the event format field
  powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
  ...
2016-12-16 09:26:42 -08:00
Linus Torvalds 179a7ba680 This release has a few updates:
o STM can hook into the function tracer
  o Function filtering now supports more advance glob matching
  o Ftrace selftests updates and added tests
  o Softirq tag in traces now show only softirqs
  o ARM nop added to non traced locations at compile time
  o New trace_marker_raw file that allows for binary input
  o Optimizations to the ring buffer
  o Removal of kmap in trace_marker
  o Wakeup and irqsoff tracers now adhere to the set_graph_notrace file
  o Other various fixes and clean ups
 
 Note, there are two patches marked for stable. These were discovered
 near the end of the 4.9 rc release cycle. By the time I had them tested
 it was just a matter of days before 4.9 would be released, and I
 figured I would just submit them in the merge window. They are old
 bugs and not critical. Nothing non-root could abuse.
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJYUrFHFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 2+AIAIr20kSQV/nA5htGAeCTobVk3WUxY6bvjd9mIJDKPP19akNLyREW0G3KnfCr
 yhx4aFRZG98fRu/6F8qieRosyN36lADDVYHelMFHMpcTOpE2aZGjaaOuNGxOEA9v
 FmMPTX+K3+dzKyFP4l68R3+5JuQ1/AqLTioTWeLW8IDQ2OOVsjD8+0BuXrNKMJDY
 o6U4Hk5U/vn+zHc6BmgBzloAXemBd7iJ1t5V3FRRGvm8yv3HU85Twc5ofGeYTWvB
 J8PboEywRlIzxg0Kd8mxnMI5PgaKZSEc2ub8E7cY/CZ5PYpDE2xDA2hJmJgfYp00
 1VW+DHRpRZfElsCcya6S6P4bs5Y=
 =MGZ/
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "This release has a few updates:

   - STM can hook into the function tracer
   - Function filtering now supports more advance glob matching
   - Ftrace selftests updates and added tests
   - Softirq tag in traces now show only softirqs
   - ARM nop added to non traced locations at compile time
   - New trace_marker_raw file that allows for binary input
   - Optimizations to the ring buffer
   - Removal of kmap in trace_marker
   - Wakeup and irqsoff tracers now adhere to the set_graph_notrace file
   - Other various fixes and clean ups"

* tag 'trace-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (42 commits)
  selftests: ftrace: Shift down default message verbosity
  kprobes/trace: Fix kprobe selftest for newer gcc
  tracing/kprobes: Add a helper method to return number of probe hits
  tracing/rb: Init the CPU mask on allocation
  tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results
  tracing/fgraph: Have wakeup and irqsoff tracers ignore graph functions too
  fgraph: Handle a case where a tracer ignores set_graph_notrace
  tracing: Replace kmap with copy_from_user() in trace_marker writing
  ftrace/x86_32: Set ftrace_stub to weak to prevent gcc from using short jumps to it
  tracing: Allow benchmark to be enabled at early_initcall()
  tracing: Have system enable return error if one of the events fail
  tracing: Do not start benchmark on boot up
  tracing: Have the reg function allow to fail
  ring-buffer: Force rb_end_commit() and rb_set_commit_to_write() inline
  ring-buffer: Froce rb_update_write_stamp() to be inlined
  ring-buffer: Force inline of hotpath helper functions
  tracing: Make __buffer_unlock_commit() always_inline
  tracing: Make tracepoint_printk a static_key
  ring-buffer: Always inline rb_event_data()
  ring-buffer: Make rb_reserve_next_event() always inlined
  ...
2016-12-15 13:49:34 -08:00
Steven Rostedt (Red Hat) 8cf868affd tracing: Have the reg function allow to fail
Some tracepoints have a registration function that gets enabled when the
tracepoint is enabled. There may be cases that the registraction function
must fail (for example, can't allocate enough memory). In this case, the
tracepoint should also fail to register, otherwise the user would not know
why the tracepoint is not working.

Cc: David Howells <dhowells@redhat.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-12-09 09:13:30 -05:00
Thiago Jung Bauermann da6658859b powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead.
Commit 2965faa5e0 ("kexec: split kexec_load syscall from kexec core
code") introduced CONFIG_KEXEC_CORE so that CONFIG_KEXEC means whether
the kexec_load system call should be compiled-in and CONFIG_KEXEC_FILE
means whether the kexec_file_load system call should be compiled-in.
These options can be set independently from each other.

Since until now powerpc only supported kexec_load, CONFIG_KEXEC and
CONFIG_KEXEC_CORE were synonyms. That is not the case anymore, so we
need to make a distinction. Almost all places where CONFIG_KEXEC was
being used should be using CONFIG_KEXEC_CORE instead, since
kexec_file_load also needs that code compiled in.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-30 23:15:11 +11:00
Michael Ellerman ddbefe7e77 Merge branch 'topic/ppc-kvm' into next
Merge the topic branch we're sharing with the kvm-ppc tree.
2016-11-24 22:14:52 +11:00
Paul Mackerras ffe6d810fe powerpc/powernv: Define real-mode versions of OPAL XICS accessors
This defines real-mode versions of opal_int_get_xirr(), opal_int_eoi()
and opal_int_set_mfrr(), for use by KVM real-mode code.

It also exports opal_int_set_mfrr() so that the modular part of KVM
can use it to send IPIs.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 10:32:11 +11:00
Russell Currey 1f52f17614 powerpc/pci: Always print PHB and PE numbers as hexadecimal
PHB, PE (and by association MVE) numbers are printed as a mix of decimal
and hexadecimal throughout the kernel.  This can be misleading, so make
them all hexadecimal.

Standardising on hex instead of dec because:

 - PHB numbers are presented in hex in sysfs/debugfs (and lspci, etc)
 - PE numbers are presented as hex in sysfs and parsed in hex in debugfs

The only place I think this could cause confusing are the messages during
boot, i.e.

	pci 000a:01     : [PE# 000] Secondary bus 1 associated with PE#0

which can be a quick way to check PE numbers.  pe_level_printk() will
only print two characters instead of three, so the above would be

	pci 000a:01     : [PE# 00] Secondary bus 1 associated with PE#0

which gives a hint it's in hex.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22 11:57:07 +11:00
Russell Currey d4791db527 powerpc/powernv: Don't warn on PE init if unfreeze is unsupported
Whenever a PE is initialised in powernv, opal_pci_eeh_freeze_clear() is
called.  This is to remove any existing freeze, and has no negative side
effects if the PE is already in an unfrozen state.  On PHB backends that
don't support this operation and return OPAL_UNSUPPORTED, this creates a
scary and misleading warning message.

Skip the warning message on init if OPAL_UNSUPPORTED is returned.

As far as I'm aware, this currently only affects NPUs.

Fixes: 313483d ("powerpc/powernv: Unfreeze PE on allocation")
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22 11:57:07 +11:00
Jack Miller 9e4f51bdaf powerpc/powernv: Simplify searching for compatible device nodes
This condenses the opal node searching into a single function that finds
all compatible nodes, instead of just searching the ibm,opal children,
for ipmi, flash, and prd similar to how opal-i2c nodes are found.

Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 20:05:57 +11:00
Linus Torvalds 07021b4359 powerpc updates for 4.9
Highlights:
  - Major rework of Book3S 64-bit exception vectors (Nicholas Piggin)
    - Use gas sections for arranging exception vectors et. al.
  - Large set of TM cleanups and selftests (Cyril Bur)
  - Enable transactional memory (TM) lazily for userspace (Cyril Bur)
  - Support for XZ compression in the zImage wrapper (Oliver O'Halloran)
  - Add support for bpf constant blinding (Naveen N. Rao)
  - Beginnings of upstream support for PA Semi Nemo motherboards (Darren Stevens)
 
 Fixes:
  - Ensure .mem(init|exit).text are within _stext/_etext (Michael Ellerman)
  - xmon: Don't use ld on 32-bit (Michael Ellerman)
  - vdso64: Use double word compare on pointers (Anton Blanchard)
  - powerpc/nvram: Fix an incorrect partition merge (Pan Xinhui)
  - powerpc: Fix usage of _PAGE_RO in hugepage (Christophe Leroy)
  - powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K (Aneesh Kumar K.V)
  - Fix memory leak in queue_hotplug_event() error path (Andrew Donnellan)
  - Replay hypervisor maintenance interrupt first (Nicholas Piggin)
 
 Cleanups & features:
  - Sparse fixes/cleanups (Daniel Axtens)
  - Preserve CFAR value on SLB miss caused by access to bogus address (Paul Mackerras)
  - Radix MMU fixups for POWER9 (Aneesh Kumar K.V)
  - Support for setting used_(vsr|vr|spe) in sigreturn path (for CRIU) (Simon Guo)
  - Optimise syscall entry for virtual, relocatable case (Nicholas Piggin)
  - Optimise MSR handling in exception handling (Nicholas Piggin)
  - Support for kexec with Radix MMU (Benjamin Herrenschmidt)
  - powernv EEH fixes (Russell Currey)
  - Suprise PCI hotplug support for powernv (Gavin Shan)
  - Endian/sparse fixes for powernv PCI (Gavin Shan)
  - Defconfig updates (Anton Blanchard)
  - Various performance optimisations (Anton Blanchard)
    - Align hot loops of memset() and backwards_memcpy()
    - During context switch, check before setting mm_cpumask
    - Remove static branch prediction in atomic{, 64}_add_unless
    - Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian
    - Set default CPU type to POWER8 for little endian builds
 
  - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (Balbir Singh)
  - cxl: Flush PSL cache before resetting the adapter (Frederic Barrat)
  - cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put() (Andrew Donnellan)
  - Fix HV facility unavailable to use correct handler (Nicholas Piggin)
  - Remove unnecessary syscall trampoline (Nicholas Piggin)
  - fadump: Fix build break when CONFIG_PROC_VMCORE=n (Michael Ellerman)
  - Quieten EEH message when no adapters are found (Anton Blanchard)
  - powernv: Add PHB register dump debugfs handle (Russell Currey)
  - Use kprobe blacklist for exception handlers & asm functions (Nicholas Piggin)
  - Document the syscall ABI (Nicholas Piggin)
  - MAINTAINERS: Update cxl maintainers (Michael Neuling)
  - powerpc: Remove all usages of NO_IRQ (Michael Ellerman)
 
 Minor cleanups:
  - Andrew Donnellan, Christophe Leroy, Colin Ian King, Cyril Bur, Frederic Barrat,
    Pan Xinhui, PrasannaKumar Muralidharan, Rui Teng, Simon Guo.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX9x5ZAAoJEFHr6jzI4aWAWQ0P+gOhdtayMsRY0k0dzPmYaFr0
 Ha5v968RJaNIyGGM9ARJg8h27PGMaSlBp/9zaYdk1G7xfv/DMR0uq8d8l5pjy/Zw
 Jm72WE4PEX/zAcQxry6Y2fDdumO09crTBA/W0hM1UZzqu0bcVUfD+E51ZFYWW7yh
 fyhT2YnlucxIcT34pxsLqwTIiZYG4xgN3+YGo0wohY1D1GHE3UZ7SXIglb49yM6v
 ZeXrL7SOdERR1w88rC+g99P/cWng5HDS0wPLUbxGT5KIpoOSXOs7EbZwFqQBUy5O
 37PB07K5dDyUbrm++l5lUigldF3W1OZQBN5+n8PciulxxwFX84pllTlAxv1p60JR
 piEKZ8pl023IF7zMGatUG9qcNOcnbxdMsAhoEhlcFi9ulM/yLzbmRTKVfDYm+O/J
 UI+YtcbsgdyOXMdGXCqdpeBNuuypgLG/g7gC8bnk3taS0LUUZLcXtRNuE4tcPJJe
 v8FnszaLkjAi83Lmzt3fgZo7DI1RIPwDSw6fY+nBrxCRfEPRVx3f7KhmUXvSeol5
 Ln9xpk4AtyQt1RHhckxXwWSUgvXVg2ltmz7ElqK4sQ9mO/D2ZIs6R6fPY4VlJLc4
 /2yIV4RLIsbHmdv9IbJ8PBp0VTugSNdicZ904QiAHSZQv/i1mgYuXw3tjR6kuy9f
 bKOzNJTwLV1WUsOlUpiq
 =Jnn8
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - Major rework of Book3S 64-bit exception vectors (Nicholas Piggin)
   - Use gas sections for arranging exception vectors et. al.
   - Large set of TM cleanups and selftests (Cyril Bur)
   - Enable transactional memory (TM) lazily for userspace (Cyril Bur)
   - Support for XZ compression in the zImage wrapper (Oliver
     O'Halloran)
   - Add support for bpf constant blinding (Naveen N. Rao)
   - Beginnings of upstream support for PA Semi Nemo motherboards
     (Darren Stevens)

  Fixes:
   - Ensure .mem(init|exit).text are within _stext/_etext (Michael
     Ellerman)
   - xmon: Don't use ld on 32-bit (Michael Ellerman)
   - vdso64: Use double word compare on pointers (Anton Blanchard)
   - powerpc/nvram: Fix an incorrect partition merge (Pan Xinhui)
   - powerpc: Fix usage of _PAGE_RO in hugepage (Christophe Leroy)
   - powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K
     (Aneesh Kumar K.V)
   - Fix memory leak in queue_hotplug_event() error path (Andrew
     Donnellan)
   - Replay hypervisor maintenance interrupt first (Nicholas Piggin)

  Various performance optimisations (Anton Blanchard):
   - Align hot loops of memset() and backwards_memcpy()
   - During context switch, check before setting mm_cpumask
   - Remove static branch prediction in atomic{, 64}_add_unless
   - Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little
     endian
   - Set default CPU type to POWER8 for little endian builds

  Cleanups & features:
   - Sparse fixes/cleanups (Daniel Axtens)
   - Preserve CFAR value on SLB miss caused by access to bogus address
     (Paul Mackerras)
   - Radix MMU fixups for POWER9 (Aneesh Kumar K.V)
   - Support for setting used_(vsr|vr|spe) in sigreturn path (for CRIU)
     (Simon Guo)
   - Optimise syscall entry for virtual, relocatable case (Nicholas
     Piggin)
   - Optimise MSR handling in exception handling (Nicholas Piggin)
   - Support for kexec with Radix MMU (Benjamin Herrenschmidt)
   - powernv EEH fixes (Russell Currey)
   - Suprise PCI hotplug support for powernv (Gavin Shan)
   - Endian/sparse fixes for powernv PCI (Gavin Shan)
   - Defconfig updates (Anton Blanchard)
   - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (Balbir Singh)
   - cxl: Flush PSL cache before resetting the adapter (Frederic Barrat)
   - cxl: replace loop with for_each_child_of_node(), remove unneeded
     of_node_put() (Andrew Donnellan)
   - Fix HV facility unavailable to use correct handler (Nicholas
     Piggin)
   - Remove unnecessary syscall trampoline (Nicholas Piggin)
   - fadump: Fix build break when CONFIG_PROC_VMCORE=n (Michael
     Ellerman)
   - Quieten EEH message when no adapters are found (Anton Blanchard)
   - powernv: Add PHB register dump debugfs handle (Russell Currey)
   - Use kprobe blacklist for exception handlers & asm functions
     (Nicholas Piggin)
   - Document the syscall ABI (Nicholas Piggin)
   - MAINTAINERS: Update cxl maintainers (Michael Neuling)
   - powerpc: Remove all usages of NO_IRQ (Michael Ellerman)

  Minor cleanups:
   - Andrew Donnellan, Christophe Leroy, Colin Ian King, Cyril Bur,
     Frederic Barrat, Pan Xinhui, PrasannaKumar Muralidharan, Rui Teng,
     Simon Guo"

* tag 'powerpc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (156 commits)
  powerpc/bpf: Add support for bpf constant blinding
  powerpc/bpf: Implement support for tail calls
  powerpc/bpf: Introduce accessors for using the tmp local stack space
  powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n
  powerpc: tm: Enable transactional memory (TM) lazily for userspace
  powerpc/tm: Add TM Unavailable Exception
  powerpc: Remove do_load_up_transact_{fpu,altivec}
  powerpc: tm: Rename transct_(*) to ck(\1)_state
  powerpc: tm: Always use fp_state and vr_state to store live registers
  selftests/powerpc: Add checks for transactional VSXs in signal contexts
  selftests/powerpc: Add checks for transactional VMXs in signal contexts
  selftests/powerpc: Add checks for transactional FPUs in signal contexts
  selftests/powerpc: Add checks for transactional GPRs in signal contexts
  selftests/powerpc: Check that signals always get delivered
  selftests/powerpc: Add TM tcheck helpers in C
  selftests/powerpc: Allow tests to extend their kill timeout
  selftests/powerpc: Introduce GPR asm helper header file
  selftests/powerpc: Move VMX stack frame macros to header file
  selftests/powerpc: Rework FPU stack placement macros and move to header file
  selftests/powerpc: Check for VSX preservation across userspace preemption
  ...
2016-10-07 20:19:31 -07:00
Linus Torvalds 6218590bcb KVM updates for v4.9-rc1
All architectures:
   Move `make kvmconfig` stubs from x86;  use 64 bits for debugfs stats.
 
 ARM:
   Important fixes for not using an in-kernel irqchip; handle SError
   exceptions and present them to guests if appropriate; proxying of GICV
   access at EL2 if guest mappings are unsafe; GICv3 on AArch32 on ARMv8;
   preparations for GICv3 save/restore, including ABI docs; cleanups and
   a bit of optimizations.
 
 MIPS:
   A couple of fixes in preparation for supporting MIPS EVA host kernels;
   MIPS SMP host & TLB invalidation fixes.
 
 PPC:
   Fix the bug which caused guests to falsely report lockups; other minor
   fixes; a small optimization.
 
 s390:
   Lazy enablement of runtime instrumentation; up to 255 CPUs for nested
   guests; rework of machine check deliver; cleanups and fixes.
 
 x86:
   IOMMU part of AMD's AVIC for vmexit-less interrupt delivery; Hyper-V
   TSC page; per-vcpu tsc_offset in debugfs; accelerated INS/OUTS in
   nVMX; cleanups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJX9iDrAAoJEED/6hsPKofoOPoIAIUlgojkb9l2l1XVDgsXdgQL
 sRVhYSVv7/c8sk9vFImrD5ElOPZd+CEAIqFOu45+NM3cNi7gxip9yftUVs7wI5aC
 eDZRWm1E4trDZLe54ZM9ThcqZzZZiELVGMfR1+ZndUycybwyWzafpXYsYyaXp3BW
 hyHM3qVkoWO3dxBWFwHIoO/AUJrWYkRHEByKyvlC6KPxSdBPSa5c1AQwMCoE0Mo4
 K/xUj4gBn9eMelNhg4Oqu/uh49/q+dtdoP2C+sVM8bSdquD+PmIeOhPFIcuGbGFI
 B+oRpUhIuntN39gz8wInJ4/GRSeTuR2faNPxMn4E1i1u4LiuJvipcsOjPfe0a18=
 =fZRB
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "All architectures:
   - move `make kvmconfig` stubs from x86
   - use 64 bits for debugfs stats

  ARM:
   - Important fixes for not using an in-kernel irqchip
   - handle SError exceptions and present them to guests if appropriate
   - proxying of GICV access at EL2 if guest mappings are unsafe
   - GICv3 on AArch32 on ARMv8
   - preparations for GICv3 save/restore, including ABI docs
   - cleanups and a bit of optimizations

  MIPS:
   - A couple of fixes in preparation for supporting MIPS EVA host
     kernels
   - MIPS SMP host & TLB invalidation fixes

  PPC:
   - Fix the bug which caused guests to falsely report lockups
   - other minor fixes
   - a small optimization

  s390:
   - Lazy enablement of runtime instrumentation
   - up to 255 CPUs for nested guests
   - rework of machine check deliver
   - cleanups and fixes

  x86:
   - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery
   - Hyper-V TSC page
   - per-vcpu tsc_offset in debugfs
   - accelerated INS/OUTS in nVMX
   - cleanups and fixes"

* tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits)
  KVM: MIPS: Drop dubious EntryHi optimisation
  KVM: MIPS: Invalidate TLB by regenerating ASIDs
  KVM: MIPS: Split kernel/user ASID regeneration
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  KVM: arm/arm64: vgic: Don't flush/sync without a working vgic
  KVM: arm64: Require in-kernel irqchip for PMU support
  KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
  KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL
  KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie
  KVM: PPC: BookE: Fix a sanity check
  KVM: PPC: Book3S HV: Take out virtual core piggybacking code
  KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread
  ARM: gic-v3: Work around definition of gic_write_bpr1
  KVM: nVMX: Fix the NMI IDT-vectoring handling
  KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive
  KVM: nVMX: Fix reload apic access page warning
  kvmconfig: add virtio-gpu to config fragment
  config: move x86 kvm_guest.config to a common location
  arm64: KVM: Remove duplicating init code for setting VMID
  ARM: KVM: Support vgic-v3
  ...
2016-10-06 10:49:01 -07:00
Gavin Shan 0e7736c6b8 powerpc/powernv: Fix data type for @r in pnv_ioda_parse_m64_window()
This fixes warning reported from sparse:

  pci-ioda.c:451:49: warning: incorrect type in argument 2 (different base types)

Fixes: 262af557dd ("powerpc/powernv: Enable M64 aperatus for PHB3")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04 16:30:28 +11:00
Gavin Shan 5adaf8629b powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data()
This fixes the warnings reported from sparse:

  pci.c:312:33: warning: restricted __be64 degrades to integer
  pci.c:313:33: warning: restricted __be64 degrades to integer

Fixes: cee72d5bb4 ("powerpc/powernv: Display diag data on p7ioc EEH errors")
Cc: stable@vger.kernel.org # v3.3+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04 16:29:59 +11:00
Gavin Shan a7032132d7 powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag()
The hub diag-data type is filled with big-endian data by OPAL call
opal_pci_get_hub_diag_data(). We need convert it to CPU-endian value
before using it. The issue is reported by sparse as pointed by Michael
Ellerman:

  eeh-powernv.c:1309:21: warning: restricted __be16 degrades to integer

This converts hub diag-data type to CPU-endian before using it in
pnv_eeh_get_and_dump_hub_diag().

Fixes: 2a485ad7c8 ("powerpc/powernv: Drop PHB operation next_error()")
Cc: stable@vger.kernel.org # v4.1+
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04 16:29:23 +11:00
Gavin Shan d63e51b31e powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear()
The PE number (@frozen_pe_no), filled by opal_pci_next_error() is in
big-endian format. It should be converted to CPU-endian before it is
passed to opal_pci_eeh_freeze_clear() when clearing the frozen state if
the PE is invalid one. As Michael Ellerman pointed out, the issue is
also detected by sparse:

  eeh-powernv.c:1541:41: warning: incorrect type in argument 2 (different base types)

This passes CPU-endian PE number to opal_pci_eeh_freeze_clear() and it
should be part of commit <0f36db77643b> ("powerpc/eeh: Fix wrong printed
PE number"), which was merged to 4.3 kernel.

Fixes: 71b540adff ("powerpc/powernv: Don't escalate non-existing frozen PE")
Cc: stable@vger.kernel.org # v4.3+
Suggested-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04 16:28:18 +11:00
Gavin Shan 313483dd72 powerpc/powernv: Unfreeze PE on allocation
This unfreezes PE when it's initialized because the PE might be put
into frozen state in the last hot remove path. It's not harmful to
do so if the PE is already in unfrozen state.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-29 15:01:53 +10:00
Gavin Shan fbce44d0ed powerpc/powernv: Call opal_pci_poll() if needed
When issuing PHB reset, OPAL API opal_pci_poll() is called to drive
the state machine in OPAL forward. However, we needn't always call
the function under some circumstances like reset deassert.

This avoids calling opal_pci_poll() when OPAL_SUCCESS is returned
from opal_pci_reset(). Except the overhead introduced by additional
one unnecessary OPAL call, I didn't run into real issue because of
this.

Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-29 14:50:51 +10:00
Andrew Donnellan 6060e9ea8d powerpc/powernv: Fix comment style and spelling
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23 07:54:23 +10:00
Russell Currey e98ddb7716 powerpc/powernv/eeh: Skip finding bus for VF resets
When the PE used in pnv_eeh_reset() is that of a VF,
pnv_eeh_reset_vf_pe() is used.  Unlike the other reset functions called
in pnv_eeh_reset(), the VF reset doesn't require a bus, and if a bus was
missing the function would error out before resetting the VF PE.

To avoid this, reorder the VF reset function to occur before finding and
checking the bus.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23 07:54:21 +10:00
Russell Currey 04fec21c06 powerpc/eeh: Null check uses of eeh_pe_bus_get
eeh_pe_bus_get() can return NULL if a PCI bus isn't found for a given PE.
Some callers don't check this, and can cause a null pointer dereference
under certain circumstances.

Fix this by checking NULL everywhere eeh_pe_bus_get() is called.

Fixes: 8a6b1bc70d ("powerpc/eeh: EEH core to handle special event")
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23 07:54:20 +10:00
Russell Currey 98b665da57 powerpc/powernv/pci: Add PHB register dump debugfs handle
On EEH events the kernel will print a dump of relevant registers.
If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform
doesn't have EEH support, etc) this information isn't readily available.

Add a new debugfs handler to trigger a PHB register dump, so that this
information can be made available on demand.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23 07:54:19 +10:00
Russell Currey b79331a5eb powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignment
Commit 5958d19a14 checks for prefetchable m64 BARs by comparing the
addresses instead of using resource flags.  This broke SR-IOV as the m64
check in pnv_pci_ioda_fixup_iov_resources() fails.

The condition in pnv_pci_window_alignment() also changed to checking
only IORESOURCE_MEM_64 instead of both IORESOURCE_MEM_64 and
IORESOURCE_PREFETCH.

Revert these cases to the previous behaviour, adding a new helper function
to do so.  This is named pnv_pci_is_m64_flags() to make it clear this
function is only looking at resource flags and should not be relied on for
non-SRIOV resources.

Fixes: 5958d19a14 ("Fix incorrect PE reservation attempt on some 64-bit BARs")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-21 14:04:13 +10:00
Michael Ellerman ef24ba7091 powerpc: Remove all usages of NO_IRQ
NO_IRQ has been == 0 on powerpc for just over ten years (since commit
0ebfff1491 ("[POWERPC] Add new interrupt mapping core and change
platforms to use it")). It's also 0 on most other arches.

Although it's fairly harmless, every now and then it causes confusion
when a driver is built on powerpc and another arch which doesn't define
NO_IRQ. There's at least 6 definitions of NO_IRQ in drivers/, at least
some of which are to work around that problem.

So we'd like to remove it. This is fairly trivial in the arch code, we
just convert:

    if (irq == NO_IRQ)	to	if (!irq)
    if (irq != NO_IRQ)	to	if (irq)
    irq = NO_IRQ;	to	irq = 0;
    return NO_IRQ;	to	return 0;

And a few other odd cases as well.

At least for now we keep the #define NO_IRQ, because there is driver
code that uses NO_IRQ and the fixes to remove those will go via other
trees.

Note we also change some occurrences in PPC sound drivers, drivers/ps3,
and drivers/macintosh.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20 20:57:12 +10:00
Michael Ellerman ed7d9a1d7d powerpc/powernv/pci: Fix missed TCE invalidations that should fallback to OPAL
In commit f0228c4130 ("powerpc/powernv/pci: Fallback to OPAL for TCE
invalidations"), we added logic to fallback to OPAL for doing TCE
invalidations if we can't do it in Linux.

Ben sent a v2 of the patch, containing these additional call sites, but
I had already applied v1 and didn't notice. So fix them now.

Fixes: f0228c4130 ("powerpc/powernv/pci: Fallback to OPAL for TCE invalidations")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-15 17:05:11 +10:00
Gavin Shan 29bf282dec powerpc/powernv: Detach from PE on releasing PCI device
The PCI hotplug can be part of EEH error recovery. The @pdn and
the device's PE number aren't removed and added afterwords. The
PE number in @pdn should be set to an invalid one. Otherwise, the
PE's device count is decreased on removing devices while failing
to be increased on adding devices. It leads to unbalanced PE's
device count and make normal PCI hotplug path broken.

Fixes: c5f7700bbd ("powerpc/powernv: Dynamically release PE")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-15 13:53:19 +10:00
Gavin Shan 6eaed1665f powerpc/powernv: Fix the state of root PE
The PE for root bus (root PE) can be removed because of PCI hot
remove in EEH recovery path for fenced PHB error. We need update
@phb->root_pe_populated accordingly so that the root PE can be
populated again in forthcoming PCI hot add path. Also, the PE
shouldn't be destroyed as it's global and reserved resource.

Fixes: c5f7700bbd ("powerpc/powernv: Dynamically release PE")
Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-14 11:40:09 +10:00
Daniel Axtens 7c98bd7208 powerpc/sparse: Make a bunch of things static
Squash a bunch of sparse warnings by making things static.

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13 17:35:47 +10:00
Paul Mackerras 5d375199ea KVM: PPC: Book3S HV: Set server for passed-through interrupts
When a guest has a PCI pass-through device with an interrupt, it
will direct the interrupt to a particular guest VCPU.  In fact the
physical interrupt might arrive on any CPU, and then get
delivered to the target VCPU in the emulated XICS (guest interrupt
controller), and eventually delivered to the target VCPU.

Now that we have code to handle device interrupts in real mode
without exiting to the host kernel, there is an advantage to having
the device interrupt arrive on the same sub(core) as the target
VCPU is running on.  In this situation, the interrupt can be
delivered to the target VCPU without any exit to the host kernel
(using a hypervisor doorbell interrupt between threads if
necessary).

This patch aims to get passed-through device interrupts arriving
on the correct core by setting the interrupt server in the real
hardware XICS for the interrupt to the first thread in the (sub)core
where its target VCPU is running.  We do this in the real-mode H_EOI
code because the H_EOI handler already needs to look at the
emulated ICS state for the interrupt (whereas the H_XIRR handler
doesn't), and we know we are running in the target VCPU context
at that point.

We set the server CPU in hardware using an OPAL call, regardless of
what the IRQ affinity mask for the interrupt says, and without
updating the affinity mask.  This amounts to saying that when an
interrupt is passed through to a guest, as a matter of policy we
allow the guest's affinity for the interrupt to override the host's.

This is inspired by an earlier patch from Suresh Warrier, although
none of this code came from that earlier patch.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:28 +10:00
Suresh Warrier 4ee11c1a9f powerpc/powernv: Provide facilities for EOI, usable from real mode
This adds a new function pnv_opal_pci_msi_eoi() which does the part of
end-of-interrupt (EOI) handling of an MSI which involves doing an
OPAL call.  This function can be called in real mode.  This doesn't
just export pnv_ioda2_msi_eoi() because that does a call to
icp_native_eoi(), which does not work in real mode.

This also adds a function, is_pnv_opal_msi(), which KVM can call to
check whether an interrupt is one for which we should be calling
pnv_opal_pci_msi_eoi() when we need to do an EOI.

[paulus@ozlabs.org - split out the addition of pnv_opal_pci_msi_eoi()
 from Suresh's patch "KVM: PPC: Book3S HV: Handle passthrough
 interrupts in guest"; added is_pnv_opal_msi(); wrote description.]

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-09 16:17:59 +10:00
Gavin Shan caa58f8088 powerpc/powernv: Fix corrupted PE allocation bitmap on releasing PE
In pnv_ioda_free_pe(), the PE object (including the associated PE
number) is cleared before resetting the corresponding bit in the
PE allocation bitmap. It means PE#0 is always released to the bitmap
wrongly.

This fixes above issue by caching the PE number before the PE object
is cleared.

Fixes: 1e9167726c ("powerpc/powernv: Use PE instead of number during setup and release"
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-08 13:12:52 +10:00
Gavin Shan b314427a52 powerpc/powernv: Fix crash on releasing compound PE
The compound PE is created to accommodate the devices attached to
one specific PCI bus that consume multiple M64 segments. The compound
PE is made up of one master PE and possibly multiple slave PEs. The
slave PEs should be destroyed when releasing the master PE. A kernel
crash happens when derferencing @pe->pdev on releasing the slave PE
in pnv_ioda_deconfigure_pe().

  # echo 0 > /sys/bus/pci/slots/C7/power
  iommu: Removing device 0000:01:00.1 from group 0
  iommu: Removing device 0000:01:00.0 from group 0
  Unable to handle kernel paging request for data at address 0x00000010
  Faulting instruction address: 0xc00000000005d898
  cpu 0x1: Vector: 300 (Data Access) at [c000000fe8217620]
      pc: c00000000005d898: pnv_ioda_release_pe+0x288/0x610
      lr: c00000000005dbdc: pnv_ioda_release_pe+0x5cc/0x610
      sp: c000000fe82178a0
     msr: 9000000000009033
     dar: 10
   dsisr: 40000000
    current = 0xc000000fe815ab80
    paca    = 0xc00000000ff00400	 softe: 0	 irq_happened: 0x01
      pid   = 2709, comm = sh
  Linux version 4.8.0-rc5-gavin-00006-g745efdb (gwshan@gwshan) \
  (gcc version 4.9.3 (Buildroot 2016.02-rc2-00093-g5ea3bce) ) #586 SMP \
  Tue Sep 6 13:37:29 AEST 2016
  enter ? for help
  [c000000fe8217940] c00000000005d684 pnv_ioda_release_pe+0x74/0x610
  [c000000fe82179e0] c000000000034460 pcibios_release_device+0x50/0x70
  [c000000fe8217a10] c0000000004aba80 pci_release_dev+0x50/0xa0
  [c000000fe8217a40] c000000000704898 device_release+0x58/0xf0
  [c000000fe8217ac0] c000000000470510 kobject_release+0x80/0xf0
  [c000000fe8217b00] c000000000704dd4 put_device+0x24/0x40
  [c000000fe8217b20] c0000000004af94c pci_remove_bus_device+0x12c/0x150
  [c000000fe8217b60] c000000000034244 pci_hp_remove_devices+0x94/0xd0
  [c000000fe8217ba0] c0000000004ca444 pnv_php_disable_slot+0x64/0xb0
  [c000000fe8217bd0] c0000000004c88c0 power_write_file+0xa0/0x190
  [c000000fe8217c50] c0000000004c248c pci_slot_attr_store+0x3c/0x60
  [c000000fe8217c70] c0000000002d6494 sysfs_kf_write+0x94/0xc0
  [c000000fe8217cb0] c0000000002d50f0 kernfs_fop_write+0x180/0x260
  [c000000fe8217d00] c0000000002334a0 __vfs_write+0x40/0x190
  [c000000fe8217d90] c000000000234738 vfs_write+0xc8/0x240
  [c000000fe8217de0] c000000000236250 SyS_write+0x60/0x110
  [c000000fe8217e30] c000000000009524 system_call+0x38/0x108

It fixes the kernel crash by bypassing releasing resources (DMA,
IO and memory segments, PELTM) because there are no resources assigned
to the slave PE.

Fixes: c5f7700bbd ("powerpc/powernv: Dynamically release PE")
Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-06 14:54:46 +10:00
Mukesh Ojha a9cbf0b219 powerpc/powernv : Drop reference added by kset_find_obj()
In a situation, where Linux kernel gets notified about duplicate error log
from OPAL, it is been observed that kernel fails to remove sysfs entries
(/sys/firmware/opal/elog/0xXXXXXXXX) of such error logs. This is because,
we currently search the error log/dump kobject in the kset list via
'kset_find_obj()' routine. Which eventually increment the reference count
by one, once it founds the kobject.

So, unless we decrement the reference count by one after it found the kobject,
we would not be able to release the kobject properly later.

This patch adds the 'kobject_put()' which was missing earlier.

Signed-off-by: Mukesh Ojha <mukesh02@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-29 12:48:21 +10:00
Andrzej Hajda 6096481649 powerpc/powernv/pci: fix iterator signedness
Unsigned type is always non-negative, so the loop could not end in case
condition is never true.

The problem has been detected using semantic patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Benjamin Herrenschmidt 5958d19a14 powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs
The generic allocation code may sometimes decide to assign a prefetchable
64-bit BAR to the M32 window. In fact it may also decide to allocate
a 64-bit non-prefetchable BAR to the M64 one ! So using the resource
flags as a test to decide which window was used for PE allocation is
just wrong and leads to insane PE numbers.

Instead, compare the addresses to figure it out.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rename the function as agreed by Ben & Gavin]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 19:51:47 +10:00
Mahesh Salgaonkar c74dd88e77 powerpc/book3s: Fix MCE console messages for unrecoverable MCE.
When machine check occurs with MSR(RI=0), it means MC interrupt is
unrecoverable and kernel goes down to panic path. But the console
message still shows it as recovered. This patch fixes the MCE console
messages.

Fixes: 36df96f8ac ("powerpc/book3s: Decode and save machine check event.")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 19:46:54 +10:00
Alexey Kardashevskiy 4d9021957b powerpc/powernv/ioda: Fix TCE invalidate to work in real mode again
Commit fd141d1a99 ("powerpc/powernv/pci: Rework accessing the TCE
invalidate register") broke TCE invalidation on IODA2/PHB3 for real
mode.

This makes invalidate work again.

Fixes: fd141d1a99 ("powerpc/powernv/pci: Rework accessing the TCE invalidate register")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 14:50:19 +10:00
Benjamin Herrenschmidt 880a3d6afd powerpc/xics: Properly set Edge/Level type and enable resend
This sets the type of the interrupt appropriately. We set it as follow:

 - If not mapped from the device-tree, we use edge. This is the case
of the virtual interrupts and PCI MSIs for example.

 - If mapped from the device-tree and #interrupt-cells is 2 (PAPR
compliant), we use the second cell to set the appropriate type

 - If mapped from the device-tree and #interrupt-cells is 1 (current
OPAL on P8 does that), we assume level sensitive since those are
typically going to be the PSI LSIs which are level sensitive.

Additionally, we mark the interrupts requested via the opal_interrupts
property all level. This is a bit fishy but the best we can do until we
fix OPAL to properly expose them with a complete descriptor. It is also
correct for the current HW anyway as OPAL interrupts are currently PCI
error and PSI interrupts which are level.

Finally now that edge interrupts are properly identified, we can enable
CONFIG_HARDIRQS_SW_RESEND which will make the core re-send them if
they occur while masked, which some drivers rely upon.

This fixes issues with lost interrupts on some Mellanox adapters.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 14:50:18 +10:00
Krzysztof Kozlowski 00085f1efa dma-mapping: use unsigned long for dma_attrs
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-04 08:50:07 -04:00
Linus Torvalds 221bb8a46e - ARM: GICv3 ITS emulation and various fixes. Removal of the old
VGIC implementation.
 
 - s390: support for trapping software breakpoints, nested virtualization
 (vSIE), the STHYI opcode, initial extensions for CPU model support.
 
 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
 preliminary to this and the upcoming support for hardware virtualization
 extensions.
 
 - x86: support for execute-only mappings in nested EPT; reduced vmexit
 latency for TSC deadline timer (by about 30%) on Intel hosts; support for
 more than 255 vCPUs.
 
 - PPC: bugfixes.
 
 The ugly bit is the conflicts.  A couple of them are simple conflicts due
 to 4.7 fixes, but most of them are with other trees. There was definitely
 too much reliance on Acked-by here.  Some conflicts are for KVM patches
 where _I_ gave my Acked-by, but the worst are for this pull request's
 patches that touch files outside arch/*/kvm.  KVM submaintainers should
 probably learn to synchronize better with arch maintainers, with the
 latter providing topic branches whenever possible instead of Acked-by.
 This is what we do with arch/x86.  And I should learn to refuse pull
 requests when linux-next sends scary signals, even if that means that
 submaintainers have to rebase their branches.
 
 Anyhow, here's the list:
 
 - arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
 by the nvdimm tree.  This tree adds handle_preemption_timer and
 EXIT_REASON_PREEMPTION_TIMER at the same place.  In general all mentions
 of pcommit have to go.
 
 There is also a conflict between a stable fix and this patch, where the
 stable fix removed the vmx_create_pml_buffer function and its call.
 
 - virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
 This tree adds kvm_io_bus_get_dev at the same place.
 
 - virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
 file was completely removed for 4.8.
 
 - include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
 this is a change that should have gone in through the irqchip tree and
 pulled by kvm-arm.  I think I would have rejected this kvm-arm pull
 request.  The KVM version is the right one, except that it lacks
 GITS_BASER_PAGES_SHIFT.
 
 - arch/powerpc: what a mess.  For the idle_book3s.S conflict, the KVM
 tree is the right one; everything else is trivial.  In this case I am
 not quite sure what went wrong.  The commit that is causing the mess
 (fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
 path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
 and arch/powerpc/kvm/.  It's large, but at 396 insertions/5 deletions
 I guessed that it wasn't really possible to split it and that the 5
 deletions wouldn't conflict.  That wasn't the case.
 
 - arch/s390: also messy.  First is hypfs_diag.c where the KVM tree
 moved some code and the s390 tree patched it.  You have to reapply the
 relevant part of commits 6c22c98637, plus all of e030c1125e, to
 arch/s390/kernel/diag.c.  Or pick the linux-next conflict
 resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
 Second, there is a conflict in gmap.c between a stable fix and 4.8.
 The KVM version here is the correct one.
 
 I have pushed my resolution at refs/heads/merge-20160802 (commit
 3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
 SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
 U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
 x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
 qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
 69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
 =42ti
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:

 - ARM: GICv3 ITS emulation and various fixes.  Removal of the
   old VGIC implementation.

 - s390: support for trapping software breakpoints, nested
   virtualization (vSIE), the STHYI opcode, initial extensions
   for CPU model support.

 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
   of cleanups, preliminary to this and the upcoming support for
   hardware virtualization extensions.

 - x86: support for execute-only mappings in nested EPT; reduced
   vmexit latency for TSC deadline timer (by about 30%) on Intel
   hosts; support for more than 255 vCPUs.

 - PPC: bugfixes.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
  KVM: PPC: Introduce KVM_CAP_PPC_HTM
  MIPS: Select HAVE_KVM for MIPS64_R{2,6}
  MIPS: KVM: Reset CP0_PageMask during host TLB flush
  MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
  MIPS: KVM: Sign extend MFC0/RDHWR results
  MIPS: KVM: Fix 64-bit big endian dynamic translation
  MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
  MIPS: KVM: Use 64-bit CP0_EBase when appropriate
  MIPS: KVM: Set CP0_Status.KX on MIPS64
  MIPS: KVM: Make entry code MIPS64 friendly
  MIPS: KVM: Use kmap instead of CKSEG0ADDR()
  MIPS: KVM: Use virt_to_phys() to get commpage PFN
  MIPS: Fix definition of KSEGX() for 64-bit
  KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
  kvm: x86: nVMX: maintain internal copy of current VMCS
  KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
  KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
  KVM: arm64: vgic-its: Simplify MAPI error handling
  KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
  KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
  ...
2016-08-02 16:11:27 -04:00
Alexey Kardashevskiy 802a345183 powerpc/powernv/ioda: Fix endianness when reading TCEs
The iommu_table_ops::exchange() callback writes new TCE to the table and
returns old value and permission mask. The old TCE value is correctly
converted from BE to CPU endian; however permission mask was calculated
from BE value and therefore always returned DMA_NONE which could cause
memory leak on LE systems using VFIO SPAPR TCE IOMMU v1 driver.

This fixes pnv_tce_xchg() to have @oldtce a CPU endian.

Fixes: 05c6cfb9dc ("powerpc/iommu/powernv: Release replaced TCE")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 20:21:06 +10:00
Benjamin Herrenschmidt f2d576948d powerpc: Get rid of ppc_md.init_early()
It is now called right after platform probe, so the probe function
can just do the job.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 19:07:26 +10:00
Benjamin Herrenschmidt 406b0b6ae3 powerpc/64: Move 64-bit probe_machine() to later in the boot process
We no long need the machine type that early, so we can move probe_machine()
to after the device-tree has been expanded. This will allow further
consolidation.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21 18:59:22 +10:00