Commit Graph

18 Commits

Author SHA1 Message Date
Paul Mackerras 44a3add863 KVM: PPC: Book3S HV: Better handling of exceptions that happen in real mode
When an interrupt or exception happens in the guest that comes to the
host, the CPU goes to hypervisor real mode (MMU off) to handle the
exception but doesn't change the MMU context.  After saving a few
registers, we then clear the "in guest" flag.  If, for any reason,
we get an exception in the real-mode code, that then gets handled
by the normal kernel exception handlers, which turn the MMU on.  This
is disastrous if the MMU is still set to the guest context, since we
end up executing instructions from random places in the guest kernel
with hypervisor privilege.

In order to catch this situation, we define a new value for the "in guest"
flag, KVM_GUEST_MODE_HOST_HV, to indicate that we are in hypervisor real
mode with guest MMU context.  If the "in guest" flag is set to this value,
we branch off to an emergency handler.  For the moment, this just does
a branch to self to stop the CPU from doing anything further.

While we're here, we define another new flag value to indicate that we
are in a HV guest, as distinct from a PR guest.  This will be useful
when we have a kernel that can support both PR and HV guests concurrently.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-17 14:49:37 +02:00
Paul Mackerras a4a0f2524a KVM: PPC: Book3S PR: Allow guest to use 64k pages
This adds the code to interpret 64k HPTEs in the guest hashed page
table (HPT), 64k SLB entries, and to tell the guest about 64k pages
in kvm_vm_ioctl_get_smmu_info().  Guest 64k pages are still shadowed
by 4k pages.

This also adds another hash table to the four we have already in
book3s_mmu_hpte.c to allow us to find all the PTEs that we have
instantiated that match a given 64k guest page.

The tlbie instruction changed starting with POWER6 to use a bit in
the RB operand to indicate large page invalidations, and to use other
RB bits to indicate the base and actual page sizes and the segment
size.  64k pages came in slightly earlier, with POWER5++.
We use one bit in vcpu->arch.hflags to indicate that the emulated
cpu supports 64k pages, and another to indicate that it has the new
tlbie definition.

The KVM_PPC_GET_SMMU_INFO ioctl presents a bit of a problem, because
the MMU capabilities depend on which CPU model we're emulating, but it
is a VM ioctl not a VCPU ioctl and therefore doesn't get passed a VCPU
fd.  In addition, commonly-used userspace (QEMU) calls it before
setting the PVR for any VCPU.  Therefore, as a best effort we look at
the first vcpu in the VM and return 64k pages or not depending on its
capabilities.  We also make the PVR default to the host PVR on recent
CPUs that support 1TB segments (and therefore multiple page sizes as
well) so that KVM_PPC_GET_SMMU_INFO will include 64k page and 1TB
segment support on those CPUs.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-17 14:45:03 +02:00
Mihai Caraman 4edd1ae91b kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage
Interrupt numbers defined for Book3E follows IVORs definition. Align
BOOKE_INTERRUPT_ALTIVEC_UNAVAIL and BOOKE_INTERRUPT_ALTIVEC_ASSIST to this
rule which also fixes the build breakage.
IVORs 32 and 33 are shared so reflect this in the interrupts naming.

This fixes a build break for 64-bit booke KVM.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11 11:10:49 +03:00
Kumar Gala cd66cc2ee5 powerpc/85xx: Add AltiVec support for e6500
The e6500 core adds support for AltiVec on a Book-E class processor.
Connect up all the various exception handling code and build config
mechanisms to allow user spaces apps to utilize AltiVec.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2013-03-12 15:59:26 -05:00
Paul Mackerras 913d3ff9a3 KVM: PPC: Book3s HV: Don't access runnable threads list without vcore lock
There were a few places where we were traversing the list of runnable
threads in a virtual core, i.e. vc->runnable_threads, without holding
the vcore spinlock.  This extends the places where we hold the vcore
spinlock to cover everywhere that we traverse that list.

Since we possibly need to sleep inside kvmppc_book3s_hv_page_fault,
this moves the call of it from kvmppc_handle_exit out to
kvmppc_vcpu_run, where we don't hold the vcore lock.

In kvmppc_vcore_blocked, we don't actually need to check whether
all vcpus are ceded and don't have any pending exceptions, since the
caller has already done that.  The caller (kvmppc_run_vcpu) wasn't
actually checking for pending exceptions, so we add that.

The change of if to while in kvmppc_run_vcpu is to make sure that we
never call kvmppc_remove_runnable() when the vcore state is RUNNING or
EXITING.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-30 10:54:55 +01:00
Alexander Graf 3d4c6826ed KVM: PPC: Restrict PPC_[L|ST]D macro to asm code
We only want asm code macros to be accessible from asm code, so #ifdef it
depending on it.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-05-06 16:19:08 +02:00
Varun Sethi 185e4188da KVM: PPC: bookehv: Use a Macro for saving/restoring guest registers to/from their 64 bit copies.
Introduced PPC_STD/PPC_LD macros for saving/restoring guest registers to/from their 64 bit copies.

Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-05-06 16:19:08 +02:00
Scott Wood d30f6e4800 KVM: PPC: booke: category E.HV (GS-mode) support
Chips such as e500mc that implement category E.HV in Power ISA 2.06
provide hardware virtualization features, including a new MSR mode for
guest state.  The guest OS can perform many operations without trapping
into the hypervisor, including transitions to and from guest userspace.

Since we can use SRR1[GS] to reliably tell whether an exception came from
guest state, instead of messing around with IVPR, we use DO_KVM similarly
to book3s.

Current issues include:
 - Machine checks from guest state are not routed to the host handler.
 - The guest can cause a host oops by executing an emulated instruction
   in a page that lacks read permission.  Existing e500/4xx support has
   the same problem.

Includes work by Ashish Kalra <Ashish.Kalra@freescale.com>,
Varun Sethi <Varun.Sethi@freescale.com>, and
Liu Yu <yu.liu@freescale.com>.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: remove pt_regs usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:51:19 +03:00
Paul Mackerras de56a948b9 KVM: PPC: Add support for Book3S processors in hypervisor mode
This adds support for KVM running on 64-bit Book 3S processors,
specifically POWER7, in hypervisor mode.  Using hypervisor mode means
that the guest can use the processor's supervisor mode.  That means
that the guest can execute privileged instructions and access privileged
registers itself without trapping to the host.  This gives excellent
performance, but does mean that KVM cannot emulate a processor
architecture other than the one that the hardware implements.

This code assumes that the guest is running paravirtualized using the
PAPR (Power Architecture Platform Requirements) interface, which is the
interface that IBM's PowerVM hypervisor uses.  That means that existing
Linux distributions that run on IBM pSeries machines will also run
under KVM without modification.  In order to communicate the PAPR
hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
to include/linux/kvm.h.

Currently the choice between book3s_hv support and book3s_pr support
(i.e. the existing code, which runs the guest in user mode) has to be
made at kernel configuration time, so a given kernel binary can only
do one or the other.

This new book3s_hv code doesn't support MMIO emulation at present.
Since we are running paravirtualized guests, this isn't a serious
restriction.

With the guest running in supervisor mode, most exceptions go straight
to the guest.  We will never get data or instruction storage or segment
interrupts, alignment interrupts, decrementer interrupts, program
interrupts, single-step interrupts, etc., coming to the hypervisor from
the guest.  Therefore this introduces a new KVMTEST_NONHV macro for the
exception entry path so that we don't have to do the KVM test on entry
to those exception handlers.

We do however get hypervisor decrementer, hypervisor data storage,
hypervisor instruction storage, and hypervisor emulation assist
interrupts, so we have to handle those.

In hypervisor mode, real-mode accesses can access all of RAM, not just
a limited amount.  Therefore we put all the guest state in the vcpu.arch
and use the shadow_vcpu in the PACA only for temporary scratch space.
We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
We don't have a shared page with the guest, but we still need a
kvm_vcpu_arch_shared struct to store the values of various registers,
so we include one in the vcpu_arch struct.

The POWER7 processor has a restriction that all threads in a core have
to be in the same partition.  MMU-on kernel code counts as a partition
(partition 0), so we have to do a partition switch on every entry to and
exit from the guest.  At present we require the host and guest to run
in single-thread mode because of this hardware restriction.

This code allocates a hashed page table for the guest and initializes
it with HPTEs for the guest's Virtual Real Memory Area (VRMA).  We
require that the guest memory is allocated using 16MB huge pages, in
order to simplify the low-level memory management.  This also means that
we can get away without tracking paging activity in the host for now,
since huge pages can't be paged or swapped.

This also adds a few new exports needed by the book3s_hv code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:54 +03:00
Benjamin Herrenschmidt a5d4f3ad3a powerpc: Base support for exceptions using HSRR0/1
Pass the register type to the prolog, also provides alternate "HV"
version of hardware interrupt (0x500) and adjust LPES accordingly

We tag those interrupts by setting bit 0x2 in the trap number

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Alexander Graf 17bd158006 KVM: PPC: Implement Level interrupts on Book3S
The current interrupt logic is just completely broken. We get a notification
from user space, telling us that an interrupt is there. But then user space
expects us that we just acknowledge an interrupt once we deliver it to the
guest.

This is not how real hardware works though. On real hardware, the interrupt
controller pulls the external interrupt line until it gets notified that the
interrupt was received.

So in reality we have two events: pulling and letting go of the interrupt line.

To maintain backwards compatibility, I added a new request for the pulling
part. The letting go part was implemented earlier already.

With this in place, we can now finally start guests that do not randomly stall
and stop to work at random times.

This patch implements above logic for Book3S.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:19 +02:00
Alexander Graf b83d4a9cfc KVM: PPC: Enable native paired singles
When we're on a paired single capable host, we can just always enable
paired singles and expose them to the guest directly.

This approach breaks when multiple VMs run and access PS concurrently,
but this should suffice until we get a proper framework for it in Linux.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:19:08 +03:00
Alexander Graf 3c402a75ea KVM: PPC: Add hidden flag for paired singles
The Gekko implements an extension called paired singles. When the guest wants
to use that extension, we need to make sure we're not running the host FPU,
because all FPU instructions need to get emulated to accomodate for additional
operations that occur.

This patch adds an hflag to track if we're in paired single mode or not.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:34:50 +03:00
Alexander Graf b4433a7cce KVM: PPC: Implement 'skip instruction' mode
To fetch the last instruction we were interrupted on, we enable DR in early
exit code, where we are still in a very transitional phase between guest
and host state.

Most of the time this seemed to work, but another CPU can easily flush our
TLB and HTAB which makes us go in the Linux page fault handler which totally
breaks because we still use the guest's SLB entries.

To work around that, let's introduce a second KVM guest mode that defines
that whenever we get a trap, we don't call the Linux handler or go into
the KVM exit code, but just jump over the faulting instruction.

That way a potentially bad lwz doesn't trigger any faults and we can later
on interpret the invalid instruction we fetched as "fetch didn't work".

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:48 -03:00
Alexander Graf e15a113700 powerpc/kvm: Sync guest visible MMU state
Currently userspace has no chance to find out which virtual address space we're
in and resolve addresses. While that is a big problem for migration, it's also
unpleasent when debugging, as gdb and the monitor don't work on virtual
addresses.

This patch exports enough of the MMU segment state to userspace to make
debugging work and thus also includes the groundwork for migration.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-08 16:02:50 +11:00
Alexander Graf 83cd259d8e Add Book3s definitions
We need quite a bunch of new constants for KVM on Book3s,
so let's define them now.

These constants will be used in later patches.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:52 +11:00
Hollis Blanchard bb3a8a178d KVM: ppc: Add extra E500 exceptions
e500 has additional interrupt vectors (and corresponding IVORs) for SPE and
performance monitoring interrupts.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:02:59 +02:00
Stephen Rothwell b8b572e101 powerpc: Move include files to arch/powerpc/include/asm
from include/asm-powerpc.  This is the result of a

mkdir arch/powerpc/include/asm
git mv include/asm-powerpc/* arch/powerpc/include/asm

Followed by a few documentation/comment fixups and a couple of places
where <asm-powepc/...> was being used explicitly.  Of the latter only
one was outside the arch code and it is a driver only built for powerpc.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-08-04 12:02:00 +10:00